IBM Models

The IBM Models are a sequence of models with increasing complexity, starting with lexical translation probabilities, adding models for reordering and word duplication.

IBM Models is the main subject of 45 publications. 28 are discussed here.

Topics in WordBasedModels

Publications

The IBM models are described in detail by Brown et al. (1993), who originally presented the statistical machine translation approach in earlier papers (Brown et al., 1988; Brown et al., 1990). See also the introductions by Knight (1997); Knight (1999).

During a 1999 Johns Hopkins University Workshop, the IBM models were implemented in a toolkit called GIZA (Al-Onaizan et al., 1999), later refined into GIZA++ by Och and Ney (2000). GIZA++ is open source and widely used. The estimation of the bilingual word classes is described by Och (1999).

Instead of hill-climbing to the Viterbi alignment, algorithms such as Estimation of Distributions may be employed (Rodríguez et al., 2006). The stochastic modelling approach for translation is described by Ney (2001).

A variation on the IBM models is the HMM model which uses relative distortion but not fertility (Vogel et al., 1996). This model was extended by treating jumps to other source words differently from repeated translations of the same source word (Toutanova et al., 2002), and conditioning jumps on the source word (He, 2007).

IBM models have been extended using maximum entropy models (Foster, 2000) to include position (Foster, 2000), part-of-speech tag information (Kim et al., 2000), even in the EM training algorithm (García-Varea et al., 2002; García-Varea et al., 2002b). Improvements have also been obtained by adding bilingual dictionaries (Wu and Wang, 2004) and context vectors estimated from monolingual corpora (Wang and Zhou, 2004), lemmatizing words (Dejean et al., 2003; Popovic and Ney, 2004; Pianta and Bentivogli, 2004), interpolating lemma and word aligment models (Zhang and Sumita, 2007), as well as smoothing (Moore, 2004). Mixture models for word translation probabilities have been explored to automatically learn topic-dependent translation models (Zhao and Xing, 2006; Civera and Juan, 2006). Packing words that typically occur in many-to-one alignments into a single token may improve alignment quality (Ma et al., 2007).

Benchmarks

Discussion

New Publications

Eyigöz, Elif and Gildea, Daniel and Oflazer, Kemal (2013): Multi-Rate HMMs for Word Alignment, Proceedings of the Eighth Workshop on Statistical Machine Translation
add
@InProceedings{eyigoz-gildea-oflazer:2013:WMT,
author = {Eyig\"{o}z, Elif and Gildea, Daniel and Oflazer, Kemal},
title = {{Multi-Rate} {HMMs} for Word Alignment},
booktitle = {Proceedings of the Eighth Workshop on Statistical Machine Translation},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {494--502},
url = {http://www.aclweb.org/anthology/W13-2262},
year = 2013
}
Eyigöz et al. (2013)
Schulz, Philip and Aziz, Wilker (2016): Fast Collocation-Based Bayesian HMM Word Alignment, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
add
@InProceedings{schulz-aziz:2016:COLING,
author = {Schulz, Philip and Aziz, Wilker},
title = {Fast Collocation-Based Bayesian HMM Word Alignment},
booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers},
month = {December},
address = {Osaka, Japan},
publisher = {The COLING 2016 Organizing Committee},
pages = {3146--3155},
url = {http://aclweb.org/anthology/C16-1296},
year = 2016
}
Schulz and Aziz (2016)
UNKNOWN CITATION 'simion-collins-stein:2015:EMNLP'
Dyer, Chris and Chahuneau, Victor and Smith, Noah A. (2013): A Simple, Fast, and Effective Reparameterization of IBM Model 2, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
add
@InProceedings{dyer-chahuneau-smith:2013:NAACL-HLT,
author = {Dyer, Chris and Chahuneau, Victor and Smith, Noah A.},
title = {A Simple, Fast, and Effective Reparameterization of IBM Model 2},
booktitle = {Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {Atlanta, Georgia},
publisher = {Association for Computational Linguistics},
pages = {644--648},
url = {http://www.aclweb.org/anthology/N13-1073},
year = 2013
}
Dyer et al. (2013)
Gal, Yarin and Blunsom, Phil (2013): A Systematic Bayesian Treatment of the IBM Alignment Models, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
add
@InProceedings{gal-blunsom:2013:NAACL-HLT,
author = {Gal, Yarin and Blunsom, Phil},
title = {A Systematic Bayesian Treatment of the IBM Alignment Models},
booktitle = {Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {Atlanta, Georgia},
publisher = {Association for Computational Linguistics},
pages = {969--977},
url = {http://www.aclweb.org/anthology/N13-1117},
year = 2013
}
Gal and Blunsom (2013)
Simion, Andrei and Collins, Michael and Stein, Cliff (2013): A Convex Alternative to IBM Model 2, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{simion-collins-stein:2013:EMNLP,
author = {Simion, Andrei and Collins, Michael and Stein, Cliff},
title = {A Convex Alternative to {IBM} Model 2},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {1574--1583},
url = {http://www.aclweb.org/anthology/D13-1164},
year = 2013
}
Simion et al. (2013)
Simion, Andrei and Collins, Michael and Stein, Cliff (2014): Some Experiments with a Convex IBM Model 2, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
add
@InProceedings{simion-collins-stein:2014:EACL2014-SP,
author = {Simion, Andrei and Collins, Michael and Stein, Cliff},
title = {Some Experiments with a Convex IBM Model 2},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {180--184},
url = {http://www.aclweb.org/anthology/E14-4035},
year = 2014
}
Simion et al. (2014)
Schoenemann, Thomas (2013): Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{schoenemann:2013:ACL2013,
author = {Schoenemann, Thomas},
title = {Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {22--31},
url = {http://www.aclweb.org/anthology/P13-1003},
year = 2013
}
Schoenemann (2013)
Gelling, Douwe and Cohn, Trevor (2014): Simple extensions and POS Tags for a reparameterised IBM Model 2, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{gelling-cohn:2014:P14-2,
author = {Gelling, Douwe and Cohn, Trevor},
title = {Simple extensions and POS Tags for a reparameterised IBM Model 2},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {June},
address = {Baltimore, Maryland},
publisher = {Association for Computational Linguistics},
pages = {150--154},
url = {http://www.aclweb.org/anthology/P14-2025},
year = 2014
}
Gelling and Cohn (2014)
Vaswani, Ashish and Huang, Liang and Chiang, David (2012): Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{vaswani-huang-chiang:2012:ACL2012,
author = {Vaswani, Ashish and Huang, Liang and Chiang, David},
title = {Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {311--319},
url = {http://www.aclweb.org/anthology/P12-1033},
year = 2012
}
Vaswani et al. (2012)
Riley, Darcey and Gildea, Daniel (2012): Improving the IBM Alignment Models Using Variational Bayes, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{riley-gildea:2012:ACL2012short,
author = {Riley, Darcey and Gildea, Daniel},
title = {Improving the IBM Alignment Models Using Variational Bayes},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {306--310},
url = {http://www.aclweb.org/anthology/P12-2060},
year = 2012
}
Riley and Gildea (2012)
Sujith Ravi and Kevin Knight (2010): Squibs: Does GIZA++ Make Search Errors?, Computational Linguistics
add
@Article{CL:2010-3001,
author = {Sujith Ravi and Kevin Knight},
title = {Squibs: Does {GIZA++} Make Search Errors?},
journal = {Computational Linguistics},
volume = {36},
number = {3},
url = {http://aclweb.org/anthology-new/J/J10/J10-3001.pdf},
year = 2010
}
Ravi and Knight (2010)
Christer Samuelsson (2012): HAL: Challenging Three Key Aspects of IBM-style Statistical Machine Translation, Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)
add
@inproceedings{AMTA-2012-Samuelsson,
author = {Christer Samuelsson},
title = {HAL}: Challenging Three Key Aspects of {IBM-style Statistical Machine Translation},
url = {http://www.mt-archive.info/AMTA-2012-Samuelsson.pdf},
booktitle = {Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {San Diego, California},
year = 2012
}
Samuelsson (2012)
Brunning, Jamie and de Gispert, Adrià and Byrne, William (2009): Context-Dependent Alignment Models for Statistical Machine Translation, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
add
@InProceedings{brunning-degispert-byrne:2009:NAACLHLT09,
author = {Brunning, Jamie and de Gispert, Adri\`{a} and Byrne, William},
title = {Context-Dependent Alignment Models for Statistical Machine Translation},
booktitle = {Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {110--118},
url = {http://www.aclweb.org/anthology/N/N09/N09-1013},
year = 2009
}
Brunning et al. (2009)
Gao, Qin and Bach, Nguyen and Vogel, Stephan (2010): A Semi-Supervised Word Alignment Algorithm with Partial Manual Alignments, Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
add
@InProceedings{gao-bach-vogel:2010:WMT,
author = {Gao, Qin and Bach, Nguyen and Vogel, Stephan},
title = {A Semi-Supervised Word Alignment Algorithm with Partial Manual Alignments},
booktitle = {Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {1--10},
url = {http://www.aclweb.org/anthology/W10-1701},
year = 2010
}
Gao et al. (2010)
Schoenemann, Thomas (2010): Computing Optimal Alignments for the IBM-3 Translation Model, Proceedings of the Fourteenth Conference on Computational Natural Language Learning
add
@InProceedings{schoenemann:2010:CONLL,
author = {Schoenemann, Thomas},
title = {Computing Optimal Alignments for the {IBM}-3 Translation Model},
booktitle = {Proceedings of the Fourteenth Conference on Computational Natural Language Learning},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {98--106},
url = {http://www.aclweb.org/anthology/W10-2913},
year = 2010
}
Schoenemann (2010)
Toutanova, Kristina and Galley, Michel (2011): Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies
add
@InProceedings{toutanova-galley:2011:ACL-HLT2011,
author = {Toutanova, Kristina and Galley, Michel},
title = {Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies},
month = {June},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {461--466},
url = {http://www.aclweb.org/anthology/P11-2081},
year = 2011
}
Toutanova and Galley (2011)
Lopez, Adam and Resnik, Philip (2005): Improved HMM Alignment Models for Languages with Scarce Resources, Proceedings of the ACL Workshop on Building and Using Parallel Texts
add
@InProceedings{lopez-resnik:2005:WPT,
author = {Lopez, Adam and Resnik, Philip},
title = {Improved {HMM} Alignment Models for Languages with Scarce Resources},
booktitle = {Proceedings of the ACL Workshop on Building and Using Parallel Texts},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {83--86},
url = {http://www.aclweb.org/anthology/W/W05/W05-0812},
year = 2005
}
Lopez and Resnik (2005)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

IBM Models

Publications

Benchmarks

Discussion

Related Topics

New Publications