Word-Based Models
Being the initial models for statistical machine translation, word based models are tied to the translation of individual words.
Word Based Models and its 13 sub-topics are the main subject of 395 publications.
Publications
The initial approach to statistical machine translation led to the development of the
IBM Models Peter F. Brown and John Cocke and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Frederick Jelinek and Robert L. Mercer and Paul Rossin (1988):
A STATISTICAL APPROACH TO LANGUAGE TRANSLATION, Proceedings of the International Conference on Computational Linguistics (COLING)
mentioned in Word Based Models and IBM Models@InProceedings{Brown:1988,
author = {Peter F. Brown and John Cocke and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Frederick Jelinek and Robert L. Mercer and Paul Rossin},
title = {A STATISTICAL APPROACH TO LANGUAGE TRANSLATION},
booktitle = {Proceedings of the International Conference on Computational Linguistics (COLING)},
year = 1988
}
(Brown et al., 1988;
Peter F. Brown and John Cocke and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Frederick Jelinek and John D. Lafferty and Robert L. Mercer and Paul Rossin (1990):
A statistical approach to machine translation, Computational Linguistics
mentioned in Word Based Models and IBM Models@Article{Brown:1990,
author = {Peter F. Brown and John Cocke and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Frederick Jelinek and John D. Lafferty and Robert L. Mercer and Paul Rossin},
title = {A statistical approach to machine translation},
journal = {Computational Linguistics},
volume = {16},
number = {2},
pages = {76--85},
year = 1990
}
Brown et al., 1990;
Peter F. Brown and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Robert L. Mercer (1993):
The Mathematics of Statistical Machine Translation, Computational Linguistics
mentioned in Word Based Models and IBM Models@Article{Brown:1993,
author = {Peter F. Brown and Stephen A. Della-Pietra and Vincent J. Della-Pietra and Robert L. Mercer},
title = {The Mathematics of Statistical Machine Translation},
url = {
http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf},
volume = {19},
number = {2},
pages = {263--313},
journal = {Computational Linguistics},
year = 1993
}
Brown et al., 1993). A popular implementation of the training of these models is GIZA++
Franz Josef Och and Hermann Ney (2000):
Improved Statistical Alignment Models, Proceedings of the 38th Annual Meeting of the Association of Computational Linguistics (ACL)
mentioned in Word Based Models and IBM Models@InProceedings{Och:2000,
author = {Franz Josef Och and Hermann Ney},
title = {Improved Statistical Alignment Models},
booktitle = {Proceedings of the 38th Annual Meeting of the Association of Computational Linguistics (ACL)},
url = {
http://acl.ldc.upenn.edu/P/P00/P00-1056.pdf},
year = 2000
}
(Och and Ney, 2000) which is still used for word alignment as a initial training step of more complex models.
Benchmarks
Discussion
None of the currently competitive machine translation systems are word based models, but nevertheless the principles such as generative modelling and the use of the expectation maximimization algorithm are still core methods today. Moreover, word alignment based on word based models is more often than not the first step in training more complex models.
New Publications