Word Graphs
For various applications, it is useful to view the space of all possible (or probable) translations as a word graph in which each path constitutes one possible translation.
Word Graphs is the main subject of 4 publications. 4 are discussed here.
Publications
The generation of word graphs from the search graph of the decoding process is first described by
Ueffing, Nicola and Och, Franz Josef and Ney, Hermann (2002):
Generation of Word Graphs in Statistical Machine Translation, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
@inproceedings{Ueffing:2002,
author = {Ueffing, Nicola and Och, Franz Josef and Ney, Hermann},
title = {Generation of Word Graphs in Statistical Machine Translation},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {July},
address = {Philadelphia},
publisher = {Association for Computational Linguistics},
pages = {156--163},
year = 2002
}
Ueffing et al. (2002),
Zens, Richard and Ney, Hermann (2005):
Word Graphs for Statistical Machine Translation, Proceedings of the ACL Workshop on Building and Using Parallel Texts
@InProceedings{zens-ney:2005:WPT,
author = {Zens, Richard and Ney, Hermann},
title = {Word Graphs for Statistical Machine Translation},
booktitle = {Proceedings of the ACL Workshop on Building and Using Parallel Texts},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {191--198},
url = {
http://www.aclweb.org/anthology/W/W05/W05-0834},
year = 2005
}
Zens and Ney (2005) describes additional pruning methods to achieve a more compact graph. Word graphs may converted into confusion networks or mined for n-best lists. Alternatively, the n-list may be extended with additional candidate translation generated with a language model that is trained on the n-best list
Boxing Chen and Marcello Federico and Mauro Cettolo (2007):
Better N-best Translations through Generative n-gram Language Models, Proceedings of the MT Summit XI
@inproceedings{Chen:2007:MTSummit,
author = {Boxing Chen and Marcello Federico and Mauro Cettolo},
title = {Better N-best Translations through Generative n-gram Language Models},
url = {
http://mt-archive.info/MTS-2007-Chen.pdf},
googlescholar = {14409701864859694322},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
(Chen et al., 2007) or from overlapping n-grams in the original n-best list
Chen, Boxing and Zhang, Min and Aw, Aiti and Li, Haizhou (2008):
Regenerating Hypotheses for Statistical Machine Translation, Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)
@InProceedings{chen-EtAl:2008:PAPERS1,
author = {Chen, Boxing and Zhang, Min and Aw, Aiti and Li, Haizhou},
title = {Regenerating Hypotheses for Statistical Machine Translation},
booktitle = {Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)},
month = {August},
address = {Manchester, UK},
publisher = {Coling 2008 Organizing Committee},
pages = {105--112},
url = {
http://www.aclweb.org/anthology/C08-1014},
year = 2008
}
(Chen et al., 2008).
Benchmarks
Discussion
Related Topics
New Publications