Pruning Large Translation Models

Physical requirements may limit the size of translation models that can be used in practice, so we may have to prune the models by removing the least relevant phrase pairs.

Pruning Large Models is the main subject of 23 publications. 11 are discussed here.

Topics in PhraseBasedModels

Publications

Quirk and Menezes (2006) argue that extracting only minimal phrases, i.e. the smallest phrase pairs that map each entire sentence pair, does not hurt performance. This is also the basis of the n-gram translation model (Mariño et al., 2006; Costa-jussà et al., 2007), a variant of the phrase-based model.

Discarding unlikely phrase pairs based on significance tests on their more-than-random occurrence reduces the phrase table drastically and may even yield increases in performance (Johnson et al., 2007). Wu and Wang (2007) propose a method for filtering the noise in the phrase translation table based on a log likelihood ratio. Kutsumi et al. (2005) uses a support vector machine for cleaning phrase tables.

Such considerations may also be taking into account in second pass phrase extraction stage that does not extract bad phrase pairs (Zettlemoyer and Moore, 2007). When faced with porting phrase-based models to small devices such as PDAs (Zhang and Vogel, 2007), the translation table has to be reduced to fit a fixed amount of memory. Eck et al. (2007); Eck et al. (2007b) prune the translation table based on how often a phrase pair was considered during decoding and how often it was used in the best translation. Sanchis-Trilles et al. (2011) retranslate the training corpus to re-estimate the phrase table from phrases used in the best derivations, reducing the size of the phrase table vastly with some loss of quality.

Benchmarks

Discussion

New Publications

Xu, Wenduan and Zhang, Yue and Williams, Philip and Koehn, Philipp (2013): Learning to Prune: Context-Sensitive Pruning for Syntactic MT, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{xu-EtAl:2013:Short1,
author = {Xu, Wenduan and Zhang, Yue and Williams, Philip and Koehn, Philipp},
title = {Learning to Prune: Context-Sensitive Pruning for Syntactic MT},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {352--357},
url = {http://www.aclweb.org/anthology/P13-2063},
year = 2013
}
Xu et al. (2013)
Matthias Eck and Stephan Vogel and Alex Waibel (2005): Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and TF-IDF, Proc. of the International Workshop on Spoken Language Translation
add
@InProceedings{eck:2005b:iwslt,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and {TF-IDF}},
url = {http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_05/papers/slt5\_061.pdf},
googlescholar = {18408661374131908591},
booktitle = {Proc. of the International Workshop on Spoken Language Translation},
location = {Pittsburgh, PA, USA},
month = {October},
year = 2005
}
Eck et al. (2005)
Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe (2014): Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
add
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
Martzoukos et al. (2014)
Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe (2014): Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
add
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
Martzoukos et al. (2014)
Ling, Wang and Graça, João and Trancoso, Isabel and Black, Alan (2012): Entropy-based Pruning for Phrase-based Machine Translation, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
add
@InProceedings{ling-EtAl:2012:EMNLP-CoNLL,
author = {Ling, Wang and Gra\c{c}a, Jo\~{a}o and Trancoso, Isabel and Black, Alan},
title = {Entropy-based Pruning for Phrase-based Machine Translation},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {962--971},
url = {http://www.aclweb.org/anthology/D12-1088},
year = 2012
}
Ling et al. (2012)
Zens, Richard and Stanton, Daisy and Xu, Peng (2012): A Systematic Comparison of Phrase Table Pruning Techniques, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
add
@InProceedings{zens-stanton-xu:2012:EMNLP-CoNLL,
author = {Zens, Richard and Stanton, Daisy and Xu, Peng},
title = {A Systematic Comparison of Phrase Table Pruning Techniques},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {972--983},
url = {http://www.aclweb.org/anthology/D12-1089},
year = 2012
}
Zens et al. (2012)
Lee, Seung-Wook and Zhang, Dongdong and Li, Mu and Zhou, Ming and Rim, Hae-Chang (2012): Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{lee-EtAl:2012:ACL2012short,
author = {Lee, Seung-Wook and Zhang, Dongdong and Li, Mu and Zhou, Ming and Rim, Hae-Chang},
title = {Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {291--295},
url = {http://www.aclweb.org/anthology/P12-2057},
year = 2012
}
Lee et al. (2012)
J Howard Johnson (2012): Conditional Significance Pruning: Discarding More of Huge Phrase Tables, Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)
add
@inproceedings{AMTA-2012-Johnson,
author = {J Howard Johnson},
title = {Conditional Significance Pruning: Discarding More of Huge Phrase Tables},
url = {http://www.mt-archive.info/AMTA-2012-Johnson.pdf},
booktitle = {Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {San Diego, California},
year = 2012
}
Johnson (2012)
Nadi Tomeh and Nicola Cancedda and Marc Dymetman (2009): Complexity-Based Phrase-Table Filtering for Statistical Machine Translation, Proceedings of the Twelfth Machine Translation Summit (MT Summit XII)
add
@inproceedings{MTS09:Tomeh,
author = {Nadi Tomeh and Nicola Cancedda and Marc Dymetman},
title = {Complexity-Based Phrase-Table Filtering for Statistical Machine Translation},
url = {http://eprints.pascal-network.org/archive/00006145/01/Tomeh-et-al-09.pdf},
googlescholar = {14142713203469977579},
booktitle = {Proceedings of the Twelfth Machine Translation Summit (MT Summit XII)},
publisher = {International Association for Machine Translation},
location = {Ottawa, Ontario, Canada},
year = 2009
}
Tomeh et al. (2009)
He, Zhongjun and Meng, Yao and Lü, Yajuan and Yu, Hao and Liu, Qun (2009): Reducing SMT Rule Table with Monolingual Key Phrase, Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
add
@InProceedings{he-EtAl:2009:Short,
author = {He, Zhongjun and Meng, Yao and L\"{u}, Yajuan and Yu, Hao and Liu, Qun},
title = {Reducing {SMT} Rule Table with Monolingual Key Phrase},
booktitle = {Proceedings of the ACL-IJCNLP 2009 Conference Short Papers},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {121--124},
url = {http://www.aclweb.org/anthology/P/P09/P09-2031},
year = 2009
}
He et al. (2009)
Iglesias, Gonzalo and de Gispert, Adrià and Banga, Eduardo R. and Byrne, William (2009): Rule Filtering by Pattern for Efficient Hierarchical Translation, Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
add
@InProceedings{iglesias-EtAl:2009:EACL,
author = {Iglesias, Gonzalo and de Gispert, Adri\`{a} and Banga, Eduardo R. and Byrne, William},
title = {Rule Filtering by Pattern for Efficient Hierarchical Translation},
booktitle = {Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)},
month = {March},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {380--388},
url = {http://www.aclweb.org/anthology/E09-1044},
year = 2009
}
Iglesias et al. (2009)
Germann, Ulrich and Joanis, Eric and Larkin, Samuel (2009): Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too, Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)
add
@InProceedings{germann-joanis-larkin:2009:SETQA-NLP,
author = {Germann, Ulrich and Joanis, Eric and Larkin, Samuel},
title = {Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too},
booktitle = {Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {31--39},
url = {http://www.aclweb.org/anthology/W09-1505},
year = 2009
}
Germann et al. (2009)
Matthias Eck and Stephan Vogel and Alex Waibel (2005): Low Cost Portability for Statistical Machine Translation based on N-gram Coverage, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
add
@InProceedings{Eck:2005:MTS,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Coverage},
url = {http://i13pc106.ira.uka.de/fileadmin/publication-files/2\_MTS-2005-ECK.pdf},
googlescholar = {16517770361722260065},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
Eck et al. (2005)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

Pruning Large Translation Models

Publications

Benchmarks

Discussion

Related Topics

New Publications