Pruning Large Translation Models
Physical requirements may limit the size of translation models that can be used in practice, so we may have to prune the models by removing the least relevant phrase pairs.
Pruning Large Models is the main subject of 23 publications. 11 are discussed here.
Publications
Quirk, Chris and Menezes, Arul (2006):
Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation, Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
@InProceedings{quirk-menezes:2006:HLT-NAACL06-Main,
author = {Quirk, Chris and Menezes, Arul},
title = {Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation},
booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
month = {June},
address = {New York City, USA},
publisher = {Association for Computational Linguistics},
pages = {9--16},
url = {
http://www.aclweb.org/anthology/N/N06/N06-1002},
year = 2006
}
Quirk and Menezes (2006) argue that extracting only minimal phrases, i.e. the smallest phrase pairs that map each entire sentence pair, does not hurt performance. This is also the basis of the n-gram translation model
José B. Mariño and Rafael E. Banchs and Josep M. Crego and Adrià de Gispert and Patrik Lambert and José A. R. Fonollosa and Marta Ruiz Costa-jussà (2006):
N-gram-based Machine Translation, Computational Linguistics
@Article{Marino:CL:2006,
author = {Jos{\'e} B. Mari{\~n}o and Rafael E. Banchs and Josep M. Crego and Adri{\`a} de Gispert and Patrik Lambert and Jos{\'e} A. R. Fonollosa and Marta Ruiz Costa-juss{\`a}},
title = {N-gram-based Machine Translation},
url = {
http://upcommons.upc.edu/e-prints/bitstream/2117/2104/1/coli.2006.32.4.527.pdf},
googlescholar = {3958992429316331449},
journal = {Computational Linguistics},
volume = {32},
number = {4},
year = 2006
}
(Mariño et al., 2006;
Costa-jussà, Marta Ruiz and Crego, Josep M. and Vilar, David and Fonollosa, José A. R. and Mariño, José B. and Ney, Hermann (2007):
Analysis and System Combination of Phrase- and N-Gram-Based Statistical Machine Translation Systems, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
@InProceedings{rcostajussa-EtAl:2007:ShortPapers,
author = {Costa-juss\`{a}, Marta Ruiz and Crego, Josep M. and Vilar, David and Fonollosa, Jos\'{e} A. R. and Mari{\~n}o, Jos\'{e} B. and Ney, Hermann},
title = {Analysis and System Combination of Phrase- and {N}-Gram-Based Statistical Machine Translation Systems},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {137--140},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2035},
year = 2007
}
Costa-jussà et al., 2007), a variant of the phrase-based model.
Discarding unlikely phrase pairs based on significance tests on their more-than-random occurrence reduces the phrase table drastically and may even yield increases in performance
Johnson, Howard and Martin, Joel and Foster, George and Kuhn, Roland (2007):
Improving Translation Quality by Discarding Most of the Phrasetable, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
@InProceedings{johnson-EtAl:2007:EMNLP-CoNLL2007,
author = {Johnson, Howard and Martin, Joel and Foster, George and Kuhn, Roland},
title = {Improving Translation Quality by Discarding Most of the Phrasetable},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {967--975},
url = {
http://www.aclweb.org/anthology/D/D07/D07-1103},
year = 2007
}
(Johnson et al., 2007).
Hua Wu and Haifeng Wang (2007):
Comparative Study of Word Alignment Heuristics and Phrase-Based SMT, Proceedings of the MT Summit XI
@inproceedings{Wu:2007:MTSummit,
author = {Hua Wu and Haifeng Wang},
title = {Comparative Study of Word Alignment Heuristics and Phrase-Based {SMT}},
url = {
http://mt-archive.info/MTS-2007-Wu.pdf},
googlescholar = {7851682665236592125},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
Wu and Wang (2007) propose a method for filtering the noise in the phrase translation table based on a log likelihood ratio.
Takeshi Kutsumi and Takehiko Yoshimi and Katsunori Kotani and Ichiko Sata and Hitoshi Isahara (2005):
Selection of Entries for a Bilingual Dictionary from Aligned Translation Equivalents using Support Vector Machines, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
@InProceedings{Kutsumi:2005:MTS,
author = {Takeshi Kutsumi and Takehiko Yoshimi and Katsunori Kotani and Ichiko Sata and Hitoshi Isahara},
title = {Selection of Entries for a Bilingual Dictionary from Aligned Translation Equivalents using Support Vector Machines},
url = {
http://www.mt-archive.info/MTS-2005-Kutsumi.pdf},
googlescholar = {12230418553492439993},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
Kutsumi et al. (2005) uses a support vector machine for cleaning phrase tables.
Such considerations may also be taking into account in second pass phrase extraction stage that does not extract bad phrase pairs
Zettlemoyer, Luke and Moore, Robert C. (2007):
Selective Phrase Pair Extraction for Improved Statistical Machine Translation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
@InProceedings{zettlemoyer-moore:2007:ShortPapers,
author = {Zettlemoyer, Luke and Moore, Robert C.},
title = {Selective Phrase Pair Extraction for Improved Statistical Machine Translation},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {209--212},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2053},
year = 2007
}
(Zettlemoyer and Moore, 2007). When faced with porting phrase-based models to small devices such as PDAs
Ying Zhang and Stephan Vogel (2007):
PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices, Proceedings of the MT Summit XI
@inproceedings{Zhang2:2007:MTSummit,
author = {Ying Zhang and Stephan Vogel},
title = {PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
(Zhang and Vogel, 2007), the translation table has to be reduced to fit a fixed amount of memory.
Eck, Matthias and Vogel, Stephan and Waibel, Alex (2007):
Translation Model Pruning via Usage Statistics for Statistical Machine Translation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
@InProceedings{eck-vogel-waibel:2007:ShortPapers,
author = {Eck, Matthias and Vogel, Stephan and Waibel, Alex},
title = {Translation Model Pruning via Usage Statistics for Statistical Machine Translation},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {21--24},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2006},
year = 2007
}
Eck et al. (2007);
Matthias Eck and Stephan Vogel and Alex Waibel (2007):
Estimating Phrase Pair Relevance for Translation Model Pruning, Proceedings of the MT Summit XI
@inproceedings{Eck:2007:MTSummit,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Estimating Phrase Pair Relevance for Translation Model Pruning},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
Eck et al. (2007b) prune the translation table based on how often a phrase pair was considered during decoding and how often it was used in the best translation.
Germán Sanchis-Trilles and Daniel Ortiz-Martínez and Jes?s González-Rubio and Jorge González and Francisco Casacuberta (2011):
Bilingual segmentation for phrasetable pruning in Statistical Machine Translation, Proceedings of the 15th International Conference of the European Association for Machine Translation (EAMT)
mentioned in Phrase Based Model EM and Pruning Large Models@inproceedings{eamt11:Sanchis-Trilles,
author = {Germ{\'a}n Sanchis-Trilles and Daniel Ortiz-Mart{\'i}nez and Jes?s Gonz{\'a}lez-Rubio and Jorge Gonz{\'a}lez and Francisco Casacuberta},
title = {Bilingual segmentation for phrasetable pruning in Statistical Machine Translation},
url = {
http://www.mt-archive.info/EAMT-2011-Sanchis-Trilles.pdf},
googlescholar = {4525003709044849218},
pages = {257--264},
booktitle = {Proceedings of the 15th International Conference of the European Association for Machine Translation (EAMT)},
location = {Leuven, Belgium},
editor = {Mikel L. Forcada and Heidi Depraetere and Vincent Vandeghinste},
year = 2011
}
Sanchis-Trilles et al. (2011) retranslate the training corpus to re-estimate the phrase table from phrases used in the best derivations, reducing the size of the phrase table vastly with some loss of quality.
Benchmarks
Discussion
Related Topics
New Publications
Xu, Wenduan and Zhang, Yue and Williams, Philip and Koehn, Philipp (2013):
Learning to Prune: Context-Sensitive Pruning for Syntactic MT, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{xu-EtAl:2013:Short1,
author = {Xu, Wenduan and Zhang, Yue and Williams, Philip and Koehn, Philipp},
title = {Learning to Prune: Context-Sensitive Pruning for Syntactic MT},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {352--357},
url = {
http://www.aclweb.org/anthology/P13-2063},
year = 2013
}
Xu et al. (2013)
Matthias Eck and Stephan Vogel and Alex Waibel (2005):
Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and TF-IDF, Proc. of the International Workshop on Spoken Language Translation
@InProceedings{eck:2005b:iwslt,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and {TF-IDF}},
url = {
http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_05/papers/slt5\_061.pdf},
googlescholar = {18408661374131908591},
booktitle = {Proc. of the International Workshop on Spoken Language Translation},
location = {Pittsburgh, PA, USA},
month = {October},
year = 2005
}
Eck et al. (2005)
Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe (2014):
Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {
http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
Martzoukos et al. (2014)
Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe (2014):
Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros and Monz, Christof and Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {
http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
Martzoukos et al. (2014)
Ling, Wang and Graça, João and Trancoso, Isabel and Black, Alan (2012):
Entropy-based Pruning for Phrase-based Machine Translation, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
@InProceedings{ling-EtAl:2012:EMNLP-CoNLL,
author = {Ling, Wang and Gra\c{c}a, Jo\~{a}o and Trancoso, Isabel and Black, Alan},
title = {Entropy-based Pruning for Phrase-based Machine Translation},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {962--971},
url = {
http://www.aclweb.org/anthology/D12-1088},
year = 2012
}
Ling et al. (2012)
Zens, Richard and Stanton, Daisy and Xu, Peng (2012):
A Systematic Comparison of Phrase Table Pruning Techniques, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
@InProceedings{zens-stanton-xu:2012:EMNLP-CoNLL,
author = {Zens, Richard and Stanton, Daisy and Xu, Peng},
title = {A Systematic Comparison of Phrase Table Pruning Techniques},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {972--983},
url = {
http://www.aclweb.org/anthology/D12-1089},
year = 2012
}
Zens et al. (2012)
Lee, Seung-Wook and Zhang, Dongdong and Li, Mu and Zhou, Ming and Rim, Hae-Chang (2012):
Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{lee-EtAl:2012:ACL2012short,
author = {Lee, Seung-Wook and Zhang, Dongdong and Li, Mu and Zhou, Ming and Rim, Hae-Chang},
title = {Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {291--295},
url = {
http://www.aclweb.org/anthology/P12-2057},
year = 2012
}
Lee et al. (2012)
J Howard Johnson (2012):
Conditional Significance Pruning: Discarding More of Huge Phrase Tables, Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)
@inproceedings{AMTA-2012-Johnson,
author = {J Howard Johnson},
title = {Conditional Significance Pruning: Discarding More of Huge Phrase Tables},
url = {
http://www.mt-archive.info/AMTA-2012-Johnson.pdf},
booktitle = {Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {San Diego, California},
year = 2012
}
Johnson (2012)
Nadi Tomeh and Nicola Cancedda and Marc Dymetman (2009):
Complexity-Based Phrase-Table Filtering for Statistical Machine Translation, Proceedings of the Twelfth Machine Translation Summit (MT Summit XII)
@inproceedings{MTS09:Tomeh,
author = {Nadi Tomeh and Nicola Cancedda and Marc Dymetman},
title = {Complexity-Based Phrase-Table Filtering for Statistical Machine Translation},
url = {
http://eprints.pascal-network.org/archive/00006145/01/Tomeh-et-al-09.pdf},
googlescholar = {14142713203469977579},
booktitle = {Proceedings of the Twelfth Machine Translation Summit (MT Summit XII)},
publisher = {International Association for Machine Translation},
location = {Ottawa, Ontario, Canada},
year = 2009
}
Tomeh et al. (2009)
He, Zhongjun and Meng, Yao and Lü, Yajuan and Yu, Hao and Liu, Qun (2009):
Reducing SMT Rule Table with Monolingual Key Phrase, Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
@InProceedings{he-EtAl:2009:Short,
author = {He, Zhongjun and Meng, Yao and L\"{u}, Yajuan and Yu, Hao and Liu, Qun},
title = {Reducing {SMT} Rule Table with Monolingual Key Phrase},
booktitle = {Proceedings of the ACL-IJCNLP 2009 Conference Short Papers},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {121--124},
url = {
http://www.aclweb.org/anthology/P/P09/P09-2031},
year = 2009
}
He et al. (2009)
Iglesias, Gonzalo and de Gispert, Adrià and Banga, Eduardo R. and Byrne, William (2009):
Rule Filtering by Pattern for Efficient Hierarchical Translation, Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
@InProceedings{iglesias-EtAl:2009:EACL,
author = {Iglesias, Gonzalo and de Gispert, Adri\`{a} and Banga, Eduardo R. and Byrne, William},
title = {Rule Filtering by Pattern for Efficient Hierarchical Translation},
booktitle = {Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)},
month = {March},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {380--388},
url = {
http://www.aclweb.org/anthology/E09-1044},
year = 2009
}
Iglesias et al. (2009)
Germann, Ulrich and Joanis, Eric and Larkin, Samuel (2009):
Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too, Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)
@InProceedings{germann-joanis-larkin:2009:SETQA-NLP,
author = {Germann, Ulrich and Joanis, Eric and Larkin, Samuel},
title = {Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too},
booktitle = {Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {31--39},
url = {
http://www.aclweb.org/anthology/W09-1505},
year = 2009
}
Germann et al. (2009)
Matthias Eck and Stephan Vogel and Alex Waibel (2005):
Low Cost Portability for Statistical Machine Translation based on N-gram Coverage, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
@InProceedings{Eck:2005:MTS,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Coverage},
url = {
http://i13pc106.ira.uka.de/fileadmin/publication-files/2\_MTS-2005-ECK.pdf},
googlescholar = {16517770361722260065},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
Eck et al. (2005)