Pruning Large Translation Models
Physical requirements may limit the size of translation models that can be used in practice, so we may have to prune the models by removing the least relevant phrase pairs.
Pruning Large Models is the main subject of 23 publications. 11 are discussed here.
Publications
Quirk, Chris  and  Menezes, Arul (2006): 
Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation, Proceedings of the Human Language Technology Conference of the NAACL, Main Conference 
 
 
 
@InProceedings{quirk-menezes:2006:HLT-NAACL06-Main,
author = {Quirk, Chris  and  Menezes, Arul},
title = {Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation},
booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
month = {June},
address = {New York City, USA},
publisher = {Association for Computational Linguistics},
pages = {9--16},
url = {
http://www.aclweb.org/anthology/N/N06/N06-1002},
year = 2006
}
 Quirk and Menezes (2006) argue that extracting only minimal phrases, i.e. the smallest phrase pairs that map each entire sentence pair, does not hurt performance. This is also the basis of the n-gram translation model 
José B. Mariño and Rafael E. Banchs and Josep M. Crego and Adrià de Gispert and Patrik Lambert and José A. R. Fonollosa and Marta Ruiz Costa-jussà (2006): 
N-gram-based Machine Translation, Computational Linguistics 
 
 
 
@Article{Marino:CL:2006,
author = {Jos{\'e} B. Mari{\~n}o and Rafael E. Banchs and Josep M. Crego and Adri{\`a} de Gispert and Patrik Lambert and Jos{\'e} A. R. Fonollosa and Marta Ruiz Costa-juss{\`a}},
title = {N-gram-based Machine Translation},
url = {
http://upcommons.upc.edu/e-prints/bitstream/2117/2104/1/coli.2006.32.4.527.pdf},
googlescholar = {3958992429316331449},
journal = {Computational Linguistics},
volume = {32},
number = {4},
year = 2006
}
 (Mariño et al., 2006; 
Costa-jussà, Marta Ruiz and  Crego, Josep M.  and  Vilar, David  and  Fonollosa, José A. R. and  Mariño, José B.  and  Ney, Hermann (2007): 
Analysis and System Combination of Phrase- and N-Gram-Based Statistical Machine Translation Systems, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers 
 
 
 
@InProceedings{rcostajussa-EtAl:2007:ShortPapers,
author = {Costa-juss\`{a}, Marta Ruiz and  Crego, Josep M.  and  Vilar, David  and  Fonollosa, Jos\'{e} A. R. and  Mari{\~n}o, Jos\'{e} B.  and  Ney, Hermann},
title = {Analysis and System Combination of Phrase- and {N}-Gram-Based Statistical Machine Translation Systems},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {137--140},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2035},
year = 2007
}
 Costa-jussà et al., 2007), a variant of the phrase-based model. 
Discarding unlikely phrase pairs based on significance tests on their more-than-random occurrence reduces the phrase table drastically and may even yield increases in performance 
Johnson, Howard  and  Martin, Joel  and  Foster, George  and  Kuhn, Roland (2007): 
Improving Translation Quality by Discarding Most of the Phrasetable, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) 
 
 
 
@InProceedings{johnson-EtAl:2007:EMNLP-CoNLL2007,
author = {Johnson, Howard  and  Martin, Joel  and  Foster, George  and  Kuhn, Roland},
title = {Improving Translation Quality by Discarding Most of the Phrasetable},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {967--975},
url = {
http://www.aclweb.org/anthology/D/D07/D07-1103},
year = 2007
}
 (Johnson et al., 2007). 
Hua Wu and Haifeng Wang (2007): 
Comparative Study of Word Alignment Heuristics and Phrase-Based SMT, Proceedings of the MT Summit XI 
 
 
 
@inproceedings{Wu:2007:MTSummit,
author = {Hua Wu and Haifeng Wang},
title = {Comparative Study of Word Alignment Heuristics and Phrase-Based {SMT}},
url = {
http://mt-archive.info/MTS-2007-Wu.pdf},
googlescholar = {7851682665236592125},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
 Wu and Wang (2007) propose a method for filtering the noise in the phrase translation table based on a log likelihood ratio. 
Takeshi Kutsumi and Takehiko Yoshimi and Katsunori Kotani and Ichiko Sata and Hitoshi Isahara (2005): 
Selection of Entries for a Bilingual Dictionary from Aligned Translation Equivalents using Support Vector Machines, Proceedings of the Tenth Machine Translation Summit (MT Summit X) 
 
 
 
@InProceedings{Kutsumi:2005:MTS,
author = {Takeshi Kutsumi and Takehiko Yoshimi and Katsunori Kotani and Ichiko Sata and Hitoshi Isahara},
title = {Selection of Entries for a Bilingual Dictionary from Aligned Translation Equivalents using Support Vector Machines},
url = {
http://www.mt-archive.info/MTS-2005-Kutsumi.pdf},
googlescholar = {12230418553492439993},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
 Kutsumi et al. (2005) uses a support vector machine for cleaning phrase tables. 
Such considerations may also be taking into account in second pass phrase extraction stage that does not extract bad phrase pairs 
Zettlemoyer, Luke  and  Moore, Robert C. (2007): 
Selective Phrase Pair Extraction for Improved Statistical Machine Translation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers 
 
 
 
@InProceedings{zettlemoyer-moore:2007:ShortPapers,
author = {Zettlemoyer, Luke  and  Moore, Robert C.},
title = {Selective Phrase Pair Extraction for Improved Statistical Machine Translation},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {209--212},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2053},
year = 2007
}
 (Zettlemoyer and Moore, 2007). When faced with porting phrase-based models to small devices such as PDAs 
Ying Zhang and Stephan Vogel (2007): 
PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices, Proceedings of the MT Summit XI 
 
 
@inproceedings{Zhang2:2007:MTSummit,
author = {Ying Zhang and Stephan Vogel},
title = {PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
 (Zhang and Vogel, 2007), the translation table has to be reduced to fit a fixed amount of memory. 
Eck, Matthias  and  Vogel, Stephan  and  Waibel, Alex (2007): 
Translation Model Pruning via Usage Statistics for Statistical Machine Translation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers 
 
 
 
@InProceedings{eck-vogel-waibel:2007:ShortPapers,
author = {Eck, Matthias  and  Vogel, Stephan  and  Waibel, Alex},
title = {Translation Model Pruning via Usage Statistics for Statistical Machine Translation},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {21--24},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2006},
year = 2007
}
 Eck et al. (2007); 
Matthias Eck and Stephan Vogel and Alex Waibel (2007): 
Estimating Phrase Pair Relevance for Translation Model Pruning, Proceedings of the MT Summit XI 
 
 
@inproceedings{Eck:2007:MTSummit,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Estimating Phrase Pair Relevance for Translation Model Pruning},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
 Eck et al. (2007b) prune the translation table based on how often a phrase pair was considered during decoding and how often it was used in the best translation. 
Germán Sanchis-Trilles and Daniel Ortiz-Martínez and Jes?s González-Rubio and Jorge González and Francisco Casacuberta (2011): 
Bilingual segmentation for phrasetable pruning in Statistical Machine Translation, Proceedings of the 15th International Conference of the European Association for Machine Translation (EAMT) 
 
 
 
 mentioned in Phrase Based Model EM and Pruning Large Models@inproceedings{eamt11:Sanchis-Trilles,
author = {Germ{\'a}n Sanchis-Trilles and Daniel Ortiz-Mart{\'i}nez and Jes?s Gonz{\'a}lez-Rubio and Jorge Gonz{\'a}lez and Francisco Casacuberta},
title = {Bilingual segmentation for phrasetable pruning in Statistical Machine Translation},
url = {
http://www.mt-archive.info/EAMT-2011-Sanchis-Trilles.pdf},
googlescholar = {4525003709044849218},
pages = {257--264},
booktitle = {Proceedings of the 15th International Conference of the European Association for Machine Translation (EAMT)},
location = {Leuven, Belgium},
editor = {Mikel L. Forcada and Heidi Depraetere and Vincent Vandeghinste},
year = 2011
}
 Sanchis-Trilles et al. (2011) retranslate the training corpus to re-estimate the phrase table from phrases used in the best derivations, reducing the size of the phrase table vastly with some loss of quality.
Benchmarks
Discussion
Related Topics
New Publications
Xu, Wenduan  and  Zhang, Yue  and  Williams, Philip  and  Koehn, Philipp (2013): 
Learning to Prune: Context-Sensitive Pruning for Syntactic MT, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 
 
 
 
@InProceedings{xu-EtAl:2013:Short1,
author = {Xu, Wenduan  and  Zhang, Yue  and  Williams, Philip  and  Koehn, Philipp},
title = {Learning to Prune: Context-Sensitive Pruning for Syntactic MT},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {352--357},
url = {
http://www.aclweb.org/anthology/P13-2063},
year = 2013
}
 Xu et al. (2013)
Matthias Eck and Stephan Vogel and Alex Waibel (2005): 
Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and TF-IDF, Proc. of the International Workshop on Spoken Language Translation 
 
 
 
@InProceedings{eck:2005b:iwslt,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and {TF-IDF}},
url = {
http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_05/papers/slt5\_061.pdf},
googlescholar = {18408661374131908591},
booktitle = {Proc. of the International Workshop on Spoken Language Translation},
location = {Pittsburgh, PA, USA},
month = {October},
year = 2005
}
 Eck et al. (2005)
Martzoukos, Spyros  and  Monz, Christof  and  Costa Florencio, Christophe (2014): 
Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics 
 
 
 
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros  and  Monz, Christof  and  Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {
http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
 Martzoukos et al. (2014)
Martzoukos, Spyros  and  Monz, Christof  and  Costa Florencio, Christophe (2014): 
Maximizing Component Quality in Bilingual Word-Aligned Segmentations, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics 
 
 
 
@InProceedings{martzoukos-monz-costaflorencio:2014:EACL,
author = {Martzoukos, Spyros  and  Monz, Christof  and  Costa Florencio, Christophe},
title = {Maximizing Component Quality in Bilingual Word-Aligned Segmentations},
booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Gothenburg, Sweden},
publisher = {Association for Computational Linguistics},
pages = {30--38},
url = {
http://www.aclweb.org/anthology/E14-1004},
year = 2014
}
 Martzoukos et al. (2014)
Ling, Wang  and  Graça, João  and  Trancoso, Isabel  and  Black, Alan (2012): 
Entropy-based Pruning for Phrase-based Machine Translation, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning 
 
 
 
@InProceedings{ling-EtAl:2012:EMNLP-CoNLL,
author = {Ling, Wang  and  Gra\c{c}a, Jo\~{a}o  and  Trancoso, Isabel  and  Black, Alan},
title = {Entropy-based Pruning for Phrase-based Machine Translation},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {962--971},
url = {
http://www.aclweb.org/anthology/D12-1088},
year = 2012
}
 Ling et al. (2012)
Zens, Richard  and  Stanton, Daisy  and  Xu, Peng (2012): 
A Systematic Comparison of Phrase Table Pruning Techniques, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning 
 
 
 
@InProceedings{zens-stanton-xu:2012:EMNLP-CoNLL,
author = {Zens, Richard  and  Stanton, Daisy  and  Xu, Peng},
title = {A Systematic Comparison of Phrase Table Pruning Techniques},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {972--983},
url = {
http://www.aclweb.org/anthology/D12-1089},
year = 2012
}
 Zens et al. (2012)
Lee, Seung-Wook  and  Zhang, Dongdong  and  Li, Mu  and  Zhou, Ming  and  Rim, Hae-Chang (2012): 
Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 
 
 
 
@InProceedings{lee-EtAl:2012:ACL2012short,
author = {Lee, Seung-Wook  and  Zhang, Dongdong  and  Li, Mu  and  Zhou, Ming  and  Rim, Hae-Chang},
title = {Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {291--295},
url = {
http://www.aclweb.org/anthology/P12-2057},
year = 2012
}
 Lee et al. (2012)
J Howard Johnson (2012): 
Conditional Significance Pruning: Discarding More of Huge Phrase Tables, Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA) 
 
 
 
@inproceedings{AMTA-2012-Johnson,
author = {J Howard Johnson},
title = {Conditional Significance Pruning: Discarding More of Huge Phrase Tables},
url = {
http://www.mt-archive.info/AMTA-2012-Johnson.pdf},
booktitle = {Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {San Diego, California},
year = 2012
}
 Johnson (2012)
Nadi Tomeh and Nicola Cancedda and Marc Dymetman (2009): 
Complexity-Based Phrase-Table Filtering for Statistical Machine Translation, Proceedings of the Twelfth Machine Translation Summit (MT Summit XII) 
 
 
 
@inproceedings{MTS09:Tomeh,
author = {Nadi Tomeh and Nicola Cancedda and Marc Dymetman},
title = {Complexity-Based Phrase-Table Filtering for Statistical Machine Translation},
url = {
http://eprints.pascal-network.org/archive/00006145/01/Tomeh-et-al-09.pdf},
googlescholar = {14142713203469977579},
booktitle = {Proceedings of the Twelfth Machine Translation Summit (MT Summit XII)},
publisher = {International Association for Machine Translation},
location = {Ottawa, Ontario, Canada},
year = 2009
}
 Tomeh et al. (2009)
He, Zhongjun  and  Meng, Yao  and  Lü, Yajuan  and  Yu, Hao  and  Liu, Qun (2009): 
Reducing SMT Rule Table with Monolingual Key Phrase, Proceedings of the ACL-IJCNLP 2009 Conference Short Papers 
 
 
 
@InProceedings{he-EtAl:2009:Short,
author = {He, Zhongjun  and  Meng, Yao  and  L\"{u}, Yajuan  and  Yu, Hao  and  Liu, Qun},
title = {Reducing {SMT} Rule Table with Monolingual Key Phrase},
booktitle = {Proceedings of the ACL-IJCNLP 2009 Conference Short Papers},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {121--124},
url = {
http://www.aclweb.org/anthology/P/P09/P09-2031},
year = 2009
}
 He et al. (2009)
Iglesias, Gonzalo  and  de Gispert, Adrià  and  Banga, Eduardo R.  and  Byrne, William (2009): 
Rule Filtering by Pattern for Efficient Hierarchical Translation, Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) 
 
 
 
@InProceedings{iglesias-EtAl:2009:EACL,
author = {Iglesias, Gonzalo  and  de Gispert, Adri\`{a}  and  Banga, Eduardo R.  and  Byrne, William},
title = {Rule Filtering by Pattern for Efficient Hierarchical Translation},
booktitle = {Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)},
month = {March},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {380--388},
url = {
http://www.aclweb.org/anthology/E09-1044},
year = 2009
}
 Iglesias et al. (2009)
Germann, Ulrich  and  Joanis, Eric  and  Larkin, Samuel (2009): 
Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too, Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009) 
 
 
 
@InProceedings{germann-joanis-larkin:2009:SETQA-NLP,
author = {Germann, Ulrich  and  Joanis, Eric  and  Larkin, Samuel},
title = {Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too},
booktitle = {Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {31--39},
url = {
http://www.aclweb.org/anthology/W09-1505},
year = 2009
}
 Germann et al. (2009)
Matthias Eck and Stephan Vogel and Alex Waibel (2005): 
Low Cost Portability for Statistical Machine Translation based on N-gram Coverage, Proceedings of the Tenth Machine Translation Summit (MT Summit X) 
 
 
 
@InProceedings{Eck:2005:MTS,
author = {Matthias Eck and Stephan Vogel and Alex Waibel},
title = {Low Cost Portability for Statistical Machine Translation based on N-gram Coverage},
url = {
http://i13pc106.ira.uka.de/fileadmin/publication-files/2\_MTS-2005-ECK.pdf},
googlescholar = {16517770361722260065},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
 Eck et al. (2005)