Very Large Language Models

Since language models training from vast monolingual corpora are easily very large, a constant research topic is to handle such large models efficiently.

Very Large Language Models is the main subject of 15 publications. 10 are discussed here.

Topics in LanguageModels

N Gram Language Models | Targeted Language Models | Morphological Language Models | Very Large Language Models

Publications

Using very language models that require more space than the available working memory may be distributed over a cluster of machines, but may not need sophisticated smoothing methods

Brants, Thorsten and Popat, Ashok C. and Xu, Peng and Och, Franz Josef and Dean, Jeffrey (2007): Large Language Models in Machine Translation, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

(Brants et al., 2007). Alternatively, storing the language model on disk using memory mapping is an option

Federico, Marcello and Cettolo, Mauro (2007): Efficient Handling of N-gram Language Models for Statistical Machine Translation, Proceedings of the Second Workshop on Statistical Machine Translation

(Federico and Cettolo, 2007). The methods for quantizing language model probabilities are presented by

Federico, Marcello and Bertoldi, Nicola (2006): How Many Bits Are Needed To Store Probabilities for Phrase-Based Translation?, Proceedings on the Workshop on Statistical Machine Translation

Federico and Bertoldi (2006), who also examine this for translation model probabilities.

Heafield, Kenneth (2011): KenLM: Faster and Smaller Language Model Queries, Proceedings of the Sixth Workshop on Statistical Machine Translation

Heafield (2011) introduces efficient data structures that enable compact storage and quick lookup, outperforming traditional approaches.

Alternatively, lossy data structures such as bloom filters may be used to store very large language models efficiently

Talbot, David and Osborne, Miles (2007): Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

(Talbot and Osborne, 2007;

Talbot, David and Osborne, Miles (2007): Randomised Language Modelling for Statistical Machine Translation, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

Talbot and Osborne, 2007b). Such randomized language models allow for incremental updating from a incoming stream of new training data

Levenberg, Abby and Osborne, Miles (2009): Stream-based Randomised Language Models for SMT, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

(Levenberg and Osborne, 2009), or even multiple streams

Levenberg, Abby and Osborne, Miles and Matthews, David (2011): Multiple-stream Language Models for Statistical Machine Translation, Proceedings of the Sixth Workshop on Statistical Machine Translation

(Levenberg et al., 2011).

The use of very large language models is often reduced to a reranking stage

Olteanu, Marian and Suriyentrakorn, Pasin and Moldovan, Dan (2006): Language Models and Reranking for Machine Translation, Proceedings on the Workshop on Statistical Machine Translation

(Olteanu et al., 2006).

Benchmarks

Discussion

New Publications

Heafield, Kenneth and Pouzyrevsky, Ivan and Clark, Jonathan H. and Koehn, Philipp (2013): Scalable Modified Kneser-Ney Language Model Estimation, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{heafield-EtAl:2013:Short,
author = {Heafield, Kenneth and Pouzyrevsky, Ivan and Clark, Jonathan H. and Koehn, Philipp},
title = {Scalable Modified Kneser-Ney Language Model Estimation},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {690--696},
url = {http://www.aclweb.org/anthology/P13-2121},
year = 2013
}
Heafield et al. (2013)
Yasuhara, Makoto and Tanaka, Toru and Norimatsu, Jun-ya and Yamamoto, Mikio (2013): An Efficient Language Model Using Double-Array Structures, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{yasuhara-EtAl:2013:EMNLP,
author = {Yasuhara, Makoto and Tanaka, Toru and Norimatsu, Jun-ya and Yamamoto, Mikio},
title = {An Efficient Language Model Using Double-Array Structures},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {222--232},
url = {http://www.aclweb.org/anthology/D13-1023},
year = 2013
}
Yasuhara et al. (2013)
Vaswani, Ashish and Zhao, Yinggong and Fossum, Victoria and Chiang, David (2013): Decoding with Large-Scale Neural Language Models Improves Translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing mentioned in Very Large Language Models and Neural Language Models
add
@InProceedings{vaswani-EtAl:2013:EMNLP,
author = {Vaswani, Ashish and Zhao, Yinggong and Fossum, Victoria and Chiang, David},
title = {Decoding with Large-Scale Neural Language Models Improves Translation},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {1387--1392},
url = {http://www.aclweb.org/anthology/D13-1140},
year = 2013
}
Vaswani et al. (2013)
Dani Yogatama and Chong Wang and Bryan R. Routledge and Noah A. Smith and Eric P. Xing (2014): Dynamic Language Models for Streaming Text, Transactions of the Association for Computational Linguistics (TACL)
add
@article{tacl14-Yogatama,
author = {Dani Yogatama and Chong Wang and Bryan R. Routledge and Noah A. Smith and Eric P. Xing},
title = {Dynamic Language Models for Streaming Text},
number = {2},
pages = {181-192},
url = {http://www.aclweb.org/anthology/Q/Q14/Q14-1015.pdf},
booktitle = {Transactions of the Association for Computational Linguistics (TACL)},
year = 2014
}
Yogatama et al. (2014)
Guthrie, David and Hepple, Mark (2010): Storing the Web in Memory: Space Efficient Language Models with Constant Time Retrieval, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{guthrie-hepple:2010:EMNLP,
author = {Guthrie, David and Hepple, Mark},
title = {Storing the Web in Memory: Space Efficient Language Models with Constant Time Retrieval},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {262--272},
url = {http://www.aclweb.org/anthology/D/D10/D10-1026},
year = 2010
}
Guthrie and Hepple (2010)
Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun (2011): A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies
add
@InProceedings{tan-EtAl:2011:ACL-HLT2011,
author = {Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun},
title = {A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies},
month = {June},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {201--210},
url = {http://www.aclweb.org/anthology/P11-1021},
year = 2011
}
Tan et al. (2011)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

Very Large Language Models

Publications

Benchmarks

Discussion

Related Topics

New Publications