Neural Language Models
Various neural network architectures have been applied to the basic task of language modelling, such as n-gram feed-forward models, recurrent neural networks, convolutional neural networks.
Neural Language Models is the main subject of 31 publications. 15 are discussed here.
Publications
The first vanguard of neural network research tackled language models. A prominent reference for neural language model is
Yoshua Bengio and Réjean Ducharme and Pascal Vincent and Christian Jauvin (2003):
A Neural Probabilistic Language Model, Journal of Machine Learning Research
@article{bengio:JMLR:2003,
author = {Yoshua Bengio and R{\'e}jean Ducharme and Pascal Vincent and Christian Jauvin},
title = {A Neural Probabilistic Language Model},
journal = {Journal of Machine Learning Research},
volume = {3},
month = {February},
pages = {1137--1155},
year = 2003
}
Bengio et al. (2003), who implement an n-gram language model as a feed-forward neural network with the history words as input and the predicted word as output.
Schwenk, Holger and Dechelotte, Daniel and Gauvain, Jean-Luc (2006):
Continuous Space Language Models for Statistical Machine Translation, Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions
mentioned in Neural Network Models and Neural Language Models@InProceedings{schwenk-dechelotte-gauvain:2006:POS,
author = {Schwenk, Holger and Dechelotte, Daniel and Gauvain, Jean-Luc},
title = {Continuous Space Language Models for Statistical Machine Translation},
booktitle = {Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {723--730},
url = {
http://www.aclweb.org/anthology/P/P06/P06-2093},
year = 2006
}
Schwenk et al. (2006) introduce such language models to machine translation (also called "continuous space language models"), and use them in re-ranking, similar to the earlier work in speech recognition.
Holger Schwenk (2007):
Continuous space language models, Computer Speech and Language
@article{schwenk:csl:2007,
author = {Holger Schwenk},
title = {Continuous space language models},
journal = {Computer Speech and Language},
url = {
https://wiki.inf.ed.ac.uk/twiki/pub/CSTR/ListenSemester2\_2009\_10/sdarticle.pdf},
number = {21},
volume = {3},
pages = {492--518},
year = 2007
}
Schwenk (2007) propose a number of speed-ups. They made their implementation available as a open source toolkit
Holger Schwenk (2010):
Continuous-Space Language Models for Statistical Machine Translation, The Prague Bulletin of Mathematical Linguistics
@article{pbml-93-schwenk,
author = {Holger Schwenk},
title = {Continuous-Space Language Models for Statistical Machine Translation},
url = {
http://ufal.mff.cuni.cz/pbml/93/art-schwenk.pdf},
pages = {137--146},
journal = {The Prague Bulletin of Mathematical Linguistics},
volume = {93},
year = 2010
}
(Schwenk, 2010), which also supports training on a graphical processing unit (GPU)
Schwenk, Holger and Rousseau, Anthony and Attik, Mohammed (2012):
Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation, Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT
@InProceedings{schwenk-rousseau-attik:2012:WLM,
author = {Schwenk, Holger and Rousseau, Anthony and Attik, Mohammed},
title = {Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation},
booktitle = {Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT},
month = {June},
address = {Montr{\'e}al, Canada},
publisher = {Association for Computational Linguistics},
pages = {11--19},
url = {
http://www.aclweb.org/anthology/W12-2702},
year = 2012
}
(Schwenk et al., 2012).
By first clustering words into classes and encoding words as pair of class and word-in-class bits,
Paul Baltescu and Phil Blunsom and Hieu Hoang (2014):
OxLM: A Neural Language Modelling Framework for Machine Translation, The Prague Bulletin of Mathematical Linguistics
@article{pbml-102-baltescu-blunsom-hoang,
author = {Paul Baltescu and Phil Blunsom and Hieu Hoang},
title = {OxLM: A Neural Language Modelling Framework for Machine Translation},
url = {
http://ufal.mff.cuni.cz/pbml/102/art-baltescu-blunsom-hoang.pdf},
pages = {81-92},
journal = {The Prague Bulletin of Mathematical Linguistics},
volume = {102},
month = {October},
year = 2014
}
Baltescu et al. (2014) reduce the computational complexity sufficiently to allow integration of the neural network language model into the decoder. Another way to reduce computational complexity to enable decoder integration is the use of noise contrastive estimation by
Vaswani, Ashish and Zhao, Yinggong and Fossum, Victoria and Chiang, David (2013):
Decoding with Large-Scale Neural Language Models Improves Translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
mentioned in Very Large Language Models and Neural Language Models@InProceedings{vaswani-EtAl:2013:EMNLP,
author = {Vaswani, Ashish and Zhao, Yinggong and Fossum, Victoria and Chiang, David},
title = {Decoding with Large-Scale Neural Language Models Improves Translation},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {1387--1392},
url = {
http://www.aclweb.org/anthology/D13-1140},
year = 2013
}
Vaswani et al. (2013), which roughly self-normalizes the output scores of the model during training, hence removing the need to compute the values for all possible output words.
Baltescu, Paul and Blunsom, Phil (2015):
Pragmatic Neural Language Modelling in Machine Translation, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@InProceedings{baltescu-blunsom:2015:NAACL-HLT,
author = {Baltescu, Paul and Blunsom, Phil},
title = {Pragmatic Neural Language Modelling in Machine Translation},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {May--June},
address = {Denver, Colorado},
publisher = {Association for Computational Linguistics},
pages = {820--829},
url = {
http://www.aclweb.org/anthology/N15-1083},
year = 2015
}
Baltescu and Blunsom (2015) compare the two techniques - class-based word encoding with normalized scores vs. noise-contrastive estimation without normalized scores - and show that the latter gives better performance with much higher speed.
As another way to allow straightforward decoder integration,
Wang, Rui and Utiyama, Masao and Goto, Isao and Sumita, Eiichro and Zhao, Hai and Lu, Bao-Liang (2013):
Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
@InProceedings{wang-EtAl:2013:EMNLP2,
author = {Wang, Rui and Utiyama, Masao and Goto, Isao and Sumita, Eiichro and Zhao, Hai and Lu, Bao-Liang},
title = {Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {845--850},
url = {
http://www.aclweb.org/anthology/D13-1082},
year = 2013
}
Wang et al. (2013) convert a continuous space language model for a short list of 8192 words into a traditional n-gram language model in ARPA (SRILM) format.
Wang, Rui and Zhao, Hai and Lu, Bao-Liang and Utiyama, Masao and Sumita, Eiichiro (2014):
Neural Network Based Bilingual Language Model Growing for Statistical Machine Translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
mentioned in Language Models and Neural Language Models@InProceedings{wang-EtAl:2014:EMNLP20142,
author = {Wang, Rui and Zhao, Hai and Lu, Bao-Liang and Utiyama, Masao and Sumita, Eiichiro},
title = {Neural Network Based Bilingual Language Model Growing for Statistical Machine Translation},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {189--195},
url = {
http://www.aclweb.org/anthology/D14-1023},
year = 2014
}
Wang et al. (2014) present a method to merge (or "grow") a continuous space language model with a traditional n-gram language model, to take advantage of both better estimate for the words in the short list and the full coverage from the traditional model.
Finch, Andrew and Dixon, Paul and Sumita, Eiichiro (2012):
Rescoring a Phrase-based Machine Transliteration System with Recurrent Neural Network Language Models, Proceedings of the 4th Named Entity Workshop (NEWS) 2012
@InProceedings{finch-dixon-sumita:2012:NEWS2012,
author = {Finch, Andrew and Dixon, Paul and Sumita, Eiichiro},
title = {Rescoring a Phrase-based Machine Transliteration System with Recurrent Neural Network Language Models},
booktitle = {Proceedings of the 4th Named Entity Workshop (NEWS) 2012},
month = {July},
address = {Jeju, Korea},
publisher = {Association for Computational Linguistics},
pages = {47--51},
url = {
http://www.aclweb.org/anthology/W12-4406},
year = 2012
}
Finch et al. (2012) use a recurrent neural network language model to rescore n-best lists for a transliteration system.
Sundermeyer, Martin and Oparin, Ilya and Gauvain, Jean-Luc and Freiberg, Ben and Schlüter, Ralf and Ney, Hermann (2013):
Comparison of Feedforward and Recurrent Neural Network Language Models, IEEE International Conference on Acoustics, Speech, and Signal Processing
@InProceedings {sundermeyer13:cmp,
author = {Sundermeyer, Martin and Oparin, Ilya and Gauvain, Jean-Luc and Freiberg, Ben and Schl{\"u}ter, Ralf and Ney, Hermann},
title = {Comparison of Feedforward and Recurrent Neural Network Language Models},
booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing},
pages = {8430-8434},
address = {Vancouver, Canada},
month = {may},
booktitlelink = {
http://www.icassp2013.com/},
url = {
http://www.eu-bridge.eu/downloads/\_Comparison\_of\_Feedforward\_and\_Recurrent\_Neural\_Network\_Language\_Models.pdf},
year = 2013
}
Sundermeyer et al. (2013) compare feed-forward with long short-term neural network language models, a variant of recurrent neural network language models, showing better performance for the latter in a speech recognition re-ranking task.
Mikolov (2012) reports significant improvements with reranking n-best lists of machine translation systems with a recurrent neural network language model.
Neural language models are not deep learning in the sense that they use a lot of hidden layers. However,
Luong, Thang and Kayser, Michael and Manning, Christopher D. (2015):
Deep Neural Language Models for Machine Translation, Proceedings of the Nineteenth Conference on Computational Natural Language Learning
@InProceedings{luong-kayser-manning:2015:CoNLL,
author = {Luong, Thang and Kayser, Michael and Manning, Christopher D.},
title = {Deep Neural Language Models for Machine Translation},
booktitle = {Proceedings of the Nineteenth Conference on Computational Natural Language Learning},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {305--309},
url = {
http://www.aclweb.org/anthology/K15-1031},
year = 2015
}
Luong et al. (2015) show that having 3-4 hidden layers improves over having just the typical 1 layer.
Language Models in Neural Machine Translation: Traditional statistical machine translation models have a straightforward mechanism to integrate additional knowledge sources, such as a large out of domain language model. It is harder for end-to-end neural machine translation.
Çaglar Gülçehre and Orhan Firat and Kelvin Xu and Kyunghyun Cho and Loïc Barrault and Huei-Chi Lin and Fethi Bougares and Holger Schwenk and Yoshua Bengio (2015):
On Using Monolingual Corpora in Neural Machine Translation, CoRR
@article{DBLP:journals/corr/GulcehreFXCBLBS15,
author = {{\c{C}}aglar G{\"{u}}l{\c{c}}ehre and Orhan Firat and Kelvin Xu and Kyunghyun Cho and Lo{\"{\i}}c Barrault and Huei{-}Chi Lin and Fethi Bougares and Holger Schwenk and Yoshua Bengio},
title = {On Using Monolingual Corpora in Neural Machine Translation},
journal = {CoRR},
volume = {abs/1503.03535},
url = {
http://arxiv.org/abs/1503.03535},
timestamp = {Thu, 09 Apr 2015 11:33:20 +0200},
biburl = {
http://dblp.uni-trier.de/rec/bib/journals/corr/GulcehreFXCBLBS15},
bibsource = {dblp computer science bibliography,
http://dblp.org},
year = 2015
}
Gülçehre et al. (2015) add a language model trained on additional monolingual data to this model, in form of a recurrently neural network that runs in parallel. They compare the use of the language model in re-ranking (or, re-scoring) against deeper integration where a gated unit regulates the relative contribution of the language model and the translation model when predicting a word.
Benchmarks
Discussion
Related Topics
New Publications
Herold, Christian and Gao, Yingbo and Ney, Hermann (2018):
Improving Neural Language Models with Weight Norm Initialization and Regularization, Proceedings of the Third Conference on Machine Translation: Research Papers
@inproceedings{W18-6310,
author = {Herold, Christian and Gao, Yingbo and Ney, Hermann},
title = {Improving Neural Language Models with Weight Norm Initialization and Regularization},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6310},
pages = {93--100},
year = 2018
}
Herold et al. (2018)
Stahlberg, Felix and Cross, James and Stoyanov, Veselin (2018):
Simple Fusion: Return of the Language Model, Proceedings of the Third Conference on Machine Translation: Research Papers
@inproceedings{W18-6321,
author = {Stahlberg, Felix and Cross, James and Stoyanov, Veselin},
title = {Simple Fusion: Return of the Language Model},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6321},
pages = {204--211},
year = 2018
}
Stahlberg et al. (2018)
Aram Ter-Sarkisov and Holger Schwenk and Fethi Bougares and Loïc Barrault (2014):
Incremental Adaptation Strategies for Neural Network Language Models, CoRR
@article{DBLP:journals/corr/Ter-SarkisovSBB14,
author = {Aram Ter{-}Sarkisov and Holger Schwenk and Fethi Bougares and Lo{\"{\i}}c Barrault},
title = {Incremental Adaptation Strategies for Neural Network Language Models},
journal = {CoRR},
volume = {abs/1412.6650},
url = {
http://arxiv.org/abs/1412.6650},
timestamp = {Thu, 01 Jan 2015 19:51:08 +0100},
biburl = {
http://dblp.uni-trier.de/rec/bib/journals/corr/Ter-SarkisovSBB14},
bibsource = {dblp computer science bibliography,
http://dblp.org},
year = 2014
}
Ter-Sarkisov et al. (2014)
Verwimp, Lyan and Pelemans, Joris and Van hamme, Hugo and Wambacq, Patrick (2017):
Character-Word LSTM Language Models, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
@InProceedings{verwimp-EtAl:2017:EACLlong,
author = {Verwimp, Lyan and Pelemans, Joris and Van hamme, Hugo and Wambacq, Patrick},
title = {Character-Word LSTM Language Models},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers},
month = {April},
address = {Valencia, Spain},
publisher = {Association for Computational Linguistics},
pages = {417--427},
url = {
http://www.aclweb.org/anthology/E17-1040},
year = 2017
}
Verwimp et al. (2017)
Pham, Ngoc-Quan and Kruszewski, Germán and Boleda, Gemma (2016):
Convolutional Neural Network Language Models, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
@InProceedings{pham-kruszewski-boleda:2016:EMNLP2016,
author = {Pham, Ngoc-Quan and Kruszewski, Germ\'{a}n and Boleda, Gemma},
title = {Convolutional Neural Network Language Models},
booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
month = {November},
address = {Austin, Texas},
publisher = {Association for Computational Linguistics},
pages = {1153--1162},
url = {
https://aclweb.org/anthology/D16-1123},
year = 2016
}
Pham et al. (2016)
Miyamoto, Yasumasa and Cho, Kyunghyun (2016):
Gated Word-Character Recurrent Language Model, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
@InProceedings{miyamoto-cho:2016:EMNLP2016,
author = {Miyamoto, Yasumasa and Cho, Kyunghyun},
title = {Gated Word-Character Recurrent Language Model},
booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
month = {November},
address = {Austin, Texas},
publisher = {Association for Computational Linguistics},
pages = {1992--1997},
url = {
https://aclweb.org/anthology/D16-1209},
year = 2016
}
Miyamoto and Cho (2016)
Neubig, Graham and Dyer, Chris (2016):
Generalizing and Hybridizing Count-based and Neural Language Models, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
@InProceedings{neubig-dyer:2016:EMNLP2016,
author = {Neubig, Graham and Dyer, Chris},
title = {Generalizing and Hybridizing Count-based and Neural Language Models},
booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
month = {November},
address = {Austin, Texas},
publisher = {Association for Computational Linguistics},
pages = {1163--1172},
url = {
https://aclweb.org/anthology/D16-1124},
year = 2016
}
Neubig and Dyer (2016)
Niehues, Jan and Ha, Thanh-Le and Cho, Eunah and Waibel, Alex (2016):
Using Factored Word Representation in Neural Network Language Models, Proceedings of the First Conference on Machine Translation
@InProceedings{niehues-EtAl:2016:WMT,
author = {Niehues, Jan and Ha, Thanh-Le and Cho, Eunah and Waibel, Alex},
title = {Using Factored Word Representation in Neural Network Language Models},
booktitle = {Proceedings of the First Conference on Machine Translation},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {74--82},
url = {
http://www.aclweb.org/anthology/W/W16/W16-2208},
year = 2016
}
Niehues et al. (2016)
Chen, Wenlin and Grangier, David and Auli, Michael (2016):
Strategies for Training Large Vocabulary Neural Language Models, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{chen-grangier-auli:2016:P16-1,
author = {Chen, Wenlin and Grangier, David and Auli, Michael},
title = {Strategies for Training Large Vocabulary Neural Language Models},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {1975--1985},
url = {
http://www.aclweb.org/anthology/P16-1186},
year = 2016
}
Chen et al. (2016)
Chen, Yunchuan and Mou, Lili and Xu, Yan and Li, Ge and Jin, Zhi (2016):
Compressing Neural Language Models by Sparse Word Representations, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{chen-EtAl:2016:P16-11,
author = {Chen, Yunchuan and Mou, Lili and Xu, Yan and Li, Ge and Jin, Zhi},
title = {Compressing Neural Language Models by Sparse Word Representations},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {226--235},
url = {
http://www.aclweb.org/anthology/P16-1022},
year = 2016
}
Chen et al. (2016)
Devlin, Jacob and Quirk, Chris and Menezes, Arul (2015):
Pre-Computable Multi-Layer Neural Network Language Models, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
@InProceedings{devlin-quirk-menezes:2015:EMNLP,
author = {Devlin, Jacob and Quirk, Chris and Menezes, Arul},
title = {Pre-Computable Multi-Layer Neural Network Language Models},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {256--260},
url = {
http://aclweb.org/anthology/D15-1029},
year = 2015
}
Devlin et al. (2015)
Walid Aransa and Holger Schwenk and Loïc Barrault (2015):
Improving continuous space language models auxiliary features, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{IWSLT-2015-Aransa,
author = {Walid Aransa and Holger Schwenk and Loïc Barrault},
title = {Improving continuous space language models auxiliary features},
pages = {151-158},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
location = {Da Nang, Vietnam},
url = {
http://www.mt-archive.info/15/IWSLT-2015-aransa.pdf},
month = {December},
year = 2015
}
Aransa et al. (2015)
Auli, Michael and Gao, Jianfeng (2014):
Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{auli-gao:2014:P14-2,
author = {Auli, Michael and Gao, Jianfeng},
title = {Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {June},
address = {Baltimore, Maryland},
publisher = {Association for Computational Linguistics},
pages = {136--142},
url = {
http://www.aclweb.org/anthology/P14-2023},
year = 2014
}
Auli and Gao (2014)
Jan Niehues and Alexander Allauzen and François Yvon and Alex Waibel (2014):
Combining techniques from different NN-based language models for machine translation, Proceedings of the Eleventh Conference of the Association for Machine Translation in the Americas (AMTA)
@inproceedings{AMTA-2014-Niehues,
author = {Jan Niehues and Alexander Allauzen and Fran{\,c}ois Yvon and Alex Waibel},
title = {Combining techniques from different NN-based language models for machine translation},
pages = {222-233},
url = {
http://www.mt-archive.info/10/AMTA-2014-Niehues.pdf},
volume = {1},
booktitle = {Proceedings of the Eleventh Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {Vancouver, BC, Canada},
year = 2014
}
Niehues et al. (2014)
Jan Niehus and Alex Waibel (2012):
Continuous space language models using restricted Boltzmann machines, Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{iwslt12:Niehues,
author = {Jan Niehus and Alex Waibel},
title = {Continuous space language models using restricted Boltzmann machines},
url = {
http://www.mt-archive.info/IWSLT-2012-Niehues.pdf},
pages = {164-170},
booktitle = {Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)},
location = {Hong Kong},
year = 2012
}
Niehus and Waibel (2012)
Alkhouli, Tamer and Rietig, Felix and Ney, Hermann (2015):
Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models, Proceedings of the Tenth Workshop on Statistical Machine Translation
@InProceedings{alkhouli-rietig-ney:2015:WMT,
author = {Alkhouli, Tamer and Rietig, Felix and Ney, Hermann},
title = {Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models},
booktitle = {Proceedings of the Tenth Workshop on Statistical Machine Translation},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {294--303},
url = {
http://aclweb.org/anthology/W15-3034},
year = 2015
}
Alkhouli et al. (2015)
Wang, Rui and Utiyama, Masao and Goto, Isao and Sumita, Eiichro and Zhao, Hai and Lu, Bao-Liang (2013):
Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
@InProceedings{wang-EtAl:2013:EMNLP2,
author = {Wang, Rui and Utiyama, Masao and Goto, Isao and Sumita, Eiichro and Zhao, Hai and Lu, Bao-Liang},
title = {Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {845--850},
url = {
http://www.aclweb.org/anthology/D13-1082},
year = 2013
}
Wang et al. (2013)