Embeddings

Embeddings of words, phrases, sentences, and entire documents have several uses, one among them is to work towards interlingual representations of meaning.

Embeddings is the main subject of 26 publications. 10 are discussed here.

Topics in NeuralNetworkModels

Publications

Word embeddings have become a common feature in current research in natural language processing. Mikolov et al. (2013) propose the skip-gram method to obtain these representations. Mikolov et al. (2013) introduce efficient training methods for the skip-gram and continuous bag of words models, are used in the very popular word2vec implementation and publicly available word embedding sets for many languages.

Pennington et al. (2014) train word embedding models on the co-occurrence statistics of a word over the entire corpus.

Contextualized Word Embeddings

Peters et al. (2018) demonstrate that various natural language tasks can be improved by contextualizing word embeddings through bi-directional neural language model layers (called ELMo), just as it is done in encoders in machine translations. Devlin et al. (2019) show superior results with a method called BERT which pre-trains word embeddings on a masked language model and next sentence prediction task using the transformer architecture. Yang et al. (2019) refine the BERT model by predicting one masked word at a time, with permutation of the order of the masked words. They call their variant XLNet.

Using Pre-Training Word Embedding

Xing et al. (2015) point out inconsistencies in the representation of word embeddings and the objective function for translation transforms between word embeddings, which they address with normalization. Hirasawa et al. (2019) de-bias word embeddings and show gains with pre-trained word embeddings in a low resource setting.

Phrase Embeddings

Zhang et al. (2014) learn phrase embeddings using recursive neural networks and auto-encoders and a mapping between input and output phrase to add an additional score to the phrase translations and to filter the phrase table. Hu et al. (2015) use convolutional neural networks to encode the input and output phrase and pass them to matching that computes their similarity. They include the full input sentence context in the and use a learning strategy called curriculum learning that first learns from the easy training examples and then the harder ones.

Benchmarks

Discussion

New Publications

Jauregi Unanue, Inigo and Zare Borzeshi, Ehsan and Esmaili, Nazanin and Piccardi, Massimo (2019): ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
add
@inproceedings{jauregi-unanue-etal-2019-rewe,
author = {Jauregi Unanue, Inigo and Zare Borzeshi, Ehsan and Esmaili, Nazanin and Piccardi, Massimo},
title = {R}e{WE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/N19-1041},
pages = {430--436},
year = 2019
}
Unanue et al. (2019)
McCann, Bryan and Bradbury, James and Xiong, Caiming and Socher, Richard (2017): Learned in Translation: Contextualized Word Vectors, Advances in Neural Information Processing Systems 30
add
@incollection{NIPS2017-7209,
author = {McCann, Bryan and Bradbury, James and Xiong, Caiming and Socher, Richard},
title = {Learned in Translation: Contextualized Word Vectors},
booktitle = {Advances in Neural Information Processing Systems 30},
editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
pages = {6294--6305},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf},
year = 2017
}
McCann et al. (2017)
Mrksic, Nikola and Vulio, Ivan and O Seaghdha, Diarmuid and Leviant, Ira and Reichart, Roi and Gasic, Milica and Korhonen, Anna and Young, Steve (2017): Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints, Transactions of the Association for Computational Linguistics
add
@article{TACL1171,
author = {Mrksic, Nikola and Vulio, Ivan and O Seaghdha, Diarmuid and Leviant, Ira and Reichart, Roi and Gasic, Milica and Korhonen, Anna and Young, Steve},
title = {Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints},
journal = {Transactions of the Association for Computational Linguistics},
volume = {5},
keywords = {{}},
issn = {2307-387X},
url = {https://transacl.org/ojs/index.php/tacl/article/view/1171},
pages = {309--324},
year = 2017
}
Mrksic et al. (2017)
Wieting, John and Gimpel, Kevin (2018): ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{P18-1042,
author = {Wieting, John and Gimpel, Kevin},
title = {ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {451--462},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-1042},
year = 2018
}
Wieting and Gimpel (2018)
Pilehvar, Mohammad Taher and Collier, Nigel (2017): Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
add
@InProceedings{pilehvar-collier:2017:EACLshort,
author = {Pilehvar, Mohammad Taher and Collier, Nigel},
title = {Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers},
month = {April},
address = {Valencia, Spain},
publisher = {Association for Computational Linguistics},
pages = {388--393},
url = {http://www.aclweb.org/anthology/E17-2062},
year = 2017
}
Pilehvar and Collier (2017)
Passban, Peyman and Liu, Qun and Way, Andy (2016): Enriching Phrase Tables for Statistical Machine Translation Using Mixed Embeddings, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
add
@InProceedings{passban-liu-way:2016:COLING,
author = {Passban, Peyman and Liu, Qun and Way, Andy},
title = {Enriching Phrase Tables for Statistical Machine Translation Using Mixed Embeddings},
booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers},
month = {December},
address = {Osaka, Japan},
publisher = {The COLING 2016 Organizing Committee},
pages = {2582--2591},
url = {http://aclweb.org/anthology/C16-1243},
year = 2016
}
Passban et al. (2016)
Sergienya, Irina and Schütze, Hinrich (2015): Learning Better Embeddings for Rare Words Using Distributional Representations, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{sergienya-schutze:2015:EMNLP,
author = {Sergienya, Irina and Sch\"{u}tze, Hinrich},
title = {Learning Better Embeddings for Rare Words Using Distributional Representations},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {280--285},
url = {http://aclweb.org/anthology/D15-1033},
year = 2015
}
Sergienya and Schütze (2015)
Köhn, Arne (2015): What's in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{kohn:2015:EMNLP,
author = {K\"{o}hn, Arne},
title = {What's in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {2067--2073},
url = {http://aclweb.org/anthology/D15-1246},
year = 2015
}
Köhn (2015)
Sachdeva, Kunal and Sharma, Dipti (2015): Exploring the effect of semantic similarity for Phrase-based Machine Translation, Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality
add
@InProceedings{sachdeva-sharma:2015:CVSC,
author = {Sachdeva, Kunal and Sharma, Dipti},
title = {Exploring the effect of semantic similarity for Phrase-based Machine Translation},
booktitle = {Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {41--47},
url = {http://www.aclweb.org/anthology/W15-4005},
year = 2015
}
Sachdeva and Sharma (2015)
Zhao, Kai and Hassan, Hany and Auli, Michael (2015): Learning Translation Models from Monolingual Continuous Representations, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
add
@InProceedings{zhao-hassan-auli:2015:NAACL-HLT,
author = {Zhao, Kai and Hassan, Hany and Auli, Michael},
title = {Learning Translation Models from Monolingual Continuous Representations},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {May--June},
address = {Denver, Colorado},
publisher = {Association for Computational Linguistics},
pages = {1527--1536},
url = {http://www.aclweb.org/anthology/N15-1176},
year = 2015
}
Zhao et al. (2015)
Martinez Garcia, Eva and Tiedemann, Jörg and España-Bonet, Cristina and Màrquez, Lluís (2014): Word's Vector Representations meet Machine Translation, Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
add
@InProceedings{martinezgarcia-EtAl:2014:SSST-8,
author = {Martinez Garcia, Eva and Tiedemann, J\"{o}rg and Espa\~{n}a-Bonet, Cristina and M\`{a}rquez, Llu\'{i}s},
title = {Word's Vector Representations meet Machine Translation},
booktitle = {Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {132--134},
url = {http://www.aclweb.org/anthology/W14-4015},
year = 2014
}
Garcia et al. (2014)
Thanh-Le Ha and Jan Niehues and Alex Waibel (2014): Lexical Translation Model Using A Deep Neural Network Architecture, Proceedings of the International Workshop on Spoken Language Translation (IWSLT) mentioned in Lexical Choice, Context Features and Embeddings
add
@inproceedings{Ha:iwslt:2014,
author = {Thanh-Le Ha and Jan Niehues and Alex Waibel},
title = {Lexical Translation Model Using A Deep Neural Network Architecture},
pages = {223--229},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2014
}
Ha et al. (2014)
Gao, Jianfeng and He, Xiaodong and Yih, Wen-tau and Deng, Li (2014): Learning Continuous Phrase Representations for Translation Modeling, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{gao-EtAl:2014:P14-1,
author = {Gao, Jianfeng and He, Xiaodong and Yih, Wen-tau and Deng, Li},
title = {Learning Continuous Phrase Representations for Translation Modeling},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {June},
address = {Baltimore, Maryland},
publisher = {Association for Computational Linguistics},
pages = {699--709},
url = {http://www.aclweb.org/anthology/P14-1066},
year = 2014
}
Gao et al. (2014)
Cho, Kyunghyun and van Merrienboer, Bart and Gulcehre, Caglar and Bahdanau, Dzmitry and Bougares, Fethi and Schwenk, Holger and Bengio, Yoshua (2014): Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
add
@InProceedings{cho-EtAl:2014:EMNLP2014,
author = {Cho, Kyunghyun and van Merrienboer, Bart and Gulcehre, Caglar and Bahdanau, Dzmitry and Bougares, Fethi and Schwenk, Holger and Bengio, Yoshua},
title = {Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {1724--1734},
url = {http://www.aclweb.org/anthology/D14-1179},
year = 2014
}
Cho et al. (2014)
Levinboim, Tomer and Chiang, David (2015): Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource Languages, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{levinboim-chiang:2015:EMNLP,
author = {Levinboim, Tomer and Chiang, David},
title = {Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource Languages},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {1079--1083},
url = {http://aclweb.org/anthology/D15-1126},
year = 2015
}
Levinboim and Chiang (2015)
Alkhouli, Tamer and Guta, Andreas and Ney, Hermann (2014): Vector Space Models for Phrase-based Machine Translation, Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
add
@InProceedings{alkhouli-guta-ney:2014:SSST-8,
author = {Alkhouli, Tamer and Guta, Andreas and Ney, Hermann},
title = {Vector Space Models for Phrase-based Machine Translation},
booktitle = {Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {1--10},
url = {http://www.aclweb.org/anthology/W14-4001},
year = 2014
}
Alkhouli et al. (2014)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

Embeddings

Publications

Benchmarks

Discussion

Related Topics

New Publications