Monolingual Data
Monolingual data is much more plentiful than parallel data and has been been proven valuable for informing models of fluency and informing the representation of words.
Monolingual Data is the main subject of 33 publications. 22 are discussed here.
Publications
Backtranslation
Sennrich, Rico and Haddow, Barry and Birch, Alexandra (2016):
Improving Neural Machine Translation Models with Monolingual Data, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
mentioned in Corpus Cleaning and Monolingual Data@InProceedings{sennrich-haddow-birch:2016:P16-11,
author = {Sennrich, Rico and Haddow, Barry and Birch, Alexandra},
title = {Improving Neural Machine Translation Models with Monolingual Data},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {86--96},
url = {
http://www.aclweb.org/anthology/P16-1009},
year = 2016
}
Sennrich et al. (2016) back-translate the monolingual data into the input language and use the obtained synthetic parallel corpus as additional training data.
Hoang, Vu Cong Duy and Koehn, Philipp and Haffari, Gholamreza and Cohn, Trevor (2018):
Iterative Back-Translation for Neural Machine Translation, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
@InProceedings{W18-2703,
author = {Hoang, Vu Cong Duy and Koehn, Philipp and Haffari, Gholamreza and Cohn, Trevor},
title = {Iterative Back-Translation for Neural Machine Translation},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {18--24},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2703},
year = 2018
}
Hoang et al. (2018) show that the quality of the machine translation system matters and can be improved by iterative back-translation.
Burlot, Franck and Yvon, François (2018):
Using Monolingual Data in Neural Machine Translation: a Systematic Study, Proceedings of the Third Conference on Machine Translation: Research Papers
@inproceedings{W18-6315,
author = {Burlot, Franck and Yvon, Fran{\c{c}}ois},
title = {Using Monolingual Data in Neural Machine Translation: a Systematic Study},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6315},
pages = {144--155},
year = 2018
}
Burlot and Yvon (2018) also show that backtranslation quality matters and carry out additional analysis.
Edunov, Sergey and Ott, Myle and Auli, Michael and Grangier, David (2018):
Understand ing Back-Translation at Scale, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
mentioned in Inference and Monolingual Data@inproceedings{D18-1045,
author = {Edunov, Sergey and Ott, Myle and Auli, Michael and Grangier, David},
title = {Understand ing Back-Translation at Scale},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1045},
pages = {489--500},
year = 2018
}
Edunov et al. (2018) show better results with Monte Carlo search to generate the backtranslation data, i.e., randomly selecting word translations based on the predicted probability distribution.
Imamura, Kenji and Fujita, Atsushi and Sumita, Eiichiro (2018):
Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
@InProceedings{W18-2707,
author = {Imamura, Kenji and Fujita, Atsushi and Sumita, Eiichiro},
title = {Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {55--63},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2707},
year = 2018
}
Imamura et al. (2018);
Imamura, Kenji and Sumita, Eiichiro (2018):
NICT Self-Training Approach to Neural Machine Translation at NMT-2018, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
@InProceedings{W18-2713,
author = {Imamura, Kenji and Sumita, Eiichiro},
title = {NICT Self-Training Approach to Neural Machine Translation at NMT-2018},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {110--115},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2713},
year = 2018
}
Imamura and Sumita (2018) also confirm that better translation quality can be obtained when backtranslating with such sampling and offer some refinements.
Caswell, Isaac and Chelba, Ciprian and Grangier, David (2019):
Tagged Back-Translation, Proceedings of the Fourth Conference on Machine Translation
@InProceedings{caswell-chelba-grangier:2019:WMT,
author = {Caswell, Isaac and Chelba, Ciprian and Grangier, David},
title = {Tagged Back-Translation},
booktitle = {Proceedings of the Fourth Conference on Machine Translation},
month = {August},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
pages = {53--63},
url = {
http://www.aclweb.org/anthology/W19-5206},
year = 2019
}
Caswell et al. (2019) argue that the noise introduced by this type of stochastic search flags to the model that it is backtranslated data, something that can also be accomplished with an explicit special token, to the same effect.
Currey, Anna and Miceli Barone, Antonio Valerio and Heafield, Kenneth (2017):
Copied Monolingual Data Improves Low-Resource Neural Machine Translation, Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper
mentioned in Corpus Cleaning and Monolingual Data@InProceedings{currey-micelibarone-heafield:2017:WMT,
author = {Currey, Anna and Miceli Barone, Antonio Valerio and Heafield, Kenneth},
title = {Copied Monolingual Data Improves Low-Resource Neural Machine Translation},
booktitle = {Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper},
month = {September},
address = {Copenhagen, Denmark},
publisher = {Association for Computational Linguistics},
pages = {148--156},
url = {
http://www.aclweb.org/anthology/W17-4715},
year = 2017
}
Currey et al. (2017) show that in low resource conditions simple copying of target side data to the source side also generates beneficial training data.
Fadaee, Marzieh and Monz, Christof (2018):
Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
@inproceedings{D18-1040,
author = {Fadaee, Marzieh and Monz, Christof},
title = {Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1040},
pages = {436--446},
year = 2018
}
Fadaee and Monz (2018) see gains with synthetic data generated by forward-translation (also called self-training). They also report gains when subsampling backtranslation data to favor rare or difficult to generate words (words with high loss during training).
Dual Learning
He, Di and Xia, Yingce and Qin, Tao and Wang, Liwei and Yu, Nenghai and Liu, Tieyan and Ma, Wei-Ying (2016):
Dual Learning for Machine Translation, Advances in Neural Information Processing Systems 29
@incollection{NIPS2016-6469,
author = {He, Di and Xia, Yingce and Qin, Tao and Wang, Liwei and Yu, Nenghai and Liu, Tieyan and Ma, Wei-Ying},
title = {Dual Learning for Machine Translation},
booktitle = {Advances in Neural Information Processing Systems 29},
editor = {D. D. Lee and M. Sugiyama and U. V. Luxburg and I. Guyon and R. Garnett},
pages = {820--828},
publisher = {Curran Associates, Inc.},
url = {
http://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdf},
year = 2016
}
He et al. (2016) use monolingual data in a dual learning setup. Machine translation engines are trained in both directions, and in addition to regular model training from parallel data, monolingual data is translated in a round trip (
e to
f to
e) and evaluated with a language model for language
f and reconstruction match back to
e as cost function to drive gradient descent updates to the model.
Zhaopeng Tu and Yang Liu and Lifeng Shang and Xiaohua Liu and Hang Li (2017):
Neural Machine Translation with Reconstruction, Proceedings of the 31st AAAI Conference on Artificial Intelligence
@inproceedings{Tu-EtAl:2017:AAAI,
author = {Zhaopeng Tu and Yang Liu and Lifeng Shang and Xiaohua Liu and Hang Li},
title = {Neural Machine Translation with Reconstruction},
booktitle = {Proceedings of the 31st AAAI Conference on Artificial Intelligence},
url = {
http://arxiv.org/abs/1611.01874},
year = 2017
}
Tu et al. (2017) augment the translation model with a reconstruction step. The generated output is translated back into the input language and the training objective is extended to not only include the likelihood of the target sentence but also the likelihood to the reconstructed input sentence.
Niu, Xing and Denkowski, Michael and Carpuat, Marine (2018):
Bi-Directional Neural Machine Translation with Synthetic Parallel Data, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
@InProceedings{W18-2710,
author = {Niu, Xing and Denkowski, Michael and Carpuat, Marine},
title = {Bi-Directional Neural Machine Translation with Synthetic Parallel Data},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {84--91},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2710},
year = 2018
}
Niu et al. (2018) simultaneously train a model in both translation directions (with the identity of the source language indicated by marker token.
Niu, Xing and Xu, Weijia and Carpuat, Marine (2019):
Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
@inproceedings{niu-etal-2019-bi,
author = {Niu, Xing and Xu, Weijia and Carpuat, Marine},
title = {Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1043},
pages = {442--448},
year = 2019
}
Niu et al. (2019) extend this work to roundtrip translation training on monolingual data, allowing the forward translation and the reconstruction step to operate on the same model. They use Gumbel softmax to make the roundtrip differentiable.
Unsupervised Machine Translation
The idea of backtranslation is also crucial for the ambitious goal of unsupervised machine translation, i.e., the training of machine translation systems with monolingual data only. These methods typically start with
multilingual word embeddings, which may also be induced from monolingual data. Given such a word translation model,
Guillaume Lample and Alexis Conneau and Ludovic Denoyer and Marc'Aurelio Ranzato (2018):
Unsupervised Machine Translation Using Monolingual Corpora Only, International Conference on Learning Representations
@inproceedings{lample2018unsupervised,
author = {Guillaume Lample and Alexis Conneau and Ludovic Denoyer and Marc'Aurelio Ranzato},
title = {Unsupervised Machine Translation Using Monolingual Corpora Only},
booktitle = {International Conference on Learning Representations},
url = {
https://openreview.net/forum?id=rkYTTf-AZ},
year = 2018
}
Lample et al. (2018) propose to translate sentences in one language with a simple word-by-word translation model into another language, using a shared encoder and decoder for both languages involved. They define three different objectives in their setup: the ability to reconstruct a source sentence form its intermediate representation, even with added noise (randomly dropping words), the ability to reconstruct a source sentence from its translation into the target language, and an adversarial component that attempts to classify the identity of the language from intermediate representation of a sentence in either language.
Mikel Artetxe and Gorka Labaka and Eneko Agirre and Kyunghyun Cho (2018):
Unsupervised Neural Machine Translation, International Conference on Learning Representations
mentioned in Monolingual Data and Multilingual Multimodal Multitask@inproceedings{artetxe2018unsupervised,
author = {Mikel Artetxe and Gorka Labaka and Eneko Agirre and Kyunghyun Cho},
title = {Unsupervised Neural Machine Translation},
booktitle = {International Conference on Learning Representations},
url = {
https://openreview.net/forum?id=Sy2ogebAW},
year = 2018
}
Artetxe et al. (2018) use a similar setup, with a shared encoder and language-specific decoder, relying on the idea of a denoising auto-encoder (just like the first objective above), and the ability to reconstruct the source sentence from a translation into the target language.
Sun, Haipeng and Wang, Rui and Chen, Kehai and Utiyama, Masao and Sumita, Eiichiro and Zhao, Tiejun (2019):
Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{sun-etal-2019-unsupervised,
author = {Sun, Haipeng and Wang, Rui and Chen, Kehai and Utiyama, Masao and Sumita, Eiichiro and Zhao, Tiejun},
title = {Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1119},
pages = {1235--1245},
year = 2019
}
Sun et al. (2019) note that during training of the neural machine translation model, the bilingual word embedding deteriorates. They add the training objective for the induction of the bilingual word embeddings into the objective function of neural machine translation training.
Yang, Zhen and Chen, Wei and Wang, Feng and Xu, Bo (2018):
Unsupervised Neural Machine Translation with Weight Sharing, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{P18-1005,
author = {Yang, Zhen and Chen, Wei and Wang, Feng and Xu, Bo},
title = {Unsupervised Neural Machine Translation with Weight Sharing},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {46--55},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/P18-1005},
year = 2018
}
Yang et al. (2018) use language-specific encoders with some shared weights in a similar setup.
Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko (2018):
Unsupervised Statistical Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
@inproceedings{D18-1399,
author = {Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko},
title = {Unsupervised Statistical Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1399},
pages = {3632--3642},
year = 2018
}
Artetxe et al. (2018) show better results when inducing phrase translations from phrase embeddings and use them in statistical phrase-based machine translation model, which includes an explicit language model. They refine their model with synthetic data generated by iterative backtranslation.
Lample, Guillaume and Ott, Myle and Conneau, Alexis and Denoyer, Ludovic and Ranzato, Marc'Aurelio (2018):
Phrase-Based \& Neural Unsupervised Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
@inproceedings{lample-etal-2018-phrase,
author = {Lample, Guillaume and Ott, Myle and Conneau, Alexis and Denoyer, Ludovic and Ranzato, Marc{'}Aurelio},
title = {Phrase-Based {\&} Neural Unsupervised Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
month = {oct # "-" # nov},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1549},
pages = {5039--5049},
year = 2018
}
Lample et al. (2018) combine unsupervised statistical and neural machine translation models. Their phrase-based model is initialized with word translations obtained from multilingual word embeddings and then iteratively refined into phrase translations.
Ren, Shuo and Zhang, Zhirui and Liu, Shujie and Zhou, Ming and Ma, Shuai (2019):
Unsupervised Neural Machine Translation with SMT as Posterior Regularization, Proceedings of the AAAI Conference on Artificial Intelligence
@inproceedings{Ren_Zhang_Liu_Zhou_Ma_2019,
author = {Ren, Shuo and Zhang, Zhirui and Liu, Shujie and Zhou, Ming and Ma, Shuai},
title = {Unsupervised Neural Machine Translation with SMT as Posterior Regularization},
url = {
https://aaai.org/ojs/index.php/AAAI/article/view/3791},
doi = {10.1609/aaai.v33i01.3301241},
booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
month = {Jul.},
pages = {241-248},
year = 2019
}
Ren et al. (2019) more closely tie together training of unsupervised statistical and neural machine translation systems by using the statistical machine translation model as a regularizer for the neural model training.
Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko (2019):
Bilingual Lexicon Induction through Unsupervised Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{artetxe-etal-2019-bilingual,
author = {Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko},
title = {Bilingual Lexicon Induction through Unsupervised Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1494},
pages = {5002--5007},
year = 2019
}
Artetxe et al. (2019) improve their unsupervised statistical machine translation model with a feature that favors similarly spelled translations and a unsupervised method to tune the weights for the statistical components. Circling back to bilingual lexicon induction,
Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko (2019):
An Effective Approach to Unsupervised Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{artetxe-etal-2019-effective,
author = {Artetxe, Mikel and Labaka, Gorka and Agirre, Eneko},
title = {An Effective Approach to Unsupervised Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1019},
pages = {194--203},
year = 2019
}
Artetxe et al. (2019) use such an unsupervised machine translation model to synthesize a parallel corpus by translating monolingual data, process it with word alignment methods, and extract a bilingual dictionary using maximum likelihood estimation.
Benchmarks
Discussion
Related Topics
New Publications
Unsupervised machine translation
Guillaume Lample and Alexis Conneau (2019):
Cross-lingual Language Model Pretraining, CoRR
@article{DBLP:journals/corr/abs-1901-07291,
author = {Guillaume Lample and Alexis Conneau},
title = {Cross-lingual Language Model Pretraining},
journal = {CoRR},
volume = {abs/1901.07291},
url = {
http://arxiv.org/abs/1901.07291},
archiveprefix = {arXiv},
eprint = {1901.07291},
timestamp = {Fri, 01 Feb 2019 13:39:59 +0100},
biburl = {
https://dblp.org/rec/bib/journals/corr/abs-1901-07291},
bibsource = {dblp computer science bibliography,
https://dblp.org},
year = 2019
}
Lample and Conneau (2019)
Backtranslation
Graça, Miguel and Kim, Yunsu and Schamper, Julian and Khadivi, Shahram and Ney, Hermann (2019):
Generalizing Back-Translation in Neural Machine Translation, Proceedings of the Fourth Conference on Machine Translation
@InProceedings{graa-EtAl:2019:WMT,
author = {Gra{\,c}a, Miguel and Kim, Yunsu and Schamper, Julian and Khadivi, Shahram and Ney, Hermann},
title = {Generalizing Back-Translation in Neural Machine Translation},
booktitle = {Proceedings of the Fourth Conference on Machine Translation},
month = {August},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
pages = {45--52},
url = {
http://www.aclweb.org/anthology/W19-5205},
year = 2019
}
Graça et al. (2019)
Miguel Domingo and Francisco Casacuberta (2018):
A Machine Translation Approach for Modernizing Historical Documents Using Back Translation, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{iwslt18-Historical-Domingo,
author = {Miguel Domingo and Francisco Casacuberta},
title = {A Machine Translation Approach for Modernizing Historical Documents Using Back Translation},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2018
}
Domingo and Casacuberta (2018)
Prabhumoye, Shrimai and Tsvetkov, Yulia and Salakhutdinov, Ruslan and Black, Alan W (2018):
Style Transfer Through Back-Translation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{P18-1080,
author = {Prabhumoye, Shrimai and Tsvetkov, Yulia and Salakhutdinov, Ruslan and Black, Alan W},
title = {Style Transfer Through Back-Translation},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {866--876},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/P18-1080},
year = 2018
}
Prabhumoye et al. (2018)
Other
Luo, Jiaming and Cao, Yuan and Barzilay, Regina (2019):
Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{luo-etal-2019-neural,
author = {Luo, Jiaming and Cao, Yuan and Barzilay, Regina},
title = {Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1303},
pages = {3146--3155},
year = 2019
}
Luo et al. (2019)
Pourdamghani, Nima and Aldarrab, Nada and Ghazvininejad, Marjan and Knight, Kevin and May, Jonathan (2019):
Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{pourdamghani-etal-2019-translating,
author = {Pourdamghani, Nima and Aldarrab, Nada and Ghazvininejad, Marjan and Knight, Kevin and May, Jonathan},
title = {Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1293},
pages = {3057--3062},
year = 2019
}
Pourdamghani et al. (2019)
Xia, Mengzhou and Kong, Xiang and Anastasopoulos, Antonios and Neubig, Graham (2019):
Generalized Data Augmentation for Low-Resource Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
@inproceedings{xia-etal-2019-generalized,
author = {Xia, Mengzhou and Kong, Xiang and Anastasopoulos, Antonios and Neubig, Graham},
title = {Generalized Data Augmentation for Low-Resource Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1579},
pages = {5786--5796},
year = 2019
}
Xia et al. (2019)
Marie, Benjamin and Fujita, Atsushi (2019):
Unsupervised Extraction of Partial Translations for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
@inproceedings{marie-fujita-2019-unsupervised,
author = {Marie, Benjamin and Fujita, Atsushi},
title = {Unsupervised Extraction of Partial Translations for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1384},
pages = {3834--3844},
year = 2019
}
Marie and Fujita (2019)
- Wu et al. (2019)
Wang, Yining and Zhao, Yang and Zhang, Jiajun and Zong, Chengqing and Xue, Zhengshan (2017):
Towards Neural Machine Translation with Partially Aligned Corpora, Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
@InProceedings{wang-EtAl:2017:I17-11,
author = {Wang, Yining and Zhao, Yang and Zhang, Jiajun and Zong, Chengqing and Xue, Zhengshan},
title = {Towards Neural Machine Translation with Partially Aligned Corpora},
booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)},
month = {November},
address = {Taipei, Taiwan},
publisher = {Asian Federation of Natural Language Processing},
pages = {384--393},
url = {
http://www.aclweb.org/anthology/I17-1039},
year = 2017
}
Wang et al. (2017)
Shen, Tianxiao and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi (2017):
Style Transfer from Non-Parallel Text by Cross-Alignment, Advances in Neural Information Processing Systems 30
@incollection{NIPS2017-7259,
author = {Shen, Tianxiao and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi},
title = {Style Transfer from Non-Parallel Text by Cross-Alignment},
booktitle = {Advances in Neural Information Processing Systems 30},
editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
pages = {6830--6841},
publisher = {Curran Associates, Inc.},
url = {
http://papers.nips.cc/paper/7259-style-transfer-from-non-parallel-text-by-cross-alignment.pdf},
year = 2017
}
Shen et al. (2017)