Adaptation
Domain Adaptation has been widely studied in traditional statistical machine translation. These techniques have been adapted and new techniques have been applied to neural machine translation models to adapt them to domain or other stylistic aspects.
Adaptation is the main subject of 72 publications. 41 are discussed here.
Publications
There is often a domain mismatch between the bulk (or even all) of the training data for a translation and its test data during deployment. There is rich literature in traditional statistical machine translation on this topic.
Fine Tuning:
A common approach for neural models is to first train on all available training data, and then run a few iterations on in-domain data only
Minh-Thang Luong and Christopher Manning (2015):
Stanford neural machine translation systems for spoken language domains, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
mentioned in Neural Network Models and Adaptation@inproceedings{IWSLT-2015-Luong,
author = {Minh-Thang Luong and Christopher Manning},
title = {Stanford neural machine translation systems for spoken language domains},
pages = {76-79},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
location = {Da Nang, Vietnam},
url = {
http://www.mt-archive.info/15/IWSLT-2015-luong.pdf},
month = {December},
year = 2015
}
(Luong and Manning, 2015), as already pioneered in neural language model adaption
Ter-Sarkisov, Alex and Schwenk, Holger and Bougares, Fethi and Barrault, Loïc (2015):
Incremental Adaptation Strategies for Neural Network Language Models, Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

@InProceedings{tersarkisov-EtAl:2015:CVSC,
author = {Ter-Sarkisov, Alex and Schwenk, Holger and Bougares, Fethi and Barrault, Lo\"{i}c},
title = {Incremental Adaptation Strategies for Neural Network Language Models},
booktitle = {Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {48--56},
url = {
http://www.aclweb.org/anthology/W15-4006},
year = 2015
}
(Ter-Sarkisov et al., 2015).
Christophe Servan and Josep Maria Crego and Jean Senellart (2016):
Domain specialization: a post-training domain adaptation for Neural Machine Translation, CoRR

@article{DBLP:journals/corr/ServanCS16,
author = {Christophe Servan and Josep Maria Crego and Jean Senellart},
title = {Domain specialization: a post-training domain adaptation for Neural Machine Translation},
journal = {CoRR},
volume = {abs/1612.06141},
url = {
http://arxiv.org/abs/1612.06141},
timestamp = {Wed, 07 Jun 2017 14:41:57 +0200},
biburl = {
http://dblp.uni-trier.de/rec/bib/journals/corr/ServanCS16},
bibsource = {dblp computer science bibliography,
http://dblp.org},
year = 2016
}
Servan et al. (2016) demonstrate the effectiveness of this adaptation method with small in-domain sets consisting of as little as 500 sentence pairs.
Thierry Etchegoyhen and Anna Fernández Torné and Andoni Azpeitia and Eva Martínez Garcia and Anna Matamala (2018):
Evaluating Domain Adaptation for Machine Translation Across Scenarios, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

@InProceedings{LREC2018-ETCHEGOYHEN18.568,
author = {Thierry Etchegoyhen and Anna Fern{\'a}ndez Torn{\'e} and Andoni Azpeitia and Eva Mart{\'i}nez Garcia and Anna Matamala},
title = {Evaluating Domain Adaptation for Machine Translation Across Scenarios},
booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
month = {May 7-12, 2018},
address = {Miyazaki, Japan},
publisher = {European Language Resources Association (ELRA)},
isbn = {979-10-95546-00-9},
language = {english},
year = 2018
}
(Etchegoyhen et al., 2018) evaluate the quality of such domain-adapted systems using subjective assessment and post-editor productivity measures.
Chu, Chenhui and Dabre, Raj and Kurohashi, Sadao (2017):
An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

@InProceedings{chu-dabre-kurohashi:2017:Short,
author = {Chu, Chenhui and Dabre, Raj and Kurohashi, Sadao},
title = {An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Vancouver, Canada},
publisher = {Association for Computational Linguistics},
pages = {385--391},
url = {
http://aclweb.org/anthology/P17-2061},
year = 2017
}
Chu et al. (2017) argue that given small amount of in-domain data leads to overfitting and suggest to mix in-domain and out-of-domain data during adaption.
Markus Freitag and Yaser Al-Onaizan (2016):
Fast Domain Adaptation for Neural Machine Translation, CoRR

@article{Freitag:2016:unpublished,
author = {Markus Freitag and Yaser Al{-}Onaizan},
title = {Fast Domain Adaptation for Neural Machine Translation},
journal = {CoRR},
volume = {abs/1612.06897},
url = {
http://arxiv.org/abs/1612.06897},
archiveprefix = {arXiv},
eprint = {1612.06897},
timestamp = {Mon, 13 Aug 2018 16:48:24 +0200},
biburl = {
https://dblp.org/rec/bib/journals/corr/FreitagA16},
bibsource = {dblp computer science bibliography,
https://dblp.org},
year = 2016
}
Freitag and Al-Onaizan (2016) identify the same problem and suggest to use an ensemble of baseline models and adapted models to avoid overfitting.
Álvaro Peris and Luis Cebrián and Francisco Casacuberta (2017):
Online Learning for Neural Machine Translation Post-editing, CoRR

@article{DBLP:journals/corr/PerisCC17,
author = {{\'{A}}lvaro Peris and Luis Cebri{\'{a}}n and Francisco Casacuberta},
title = {Online Learning for Neural Machine Translation Post-editing},
journal = {CoRR},
volume = {abs/1706.03196},
url = {
http://arxiv.org/abs/1706.03196},
timestamp = {Mon, 03 Jul 2017 13:29:02 +0200},
biburl = {
http://dblp.uni-trier.de/rec/bib/journals/corr/PerisCC17},
bibsource = {dblp computer science bibliography,
http://dblp.org},
year = 2017
}
Peris et al. (2017) consider alternative training methods for the adaptation phase but do not find consistently better results than the traditional gradient descent training.
Vilar, David (2018):
Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

@InProceedings{N18-2080,
author = {Vilar, David},
title = {Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models},
booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)},
publisher = {Association for Computational Linguistics},
pages = {500--505},
location = {New Orleans, Louisiana},
url = {
http://aclweb.org/anthology/N18-2080},
year = 2018
}
Vilar (2018) leave the general model parameters fixed during fine tuning, and only update an adaptation layer in the recurrent states.
Paul Michel and Graham Neubig (2018):
Extreme Adaptation for Personalized Neural Machine Translation, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

@InProceedings{ACL2018-michel,
author = {Paul Michel and Graham Neubig},
title = {Extreme Adaptation for Personalized Neural Machine Translation},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
publisher = {Association for Computational Linguistics},
year = 2018
}
Michel and Neubig (2018) only update an additional bias term in the output softmax.
Thompson, Brian and Khayrallah, Huda and Anastasopoulos, Antonios and McCarthy, Arya D. and Duh, Kevin and Marvin, Rebecca and McNamee, Paul and Gwinnup, Jeremy and Anderson, Tim and Koehn, Philipp (2018):
Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation, Proceedings of the Third Conference on Machine Translation: Research Papers

@inproceedings{W18-6313,
author = {Thompson, Brian and Khayrallah, Huda and Anastasopoulos, Antonios and McCarthy, Arya D. and Duh, Kevin and Marvin, Rebecca and McNamee, Paul and Gwinnup, Jeremy and Anderson, Tim and Koehn, Philipp},
title = {Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6313},
pages = {124--132},
year = 2018
}
Thompson et al. (2018) explore which parameters (input embedding, recurrent state propagation, etc.) may be left unchanged while still obtaining good adaptation results.
Praveen Dakwale and Christof Monz (2017):
Fine-Tuning for Neural Machine Translation with Limited Degradation across In- and Out-of-Domain Data, Machine Translation Summit XVI

@inproceedings{mtsummit2017:Dakwale,
author = {Praveen Dakwale and Christof Monz},
title = {Fine-Tuning for Neural Machine Translation with Limited Degradation across In- and Out-of-Domain Data},
booktitle = {Machine Translation Summit XVI},
location = {Nagoya, Japan},
year = 2017
}
Dakwale and Monz (2017);
Khayrallah, Huda and Thompson, Brian and Duh, Kevin and Koehn, Philipp (2018):
Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
mentioned in Training and Adaptation@InProceedings{W18-2705,
author = {Khayrallah, Huda and Thompson, Brian and Duh, Kevin and Koehn, Philipp},
title = {Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {36--44},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2705},
year = 2018
}
Khayrallah et al. (2018) regularize the training objective to include a term that penalizes departure from the word predictions of the un-adapted baseline model.
Miceli Barone, Antonio Valerio and Haddow, Barry and Germann, Ulrich and Sennrich, Rico (2017):
Regularization techniques for fine-tuning in neural machine translation, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

@InProceedings{D17-1156,
author = {Miceli Barone, Antonio Valerio and Haddow, Barry and Germann, Ulrich and Sennrich, Rico},
title = {Regularization techniques for fine-tuning in neural machine translation},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing},
publisher = {Association for Computational Linguistics},
pages = {1490--1495},
location = {Copenhagen, Denmark},
url = {
http://aclweb.org/anthology/D17-1156},
year = 2017
}
Barone et al. (2017) use the L2 norm between baseline parameter values and adapted parameter values as regularizer in the objective function, in addition to drop out techniques.
Thompson, Brian and Gwinnup, Jeremy and Khayrallah, Huda and Duh, Kevin and Koehn, Philipp (2019):
Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{thompson-etal-2019-overcoming,
author = {Thompson, Brian and Gwinnup, Jeremy and Khayrallah, Huda and Duh, Kevin and Koehn, Philipp},
title = {Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1209},
pages = {2062--2068},
year = 2019
}
Thompson et al. (2019) show superior results with a technique called elastic weight consolidation that also tends to preserve model parameters that were important for general model translation quality.
Curriculum Training:
van der Wees, Marlies and Bisazza, Arianna and Monz, Christof (2017):
Dynamic Data Selection for Neural Machine Translation, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
mentioned in Corpus Cleaning and Adaptation@InProceedings{D17-1148,
author = {van der Wees, Marlies and Bisazza, Arianna and Monz, Christof},
title = {Dynamic Data Selection for Neural Machine Translation},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing},
publisher = {Association for Computational Linguistics},
pages = {1411--1421},
location = {Copenhagen, Denmark},
url = {
http://aclweb.org/anthology/D17-1147},
year = 2017
}
Wees et al. (2017) adopt curriculum training for the adaptation problem. They start with corpus consisting of all data, and the train on smaller and smaller subsets that are increasingly in-domain, as determined by language model.
Kocmi, Tom and Bojar, Ondřej (2017):
Curriculum Learning and Minibatch Bucketing in Neural Machine Translation, Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

@inproceedings{kocmi-bojar-2017-curriculum,
author = {Kocmi, Tom and Bojar, Ond{\v{r}}ej},
title = {Curriculum Learning and Minibatch Bucketing in Neural Machine Translation},
booktitle = {Proceedings of the International Conference Recent Advances in Natural Language Processing, {RANLP} 2017},
month = {sep},
address = {Varna, Bulgaria},
publisher = {INCOMA Ltd.},
url = {
https://doi.org/10.26615/978-954-452-049-6\_050},
doi = {10.26615/978-954-452-049-6\_050},
pages = {379--386},
year = 2017
}
Kocmi and Bojar (2017) employ curriculum training by first training on simpler sentence pairs, measured by the length of the sentences, the number of coordinating conjunctions, and the frequency of words.
Platanios, Emmanouil Antonios and Stretcu, Otilia and Neubig, Graham and Poczos, Barnabas and Mitchell, Tom (2019):
Competence-based Curriculum Learning for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{platanios-etal-2019-competence,
author = {Platanios, Emmanouil Antonios and Stretcu, Otilia and Neubig, Graham and Poczos, Barnabas and Mitchell, Tom},
title = {Competence-based Curriculum Learning for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1119},
pages = {1162--1172},
year = 2019
}
Platanios et al. (2019) show that a refined scheme that selects data of increasing difficulty based on the training progress converges faster and gives better performance for Transformer models.
Zhang, Xuan and Shapiro, Pamela and Kumar, Gaurav and McNamee, Paul and Carpuat, Marine and Duh, Kevin (2019):
Curriculum Learning for Domain Adaptation in Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{zhang-etal-2019-curriculum,
author = {Zhang, Xuan and Shapiro, Pamela and Kumar, Gaurav and McNamee, Paul and Carpuat, Marine and Duh, Kevin},
title = {Curriculum Learning for Domain Adaptation in Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1189},
pages = {1903--1915},
year = 2019
}
Zhang et al. (2019) explored various other curriculum schedules based on difficulty, including training on the hard examples first.
Kumar, Gaurav and Foster, George and Cherry, Colin and Krikun, Maxim (2019):
Reinforcement Learning based Curriculum Optimization for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{kumar-etal-2019-reinforcement,
author = {Kumar, Gaurav and Foster, George and Cherry, Colin and Krikun, Maxim},
title = {Reinforcement Learning based Curriculum Optimization for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1208},
pages = {2054--2061},
year = 2019
}
Kumar et al. (2019) learn a curriculum for data of different degrees of noisiness with reinforcement learning using gains on the validation set as rewards.
Wang, Rui and Utiyama, Masao and Sumita, Eiichiro (2018):
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

@InProceedings{P18-2048,
author = {Wang, Rui and Utiyama, Masao and Sumita, Eiichiro},
title = {Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
publisher = {Association for Computational Linguistics},
pages = {298--304},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/P18-2048},
year = 2018
}
Wang et al. (2018) argue that sentence pairs that are already correctly predicted do not contribute to further improvement of the model and increasingly remove sentence pairs that do not show improvement in their training objective cost between iterations.
Sentence-Level Adaption to Fuzzy Match:
Before translating a sentence,
Farajian, M. Amin and Turchi, Marco and Negri, Matteo and Federico, Marcello (2017):
Multi-Domain Neural Machine Translation through Unsupervised Adaptation, Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper

@InProceedings{farajian-EtAl:2017:WMT,
author = {Farajian, M. Amin and Turchi, Marco and Negri, Matteo and Federico, Marcello},
title = {Multi-Domain Neural Machine Translation through Unsupervised Adaptation},
booktitle = {Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper},
month = {September},
address = {Copenhagen, Denmark},
publisher = {Association for Computational Linguistics},
pages = {127--137},
url = {
http://www.aclweb.org/anthology/W17-4713},
year = 2017
}
Farajian et al. (2017);
Xiaoqing Li and Jiajun Zhang and Chengqing Zong (2018):
One Sentence One Model for Neural Machine Translation, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

@InProceedings{LREC2018-LI18.195,
author = {Xiaoqing Li and Jiajun Zhang and Chengqing Zong},
title = {One Sentence One Model for Neural Machine Translation},
booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
month = {May 7-12, 2018},
address = {Miyazaki, Japan},
publisher = {European Language Resources Association (ELRA)},
isbn = {979-10-95546-00-9},
language = {english},
year = 2018
}
Li et al. (2018) propose to fetch a few similar sentences and their translations from a parallel corpus and adapt the neural translation model to this subsampled training set.
Similarly, using only monolingual source side data,
Chinea-Rios, Mara and Peris, Álvaro and Casacuberta, Francisco (2017):
Adapting Neural Machine Translation with Parallel Synthetic Data, Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper

@InProceedings{chinearios-peris-casacuberta:2017:WMT,
author = {Chinea-Rios, Mara and Peris, \'{A}lvaro and Casacuberta, Francisco},
title = {Adapting Neural Machine Translation with Parallel Synthetic Data},
booktitle = {Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper},
month = {September},
address = {Copenhagen, Denmark},
publisher = {Association for Computational Linguistics},
pages = {138--147},
url = {
http://www.aclweb.org/anthology/W17-4714},
year = 2017
}
Chinea-Rios et al. (2017) subsample sentences similar to the sentences in a document to be translated and perform a self-training step. Self-training first translates the source text and then adapts the model to this synthetic parallel corpus.
Jiatao Gu and Yong Wang and Kyunghyun Cho and Victor O.K. Li (2018):
Search Engine Guided Non-Parametric Neural Machine Translation, Proceedings of the American Association for Artificial Intelligence

@inproceedings{AAAI2018-Gu,
author = {Jiatao Gu and Yong Wang and Kyunghyun Cho and Victor O.K. Li},
title = {Search Engine Guided Non-Parametric Neural Machine Translation},
booktitle = {Proceedings of the American Association for Artificial Intelligence},
url = {
https://arxiv.org/pdf/1705.07267},
year = 2018
}
Gu et al. (2018) modify the model architecture to include the retrieved sentence pairs. These sentence pairs are stored in a neural key-value memory and words from these sentence pairs may be either copied over directly or fused with predictions of the baseline neural machine translation model.
Zhang, Jingyi and Utiyama, Masao and Sumita, Eiichro and Neubig, Graham and Nakamura, Satoshi (2018):
Guiding Neural Machine Translation with Retrieved Translation Pieces, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

@InProceedings{N18-1120,
author = {Zhang, Jingyi and Utiyama, Masao and Sumita, Eiichro and Neubig, Graham and Nakamura, Satoshi},
title = {Guiding Neural Machine Translation with Retrieved Translation Pieces},
booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {1325--1335},
location = {New Orleans, Louisiana},
url = {
http://aclweb.org/anthology/N18-1120},
year = 2018
}
Zhang et al. (2018) extract phrase pairs from the retrieved sentence pairs, and add a bonus to hypotheses during search, if these contain them.
Bapna, Ankur and Firat, Orhan (2019):
Non-Parametric Adaptation for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{bapna-firat-2019-non,
author = {Bapna, Ankur and Firat, Orhan},
title = {Non-Parametric Adaptation for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1191},
pages = {1921--1931},
year = 2019
}
Bapna and Firat (2019) retrieve similar sentence pairs from a domain-specific corpus at inference time and provide these as additional conditioning context. Similarly,
Bulte, Bram and Tezcan, Arda (2019):
Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics

@inproceedings{bulte-tezcan-2019-neural,
author = {Bulte, Bram and Tezcan, Arda},
title = {Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1175},
pages = {1800--1809},
year = 2019
}
Bulte and Tezcan (2019) add the target side of similar sentence pairs to the source sentence.
Sentence-Level Instant Updating:
Kothur, Sachith Sri Ram and Knowles, Rebecca and Koehn, Philipp (2018):
Document-Level Adaptation for Neural Machine Translation, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

@InProceedings{W18-2708,
author = {Kothur, Sachith Sri Ram and Knowles, Rebecca and Koehn, Philipp},
title = {Document-Level Adaptation for Neural Machine Translation},
booktitle = {Proceedings of the 2nd Workshop on Neural Machine Translation and Generation},
publisher = {Association for Computational Linguistics},
pages = {64--73},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/W18-2708},
year = 2018
}
Kothur et al. (2018) show that machine translation systems can be adapted instantly to the post-edits of a translator working through a single document. They show gains with both fine-tuning to edited sentence pairs and adding new word translations via fine-tuning.
Wuebker, Joern and Simianer, Patrick and DeNero, John (2018):
Compact Personalized Models for Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

@inproceedings{D18-1104,
author = {Wuebker, Joern and Simianer, Patrick and DeNero, John},
title = {Compact Personalized Models for Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1104},
pages = {881--886},
year = 2018
}
Wuebker et al. (2018) build personalized translation models in a similar scenario. They show that just modifying the output layer predictions and use group lasso regularization to limit the divergence between the general model and the personalized offsets.
Simianer, Patrick and Wuebker, Joern and DeNero, John (2019):
Measuring Immediate Adaptation Performance for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{simianer-etal-2019-measuring,
author = {Simianer, Patrick and Wuebker, Joern and DeNero, John},
title = {Measuring Immediate Adaptation Performance for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1206},
pages = {2038--2046},
year = 2019
}
Simianer et al. (2019) compare different sentence-level adaptation training methods in terms of how well they perform of translating words that occur once in a adaptation sentence pair as well as new words not yet encountered during adaptation. They show that lasso-adaptation
Wuebker, Joern and Simianer, Patrick and DeNero, John (2018):
Compact Personalized Models for Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

@inproceedings{D18-1104,
author = {Wuebker, Joern and Simianer, Patrick and DeNero, John},
title = {Compact Personalized Models for Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1104},
pages = {881--886},
year = 2018
}
(Wuebker et al., 2018) improves on once-seen words while not degrading on previously not encountered words.
Subsampling and Instance Weighting:
Inspired by domain adaptation work in statistical machine translation on sub-sampling,
Wang, Rui and Finch, Andrew and Utiyama, Masao and Sumita, Eiichiro (2017):
Sentence Embedding for Neural Machine Translation Domain Adaptation, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

@InProceedings{wang-EtAl:2017:Short3,
author = {Wang, Rui and Finch, Andrew and Utiyama, Masao and Sumita, Eiichiro},
title = {Sentence Embedding for Neural Machine Translation Domain Adaptation},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
address = {Vancouver, Canada},
publisher = {Association for Computational Linguistics},
pages = {560--566},
url = {
http://aclweb.org/anthology/P17-2089},
year = 2017
}
Wang et al. (2017) augment the canonical neural translation model with a sentence embedding state that allows distinction between in-domain and out-of-domain sentences. It is computed as the sum of all input word representations, and then used as initial state of the decoder. This sentence embedding allows them to distinguish between in-domain and out-of-domain sentences, using the centroids of all in-domain and out-of-domain sentence embeddings, respectively. Out-of-domain sentences that are closer to the in-domain centroid are included in the training data.
Chen, Boxing and Cherry, Colin and Foster, George and Larkin, Samuel (2017):
Cost Weighting for Neural Machine Translation Domain Adaptation, Proceedings of the First Workshop on Neural Machine Translation

@InProceedings{chen-EtAl:2017:NMT,
author = {Chen, Boxing and Cherry, Colin and Foster, George and Larkin, Samuel},
title = {Cost Weighting for Neural Machine Translation Domain Adaptation},
booktitle = {Proceedings of the First Workshop on Neural Machine Translation},
month = {August},
address = {Vancouver},
publisher = {Association for Computational Linguistics},
pages = {40--46},
url = {
http://www.aclweb.org/anthology/W17-3205},
year = 2017
}
Chen et al. (2017) combine the idea of sub-sampling with sentence weighting. They build an in-domain vs. out-of-domain classifier for sentence pairs in the training data, and then use its prediction score to reduce the learning rate for sentence pairs that are out of domain.
Wang, Rui and Utiyama, Masao and Liu, Lemao and Chen, Kehai and Sumita, Eiichiro (2017):
Instance Weighting for Neural Machine Translation Domain Adaptation, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

@InProceedings{D17-1155,
author = {Wang, Rui and Utiyama, Masao and Liu, Lemao and Chen, Kehai and Sumita, Eiichiro},
title = {Instance Weighting for Neural Machine Translation Domain Adaptation},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing},
publisher = {Association for Computational Linguistics},
pages = {1483--1489},
location = {Copenhagen, Denmark},
url = {
http://aclweb.org/anthology/D17-1155},
year = 2017
}
Wang et al. (2017) also explore such sentence-level learning rate scaling, and compare it against oversampling of in-domain data, showing similar results.
Farajian, M. Amin and Turchi, Marco and Negri, Matteo and Bertoldi, Nicola and Federico, Marcello (2017):
Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

@InProceedings{farajian-EtAl:2017:EACLshort,
author = {Farajian, M. Amin and Turchi, Marco and Negri, Matteo and Bertoldi, Nicola and Federico, Marcello},
title = {Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers},
month = {April},
address = {Valencia, Spain},
publisher = {Association for Computational Linguistics},
pages = {280--284},
url = {
http://www.aclweb.org/anthology/E17-2045},
year = 2017
}
Farajian et al. (2017) show that traditional statistical machine translation outperforms neural machine translation when training general-purpose machine translation systems on a collection of data, and then tested on niche domains. The adaptation technique allows neural machine translation to catch up.
Domain Tokens:
A multi-domain model may be trained and informed at run-time about the domain of the input sentence.
Kobus, Catherine and Crego, Josep and Senellart, Jean (2017):
Domain Control for Neural Machine Translation, Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

@inproceedings{kobus-etal-2017-domain,
author = {Kobus, Catherine and Crego, Josep and Senellart, Jean},
title = {Domain Control for Neural Machine Translation},
booktitle = {Proceedings of the International Conference Recent Advances in Natural Language Processing, {RANLP} 2017},
month = {sep},
address = {Varna, Bulgaria},
publisher = {INCOMA Ltd.},
url = {
https://doi.org/10.26615/978-954-452-049-6\_049},
doi = {10.26615/978-954-452-049-6\_049},
pages = {372--378},
year = 2017
}
Kobus et al. (2017) apply an idea initially proposed by
Sennrich, Rico and Haddow, Barry and Birch, Alexandra (2016):
Controlling Politeness in Neural Machine Translation via Side Constraints, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

@InProceedings{sennrich-haddow-birch:2016:N16-1,
author = {Sennrich, Rico and Haddow, Barry and Birch, Alexandra},
title = {Controlling Politeness in Neural Machine Translation via Side Constraints},
booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {San Diego, California},
publisher = {Association for Computational Linguistics},
pages = {35--40},
url = {
http://www.aclweb.org/anthology/N16-1005},
year = 2016
}
Sennrich et al. (2016) - to augment input sentences for register with a politeness feature token - to the domain adaptation problem. They add a domain token to each training and test sentence.
Sander Tars and Mark Fishel (2018):
Multi-Domain Neural Machine Translation, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Tars,
author = {Sander Tars and Mark Fishel},
title = {Multi-Domain Neural Machine Translation},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Tars and Fishel (2018) give results that show domain tokens outperform fine tuning and also explore word-level domain factors.
Topic Models:
If the data contains sentences from multiple domains but the composition is unknown, then automatically detecting different domains (then typically called topics) with methods such as LDA is an option.
Zhang, Jian and Li, Liangyou and Way, Andy and Liu, Qun (2016):
Topic-Informed Neural Machine Translation, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

@InProceedings{zhang-EtAl:2016:COLING3,
author = {Zhang, Jian and Li, Liangyou and Way, Andy and Liu, Qun},
title = {Topic-Informed Neural Machine Translation},
booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers},
month = {December},
address = {Osaka, Japan},
publisher = {The COLING 2016 Organizing Committee},
pages = {1807--1817},
url = {
http://aclweb.org/anthology/C16-1170},
year = 2016
}
Zhang et al. (2016) apply such clustering and then compute for each word a topic distribution vector. It is used in addition to the word embedding to inform the encoder and decoder in a otherwise canonical neural translation model. Instead of word-level topic vectors,
Wenhu Chen and Evgeny Matusov and Shahram Khadivi and Jan-Thorsten Peter (2016):
Guided Alignment Training for Topic-Aware Neural Machine Translation, CoRR
mentioned in Coverage and Adaptation@article{DBLP:journals/corr/ChenMKP16,
author = {Wenhu Chen and Evgeny Matusov and Shahram Khadivi and Jan{-}Thorsten Peter},
title = {Guided Alignment Training for Topic-Aware Neural Machine Translation},
journal = {CoRR},
volume = {abs/1607.01628},
url = {
https://arxiv.org/pdf/1607.01628.pdf},
timestamp = {Tue, 02 Aug 2016 12:59:27 +0200},
biburl = {
http://dblp.uni-trier.de/rec/bib/journals/corr/ChenMKP16},
bibsource = {dblp computer science bibliography,
http://dblp.org},
year = 2016
}
Chen et al. (2016) encode the given domain membership of each sentence as an additional input vector to the conditioning context of word prediction layer.
Sander Tars and Mark Fishel (2018):
Multi-Domain Neural Machine Translation, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Tars,
author = {Sander Tars and Mark Fishel},
title = {Multi-Domain Neural Machine Translation},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Tars and Fishel (2018) use sentence embeddings and k-means clustering to obtain topic clusters.
Noisy Data:
Text to be translated by machine translation models may be noisy, either due to misspellings or creative language use which is common in social media text. Machine translation models may be adapted to such noise to be more robust.
Vaibhav, Vaibhav and Singh, Sumeet and Stewart, Craig and Neubig, Graham (2019):
Improving Robustness of Machine Translation with Synthetic Noise, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{vaibhav-etal-2019-improving,
author = {Vaibhav, Vaibhav and Singh, Sumeet and Stewart, Craig and Neubig, Graham},
title = {Improving Robustness of Machine Translation with Synthetic Noise},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1190},
pages = {1916--1920},
year = 2019
}
Vaibhav et al. (2019) add synthetic training data that contains types of noise similar to what has been seen in a test set of web discussion forum posts.
Anastasopoulos, Antonios and Lui, Alison and Nguyen, Toan Q. and Chiang, David (2019):
Neural Machine Translation of Text from Non-Native Speakers, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{anastasopoulos-etal-2019-neural,
author = {Anastasopoulos, Antonios and Lui, Alison and Nguyen, Toan Q. and Chiang, David},
title = {Neural Machine Translation of Text from Non-Native Speakers},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1311},
pages = {3070--3080},
year = 2019
}
Anastasopoulos et al. (2019) employ corpora from grammatical error correction tasks (sentences with errors from non-native speakers alongside their corrections) to create synthetic input that mirrors the same type of errors. They compare translation quality between clean and noisy input and reduce the gap by adding similar synthetic noisy data to training.
Benchmarks
Discussion
Related Topics
New Publications
Hu, Junjie and Xia, Mengzhou and Neubig, Graham and Carbonell, Jaime (2019):
Domain Adaptation of Neural Machine Translation by Lexicon Induction, Proceedings of the 57th Conference of the Association for Computational Linguistics

@inproceedings{hu-etal-2019-domain,
author = {Hu, Junjie and Xia, Mengzhou and Neubig, Graham and Carbonell, Jaime},
title = {Domain Adaptation of Neural Machine Translation by Lexicon Induction},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1286},
pages = {2989--3001},
year = 2019
}
Hu et al. (2019)
Saunders, Danielle and Stahlberg, Felix and de Gispert, Adrià and Byrne, Bill (2019):
Domain Adaptive Inference for Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics

@inproceedings{saunders-etal-2019-domain,
author = {Saunders, Danielle and Stahlberg, Felix and de Gispert, Adri{\`a} and Byrne, Bill},
title = {Domain Adaptive Inference for Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1022},
pages = {222--228},
year = 2019
}
Saunders et al. (2019)
Shu, Raphael and Nakayama, Hideki and Cho, Kyunghyun (2019):
Generating Diverse Translations with Sentence Codes, Proceedings of the 57th Conference of the Association for Computational Linguistics

@inproceedings{shu-etal-2019-generating,
author = {Shu, Raphael and Nakayama, Hideki and Cho, Kyunghyun},
title = {Generating Diverse Translations with Sentence Codes},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1177},
pages = {1823--1827},
year = 2019
}
Shu et al. (2019)
Wang, Wei and Caswell, Isaac and Chelba, Ciprian (2019):
Dynamically Composing Domain-Data Selection with Clean-Data Selection by ``Co-Curricular Learning'' for Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics

@inproceedings{wang-etal-2019-dynamically,
author = {Wang, Wei and Caswell, Isaac and Chelba, Ciprian},
title = {Dynamically Composing Domain-Data Selection with Clean-Data Selection by {``}Co-Curricular Learning{''} for Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-1123},
pages = {1282--1292},
year = 2019
}
Wang et al. (2019)
Variš, Dušan and Bojar, Ondřej (2019):
Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation, Proceedings of the 57th Conference of the Association for Computational Linguistics: Student Research Workshop

@inproceedings{varis-bojar-2019-unsupervised,
author = {Vari{\v{s}}, Du{\v{s}}an and Bojar, Ond{\v{r}}ej},
title = {Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics: Student Research Workshop},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/P19-2017},
pages = {130--135},
year = 2019
}
Variš and Bojar (2019)
Kalimuthu, Marimuthu and Barz, Michael and Sonntag, Daniel (2019):
Incremental Domain Adaptation for Neural Machine Translation in Low-Resource Settings, Proceedings of the Fourth Arabic Natural Language Processing Workshop

@inproceedings{kalimuthu-etal-2019-incremental,
author = {Kalimuthu, Marimuthu and Barz, Michael and Sonntag, Daniel},
title = {Incremental Domain Adaptation for Neural Machine Translation in Low-Resource Settings},
booktitle = {Proceedings of the Fourth Arabic Natural Language Processing Workshop},
month = {aug},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W19-4601},
pages = {1--10},
year = 2019
}
Kalimuthu et al. (2019)
M. Amin Farajian and Nicola Bertoldi and Matteo Negri and Marco Turchi and Marcello Federico (2018):
Evaluation of Terminology Translation in Instance-Based Neural MT Adaptation, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Farajian,
author = {M. Amin Farajian and Nicola Bertoldi and Matteo Negri and Marco Turchi and Marcello Federico},
title = {Evaluation of Terminology Translation in Instance-Based Neural MT Adaptation},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Farajian et al. (2018)
Shen Yan and Shahram Khadivi and Leonard Dahlmann and Pavel Petrushkov and Sanjika Hewavitharana (2018):
Word-based Domain Adaptation for Neural Machine Translation, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)

@inproceedings{iwslt18-Word-based-Yan,
author = {Shen Yan and Shahram Khadivi and Leonard Dahlmann and Pavel Petrushkov and Sanjika Hewavitharana},
title = {Word-based Domain Adaptation for Neural Machine Translation},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2018
}
Yan et al. (2018)
Gu, Shuhao and Feng, Yang and Liu, Qun (2019):
Improving Domain Adaptation Translation with Domain Invariant and Specific Information, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

@inproceedings{gu-etal-2019-improving,
author = {Gu, Shuhao and Feng, Yang and Liu, Qun},
title = {Improving Domain Adaptation Translation with Domain Invariant and Specific Information},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/N19-1312},
pages = {3081--3091},
year = 2019
}
Gu et al. (2019)
Yamagishi, Hayahide and Kanouchi, Shin and Sato, Takayuki and Komachi, Mamoru (2017):
Improving Japanese-to-English Neural Machine Translation by Voice Prediction, Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

@inproceedings{yamagishi-etal-2017-improving,
author = {Yamagishi, Hayahide and Kanouchi, Shin and Sato, Takayuki and Komachi, Mamoru},
title = {Improving {J}apanese-to-{E}nglish Neural Machine Translation by Voice Prediction},
booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
month = {nov},
address = {Taipei, Taiwan},
publisher = {Asian Federation of Natural Language Processing},
url = {
https://www.aclweb.org/anthology/I17-2047},
pages = {277--282},
year = 2017
}
Yamagishi et al. (2017)
Hassan Sajjad and Nadir Durrani and Fahim Dalvi and Yonatan Belinkov and Stephan Vogel (2017):
Neural Machine Translation Training in a Multi-Domain Scenario, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)

@inproceedings{IWSLT2017:Sajjad,
author = {Hassan Sajjad and Nadir Durrani and Fahim Dalvi and Yonatan Belinkov and Stephan Vogel},
title = {Neural Machine Translation Training in a Multi-Domain Scenario},
url = {
http://workshop2017.iwslt.org/downloads/P01-Paper.pdf},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
location = {Tokyo, Japan},
year = 2017
}
Sajjad et al. (2017)
LucÃa SantamarÃa and Amittai Axelrod (2017):
Data Selection with Cluster-Based Language Difference Models and Cynical Selection, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)

@inproceedings{IWSLT2017:SantamarÃa,
author = {LucÃa SantamarÃa and Amittai Axelrod},
title = {Data Selection with Cluster-Based Language Difference Models and Cynical Selection},
url = {
http://workshop2017.iwslt.org/downloads/O02-2-Paper.pdf},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
location = {Tokyo, Japan},
year = 2017
}
SantamarÃa and Axelrod (2017)
A. Valerio Miceli Barone and Barry Haddow and Ulrich Germann and Rico Sennrich (2017):
Regularization techniques for fine-tuning in neural machine translation, ArXiv e-prints

@ARTICLE{2017arXiv170709920V,
author = {A. Valerio Miceli~Barone and Barry Haddow and Ulrich Germann and Rico Sennrich},
title = {Regularization techniques for fine-tuning in neural machine translation},
journal = {ArXiv e-prints},
archiveprefix = {arXiv},
eprint = {1707.09920},
primaryclass = {cs.CL},
keywords = {Computer Science - Computation and Language},
month = {jul},
url = {
https://arxiv.org/pdf/1707.09920.pdf},
adsurl = {
http://adsabs.harvard.edu/abs/2017arXiv170709920V},
adsnote = {Provided by the SAO/NASA Astrophysics Data System},
year = 2017
}
Miceli Barone et al. (2017)
Niu, Xing and Rao, Sudha and Carpuat, Marine (2018):
Multi-Task Neural Models for Translating Between Styles Within and Across Languages, Proceedings of the 27th International Conference on Computational Linguistics

@inproceedings{C18-1086,
author = {Niu, Xing and Rao, Sudha and Carpuat, Marine},
title = {Multi-Task Neural Models for Translating Between Styles Within and Across Languages},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/C18-1086},
pages = {1008--1021},
year = 2018
}
Niu et al. (2018)
Chu, Chenhui and Wang, Rui (2018):
A Survey of Domain Adaptation for Neural Machine Translation, Proceedings of the 27th International Conference on Computational Linguistics

@inproceedings{C18-1111,
author = {Chu, Chenhui and Wang, Rui},
title = {A Survey of Domain Adaptation for Neural Machine Translation},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/C18-1111},
pages = {1304--1319},
year = 2018
}
Chu and Wang (2018)
Li, Yachao and Li, Junhui and Zhang, Min (2018):
Adaptive Weighting for Neural Machine Translation, Proceedings of the 27th International Conference on Computational Linguistics

@inproceedings{C18-1257,
author = {Li, Yachao and Li, Junhui and Zhang, Min},
title = {Adaptive Weighting for Neural Machine Translation},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/C18-1257},
pages = {3038--3048},
year = 2018
}
Li et al. (2018)
Zhang, Shiqi and Xiong, Deyi (2018):
Sentence Weighting for Neural Machine Translation Domain Adaptation, Proceedings of the 27th International Conference on Computational Linguistics

@inproceedings{C18-1269,
author = {Zhang, Shiqi and Xiong, Deyi},
title = {Sentence Weighting for Neural Machine Translation Domain Adaptation},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/C18-1269},
pages = {3181--3190},
year = 2018
}
Zhang and Xiong (2018)
Li, Zhongwei and Wang, Xuancong and Aw, AiTi and Chng, Eng Siong and Li, Haizhou (2018):
Named-Entity Tagging and Domain adaptation for Better Customized Translation, Proceedings of the Seventh Named Entities Workshop

@inproceedings{W18-2407,
author = {Li, Zhongwei and Wang, Xuancong and Aw, AiTi and Chng, Eng Siong and Li, Haizhou},
title = {Named-Entity Tagging and Domain adaptation for Better Customized Translation},
booktitle = {Proceedings of the Seventh Named Entities Workshop},
month = {jul},
address = {Melbourne, Australia},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-2407},
pages = {41--46},
year = 2018
}
Li et al. (2018)
Sandipan Dandapat and Christian Federmann (2018):
Iterative Data Augmentation for Neural Machine Translation: a Low Resource Case Study for English–Telugu, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Dandapat,
author = {Sandipan Dandapat and Christian Federmann},
title = {Iterative Data Augmentation for Neural Machine Translation: a Low Resource Case Study for English–Telugu},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Dandapat and Federmann (2018)
Zuzanna Parcheta and Germán Sanchis-Trilles and Francisco Casacuberta (2018):
Data selection for NMT using Infrequent n-gram Recovery, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Parcheta,
author = {Zuzanna Parcheta and Germ{\'a}n Sanchis-Trilles and Francisco Casacuberta},
title = {Data selection for NMT using Infrequent n-gram Recovery},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Parcheta et al. (2018)
Alberto Poncelas and Gideon Maillette de Buy Wenniger and Andy Way (2018):
Feature Decay Algorithms for Neural Machine Translation, Proceedings of the 21st Annual Conference of the European Association for Machine Translation

@inproceedings{eamt18-Poncelas,
author = {Alberto Poncelas and Gideon Maillette de Buy Wenniger and Andy Way},
title = {Feature Decay Algorithms for Neural Machine Translation},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Poncelas et al. (2018)
Costa-jussà, Marta R. and Zampieri, Marcos and Pal, Santanu (2018):
A Neural Approach to Language Variety Translation, Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

@inproceedings{W18-3931,
author = {Costa-juss{\`a}, Marta R. and Zampieri, Marcos and Pal, Santanu},
title = {A Neural Approach to Language Variety Translation},
booktitle = {Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-3931},
pages = {275--282},
year = 2018
}
Costa-jussà et al. (2018)
Wang, Wei and Watanabe, Taro and Hughes, Macduff and Nakagawa, Tetsuji and Chelba, Ciprian (2018):
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection, Proceedings of the Third Conference on Machine Translation: Research Papers

@inproceedings{W18-6314,
author = {Wang, Wei and Watanabe, Taro and Hughes, Macduff and Nakagawa, Tetsuji and Chelba, Ciprian},
title = {Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6314},
pages = {133--143},
year = 2018
}
Wang et al. (2018)
Silva, Catarina Cruz and Liu, Chao-Hong and Poncelas, Alberto and Way, Andy (2018):
Extracting In-domain Training Corpora for Neural Machine Translation Using Data Selection Methods, Proceedings of the Third Conference on Machine Translation: Research Papers

@inproceedings{W18-6323,
author = {Silva, Catarina Cruz and Liu, Chao-Hong and Poncelas, Alberto and Way, Andy},
title = {Extracting In-domain Training Corpora for Neural Machine Translation Using Data Selection Methods},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6323},
pages = {224--231},
year = 2018
}
Silva et al. (2018)
Kocmi, Tom and Bojar, Ondřej (2018):
Trivial Transfer Learning for Low-Resource Neural Machine Translation, Proceedings of the Third Conference on Machine Translation: Research Papers

@inproceedings{W18-6325,
author = {Kocmi, Tom and Bojar, Ond{\v{r}}ej},
title = {Trivial Transfer Learning for Low-Resource Neural Machine Translation},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/W18-6325},
pages = {244--252},
year = 2018
}
Kocmi and Bojar (2018)
Pintu Lohar and Haithem Afli and Andy Way (2018):
Nearest Neighbour Class-Combination Method for Balancing Translation Quality and Sentiment Preservation, Annual Meeting of the Association for Machine Translation in the Americas (AMTA)

@inproceedings{AMTA2018-Lohar,
author = {Pintu Lohar and Haithem Afli and Andy Way},
title = {Nearest Neighbour Class-Combination Method for Balancing Translation Quality and Sentiment Preservation},
booktitle = {Annual Meeting of the Association for Machine Translation in the Americas (AMTA)},
location = {Boston, USA},
year = 2018
}
Lohar et al. (2018)
Ghazvininejad, Marjan and Choi, Yejin and Knight, Kevin (2018):
Neural Poetry Translation, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

@InProceedings{N18-2011,
author = {Ghazvininejad, Marjan and Choi, Yejin and Knight, Kevin},
title = {Neural Poetry Translation},
booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)},
publisher = {Association for Computational Linguistics},
pages = {67--71},
location = {New Orleans, Louisiana},
url = {
http://aclweb.org/anthology/N18-2011},
year = 2018
}
Ghazvininejad et al. (2018)
Michel, Paul and Neubig, Graham (2018):
Extreme Adaptation for Personalized Neural Machine Translation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

@InProceedings{P18-2050,
author = {Michel, Paul and Neubig, Graham},
title = {Extreme Adaptation for Personalized Neural Machine Translation},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
publisher = {Association for Computational Linguistics},
pages = {312--318},
location = {Melbourne, Australia},
url = {
http://aclweb.org/anthology/P18-2050},
year = 2018
}
Michel and Neubig (2018)
Zeng, Jiali and Su, Jinsong and Wen, Huating and Liu, Yang and Xie, Jun and Yin, Yongjing and Zhao, Jianqiang (2018):
Multi-Domain Neural Machine Translation with Word-Level Domain Context Discrimination, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

@inproceedings{D18-1041,
author = {Zeng, Jiali and Su, Jinsong and Wen, Huating and Liu, Yang and Xie, Jun and Yin, Yongjing and Zhao, Jianqiang},
title = {Multi-Domain Neural Machine Translation with Word-Level Domain Context Discrimination},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {
https://www.aclweb.org/anthology/D18-1041},
pages = {447--457},
year = 2018
}
Zeng et al. (2018)
Britz, Denny and Le, Quoc and Pryzant, Reid (2017):
Effective Domain Mixing for Neural Machine Translation, Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper

@InProceedings{britz-le-pryzant:2017:WMT,
author = {Britz, Denny and Le, Quoc and Pryzant, Reid},
title = {Effective Domain Mixing for Neural Machine Translation},
booktitle = {Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper},
month = {September},
address = {Copenhagen, Denmark},
publisher = {Association for Computational Linguistics},
pages = {118--126},
url = {
http://www.aclweb.org/anthology/W17-4712},
year = 2017
}
Britz et al. (2017)
Joty, Shafiq and Sajjad, Hassan and Durrani, Nadir and Al-Mannai, Kamla and Abdelali, Ahmed and Vogel, Stephan (2015):
How to Avoid Unwanted Pregnancies: Domain Adaptation using Neural Network Models, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

@InProceedings{joty-EtAl:2015:EMNLP2,
author = {Joty, Shafiq and Sajjad, Hassan and Durrani, Nadir and Al-Mannai, Kamla and Abdelali, Ahmed and Vogel, Stephan},
title = {How to Avoid Unwanted Pregnancies: Domain Adaptation using Neural Network Models},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {1259--1270},
url = {
http://aclweb.org/anthology/D15-1147},
year = 2015
}
Joty et al. (2015)