edit · history · print

Neural Machine Translation Winter School 2017

To bring everybody up to speed in neural machine translation, we organize an unofficial winter school in January 2017. This is targeted at the more junior graduate students in the MT group, but anybody else at CLSP is welcome as well.

We will be meeting from 10-12 in Hackerman 306.

I'll update my survey of research papers and neural machine translation chapter.

Schedule

     Day       Presenter  Topic
Tu Jan 10Philipp KoehnIntroduction into NLP with neural networks
Th Jan 12Kevin DuhHands-on implementation of neural networks with Theano (reference nlm.py implementation)
Tu Jan 17Rebecca KnowlesNMT with Attention
Th Jan 19Shuoyang DingRunning Nematus and AmuNMT on CLSP machines Notes on AmuNMT Code
Tu Jan 24Group PresentationsAdvanced topics
Supervised Alignment in Attention Model
Domain Adaptation
Th Jan 26Gaurav KumarTensorflow Code

Background reading

Advanced Topics

Coverage in Attention Model (Ashish)

  • Tu, Zhaopeng and Lu, Zhengdong and Liu, Yang and Liu, Xiaohua and Li, Hang (2016): Modeling Coverage for Neural Machine Translation, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pdf
  • Cohn, Trevor and Hoang, Cong Duy Vu and Vymolova, Ekaterina and Yao, Kaisheng and Dyer, Chris and Haffari, Gholamreza (2016): Incorporating Structural Alignment Biases into an Attentional Neural Translation Model, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pdf
  • Zhaopeng Tu and Yang Liu and Zhengdong Lu and Xiaohua Liu and Hang Li (2016): Context Gates for Neural Machine Translation, pdf
  • Zhaopeng Tu and Yang Liu and Lifeng Shang and Xiaohua Liu and Hang Li (2017): Neural Machine Translation with Reconstruction, Proceedings of the 31st AAAI Conference on Artificial Intelligence, pdf
  • Meng, Fandong and Lu, Zhengdong and Li, Hang and Liu, Qun (2016): Interactive Attention for Neural Machine Translation, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics, pdf

Supervised Alignment in Attention Model (Becky, Pengyu)

  • Wenhu Chen and Evgeny Matusov and Shahram Khadivi and Jan-Thorsten Peter (2016): Guided Alignment Training for Topic-Aware Neural Machine Translation, pdf
  • Alkhouli, Tamer and Bretschner, Gabriel and Peter, Jan-Thorsten and Hethnawi, Mohammed and Guta, Andreas and Ney, Hermann (2016): Alignment-Based Neural Machine Translation, Proceedings of the First Conference on Machine Translation, pdf
  • Liu, Lemao and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro (2016): Neural Machine Translation with Supervised Attention, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics, pdf

Character-Based Models (Winston)

  • Costa-jussà, Marta R. and España-Bonet, Cristina and Madhyastha, Pranava and Escolano, Carlos and Fonollosa, José A. R. (2016): The TALP--UPC Spanish--English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System, Proceedings of the First Conference on Machine Translation, pdf
  • Costa-jussà, Marta R. and Fonollosa, José A. R. (2016): Character-based Neural Machine Translation. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pdf
  • Yang, Zhen and Chen, Wei and Wang, Feng and Xu, Bo (2016): A Character-Aware Encoder for Neural Machine Translation, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pdf
  • Chung, Junyoung and Cho, Kyunghyun and Bengio, Yoshua (2016): A Character-level Decoder without Explicit Segmentation for Neural Machine Translation, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pdf
  • Luong, Minh-Thang and Manning, Christopher D. (2016): Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pdf

Multi-Source Models (Huda, Chris)

  • Zoph, Barret and Knight, Kevin (2016): Multi-Source Neural Translation, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pdf
  • Firat, Orhan and Cho, Kyunghyun and Bengio, Yoshua (2016): Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pdf
  • Garmash, Ekaterina and Monz, Christof (2016): Ensemble Learning for Multi-Source Neural Machine Translation, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics, pdf
  • Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, Jeffrey Dean (2016): Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, pdf

Domain Adaptation (Matt, Rebecca)

  • Minh-Thang Luong and Christopher Manning (2015): Stanford neural machine translation systems for spoken language domains, Proceedings of the International Workshop on Spoken Language Translation (IWSLT), pdf
  • Markus Freitag and Yaser Al-Onaizan (2016): Fast Domain Adaptation for Neural Machine Translation, pdf
  • Catherine Kobus and Josep Crego and Jean Senellart (2016): Domain Control for Neural Machine Translation, pdf

System Overview

Mini Projects

  • Google's research paper had a hacky solution to coverage during search: add up attention model states, divide hypothesis score by sum(min(attention-per-word,1)). This would be very easy to implement.
  • Thinking about beam search, especially the role of the attention state
  • Pass-through of unknown words, placeholders for numbers, dates, etc. during decoding
  • Thinking about domain adaptation
  • Adding dictionaries and other non-sentence resources
  • Lattice rescoring (mini-SCALE project)
edit · history · print
Page last modified on January 30, 2017, at 04:52 PM