Possible Topics for Spring 2015
Domain Adaptation
- Ann Irvine and John Morgan and Marine Carpuat and Hal Daume III and Dragos Munteanu (2013): Measuring Machine Translation Errors in New Domains, Transactions of the Association for Computational Linguistics (TACL), pdf
- Eva Hasler and Barry Haddow and Philipp Koehn (2014): Combining domain and topic adaptation for SMT, Proceedings of the Eleventh Conference of the Association for Machine Translation in the Americas (AMTA) , pdf
- Amittai Axelrod and QingJun Li and William D. Lewis (2012): Applications of data selection via cross-entropy difference for real-world statistical machine translation, Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT), pdf
Sparse Feature Training
- Green, Spence and Wang, Sida and Cer, Daniel and Manning, Christopher D (2013): Fast and Adaptive Online Training of Feature-Rich Translation Models, pdf
- Chiang, David and Marton, Yuval and Resnik, Philip (2008): Online Large-Margin Training of Syntactic and Structural Translation Features, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pdf
- Cherry, Colin (2013): Improved Reordering for Phrase-Based Translation using Sparse Features, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pdf
Computer Aided Translation - Online Updating
- Michael Denkowski and Alon Lavie and Isabel Lacruz and Chris Dyer (2014): Real time adaptive machine translation: cdec and TransCenter, Proceedings of the Third workshop on post-editing technology and practice (WPTP-3), pdf
- Ulrich Germann (2014): Dynamic phrase tables for machine translation in an interactive post-editing scenario, Proceedings of the Workshop on interactive and adaptive machine translation, pdf
- Nicola Bertoldi and Mauro Cettolo and Marcello Federico (2013): Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation, Machine Translation Summit XIV, pdf
Computer Aided Translation - Postediting
- Isabel Lacruz and Michael Denkowski and Alon Lavie (2014): Cognitive demand and cognitive effort in post-editing, Proceedings of the Third workshop on post-editing technology and practice (WPTP-3), pdf
- Federico Gaspari and Antonio Toral and Sudip Kumar Naskar and Declan Groves and Andy Way (2014): Perception vs. reality: measuring machine translation post-editing productivity, Proceedings of the Third workshop on post-editing technology and practice (WPTP-3), pdf
- Koehn, Philipp and Germann, Ulrich (2014): The Impact of Machine Translation Quality on Human Post-Editing, Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation, pdf
- several more paper of unknown quality
Syntax - esp. soft labels
- Saluja, Avneesh and Dyer, Chris and Cohen, Shay B. (2014): Latent-Variable Synchronous CFGs for Hierarchical Translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pdf
- MINO, Hideya and WATANABE, Taro and SUMITA, Eiichiro (2014): Syntax-Augmented Machine Translation using Syntax-Label Clustering, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pdf
- Huck, Matthias and Hoang, Hieu and Koehn, Philipp (2014): Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases, Proceedings of the Ninth Workshop on Statistical Machine Translation, pdf
- Huck, Matthias and Hoang, Hieu and Koehn, Philipp (2014): Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation, Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pdf
Specific Linguistic Problems
- Cholakov, Kostadin and Kordoni, Valia (2014): Better Statistical Machine Translation through Linguistic Treatment of Phrasal Verbs, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pdf
- Williams, Philip and Koehn, Philipp (2014): Using Feature Structures to Improve Verb Translation in English-to-German Statistical MT, Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra), pdf
- Fancellu, Federico and Webber, Bonnie (2014): Applying the semantics of negation to SMT through n-best list re-ranking, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics , pdf
- Discourse
- Compounds
- Pronouns
Word Alignment
- Songyot, Theerawat and Chiang, David (2014): Improving Word Alignment using Word Similarity, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pdf
- Simion, Andrei and Collins, Michael and Stein, Cliff (2014): Some Experiments with a Convex IBM Model 2, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pdf
- Gelling, Douwe and Cohn, Trevor (2014): Simple extensions and POS Tags for a reparameterised IBM Model 2, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pdf
Decipherment
- Nuhn, Malte and Ney, Hermann (2014): EM Decipherment for Large Vocabularies, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pdf
- Dou, Qing and Vaswani, Ashish and Knight, Kevin (2014): Beyond Parallel Data: Joint Word Alignment and Decipherment Improves Machine Translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pdf
- Irvine, Ann and Callison-Burch, Chris (2014): Hallucinating Phrase Translations for Low Resource MT, Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pdf
Fun Papers
- van Halteren, Hans (2008): Source Language Markers in EUROPARL Translations, Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pdf
- Gennadi Lembersky and Noam Ordan and Shuly Wintner (2013): Improving Statistical Machine Translation by Adapting Translation Models to Translationese, Computational Linguistics, pdf
- Graham, Yvette and Baldwin, Timothy and Moffat, Alistair and Zobel, Justin (2014): Is Machine Translation Getting Better over Time?, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pdf
Corpus Crawling
- Smith, Jason R. and Saint-Amand, Herve and Plamada, Magdalena and Koehn, Philipp and Callison-Burch, Chris and Lopez, Adam (2013): Dirt Cheap Web-Scale Parallel Text from the Common Crawl, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pdf
- Uszkoreit, Jakob and Ponte, Jay and Popat, Ashok and Dubiner, Moshe (2010): Large Scale Parallel Document Mining for Machine Translation, Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pdf
- Antonova, Alexandra and Misyurev, Alexey (2011): Building a Web-Based Parallel Corpus and Filtering Out Machine-Translated Text, Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web, pdf
- Spencer Rarrick and Chris Quirk and Will Lewis (2011): MT Detection in Web-Scraped Parallel Corpora, Proceedings of the 13th Machine Translation Summit (MT Summit XIII), pdf
Metrics
Confidence Estimation