Large-Scale Discriminative Training
The current mix of generative models, ad hoc scoring functions, and discriminative parameter of a handful of weights is theoretically unappealing, so there has been a long standing effort to train all the millions of parameters of a statistical machine translation model discriminatively.
Large Scale Discriminative Training is the main subject of 44 publications. 18 are discussed here.
Publications
Large-scale discriminative training methods that optimize millions of features over the entire training corpus have emerged recently.
Tillmann, Christoph and Zhang, Tong (2005):
A Localized Prediction Model for Statistical Machine Translation, Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)
@InProceedings{tillmann-zhang:2005:ACL,
author = {Tillmann, Christoph and Zhang, Tong},
title = {A Localized Prediction Model for Statistical Machine Translation},
booktitle = {Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {557--564},
url = {
http://www.aclweb.org/anthology/P/P05/P05-1069},
year = 2005
}
Tillmann and Zhang (2005) add a binary feature for each phrase translation table entry and train feature weights using a stochastic gradient descent method. Kernel regression methods may be applied to the same task
Wang, Zhuoran and Shawe-Taylor, John and Szedmak, Sandor (2007):
Kernel Regression Based Machine Translation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
@InProceedings{wang-shawetaylor-szedmak:2007:ShortPapers,
author = {Wang, Zhuoran and Shawe-Taylor, John and Szedmak, Sandor},
title = {Kernel Regression Based Machine Translation},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {185--188},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2047},
year = 2007
}
(Wang et al., 2007;
Wang, Zhuoran and Shawe-Taylor, John (2008):
Kernel Regression Framework for Machine Translation: UCL System Description for WMT 2008 Shared Translation Task, Proceedings of the Third Workshop on Statistical Machine Translation
mentioned in Research Groups and Large Scale Discriminative Training@InProceedings{wang-shawetaylor:2008:WMT,
author = {Wang, Zhuoran and Shawe-Taylor, John},
title = {Kernel Regression Framework for Machine Translation: {UCL} System Description for {WMT} 2008 Shared Translation Task},
booktitle = {Proceedings of the Third Workshop on Statistical Machine Translation},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {155--158},
url = {
http://www.aclweb.org/anthology/W/W08/W08-0322},
year = 2008
}
Wang and Shawe-Taylor, 2008).
Benjamin Wellington and Joseph P. Turian and Chris Pike and I. Dan Melamed (2006):
Scalable Purely-Discriminative Training for Word and Tree Transducers, 5th Conference of the Association for Machine Translation in the Americas (AMTA)
@InProceedings{Wellington:2006:AMTA,
author = {Benjamin Wellington and Joseph P. Turian and Chris Pike and I. Dan Melamed},
title = {Scalable Purely-Discriminative Training for Word and Tree Transducers},
url = {
http://www.mt-archive.info/AMTA-2006-Wellington.pdf},
googlescholar = {5966566586009385723},
booktitle = {5th Conference of the Association for Machine Translation in the Americas (AMTA)},
month = {August},
address = {Boston, Massachusetts},
year = 2006
}
Wellington et al. (2006) applies discriminative training to a tree translation model. Large scale discriminative training may also use the perceptron algorithm
Liang, Percy and Bouchard-Côté, Alexandre and Klein, Dan and Taskar, Ben (2006):
An End-to-End Discriminative Approach to Machine Translation, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{liang-EtAl:2006:COLACL,
author = {Liang, Percy and Bouchard-C\^{o}t\'{e}, Alexandre and Klein, Dan and Taskar, Ben},
title = {An End-to-End Discriminative Approach to Machine Translation},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {761--768},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1096},
year = 2006
}
(Liang et al., 2006) or variations thereof
Tillmann, Christoph and Zhang, Tong (2006):
A Discriminative Global Training Algorithm for Statistical MT, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{tillmann-zhang:2006:COLACL,
author = {Tillmann, Christoph and Zhang, Tong},
title = {A Discriminative Global Training Algorithm for Statistical MT},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {721--728},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1091},
year = 2006
}
(Tillmann and Zhang, 2006) to directly optimize on error metrics such as BLEU.
Abhishek Arun and Philipp Koehn (2007):
Online Learning Methods For Discriminative Training of Phrase Based Statistical Machine Translation, Proceedings of the MT Summit XI
@inproceedings{Arun:2007:MTSummit,
author = {Abhishek Arun and Philipp Koehn},
title = {Online Learning Methods For Discriminative Training of Phrase Based Statistical Machine Translation},
url = {
http://www.researchgate.net/publication/228345919\_Online\_learning\_methods\_for\_discriminative\_training\_of\_phrase\_based\_statistical\_machine\_translation/file/79e4150db363f73ea3.pdf},
googlescholar = {5937017335543240897},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
Arun and Koehn (2007) compare MIRA and the Perceptron algorithm and point out some of the problems on the road to large-scale discriminative training. This approach has also been applied to a variant of the hierarchical phrase model
Watanabe, Taro and Suzuki, Jun and Tsukada, Hajime and Isozaki, Hideki (2007):
Online Large-Margin Training for Statistical Machine Translation, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
@InProceedings{watanabe-EtAl:2007:EMNLP-CoNLL2007,
author = {Watanabe, Taro and Suzuki, Jun and Tsukada, Hajime and Isozaki, Hideki},
title = {Online Large-Margin Training for Statistical Machine Translation},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {764--773},
url = {
http://www.aclweb.org/anthology/D/D07/D07-1080},
year = 2007
}
(Watanabe et al., 2007;
Taro Watanabe and Jun Suzuki and Katsuhito Sudoh and Hajime Tsukada and Hideki Isozaki (2007):
Larger Feature Set Approach for Machine Translation in IWSLT 2007, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
mentioned in Research Groups and Large Scale Discriminative Training@inproceedings{Watanabe:2007:IWSLT,
author = {Taro Watanabe and Jun Suzuki and Katsuhito Sudoh and Hajime Tsukada and Hideki Isozaki},
title = {Larger Feature Set Approach for Machine Translation in {IWSLT} 2007},
url = {
http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_07/papers/slt7\_111.pdf},
googlescholar = {2974416785997723565},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2007
}
Watanabe et al., 2007b). The MIRA algorithm may be also used for an extended form of parameter tuning
Chiang, David and Marton, Yuval and Resnik, Philip (2008):
Online Large-Margin Training of Syntactic and Structural Translation Features, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
@InProceedings{chiang-marton-resnik:2008:EMNLP,
author = {Chiang, David and Marton, Yuval and Resnik, Philip},
title = {Online Large-Margin Training of Syntactic and Structural Translation Features},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Honolulu, Hawaii},
publisher = {Association for Computational Linguistics},
pages = {224--233},
url = {
http://www.aclweb.org/anthology/D08-1024},
year = 2008
}
(Chiang et al., 2008), allowing for the use of thousands of features
Chiang, David and Knight, Kevin and Wang, Wei (2009):
11,001 New Features for Statistical Machine Translation, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
@InProceedings{chiang-knight-wang:2009:NAACLHLT09,
author = {Chiang, David and Knight, Kevin and Wang, Wei},
title = {11,001 New Features for Statistical Machine Translation},
booktitle = {Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {218--226},
url = {
http://www.aclweb.org/anthology/N/N09/N09-1025},
year = 2009
}
(Chiang et al., 2009), covering properties such as source and target syntax
Chiang, David (2010):
Learning to Translate with Source and Target Syntax, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
@InProceedings{chiang:2010:ACL,
author = {Chiang, David},
title = {Learning to Translate with Source and Target Syntax},
booktitle = {Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {1443--1452},
url = {
http://www.aclweb.org/anthology/P10-1146},
year = 2010
}
(Chiang, 2010), on a larger tuning set.
Blunsom, Phil and Cohn, Trevor and Osborne, Miles (2008):
A Discriminative Latent Variable Model for Statistical Machine Translation, Proceedings of ACL-08: HLT
@InProceedings{blunsom-cohn-osborne:2008:ACLMain,
author = {Blunsom, Phil and Cohn, Trevor and Osborne, Miles},
title = {A Discriminative Latent Variable Model for Statistical Machine Translation},
booktitle = {Proceedings of ACL-08: HLT},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {200--208},
url = {
http://www.aclweb.org/anthology/P/P08/P08-1024},
year = 2008
}
Blunsom et al. (2008) argue the importance to perform feature updates on all derivations of translation, not just the most likely one, to address spurious ambiguity. A representative subset of translations may be acquired by sampling
Arun, Abhishek and Dyer, Chris and Haddow, Barry and Blunsom, Phil and Lopez, Adam and Koehn, Philipp (2009):
Monte Carlo inference and maximization for phrase-based translation, Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)
@InProceedings{arun-EtAl:2009:CoNLL,
author = {Arun, Abhishek and Dyer, Chris and Haddow, Barry and Blunsom, Phil and Lopez, Adam and Koehn, Philipp},
title = {Monte Carlo inference and maximization for phrase-based translation},
booktitle = {Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {102--110},
url = {
http://www.aclweb.org/anthology/W09-1114},
year = 2009
}
(Arun et al., 2009). This allows for a unified approach to Minimum Risk training and decoding
Arun, Abhishek and Haddow, Barry and Koehn, Philipp (2010):
A Unified Approach to Minimum Risk Training and Decoding, Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
@InProceedings{arun-haddow-koehn:2010:WMT,
author = {Arun, Abhishek and Haddow, Barry and Koehn, Philipp},
title = {A Unified Approach to Minimum Risk Training and Decoding},
booktitle = {Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {365--374},
url = {
http://www.aclweb.org/anthology/W10-1756},
year = 2010
}
(Arun et al., 2010). While
Arun, Abhishek and Dyer, Chris and Haddow, Barry and Blunsom, Phil and Lopez, Adam and Koehn, Philipp (2009):
Monte Carlo inference and maximization for phrase-based translation, Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)
@InProceedings{arun-EtAl:2009:CoNLL,
author = {Arun, Abhishek and Dyer, Chris and Haddow, Barry and Blunsom, Phil and Lopez, Adam and Koehn, Philipp},
title = {Monte Carlo inference and maximization for phrase-based translation},
booktitle = {Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {102--110},
url = {
http://www.aclweb.org/anthology/W09-1114},
year = 2009
}
Arun et al. (2009) use Gibbs sampling, simpler methods such as SampleRank
Haddow, Barry and Arun, Abhishek and Koehn, Philipp (2011):
SampleRank Training for Phrase-Based Machine Translation, Proceedings of the Sixth Workshop on Statistical Machine Translation
@InProceedings{haddow-arun-koehn:2011:WMT,
author = {Haddow, Barry and Arun, Abhishek and Koehn, Philipp},
title = {SampleRank Training for Phrase-Based Machine Translation},
booktitle = {Proceedings of the Sixth Workshop on Statistical Machine Translation},
month = {July},
address = {Edinburgh, Scotland},
publisher = {Association for Computational Linguistics},
pages = {261--271},
url = {
http://www.aclweb.org/anthology/W11-2130},
year = 2011
}
(Haddow et al., 2011) may be used as well.
Machine translation may be framed as a structured prediction problem, which is a current strain of machine learning research.
Dakun Zhang and Le Sun and Wenbo Li (2008):
A Structured Prediction Approach for Statistical Machine Translation, Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP)
@inproceedings{DakunZhang:2008:IJCNLP,
author = {Dakun Zhang and Le Sun and Wenbo Li},
title = {A Structured Prediction Approach for Statistical Machine Translation},
url = {
http://www.newdesign.aclweb.org/anthology/I/I08/I08-2087.pdf},
googlescholar = {9064924548519076618},
booktitle = {Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP)},
year = 2008
}
Zhang et al. (2008) frame ITG decoding in such a way and propose a discriminative training method following the SEARN algorithm
Hal Daumé III and John Langford and Daniel Marcu (2006):
Search-based Structured Prediction, Submitted to the Machine Learning Journal
@article{daume06searn,
author = {Hal {Daum\'e III} and John Langford and Daniel Marcu},
title = {Search-based Structured Prediction},
url = {
http://www.umiacs.umd.edu/~hal/docs/daume09searn.pdf},
googlescholar = {979754630010929387},
journal = {Submitted to the Machine Learning Journal},
year = 2006
}
(Daumé III et al., 2006).
Benchmarks
Discussion
Related Topics
Discriminative training methods require the translation of the training corpus, which is also a requirement for generative training of word based models, phrase based models, and syntax based models.
New Publications
Tamchyna, Aleš and Fraser, Alexander and Bojar, Ondřej and Junczys-Dowmunt, Marcin (2016):
Target-Side Context for Discriminative Models in Statistical Machine Translation, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{tamchyna-EtAl:2016:P16-1,
author = {Tamchyna, Ale\v{s} and Fraser, Alexander and Bojar, Ond\v{r}ej and Junczys-Dowmunt, Marcin},
title = {Target-Side Context for Discriminative Models in Statistical Machine Translation},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {1704--1714},
url = {
http://www.aclweb.org/anthology/P16-1161},
year = 2016
}
Tamchyna et al. (2016)
Braune, Fabienne and Fraser, Alexander and Daumé III, Hal and Tamchyna, Aleš (2016):
A Framework for Discriminative Rule Selection in Hierarchical Moses, Proceedings of the First Conference on Machine Translation
@InProceedings{braune-EtAl:2016:WMT,
author = {Braune, Fabienne and Fraser, Alexander and Daum\'{e} III, Hal and Tamchyna, Ale\v{s}},
title = {A Framework for Discriminative Rule Selection in Hierarchical Moses},
booktitle = {Proceedings of the First Conference on Machine Translation},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {92--101},
url = {
http://www.aclweb.org/anthology/W/W16/W16-2210},
year = 2016
}
Braune et al. (2016)
Wuebker, Joern and Muehr, Sebastian and Lehnen, Patrick and Peitz, Stephan and Ney, Hermann (2015):
A Comparison of Update Strategies for Large-Scale Maximum Expected BLEU Training, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@InProceedings{wuebker-EtAl:2015:NAACL-HLT,
author = {Wuebker, Joern and Muehr, Sebastian and Lehnen, Patrick and Peitz, Stephan and Ney, Hermann},
title = {A Comparison of Update Strategies for Large-Scale Maximum Expected BLEU Training},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {May--June},
address = {Denver, Colorado},
publisher = {Association for Computational Linguistics},
pages = {1516--1526},
url = {
http://www.aclweb.org/anthology/N15-1175},
year = 2015
}
Wuebker et al. (2015)
Sokolov, Artem and Riezler, Stefan and Cohen, Shay B. (2015):
A Coactive Learning View of Online Structured Prediction in Statistical Machine Translation, Proceedings of the Nineteenth Conference on Computational Natural Language Learning
@InProceedings{sokolov-riezler-cohen:2015:CoNLL,
author = {Sokolov, Artem and Riezler, Stefan and Cohen, Shay B.},
title = {A Coactive Learning View of Online Structured Prediction in Statistical Machine Translation},
booktitle = {Proceedings of the Nineteenth Conference on Computational Natural Language Learning},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {1--11},
url = {
http://www.aclweb.org/anthology/K15-1001},
year = 2015
}
Sokolov et al. (2015)
Eidelman, Vladimir and Wu, Ke and Ture, Ferhan and Resnik, Philip and Lin, Jimmy (2013):
Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation, Proceedings of the Eighth Workshop on Statistical Machine Translation
@InProceedings{eidelman-EtAl:2013:WMT,
author = {Eidelman, Vladimir and Wu, Ke and Ture, Ferhan and Resnik, Philip and Lin, Jimmy},
title = {Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation},
booktitle = {Proceedings of the Eighth Workshop on Statistical Machine Translation},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {128--133},
url = {
http://www.aclweb.org/anthology/W13-2214},
year = 2013
}
Eidelman et al. (2013)
Xingyi Song and Lucia Specia and Trevor Cohn (2014):
Data selection for discriminative training in statistical machine translation, Proceedings of 17th Annual conference of the European Association for Machine Translation
@inproceedings{eamt-2014-Song,
author = {Xingyi Song and Lucia Specia and Trevor Cohn},
title = {Data selection for discriminative training in statistical machine translation},
booktitle = {Proceedings of 17th Annual conference of the European Association for Machine Translation},
pages = {45-52},
url = {
http://www.mt-archive.info/10/EAMT-2014-Song.pdf},
location = {Dubrovnik, Croatia},
year = 2014
}
Song et al. (2014)
Avneesh Saluja and Ying Zhang (2014):
Online discriminative learning for machine translation with binary-valued feedback, Machine Translation
@article{MTJ:2014:Saluja,
author = {Avneesh Saluja and Ying Zhang},
title = {Online discriminative learning for machine translation with binary-valued feedback},
pages = {69-90},
journal = {Machine Translation},
volume = {28},
number = {2},
month = {October},
year = 2014
}
Saluja and Zhang (2014)
Green, Spence and Cer, Daniel and Manning, Christopher (2014):
An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation, Proceedings of the Ninth Workshop on Statistical Machine Translation
mentioned in Parameter Tuning and Large Scale Discriminative Training@InProceedings{green-cer-manning:2014:W14-332,
author = {Green, Spence and Cer, Daniel and Manning, Christopher},
title = {An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation},
booktitle = {Proceedings of the Ninth Workshop on Statistical Machine Translation},
month = {June},
address = {Baltimore, Maryland, USA},
publisher = {Association for Computational Linguistics},
pages = {466--476},
url = {
http://www.aclweb.org/anthology/W14-3360},
year = 2014
}
Green et al. (2014)
Tan, Ming and Xia, Tian and Wang, Shaojun and Zhou, Bowen (2013):
A Corpus Level MIRA Tuning Strategy for Machine Translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
@InProceedings{tan-EtAl:2013:EMNLP,
author = {Tan, Ming and Xia, Tian and Wang, Shaojun and Zhou, Bowen},
title = {A Corpus Level {MIRA} Tuning Strategy for Machine Translation},
booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Seattle, Washington, USA},
publisher = {Association for Computational Linguistics},
pages = {851--856},
url = {
http://www.aclweb.org/anthology/D13-1083},
year = 2013
}
Tan et al. (2013)
Zhao, Kai and Huang, Liang and Mi, Haitao and Ittycheriah, Abe (2014):
Hierarchical MT Training using Max-Violation Perceptron, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{zhao-EtAl:2014:P14-2,
author = {Zhao, Kai and Huang, Liang and Mi, Haitao and Ittycheriah, Abe},
title = {Hierarchical {MT} Training using Max-Violation Perceptron},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {June},
address = {Baltimore, Maryland},
publisher = {Association for Computational Linguistics},
pages = {785--790},
url = {
http://www.aclweb.org/anthology/P14-2127},
year = 2014
}
Zhao et al. (2014)
Auli, Michael and Galley, Michel and Gao, Jianfeng (2014):
Large-scale Expected BLEU Training of Phrase-based Reordering Models, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
@InProceedings{auli-galley-gao:2014:EMNLP2014,
author = {Auli, Michael and Galley, Michel and Gao, Jianfeng},
title = {Large-scale Expected BLEU Training of Phrase-based Reordering Models},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {1250--1260},
url = {
http://www.aclweb.org/anthology/D14-1132},
year = 2014
}
Auli et al. (2014)
Simianer, Patrick and Riezler, Stefan (2013):
Multi-Task Learning for Improved Discriminative Training in SMT, Proceedings of the Eighth Workshop on Statistical Machine Translation
@InProceedings{simianer-riezler:2013:WMT,
author = {Simianer, Patrick and Riezler, Stefan},
title = {Multi-Task Learning for Improved Discriminative Training in {SMT}},
booktitle = {Proceedings of the Eighth Workshop on Statistical Machine Translation},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {292--300},
url = {
http://www.aclweb.org/anthology/W13-2236},
year = 2013
}
Simianer and Riezler (2013)
Flanigan, Jeffrey and Dyer, Chris and Carbonell, Jaime (2013):
Large-Scale Discriminative Training for Statistical Machine Translation Using Held-Out Line Search, Proceedings of NAACL-HLT
@inproceedings{flanigan2013large,
author = {Flanigan, Jeffrey and Dyer, Chris and Carbonell, Jaime},
title = {Large-Scale Discriminative Training for Statistical Machine Translation Using Held-Out Line Search},
url = {
http://www.aclweb.org/anthology/N13-1025},
googlescholar = {11168103960488599596},
booktitle = {Proceedings of NAACL-HLT},
pages = {248--258},
year = 2013
}
Flanigan et al. (2013)
Cherry, Colin and Foster, George (2012):
Batch tuning strategies for statistical machine translation, Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@inproceedings{cherry2012batch,
author = {Cherry, Colin and Foster, George},
title = {Batch tuning strategies for statistical machine translation},
url = {
http://www.aclweb.org/anthology/N/N12/N12-1047.pdf},
googlescholar = {13457139291854575466},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
pages = {427--436},
organization = {Association for Computational Linguistics},
year = 2012
}
Cherry and Foster (2012)
Gimpel, Kevin and Smith, Noah A (2012):
Structured ramp loss minimization for machine translation, Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@inproceedings{gimpel2012structured,
author = {Gimpel, Kevin and Smith, Noah A},
title = {Structured ramp loss minimization for machine translation},
url = {
http://www.cs.cmu.edu/~nasmith/papers/gimpel+smith.naacl12.pdf},
googlescholar = {14584730824265315099},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
pages = {221--231},
organization = {Association for Computational Linguistics},
year = 2012
}
Gimpel and Smith (2012)
Chiang, David (2012):
Hope and fear for discriminative training of statistical translation models, The Journal of Machine Learning Research
@article{chiang2012hope,
author = {Chiang, David},
title = {Hope and fear for discriminative training of statistical translation models},
url = {
http://www.isi.edu/~chiang/papers/chiang-jmlr12.pdf},
googlescholar = {12857447296546216175},
journal = {The Journal of Machine Learning Research},
volume = {98888},
pages = {1159--1187},
publisher = {JMLR. org},
year = 2012
}
Chiang (2012)
Green, Spence and Wang, Sida and Cer, Daniel and Manning, Christopher D (2013):
Fast and Adaptive Online Training of Feature-Rich Translation Models @inproceedings{green2013fast,
author = {Green, Spence and Wang, Sida and Cer, Daniel and Manning, Christopher D},
title = {Fast and Adaptive Online Training of Feature-Rich Translation Models},
url = {
http://www-nlp.stanford.edu/~sidaw/home/\_media/papers:onlinemt.pdf},
googlescholar = {4291958085712942190},
organization = {ACL},
year = 2013
}
Green et al. (2013)
Flanigan, Jeffrey and Dyer, Chris and Carbonell, Jaime (2013):
Large-Scale Discriminative Training for Statistical Machine Translation Using Held-Out Line Search, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@InProceedings{flanigan-dyer-carbonell:2013:NAACL-HLT,
author = {Flanigan, Jeffrey and Dyer, Chris and Carbonell, Jaime},
title = {Large-Scale Discriminative Training for Statistical Machine Translation Using Held-Out Line Search},
booktitle = {Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {Atlanta, Georgia},
publisher = {Association for Computational Linguistics},
pages = {248--258},
url = {
http://www.aclweb.org/anthology/N13-1025},
year = 2013
}
Flanigan et al. (2013)
Abhishek Arun and Barry Haddow and Philipp Koehn and Adam Lopez and Chris Dyer (2010):
Monte Carlo techniques for phrase-based translation, Machine Translation
@article{MTJ:2010:Arun,
author = {Abhishek Arun and Barry Haddow and Philipp Koehn and Adam Lopez and Chris Dyer},
title = {Monte {C}arlo techniques for phrase-based translation},
url = {
http://homepages.inf.ed.ac.uk/bhaddow/arun-mtsi-eps.pdf},
googlescholar = {4875145697102106083},
pages = {103-121},
journal = {Machine Translation},
volume = {24},
number = {2},
month = {June},
year = 2010
}
Arun et al. (2010)
Duan, Nan and Li, Mu and Zhou, Ming (2012):
Forced Derivation Tree based Model Training to Statistical Machine Translation, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
@InProceedings{duan-li-zhou:2012:EMNLP-CoNLL,
author = {Duan, Nan and Li, Mu and Zhou, Ming},
title = {Forced Derivation Tree based Model Training to Statistical Machine Translation},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {445--454},
url = {
http://www.aclweb.org/anthology/D12-1041},
year = 2012
}
Duan et al. (2012)
Simianer, Patrick and Riezler, Stefan and Dyer, Chris (2012):
Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{simianer-riezler-dyer:2012:ACL2012,
author = {Simianer, Patrick and Riezler, Stefan and Dyer, Chris},
title = {Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {11--21},
url = {
http://www.aclweb.org/anthology/P12-1002},
year = 2012
}
Simianer et al. (2012)
Wuebker, Joern and Hwang, Mei-Yuh and Quirk, Chris (2012):
Leave-One-Out Phrase Model Training for Large-Scale Deployment, Proceedings of the Seventh Workshop on Statistical Machine Translation
@InProceedings{wuebker-hwang-quirk:2012:WMT,
author = {Wuebker, Joern and Hwang, Mei-Yuh and Quirk, Chris},
title = {Leave-One-Out Phrase Model Training for Large-Scale Deployment},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
month = {June},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
pages = {457--464},
url = {
http://www.aclweb.org/anthology/W12-3158},
year = 2012
}
Wuebker et al. (2012)
Yuan Cao and Sanjeev Khudanpur (2012):
Sample Selection for Large-scale MT Discriminative Training, Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)
@inproceedings{AMTA-2012-Cao,
author = {Yuan Cao and Sanjeev Khudanpur},
title = {Sample Selection for Large-scale {MT} Discriminative Training},
url = {
http://www.mt-archive.info/AMTA-2012-Cao.pdf},
booktitle = {Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {San Diego, California},
year = 2012
}
Cao and Khudanpur (2012)
Eva Hasler and Barry Haddow and Philipp Koehn (2012):
Sparse lexicalised features and topic adaptation for SMT, Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{iwslt12:Hasler-2,
author = {Eva Hasler and Barry Haddow and Philipp Koehn},
title = {Sparse lexicalised features and topic adaptation for {SMT}},
url = {
http://www.mt-archive.info/IWSLT-2012-Hasler-2.pdf},
pages = {268-275},
booktitle = {Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)},
location = {Hong Kong},
year = 2012
}
Hasler et al. (2012)
Li, Zhifei and Wang, Ziyuan and Eisner, Jason and Khudanpur, Sanjeev and Roark, Brian (2011):
Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
@InProceedings{li-EtAl:2011:EMNLP1,
author = {Li, Zhifei and Wang, Ziyuan and Eisner, Jason and Khudanpur, Sanjeev and Roark, Brian},
title = {Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {920--929},
url = {
http://www.aclweb.org/anthology/D11-1085},
year = 2011
}
Li et al. (2011)
Xiao, Xinyan and Liu, Yang and Liu, Qun and Lin, Shouxun (2011):
Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
@InProceedings{xiao-EtAl:2011:EMNLP,
author = {Xiao, Xinyan and Liu, Yang and Liu, Qun and Lin, Shouxun},
title = {Fast Generation of Translation Forest for Large-Scale {SMT} Discriminative Training},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {880--888},
url = {
http://www.aclweb.org/anthology/D11-1081},
year = 2011
}
Xiao et al. (2011)