Search Descriptions


Neural machine Translation

Statistical Machine Translation

Search Publications





Evaluation Campaigns

Evaluation campaigns play an important role in the field of statistical machine translation by showcasing new methods in an independent forum.

Evaluation Campaigns is the main subject of 48 publications. 22 are discussed here.


The first machine translation evaluation campaign, in which both statistical and traditional rule-based machine translation systems participated was organized by ARPA in the early 1990s. White et al. (1994) present results and discuss in detail the experience with different evaluation strategies. The straight-forward application of metrics for human translators proved difficult, and was abandoned along with measurements of productivity of human assisted translation, with hinges to a large degree on the quality of the support tools and the level of expertise of the translator. Ultimately, only reading comprehension test with multiple choice questions and adequacy and fluency judgments were used in the final evaluation. Hamon et al. (2007) discuss the evaluation of a speech-to-speech translation system.
Currently, the development of statistical machine translation systems is driven by translation competitions, most notable annual competitions organized by DARPA on Arabic–English and Chinese–English in an open news domain (since 2001). The International Workshop on Spoken Language Translation (IWSLT) UNKNOWN CITATION 'iwslt04:WO_tsujii'; Eck and Hori, 2005; Paul, 2006; Fordyce, 2007; Paul, 2008; Paul, 2009; Paul et al., 2010; Federico et al., 2011; Federico et al., 2012; Cettolo et al., 2013) is an annual competition on spoken language translation covering may languages mostly using the TED Talks corpus (Cettolo et al., 2012). The WMT campaign is a competition on European languages mostly using the European Parliament proceedings (Koehn and Monz, 2005; Koehn and Monz, 2006; Callison-Burch et al., 2007; Callison-Burch et al., 2008; Callison-Burch et al., 2009; Callison-Burch et al., 2010; Callison-Burch et al., 2011; Callison-Burch et al., 2012).
Within the CoSyne project, Toral et al. (2011) compare the leading online machine translation systems with their own statistical machine translation system on a number of automatic metrics.



Related Topics

New Publications

  • Niehues et al. (2018)
  • Cettolo et al. (2017)
  • Banchs et al. (2015)
  • Stanojević et al. (2015)
  • Stanojević et al. (2015)
  • Cettolo et al. (2015)
  • Tan (2016)
  • Bojar et al. (2015)
  • Guillou et al. (2016)
  • Jawaid et al. (2016)
  • Bojar et al. (2016)
  • Specia et al. (2016)
  • Bojar et al. (2016)
  • Cettolo et al. (2016)
  • Nakazawa et al. (2016)
  • Braslavski et al. (2013)
  • Bojar et al. (2013)
  • Machacek and Bojar (2014)
  • Bojar et al. (2014)
  • Graham et al. (2014)
  • Yang et al. (2014)
  • Wübker et al. (2014)
  • Durrani et al. (2014)
  • Nguyen et al. (2014)
  • Macháček and Bojar (2013)
  • Fujii et al. (2008)
  • Zhao et al. (2009)