Word Error Rate
By aligning a system's output to a human reference translation, insertions, deletions and substitutions of words can be assessed.
Edit Rate Metrics is the main subject of 12 publications. 6 are discussed here.
Publications
Word error rate was first used for the evaluation of statistical machine translation by
Christoph Tillmann and Stephan Vogel and Hermann Ney and Alex Zubiaga (1997):
A DP-based Search Using Monotone Alignments in Statistical Translation, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL)
mentioned in Stack Decoding and Edit Rate Metrics@Inproceedings{Tillmann:1997,
author = {Christoph Tillmann and Stephan Vogel and Hermann Ney and Alex Zubiaga},
title = {A DP-based Search Using Monotone Alignments in Statistical Translation},
url = {
http://acl.ldc.upenn.edu/P/P97/P97-1037.pdf},
googlescholar = {16581080566414205368},
booktitle = {Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL)},
year = 1997
}
Tillmann et al. (1997), who also introduce position-independent error rate. Allowing block movement
Gregor Leusch and Nicola Ueffing and Hermann Ney (2003):
A Novel String-to-string Distance Measure with Applications to Machine Translation Evaluation, Proceedings of the MT Summit IX
@inproceedings{Leusch:2003,
author = { Gregor Leusch and Nicola Ueffing and Hermann Ney},
title = { A Novel String-to-string Distance Measure with Applications to Machine Translation Evaluation},
url = {
http://www.mt-archive.info/MTS-2003-Leusch.pdf},
googlescholar = {8123283591871021431},
booktitle = {Proceedings of the {MT} Summit IX},
year = 2003
}
(Leusch et al., 2003) leads to the definition of the CDER metric
Gregor Leusch and Nicola Ueffing and Hermann Ney (2006):
CDER: Efficient MT Evaluation Using Block Movements, Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics
@InProceedings{Leusch:2006:EACL,
author = {Gregor Leusch and Nicola Ueffing and Hermann Ney},
title = {{CDER}: Efficient {MT} Evaluation Using Block Movements},
url = {
http://acl.ldc.upenn.edu/E/E06/E06-1031.pdf},
googlescholar = {7356263408428543494},
booktitle = {Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Trento, Italy},
year = 2006
}
(Leusch et al., 2006). TER allows for arbitrary block movements
Matthew Snover and Bonnie J. Dorr and Richard Schwartz and Linnea Micciulla and John Makhoul (2006):
A Study of Translation Edit Rate with Targeted Human Annotation, 5th Conference of the Association for Machine Translation in the Americas (AMTA)
@InProceedings{Snover:2006:AMTA,
author = {Matthew Snover and Bonnie J. Dorr and Richard Schwartz and Linnea Micciulla and John Makhoul},
title = {A Study of Translation Edit Rate with Targeted Human Annotation},
url = {
http://mt-archive.info/AMTA-2006-Snover.pdf},
googlescholar = {1809540661740640949},
booktitle = {5th Conference of the Association for Machine Translation in the Americas (AMTA)},
month = {August},
address = {Boston, Massachusetts},
year = 2006
}
(Snover et al., 2006). MaxSIM uses an efficient polynomial matching algorithm that also uses lemma and part-of-speech tag matches
Chan, Yee Seng and Ng, Hwee Tou (2008):
MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation, Proceedings of ACL-08: HLT
@InProceedings{chan-ng:2008:ACLMain,
author = {Chan, Yee Seng and Ng, Hwee Tou},
title = {{MAXSIM}: A Maximum Similarity Metric for Machine Translation Evaluation},
booktitle = {Proceedings of ACL-08: HLT},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {55--62},
url = {
http://www.aclweb.org/anthology/P/P08/P08-1007},
year = 2008
}
(Chan and Ng, 2008).
The way automatic evaluation metrics work also depends on the tokenization of system output and reference translations
Gregor Leusch and Nicola Ueffing and David Vilar and Hermann Ney (2005):
Preprocessing and Normalization for Automatic Evaluation of Machine Translation, Proceedings of the ACL Workshop on Intrinsic and Evaluation Measures for Machine Translation
@InProceedings{Leusch:2005:WEval,
author = {Gregor Leusch and Nicola Ueffing and David Vilar and Hermann Ney},
title = {Preprocessing and Normalization for Automatic Evaluation of Machine Translation},
url = {
http://acl.ldc.upenn.edu/W/W05/W05-0903.pdf},
googlescholar = {17490287786118333766},
booktitle = {Proceedings of the ACL Workshop on Intrinsic and Evaluation Measures for Machine Translation},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {17--24},
year = 2005
}
(Leusch et al., 2005).
Benchmarks
Discussion
Related Topics
New Publications
Popović, Maja (2016):
chrF deconstructed: beta parameters and n-gram weights, Proceedings of the First Conference on Machine Translation
@InProceedings{popovic:2016:WMT,
author = {Popovi\'{c}, Maja},
title = {chrF deconstructed: beta parameters and n-gram weights},
booktitle = {Proceedings of the First Conference on Machine Translation},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {499--504},
url = {
http://www.aclweb.org/anthology/W/W16/W16-2341},
year = 2016
}
Popović (2016)
Wang, Weiyue and Peter, Jan-Thorsten and Rosendahl, Hendrik and Ney, Hermann (2016):
CharacTer: Translation Edit Rate on Character Level, Proceedings of the First Conference on Machine Translation
@InProceedings{wang-EtAl:2016:WMT,
author = {Wang, Weiyue and Peter, Jan-Thorsten and Rosendahl, Hendrik and Ney, Hermann},
title = {CharacTer: Translation Edit Rate on Character Level},
booktitle = {Proceedings of the First Conference on Machine Translation},
month = {August},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {505--510},
url = {
http://www.aclweb.org/anthology/W/W16/W16-2342},
year = 2016
}
Wang et al. (2016)
Dreyer, Markus and Marcu, Daniel (2012):
HyTER: Meaning-Equivalent Semantics for Translation Evaluation, Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@InProceedings{dreyer-marcu:2012:NAACL-HLT,
author = {Dreyer, Markus and Marcu, Daniel},
title = {HyTER: Meaning-Equivalent Semantics for Translation Evaluation},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {Montr\'{e}al, Canada},
publisher = {Association for Computational Linguistics},
pages = {162--171},
url = {
http://www.aclweb.org/anthology/N12-1017},
year = 2012
}
Dreyer and Marcu (2012)
Wang, Mengqiu and Manning, Christopher (2012):
SPEDE: Probabilistic Edit Distance Metrics for MT Evaluation, Proceedings of the Seventh Workshop on Statistical Machine Translation
@InProceedings{wang-manning:2012:WMT,
author = {Wang, Mengqiu and Manning, Christopher},
title = {SPEDE: Probabilistic Edit Distance Metrics for {MT} Evaluation},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
month = {June},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
pages = {73--80},
url = {
http://www.aclweb.org/anthology/W12-3107},
year = 2012
}
Wang and Manning (2012)
Gregor Leusch and Hermann Ney (2009):
Edit distances with block movements and error rate confidence estimates, Machine Translation
@article{MTJ:2009:Leusch,
author = {Gregor Leusch and Hermann Ney},
title = {Edit distances with block movements and error rate confidence estimates},
pages = {129--140},
journal = {Machine Translation},
volume = {23},
number = {2--3},
month = {September},
year = 2009
}
Leusch and Ney (2009)
Matthew G. Snover and Nitin Madnani and Bonnie Dorr and Richard Schwartz (2009):
TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate, Machine Translation
@article{MTJ:2009:Snover,
author = {Matthew G. Snover and Nitin Madnani and Bonnie Dorr and Richard Schwartz},
title = {TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate},
url = {
http://web.jhu.edu/sebin/w/e/terplusdorr.pdf},
googlescholar = {6786121275459855088},
pages = {117--127},
journal = {Machine Translation},
volume = {23},
number = {2--3},
month = {September},
year = 2009
}
Snover et al. (2009)