Minimum Bayes Risk
Training and decoding in statistical machine translation operates with imperfect models, so we may not want to aim for the most likely solutions but for the one that carries the least Bayesian risk.
Minimum Bayes Risk is the main subject of 17 publications. 8 are discussed here.
Minimum Bayes risk decoding was introduced initially for n-best-list re-ranking (Kumar and Byrne, 2004)
, and has been shown to be beneficial for many translation tasks (Ehling et al., 2007)
Computing minimum Bayes risk over lattices or hypergraphs (i.e., the full search graph of the decoder) takes advantage of a larger pool of evidence (Tromble et al., 2008)
. Allauzen et al. (2010)
present more efficient algorithms for computing the minimum Bayes risk translation from a lattice.
Optimizing for the expected error, not the actual error, may also be done in parameter tuning (Smith and Eisner, 2006)
Related to minimum Bayes risk is the use of n-gram posterior probabilities in re-ranking (Zens and Ney, 2006
; Alabau et al., 2007
; Alabau et al., 2007b)
- Duan et al. (2011)
- Duh et al. (2012)
- He and Deng (2012)
- Shimizu et al. (2012)
- consensus translation Pauls et al. (2009)
- Blackwood et al. (2010)
- Li and Eisner (2009)
- Kumar et al. (2009)
- Li et al. (2009)