Search Descriptions

General

Neural machine Translation

Statistical Machine Translation

Search Publications


author

title

other

year

Morphology

Languages with rich morphology which increases vocabulary size and, on the target size, requires the enforcement of agreement constraints.

Morphology is the main subject of 51 publications. 17 are discussed here.

Publications

Statistical models for morphology as part of a statistical machine translation system have been developed for inflected languages (Nießen and Ney, 2001; Nießen and Ney, 2004). Morphological features may be replaced by pseudo-words (Goldwater and McClosky, 2005). Morphological annotation is especially useful for small training corpora (Popović et al., 2005). For highly agglutinative languages such as Arabic, the main focus is splitting off affixes with various schemes (Habash and Sadat, 2006), or by combination of such schemes (Sadat and Habash, 2006). A straight-forward application of the morpheme splitting idea may not always lead to performance gains (Virpioja et al., 2007). Yang and Kirchhoff (2006) present a back-off method that resolves unknown words by increasingly aggressive morphological stemming and compound splitting. Denoual (2007) uses spelling similarity to find the translation of unknown words by analogy. Talbot and Osborne (2006) motivate similar work by reducing redundancy in the input language, again mostly by morphological stemming. Lemmatizing words may improve word alignment performance (Corston-Oliver and Gamon, 2004). Using the frequency of stems in the corpus in a finite state approach may guide when to split (Isbihani et al., 2006), potentially guided by a small lexicon (Riesa and Yarowsky, 2006). Splitting off affixes may also be a viable strategy when translating into morphologically rich languages (El-Kahlout and Oflazer, 2006). Different morphological analyses may be encoded in a confusion network as input to the translation system to avoid hard choices (Dyer, 2007; Dyer, 2007b). This approach may be extended to lattice decoding and tree-based models (Dyer et al., 2008).

Benchmarks

Discussion

Related Topics

New Publications

  • Salameh et al. (2014)
  • Daiber and Sima'an (2015)
  • Grönroos et al. (2015)
  • Grönroos et al. (2016)
  • Ramm and Fraser (2016)
  • Kunchukuttan and Bhattacharyya (2016)
  • Huck et al. (2017)
  • Salameh et al. (2015)
  • Eyigöz et al. (2013)
  • Cholakov and Kordoni (2014)
  • Williams and Koehn (2014)
  • Weller et al. (2013)
  • Hamdi et al. (2013)
  • Ramish et al. (2013)
  • Ramish et al. (2013)
  • Al-Haj and Lavie (2012)
  • Hasan et al. (2012)
  • Kholy and Habash (2012)
  • Al-Haj and Lavie (2012)
  • Hasan et al. (2012)
  • Kholy and Habash (2012)
  • Wang et al. (2011)
  • Kholy and Habash (2012)
  • Singh and Habash (2012)
  • Stallard et al. (2012)
  • Carpuat (2009)
  • Fraser (2009)
  • Bisazza and Federico (2009)
  • Luong et al. (2010)
  • Hong et al. (2009)
  • Durgar El-Kahlout and Yvon (2010)
  • Mansour (2010)
  • Stymne et al. (2010)
  • Virpioja et al. (2010)
  • Cartoni (2009)
  • Mel'čuk and Wanner (2008)
  • Sinha (2007)
  • Lee (2004)

Actions

Download

Contribute