Search Descriptions


Neural machine Translation

Statistical Machine Translation

Search Publications





Linguistic Annotation

Moving neural machine translation towards models that are based on linguistic insight into language include adding linguistic annotation at the word level or model syntactic or semantic structure.

Linguistic Annotation is the main subject of 65 publications. 11 are discussed here.


Wu et al. (2012) propose to use factored representations of words (using lemma, stem, and part of speech), with each factor encoded in a one-hot vector, in the input to a recurrent neural network language model. Sennrich and Haddow (2016) use such representations in the input and output of neural machine translation models, demonstrating better translation quality. Aharoni and Goldberg (2017) encode syntax with special start-of-phrase and end-of-phrase tokens in a linearized sequence and use these both on the input and the output of traditional sequence-to-sequence models.
Hirschmann et al. (2016) propose to tackle the problem of translating German compounds by splitting them into their constituent words.
Huck et al. (2017) segment words based on morphological principles: separating prefixes and suffixes and splitting compounds, showing superior performance compared to the data-driven byte-pair encoding. Burlot et al. (2017) also detach morphemes from the lemma, but replace them with tags that indicate their morphological features. Tamchyna et al. (2017) use the same method for Czech, but with deterministic tags, avoiding a disambiguation post-editing step.
Nadejde et al. (2017) add syntactic CCG tags to each output word, thus encouraging the model to also produce proper syntactic structure alongside a fluent sequence of words.
Pu et al. (2017) first train a word sense disambiguation model based WordNet senses, based on their sense description and then use it to augment the input sentence with sense tags. Rios et al. (2017) also perform word sense disambiguation and enrich the input with sense embeddings and semantically related words from previous input text.
Ma et al. (2019) computes the distance between source words in the syntax tree and uses this information when attending to words. The prediction of the syntactic distance is done while translating, using it as a secondary training objective during training.



Related Topics

New Publications

  • Currey and Heafield (2019)
  • Akoury et al. (2019)
  • Stanovsky et al. (2019)
  • Yang et al. (2019)
  • Oo et al. (2019)
  • Shapiro and Duh (2019)
  • Song et al. (2019)
  • Zhang et al. (2019)
  • Dalvi et al. (2017)
  • Le et al. (2017)
  • Watanabe et al. (2017)
  • Khadivi et al. (2017)
  • España-Bonet and Genabith (2017)
  • Sperber et al. (2017)
  • Wong et al. (2018)
  • Kuang et al. (2018)
  • Zhang et al. (2018)
  • Zaremoodi and Haffari (2018)
  • Zhang et al. (2018)
  • Passban et al. (2018)
  • Zhai et al. (2018)
  • Dhar et al. (2018)
  • Štajner and Popovi\'c (2018)
  • Beloucif and Wu (2018)
  • Wang et al. (2018)
  • Šoštari\'c et al. (2018)
  • Stojanovski and Fraser (2018)
  • Micher (2018)
  • Currey and Heafield (2018)
  • Conforti et al. (2018)
  • Vanmassenhove and Way (2018)
  • Marcheggiani et al. (2018)
  • Agrawal et al. (2018)
  • Bisk and Tran (2018)
  • Ma et al. (2018)
  • Voita et al. (2018)
  • Saunders et al. (2018)
  • G\=u et al. (2018)
  • Currey and Heafield (2018)
  • Wang et al. (2018)
  • Vanmassenhove et al. (2018)
  • Wang et al. (2018)
  • Hashimoto and Tsuruoka (2017)
  • Bastings et al. (2017)

Linguistic Features

  • Eriguchi et al. (2016)
  • Martínez et al. (2016)
  • Zhang et al. (2016)
  • Yamagishi et al. (2016)

Syntactic Models

  • Chen et al. (2017)
  • Li et al. (2017)
  • Wu et al. (2017)
  • Eriguchi et al. (2017)

Chunk-Based Decoding

  • Zhou et al. (2017)
  • Ishiwatari et al. (2017)