Search Descriptions


Neural machine Translation

Statistical Machine Translation

Search Publications





Neural Components in Statistical Machine Translation

Especially early work on neural networks for machine translation was aimed at building neural components to be used in traditional statistical machine translation systems.

Neural Components In Statistical Machine Translation is the main subject of 43 publications. 10 are discussed here.


Translation Models: By including aligned source words in the conditioning context, Devlin et al. (2014) enrich a feed-forward neural network language model with source context Zhang et al. (2015) add a sentence embedding to the conditional context of this model, which are learned using a variant of convolutional neural networks and mapping them across languages. Meng et al. (2015) use a more complex convolutional neural network to encode the input sentence that uses gated layers and also incorporates information about the output context.
Reordering Models: Lexicalized reordering models struggle with sparse data problems when conditioned on rich context. Li et al. (2014) show that a neural reordering model can be conditioned on current and previous phrase pair (encoded with a recursive neural network auto-encoder) to make the same classification decisions for orientation type.

Pre-Ordering: Instead of handing reordering within the decoding process, we may pre-order the input sentence into output word order.

Gispert et al. (2015) use an input dependency tree to learn a model that swaps children nodes and implement it using a feed-forward neural network. Barone and Attardi (2015) formulate a top-down left-to-right walk through the dependency tree and make reordering decisions at any node. They model this process with a recurrent neural network that includes past decisions in the conditioning context.
N-Gram Translation Models: An alternative view of the phrase based translation model is to break up phrase translations into minimal translation units, and employing a n-gram model over these units to condition each minimal translation units on the previous ones. Schwenk et al. (2007) treat each minimal translation unit as an atomic symbol and train a neural language model over it. Alternatively, (Hu et al., 2014) represent the minimal translation units as bag of words, (Wu et al., 2014) break them even further into single input words, single output words, or single input-output word pairs, and Yu and Zhu (2015) use phrase embeddings leaned with an auto-encoder.



Related Topics

New Publications

  • Durrani and Dalvi (2017)

Neural Models as Statistical Machine Translation Components

  • Wang et al. (2016)
  • Wang et al. (2017)
  • Sennrich (2015)
  • Zhang et al. (2016)
  • Peter et al. (2016)
  • Zhang et al. (2015)
  • Setiawan et al. (2015)
  • Lu et al. (2014)
  • Zhai et al. (2014)
  • Liu et al. (2013)
  • Liu et al. (2014)

Reordering Models

  • Kanouchi et al. (2016)
  • Cui et al. (2016)

Translation Models

  • Stahlberg et al. (2016)
  • Sundermeyer et al. (2014)
  • Schwenk (2012)
  • Le et al. (2012)
  • Addanki and Wu (2014)
  • Do et al. (2014)
  • Li et al. (2013)
  • Wu et al. (2014)
  • Auli et al. (2013)

Word Alignment

  • Legrand et al. (2016)
  • Sabet et al. (2016)
  • Tamura et al. (2014)
  • Yang et al. (2013)


  • Soricut and Och (2015)
  • Tran et al. (2014)

Topic Models

  • Cui et al. (2014)

Evaluation Metrics based on Neural Models

  • Guzmán et al. (2016)
  • Gupta et al. (2015)
  • Guzmán et al. (2015)