Neural Components in Statistical Machine Translation
Especially early work on neural networks for machine translation was aimed at building neural components to be used in traditional statistical machine translation systems.
Neural Components In Statistical Machine Translation is the main subject of 43 publications. 10 are discussed here.
By including aligned source words in the conditioning context, Devlin et al. (2014)
enrich a feed-forward neural network language model with source context
Zhang et al. (2015)
add a sentence embedding to the conditional context of this model, which are learned using a variant of convolutional neural networks and mapping them across languages. Meng et al. (2015)
use a more complex convolutional neural network to encode the input sentence that uses gated layers and also incorporates information about the output context.
Lexicalized reordering models struggle with sparse data problems when conditioned on rich context. Li et al. (2014)
show that a neural reordering model can be conditioned on current and previous phrase pair (encoded with a recursive neural network auto-encoder) to make the same classification decisions for orientation type.
Pre-Ordering: Instead of handing reordering within the decoding process, we may pre-order the input sentence into output word order.
Gispert et al. (2015)
use an input dependency tree to learn a model that swaps children nodes and implement it using a feed-forward neural network. Barone and Attardi (2015)
formulate a top-down left-to-right walk through the dependency tree and make reordering decisions at any node. They model this process with a recurrent neural network that includes past decisions in the conditioning context.
N-Gram Translation Models:
An alternative view of the phrase based translation model is to break up phrase translations into minimal translation units, and employing a n-gram model over these units to condition each minimal translation units on the previous ones. Schwenk et al. (2007)
treat each minimal translation unit as an atomic symbol and train a neural language model over it. Alternatively, (Hu et al., 2014)
represent the minimal translation units as bag of words, (Wu et al., 2014)
break them even further into single input words, single output words, or single input-output word pairs, and Yu and Zhu (2015)
use phrase embeddings leaned with an auto-encoder.
Neural Models as Statistical Machine Translation Components
- Wang et al. (2016)
- Wang et al. (2017)
- Sennrich (2015)
- Zhang et al. (2016)
- Peter et al. (2016)
- Zhang et al. (2015)
- Setiawan et al. (2015)
- Lu et al. (2014)
- Zhai et al. (2014)
- Liu et al. (2013)
- Liu et al. (2014)
- Kanouchi et al. (2016)
- Cui et al. (2016)
- Stahlberg et al. (2016)
- Sundermeyer et al. (2014)
- Schwenk (2012)
- Le et al. (2012)
- Addanki and Wu (2014)
- Do et al. (2014)
- Li et al. (2013)
- Wu et al. (2014)
- Auli et al. (2013)
- Legrand et al. (2016)
- Sabet et al. (2016)
- Tamura et al. (2014)
- Yang et al. (2013)
- Soricut and Och (2015)
- Tran et al. (2014)
Evaluation Metrics based on Neural Models
- Guzmán et al. (2016)
- Gupta et al. (2015)
- Guzmán et al. (2015)