Training Step 7: Build reordering model

By default, only a distance-based reordering model is included in final configuration. This model gives a cost linear to the reordering distance. For instance, skipping over two words costs twice as much as skipping over one word.

However, additional conditional reordering models, so called lexicalized reordering models, may be build. There are three types of lexicalized reordering models in Moses that are based on Koehn et al. (2005) and Galley and Manning (2008). The Koehn at al. model determines the orientation of two phrases based on word alignments at training time, and based on phrase alignments at decoding time. The other two models are based on Galley and Manning. The phrase-based model uses phrases both at training and decoding time, and the hierarchical model allows combinations of several phrases for determining the orientation.

The lexicalized reordering models are specified by a configuration string, containing five parts, that account for different aspects:

any possible configuration of these five factors is allowed. It is always necessary to specify orientation and language. The other three factors use the default values indicated above if they are not specified. Some examples of possible models are:

and of course distance.

Which reordering model(s) that are used (and built during the training process, if necessary) can be set with the switch -reordering, e.g.:

 -reordering distance
 -reordering msd-bidirectional-fe
 -reordering msd-bidirectional-fe,hier-mslr-bidirectional-fe
 -reordering distance,msd-bidirectional-fe,hier-mslr-bidirectional-fe

Note that the distance model is always included, so there is no need to specify it.

The number of features that are created with a lexical reordering model depends on the type of the model. If the flag allff is used, a msd model has three features, one each for the probability that the phrase is translated monotone, swapped, or discontinuous, a mslr model has four features and a monotonicity or leftright model has two features. If a bidirectional model is used, then the number of features doubles - one for each direction. If collapseff are used there is one feature for each direction, regardless of which orientation types that are used.

There are also a number of other flags that can be given to train-model.perl that concerns the reordering models: