By default, only a distance-based reordering model is included in final configuration. This model gives a cost linear to the reordering distance. For instance, skipping over two words costs twice as much as skipping over one word.
However, additional conditional reordering models, so called lexicalized reordering models, may be build. There are three types of lexicalized reordering models in Moses that are based on Koehn et al. (2005) and Galley and Manning (2008). The Koehn at al. model determines the orientation of two phrases based on word alignments at training time, and based on phrase alignments at decoding time. The other two models are based on Galley and Manning. The phrase-based model uses phrases both at training and decoding time, and the hierarchical model allows combinations of several phrases for determining the orientation.
The lexicalized reordering models are specified by a configuration string, containing five parts, that account for different aspects:
wbe
- word-based extraction (but phrase-based at decoding). This is the original model in Moses. DEFAULT
phrase
- phrase-based model
hier
- hierarchical model
mslr
- Considers four different orientations: monotone
, swap
, discontinuous-left
, discontinuous-right
msd
- Considers three different orientations: monotone
, swap
, discontinuous
(the two discontinuous classes of the mslr model are merged into one class)
monotonicity
- Considers two different orientations: monotone
or non-monotone
(swap
and discontinuous
of the msd model are merged into the non-monotone
class)
leftright
- Considers two different orientations: left
or right
(the four classes in the mslr
model are merged into two classes, swap
and discontinuous-left
into left
and monotone
and discontinuous-right
into right
)
backward
- determine orientation with respect to previous phrase DEFAULT
forward
- determine orientation with respect to following phrase
bidirectional
- use both backward and forward models
fe
- conditioned on both the source and target languages
f
- conditioned on the source language only
allff
- treat the scores as individual feature functions DEFAULT
collapseff
- collapse all scores in one direction into one feature function
any possible configuration of these five factors is allowed. It is always necessary to specify orientation and language. The other three factors use the default values indicated above if they are not specified. Some examples of possible models are:
msd-bidirectional-fe
(this model is commonly used, for instance it is the model used in the WMT baselines)
wbe-msd-bidirectional-fe-allff
same model as above
mslr-f
wbe-backward-mslr-f-allff
same model as above
phrase-msd-bidirectional-fe
hier-mslr-bidirectional-fe
hier-leftright-forward-f-collapseff
and of course distance
.
Which reordering model(s) that are used (and built during the training process, if necessary) can be set with the switch -reordering
, e.g.:
-reordering distance -reordering msd-bidirectional-fe -reordering msd-bidirectional-fe,hier-mslr-bidirectional-fe -reordering distance,msd-bidirectional-fe,hier-mslr-bidirectional-fe
Note that the distance model is always included, so there is no need to specify it.
The number of features that are created with a lexical reordering model depends on the type of the model. If the flag allff
is used, a msd
model has three features, one each for the probability that the phrase is translated monotone, swapped, or discontinuous, a mslr
model has four features and a monotonicity
or leftright
model has two features. If a bidirectional
model is used, then the number of features doubles - one for each direction. If collapseff
are used there is one feature for each direction, regardless of which orientation types that are used.
There are also a number of other flags that can be given to train-model.perl
that concerns the reordering models:
--reordering-smooth
- specifies the smoothing constant to be used for training lexicalized reordering models. If the letter u
follows the constant, smoothing is based on actual counts. (default 0.5)
--max-lexical-reordering
- if this flag is used, the extract file will contain information for the mslr orientations for all three model types, wbe
, phrase
and hier
. Otherwise the extract file will contain the minimum information that is needed based on which reordering model config strings that are given.