machine translation

Word Lattices

A word lattice is a directed acyclic graph with a single start point and edges labeled with a word and weight. Unlike confusion networks which additionally impose the requirement that every path must pass through every node, word lattices can represent any finite set of strings (although this generality makes word lattices slightly less space-efficient than confusion networks). However, in general a word lattice can represent an exponential number of sentences in polynomial space. Here is an example lattice showing possible ways of decompounding some compound words in German:

Lattice Example

Moses can decode input represented as a word lattice, and, in most useful cases, do this far more efficiently than if each sentence encoded in the lattice were decoded serially. When Moses translates input encoded as a word lattice the translation it chooses maximizes the translation probability along any path in the input (but, to be clear, a single translation hypothesis in Moses corresponds to a single path through the input lattice).

How to represent lattice inputs

Lattices are encoded by ordering the nodes in a topological ordering (there may be more than one way to do this- in general, any one is as good as any other, but see the comments on -max-phrase-length below) and using this ordering to assign consecutive numerical IDs to the nodes. Then, proceeding in order through the nodes, each node lists its outgoing edges and any weights associated with them. For example, the above lattice can be written in the moses format (also called the Python lattice format -- PLF):

   ('einen', 1.0, 1),
   ('wettbewerbsbedingten', 0.5, 2),
   ('wettbewerbs', 0.25, 1),
   ('wettbewerb', 0.25, 1),
   ('bedingten', 1.0, 1),
   ('preissturz', 0.5, 2),
   ('preis', 0.5, 1),
   ('sturz', 1.0, 1),

The second number is the probability associated with an edge. The third number is distance between the start and end nodes of the edge, defined as the numerical ID of the end node minus the numerical ID of the start node. Note that the nodes must be numbered in topological order for the distance calculation.

Typically, one writes lattices this with no spaces, on a single line as follows:

 ((('einen',1.0,1),),(('wettbewerbsbedingten',0.5,2),('wettbewerbs',0.25,1), \
 ('wettbewerb',0.25, 1),),(('bedingten',1.0,1),),(('preissturz',0.5,2), \

Configuring moses to translate lattices

To indicate that moses will be reading lattices in PLF format, you need to specify -inputtype 2 on the command line or in the moses.ini configuration file. Additionally, it is necessary to specify the feature weight that will be used to incorporate arc probability (may not necessarily be a probability!) into the translation model. To do this, add -weight-i X where X is any real number.

In word lattices, the phrase length limit imposed by the -max-phrase-length parameter (default: 20) limits the difference between the indices of the start and the end node of a phrase. If your lattice contains long jumps, you may need to increase -max-phrase-length and/or renumber the nodes to make the jumps smaller.

Verifying PLF files with checkplf

The command moses-cmd/src/checkplf reads a PLF (lattice format) input file and verifies the format as well as producing statistics.

Here's an example running the application on buggy input:

 ./checkplf < tanaka.plf  
 Reading PLF from STDIN...
 Line 1: edge goes beyond goal node at column position 8, edge label = 'TANAKA'
   Goal node expected at position 12, but edge references a node at position 13

Here's an example running the application on good input:

 christopher-dyers-macbook:src redpony$ ./checkplf < ok.plf 
 Reading PLF from STDIN... 
 PLF format appears to be correct.
        Number of lattices: 1
     Total number of nodes: 7
     Total number of edges: 9
           Average density: 1.28571 edges/node
     Total number of paths: 4
   Average number of paths: 4


If you use Moses's lattice translation in your research, please cite the following paper:

Chris Dyer, Smaranda Muresan, and Philip Resnik. Generalizing Word Lattice Translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), July 2008.

Edit - History - Print
Page last modified on March 08, 2013, at 04:39 PM