A word lattice is a directed acyclic graph with a single start point and edges labeled with a word and weight. Unlike confusion networks which additionally impose the requirement that every path must pass through every node, word lattices can represent any finite set of strings (although this generality makes word lattices slightly less space-efficient than confusion networks). However, in general a word lattice can represent an exponential number of sentences in polynomial space. Here is an example lattice showing possible ways of decompounding some compound words in German:
Moses can decode input represented as a word lattice, and, in most useful cases, do this far more efficiently than if each sentence encoded in the lattice were decoded serially. When Moses translates input encoded as a word lattice the translation it chooses maximizes the translation probability along any path in the input (but, to be clear, a single translation hypothesis in Moses corresponds to a single path through the input lattice).
Lattices are encoded by ordering the nodes in a topological ordering (there may be more than one way to do this- in general, any one is as good as any other, but see the comments on -max-phrase-length
below) and using this ordering to assign consecutive numerical IDs to the nodes. Then, proceeding in order through the nodes, each node lists its outgoing edges and any weights associated with them. For example, the above lattice can be written in the moses format (also called the Python lattice format -- PLF):
( ( ('einen', 1.0, 1), ), ( ('wettbewerbsbedingten', 0.5, 2), ('wettbewerbs', 0.25, 1), ('wettbewerb', 0.25, 1), ), ( ('bedingten', 1.0, 1), ), ( ('preissturz', 0.5, 2), ('preis', 0.5, 1), ), ( ('sturz', 1.0, 1), ), )
The second number is the probability associated with an edge. The third number is distance between the start and end nodes of the edge, defined as the numerical ID of the end node minus the numerical ID of the start node. Note that the nodes must be numbered in topological order for the distance calculation.
Typically, one writes lattices this with no spaces, on a single line as follows:
((('einen',1.0,1),),(('wettbewerbsbedingten',0.5,2),('wettbewerbs',0.25,1), \ ('wettbewerb',0.25, 1),),(('bedingten',1.0,1),),(('preissturz',0.5,2), \ ('preis',0.5,1),),(('sturz',1.0,1),),)
To indicate that moses will be reading lattices in PLF format, you need to specify -inputtype 2
on the command line or in the moses.ini configuration file. Additionally, it is necessary to specify the feature weight that will be used to incorporate arc probability (may not necessarily be a probability!) into the translation model. To do this, add -weight-i X
where X
is any real number.
In word lattices, the phrase length limit imposed by the -max-phrase-length
parameter (default: 20) limits the difference between the indices of the start and the end node of a phrase. If your lattice contains long jumps, you may need to increase -max-phrase-length
and/or renumber the nodes to make the jumps smaller.
checkplf
The command moses-cmd/src/checkplf
reads a PLF (lattice format) input file and verifies the format as well as producing statistics.
Here's an example running the application on buggy input:
./checkplf < tanaka.plf Reading PLF from STDIN... Line 1: edge goes beyond goal node at column position 8, edge label = 'TANAKA' Goal node expected at position 12, but edge references a node at position 13
Here's an example running the application on good input:
christopher-dyers-macbook:src redpony$ ./checkplf < ok.plf Reading PLF from STDIN... PLF format appears to be correct. STATISTICS: Number of lattices: 1 Total number of nodes: 7 Total number of edges: 9 Average density: 1.28571 edges/node Total number of paths: 4 Average number of paths: 4
If you use Moses's lattice translation in your research, please cite the following paper:
Chris Dyer, Smaranda Muresan, and Philip Resnik. Generalizing Word Lattice Translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), July 2008.