Projects

Zones As Features For Phrase Based Decoding

Project leader: Colin Cherry

Desirable skills for participants: C++

Moses currently implements zones in the input text as hard re-ordering constraints: once you enter a zone, everything inside of it must be translated before exiting the zone. This project would use the same idea to create features. Annotated zones of the input would incur learned penalties when interrupted. Adapters to create annotated zones (and therefore, features) from the output of popular NLP tools such as PTB constituency parsers, Stanford dependencies, or named entity recognizers would be added to Moses as part of this project.

This work is inspired by the success of previous attempts to create soft constraints from the output of syntactic parsers, such as:

Colin Cherry, Cohesive Phrase-Based Decoding for Statistical Machine Translation, in Proceedings of ACL: HLT, June 2008

Yuval Marton, Philip Resnik. Soft Syntactic Constraints for Hierarchical Phrased-Based Translation, in Proceedings of ACL: HLT, June 2008

The hope is that these systems will have even greater impact in the presence of improved tuning technology (PRO/MIRA). This result would not be a surprise (read as: would probably not be publishable), as these sorts of features were tested in an early MIRA paper:

David Chiang; Yuval Marton; Philip Resnik, Online Large-Margin Training of Syntactic and Structural Translation Features, in Proceedings of EMNLP, October, 2008

The primary goal is to provide authors of monolingual NLP tools an easy way to test the impact of their automatic linguistic annotations on a well-defined problem such as SMT. A sub-goal is to compare various sources of annotated, unlikely-to-be-violated spans, such a dependency subtrees, constituency subtrees, syntactic chunks or named entities. The penalty for leaving any such span could be learned based on attributes of the span (non-terminal labels for constituency trees, part-of-speech or arc-labels for dependencies).

Stretch goals could involve implementing the constraints / features for hierarchical decoding as well. We could also look into engineering more complex features that take into account details such as span length and nesting depth.