Features To Model Word Span Of Non Terminals
WIKI is stupid. We're using google doc
https://docs.google.com/document/d/1YzGLQCOx9fy__nACOsq4qV-sf6QjFhUPKcqR8gVIKEE/edit
Project leader: Hieu Hoang
Desirable skills for participants: hierarchical MT, C++
Create features and/or hard constraints that model the number of words that non-terminals in translation rules span In the hierarchical phrase-base translation model, non-terminal in a translation model can span a number of words. This project aims to improve reordering by modelling the distribution of the number of words spanned by each non-terminal in each translation rule. This distribution can then be used as feature functions or to constrain where particular rules can be applied. The hierarchical extract program in Moses has partially implemented this project. This project will be of interest to those who want an empirical understanding of the hierarchical translation model & reordering, which could lead to other avenues of research.