Projects

Integrating A Few Rules In The Mt Pipeline

Project leader: Christian Buck

Desirable skills for participants: Perl/Python

Right now, the transtations done with moses are based entirely on statistical models, as we would expect from an SMT system. There are however some cases where we might want to include rule based guidance including the translation of markup (which usually should not be translated at all) and translation of parts that are mostly reformatted as it is the case for dates and numbers. There is some support for giving translation options using the XML capabilities of moses but this does not allow to sidestep e.g. the language model and is limited in tracing marked up words.

Most of this work would be implemented as pre- and postprocessing steps, e.g. analyzing the alignment to restore markup. Thus, applicants are not required to have prior knowledge of the moses codebase but should be familiar which some scripting language, e.g. Perl of Python.