Computer Aided Translation

Participants: Philipp Koehn, Logan Athan, Paola Valli, Rudolf Rosa, Sravana Reddy

Agenda for Monday: Attend lab, look at Caitra, read:

A process study of computer-aided translation, Philipp Koehn, Machine Translation Journal, 2009, volume 23, number 4, pages 241-263, pdf
Convergence of Translation Memory and Statistical Machine Translation, Philipp Koehn and Jean Senellart, AMTA Workshop on MT Research and the Translation Industry, 2010, pdf

We will meet Tuesday after the lectures

With increasing use of machine translation as a tool for human translators, there are a number of ideas to provide more useful assistance than just plain post-editing machine translation output.

One interesting paradigm is interactive machine translation, where the translator types in the translation, while the system suggests the next words or phrases. We have a prototype implementation of this: Caitra.

The partial translation of the user (sometimes called a prefix) is matched against the search graph. If not exact match is found, the closest match according to string edit distance is returned.

Possible project:

Matching with other error metrics, such as TER, to better account for reordering
Matching against the hypergraph of a chart decoder

A commonly used tool by translators are translation memories (TM). When translating material that is similar to previously translated data (for instance, a revision of a product manual), then finding sentences that have been previously translated, or previously translated sentences that are very close (called "fuzzy matches), can be a useful starting point for human translator.

Such fuzzy matches can be also reformulated as rules for a machine translation system. We have demonstrated in prior work that this is a helpful, achieving better results than plain SMT or TM. This approach could be easily integrated into Moses as a MT Marathon project.

Other possible projects: confidence measures on sentence or word level.

Page last modified on September 06, 2011, at 07:08 AM