Link Deletion Word Alignment

Participant: Herve Saint-Amand

Traditional word alignments are not optimized for use with syntax-based translation models. Faulty alignment points often prohibit the extraction of a large number of useful rules.

The following paper presents a relative simple way of optimizing word alignments for syntax-based models:

Using Syntax to Improve Word Alignment Precision for Syntax-Based Machine Translation. Victoria Fossum, Kevin Knight, and Steven Abney. In Proceedings of ACL Statistical MT Workshop, 2008, pdf.

The idea is to start with the union of GIZA++ word alignments and train a perceptron classifier that deletes harmful alignment points. The classifier is trained on hand-aligned parallel corpora.

Page last modified on September 05, 2011, at 11:23 AM