Visualizing Machine Translation Search and Output

Participants: Philipp Koehn, Septina Dian Larasati, Valtezs Sics, Bushra Jawaid, Nathan Green, Michael Arcan, Firoj Alam

Agenda

  • Monday: Attend the lab today, we'll meet after the lab
  • Tuesday: Try out the analysis options described in the experiment.perl documentation. We'll meet after lunch at 2pm in the lecture room to go over work to be done.

The typical research method in statistical machine translation is what Adam Lopez once called a "cargo cult". Somewhat motivated changes are implemented to the baseline system, the test set is re-translated, and if the BLEU score goes up, then ritual song-and-dance paper writing is performed.

However, it may be good to know a bit more what is going on in the statistical machine translation system, why it arrived at certain translation, what are the main outstanding problems, etc.

There has been some effort, among others, to integrate more visualization and analysis into the web-based interface of experiment.perl. But more can be done.

  • Visualization of the search graph of the phrase-based decoder. There is a rough prototype for the chart decoder (using HTML5 SVG), which could be extended.
  • Interactive search for "hope" translations, i.e., acceptable translations that are within reach of the decoder but where not scored highest.
  • Collection of various interesting statistics
Page last modified on September 06, 2011, at 07:06 AM