Agenda for close-term work

(in parenthesis I put the responsible and the deadline, if any)

  • [DONE] read from gzipped nbest files (Nicola, by Wednesday May 21)
  • [PARTIALLY DONE] add check consistency for features and scores files
    (Nicola, by Friday May 23)
  • modify mert-moses.pl and enhanced-mert.pl to interface with new mert
    (Barry, ???)
  • check correctness of optimization algorithm
    (Jean-Baptiste, ???)
  • add NIST score (???, ???)
  • performance evaluation (Nicola, by Friday 14, June)

Improvement Of Minimum Error Rate Training

Developers

  • Nicola Bertoldi
  • Jean-Baptiste Fouet
  • Barry Haddow

Goal

  • improve Minimum Error Rate Training (MERT) to increase:
    • efficiency (speed, disk occupancy)
    • modularity
  • support:
    • distributed computation
    • new error measures
    • new optimization algorithms
    • reranking
  • implement:
    • new error measures
    • new optimization algorithms
  • rewrite in C++
  • documentation

Brainstorming

  • modification of the inner loop of mert moses
  • optimization algorithm is independent from error mesure
  • store statistics and features in a binary format to speed up I/O
  • provide a text format for debugging
  • compare performance (results and speed) wrt old code
  • ...

Work to do

  • define new architecture
  • define new objects
  • define correlation between objects
  • define new formats for features and error statistics
  • implement more error measures: BLEU, NIST, WER, PER, METEOR(?), AER, ...
  • extract feature scores and statistics for many error measures at once
  • combine more error measures
  • implement more optimization algorithm: Simplex, Powell, Sampling, ..., dummy random search
  • Figure out why random search only works on debug build!
  • implement interfaces for more nbest formats (Moses, BTEC, ...) (if required)
  • optimization over a subset of features (not finished)
  • extract 1best given a set of feature weights
    provide pointers between statistics and actual nbest texts
  • efficient binarization
  • add consistency check for files, ...
  • add support for reading gzipped files
  • modify interface with mert-moses.pl and enhanced-mert.pl
  • Investigate meteor support within the current interface
  • documentation
  • regression tests
  • evaluation of speed wrt old code
  • evaluation of error measures
  • evaluation of optimization algorithm

Work done

  • defined new architecture
  • defined new objects
  • defined correlation between objects
  • defined new formats for features and error statistics
  • created normalise.py, to perform nist-bleu normalisation of nbest file and references
  • implemented BLEU4 (multiple references, shortest/closest/average ref length)
    and PER (single reference)
  • implemented dummy random optimization
  • reading gzipped (text) files is now supported (they should have the .gz suffix)
  • efficient binarization

User guide

(More details will follow)

In trunk/mert/example there is a toy example

  • extraction of feature scores and the statistics of an error measure
 extractor --nbest NBEST
           --reference REF.0 REF.1 REF.2
           --sctype [BLEU4|PER]
           --ffile FEATSTAT.out
           --scfile SCORESTAT.out
           [--prev-ffile FEATSTAT.in]
           [--prev-scfile SCORESTAT.in]
           [--binary]
--binary: save data in a binary format
--prev-ffile: file with already computed feature scores
--prev-scfile: file with already computed error statistics
FEATSTAT.in and SCORESTAT.in can be either in zipped textual, unzipped textual or binary format
NBEST can be either in zipped or unzipped textual format
  • optimization of feature weights given the feature scores and the error statistics
 mert      --ffile FEATSTAT.in
           --scfile SCORESTAT.in
           -t Powell
FEATSTAT.in and SCORESTAT.in can be either in zipped textual, unzipped textual or binary format

Formats

  • textual format for features scores:
    • an header "FEATURES_BEGIN i N_i R f1 f2 ..fR" reporting:
      • a string identifying the type of the file "FEATURES_BEGIN"
      • the index "i" of the utterance nbest refer to
      • the size "N_i" of the nbest list
      • the number (R) of features
      • the list of the names of the features
    • one entry for each nbest reporting the R feature values
    • a footer reporting the string "FEATURES_END" identifying the type of the file
 FEATURES_BEGIN 0 4 5 w_0 lm_0 lm_1 tm_0 tm_1
 0.1 2.1 3.1 0.4 -2.1
 2.3 3.1 1.1 -20.4 -2.0
 0.9 2.1 3.1 0.4 -2.1
 1.2 2.1 3.1 0.4 -2.1
 FEATURES_END
 FEATURES_BEGIN 1 3 5 w_0 lm_0 lm_1 tm_0 tm_1
 0.1 2.1 3.1 0.4 -2.1
 ....
 FEATURES_END
  • textual format for error scores:
    • an header "SCORES_BEGIN i N_i S name" reporting:
      • a string identifying the type of the file "SCORES_BEGIN"
      • the index "i" of the utterance nbest refer to
      • the size "N_i" of the nbest list
      • the number (S) of score statistics
      • the name of the score measure (BLEU, NIST, PER)
    • one entry for each nbest reporting the S score statistics
    • a footer reporting the string "SCORES_END" identifying the type of the file
 SCORES_BEGIN 0 4 9 BLEU
 3 4 3 3 1 2 1 1 5
 2 5 1 4 1 3 1 2 6
 9 12 5 11 3 10  4 9 11
 6 21 4 20 3 19 2 18 20
 SCORES_END
 SCORES_BEGIN 1 3 9 BLEU
 3 4 3 3 1 2 1 1 5
 ...
 SCORES_END
  • keep the two files alignment
  • header of error scores file depends on the error measure
Page last modified on August 05, 2008, at 12:43 PM