General info

For the labs, we can use the workstations kindly provided by University of Trento in room 104.

To access, please your lab-account received for the MT Marathon. Please, remember to log out from the machine every time you leave the room. Please, do not lock the workstation for long period.

Machines are automatically switched off every evening at 7:30pm.

Software is pre-installed on the lab machines under /usr/local/smt_software/

  • GIZA (patched): /home/nicola.bertoldi_5/software/giza-pp/giza-pp-v1.0.5
  • IRSTLM: /usr/local/smt_software/irstlm/irstlm-r419
  • RANDLM: /usr/local/smt_software/randlm7/randlm-v0.2
  • SRILM: /usr/local/smt_software/srilm/srilm-1.5.10
  • MOSES: /usr/local/smt_software/moses/moses-r4163

Lab Material

Using EMS, set the variables to link the correct software as follows:

  • moses-src-dir = /usr/local/smt_software/moses/moses-r4163
  • srilm-dir = /usr/local/smt_software/srilm/srilm-1.5.10/bin/i686
  • decoder = $moses-src-dir/bin/moses
  • ttable-binarizer = $moses-src-dir/bin/processPhraseTable
  • training-options = "-bin-dir=/home/nicola.bertoldi_5/software/giza-pp/giza-pp-v1.0.5/bin"

Using EMS, COLLINS-PARSER is NOT available yet. Please, pay attention when using syntax stuff.

Day 1: EMS lab

Use the following commands to set up an experiment:

 cd
 mkdir experiment
 cd experiment
 cp /usr/local/smt_software/moses/moses-r4163/scripts/ems/example/config.toy .

Now edit in config.toy the following settings:

  • working-dir = /home/sci-mtm(YOUR-USER-ID)/experiment
  • moses-src-dir = /usr/local/smt_software/moses/moses-r4163
  • moses-script-dir = $moses-src-dir/scripts
  • srilm-dir = /usr/local/smt_software/srilm/srilm-1.5.10/bin/i686
  • decoder = $moses-src-dir/bin/moses
  • ttable-binarizer = $moses-src-dir/bin/processPhraseTable
  • training-options = "-bin-dir=/home/nicola.bertoldi_5/software/giza-pp/giza-pp-v1.0.5/bin"

You are now able to run the experiment:

 /usr/local/smt_software/moses/moses-r4163/scripts/ems/experiment.perl
 -config config.toy -exec

See the Moses documentation for EMS for more details.

Day 2: Model 1

We discussed IBM Model 1 in the lecture today. In this lab you will implement the EM algorithm for IBM Model 1 in your favorite programming language. Here are some data sets to train on:

Your program should output two different things:

  • A table containing the word translation probabilities that were learned (note: think of an efficient data structure for such a sparse matrix)
  • The most likely alignment (the Viterbi alignment) for each sentence pair in the training data

Pseudo-code of IBM Model 1 as presented in the lecture:

 initialize t(e|f) uniformly
 do until convergence
   set count(e|f) to 0 for all e,f
   set total(f) to 0 for all f
   for all sentence pairs (e_s,f_s)
     set total_s(e) = 0 for all e
     for all words e in e_s
       for all words f in f_s
         total_s(e) += t(e|f)
     for all words e in e_s
       for all words f in f_s
         count(e|f) += t(e|f) / total_s(e)
         total(f)   += t(e|f) / total_s(e)
   for all f
     for all e
       t(e|f) = count(e|f) / total(f)

Day 3: Evaluation

See the slides from Maja Popovic

Day 4 - Hierarchical Models

 cp config.toy config.hierarchical

Change the following:

  1. decoder = $moses-src-dir/bin/moses_chart
  2. ttable-binarizer = "$moses-src-dir/bin/CreateOnDiskPt 1 1 5 100 2"
  3. Delete or comment out the line
         lexicalized-reordering = msd-bidirectional-fe
  4. Uncomment line
         hierarchical-rule-set = true
  5. weight-config = $working-dir/weight_hiero.ini  [THIS HAS CHANGED!!]
  6. decoder-settings = "-search-algorithm 3 -cube-pruning-pop-limit 5000 -s 5000"

[3 instead of 1]

cp /usr/local/smt_software/moses/moses-r4163/scripts/ems/example/data/weight.ini weight_hiero.ini

Edit weight_hiero.ini : - remove the whole distortion block (including weights) - add one weight (value 0.2) to the # translation model weights block

You are now able to run the experiment:

 /usr/local/smt_software/moses/moses-r4163/scripts/ems/experiment.perl
 -config config.hierarchical -exec
Page last modified on September 12, 2011, at 07:08 AM