WMT is now a conference, but the shared task remains.
New languages are Turkish (hard!) and Romanian (easy). We also did not build Finnish systems last year, and threw out systems together with Edinburgh's in a joint submission.
Currently, some baseline systems [25]-[36] are being built. The web interface to experiments is accessible on CLSP machines under
http://ndc06/gems/
If that does not work, you have to build a tunnel.
ssh -fNL 11111:ndc06:80 [username]@login.clsp.jhu.edu
and then access the web site with
http://localhost:11111/gems/
I spent a lot of time last year to run the training pipeline properly on the CLSP cluster.
The recommended usage is to call experiment.perl
with the switch -cluster
, so the scheduler submits jobs to the cluster.
You will see a number of specification in the config files like:
[INTERPOLATED-LM] interpolate:qsub-settings = "-l 'arch=*64,mem_free=100G,ram_free=100G'"
These ensure that for specific jobs enough memory (and cores) are reserved.
The most challenging aspect is running the decoder on the massive models that are built for some of the language pairs. The recommended way to deal with that is to first copy the models to local disk and then start the decoder. This is done automatically with the cache-model
specification in the config files.
To avoid that too many instances of the model files are copied on too many machines, the use of machines for decoding and tuning is restricted. For instance, run tuning only on the b0* machines:
tune:qsub-settings = "-l 'hostname=b0*,arch=*64,mem_free=50G,ram_free=50G' -pe smp 16"
There are a bunch of things we could try:
The first baseline uses all official data. The second baseline includes Brown clustering (which helped on average 0.5 BLEU last time around). This matches mostly the WMT 2015 system. A write-up of that system is here.
Language Pair | JHU in 2015 | Best in 2015 | Baseline | w/ Brown clusters | w/ CC LM | w/ both | ttl 100 | w/ nnjm | w/ all | Syntax |
English-Turkish | - | - | [36-1] 7.84 (1.049) | [36-3] 8.18 (1.040) +.34 | [36-2] 9.40 (1.044) +1.56 | [36-4] 8.85 (1.040) +1.01 | [36-5] 8.86 (1.041) +1.02 | - | - | - |
Turkish-English | - | - | [35-1] 14.03 (0.994) | [35-3] 14.30 (0.988) +.27 | [35-2] 13.91 (1.011) -.12 | [35-4] 14.12 (1.010) +.09 | [35-5] 14.19 (1.015) +.16 | - | - | [51-1] 15.47 (0.921) +1.44 |
English-Finnish | - | 15.5 (Abumatran) | [34-1] 11.88 (1.053) | [24-3] 12.59 (1.055) +.71 | [34-2] 12.15 (1.074) +.27 | [34-4] 12.85 (1.059) +.97 | [35-5] 12.82 (1.061) +.94 | - | - | - |
Finnish-English | - | 19.7 (UEDIN) | [33-1] 16.55 (0.985) | [34-3] 16.90 (0.981) +.35 | [33-2] 16.41 (0.990) -.14 | [33-4] 16.93 (0.998) +.38 | [33-5] 16.82 (1.004) +.27 | - | - | |
English-Romanian | - | - | [32-6] 23.36 (1.007) | [32-4] 24.60 (1.006) +1.24 | [32-3] 23.29 (1.039) -.07 | [32-5] 23.49 (0.967) +.13 | [32-7] 23.55 (0.970) +.19 | [46-6] 23.73 (1.010) +.37 | [32-8] 23.49 (0.962) +.13 | |
Romanian-English | - | - | [31-2] 31.95 (1.014) | [31-4] 32.53 (1.020) +.58 | [31-3] 32.47 (1.018) +.52 | [31-5] 32.80 (1.015) +.85 | [31-5] 32.80 (1.016) +.85 | [50-2] 32.03 (1.015) +.08 | [31-7] 32.80 (1.019) +.85 | [50-1] 27.04 (0.934) -4.91 |
English-Russian | [11-6] 24.53 (1.034) | 24.3 (UEDIN) | [30-1] 23.89 (1.037) | [30-3] 24.96 (1.033) +1.07 | [30-2] 23.89 (1.055) +.00 | [30-4] 24.87 (1.050) +.98 | [30-6] 25.12 (1.055) +1.23 | [53-1] 24.37 (1.038) +.48 | [30-7] 25.16 (1.048) +1.27 | - |
Russian-English | [10-6] 27.96 (0.973) | 27.9 (JHU) | [29-1] 27.54 (0.978) | [29-3] 28.25 (0.974) +.71 | [29-2] 28.08 (0.981) +.54 | [29-4] 28.22 (0.979) +.68 | [29-5] 28.28 (0.981) +.74 | [54-1] 27.81 (0.979) +.27 | [29-6] 28.65 (0.989) +1.11 | |
English-Czech | [7-5] 18.11 (1.044) | 18.8 (CharlesU) | [28-1] 18.24 (1.044) | [28-3] 19.19 (1.044) +.95 | [28-2] 18.77 (1.048) +.53 | [28-4] 19.55 (1.046) +1.31 | [28-5] 19.53 (1.048) +1.29 | - | - | - |
Czech-English | [6-5] 26.38 (0.985) | 26.2 (JHU) | [27-1] 27.04 (0.985) | [27-3] 27.68 (0.987) +.64 | [27-2] 27.68 (0.994) +.64 | [27-5] 28.08 (0.993) +1.04 | [27-6] 28.18 (0.994) +1.14 | - | - | - |
English-German | [5-5] 22.70 (1.039) | 24.9 (Montreal) | [26-2] 22.67 (1.035) | [26-4] 22.99 (1.035) +.32 | [26-3] 22.51 (1.056) -.16 | [26-5] 22.73 (1.055) +.06 | [26-6] 22.70 (1.057) +.03 | [47-3] 22.62 (1.036) -.05 | [26-7] 22.88 (1.057) +.21 | |
German-English | [4-5] 29.15 (0.985) | 29.3 (UEDIN) | [25-1] 29.03 (0.983) | [25-3] 29.64 (0.986) +.61 | [25-2] 29.63 (0.996) +.60 | [25-5] 29.90 (0.993) +.87 | [25-6] 29.96 (0.994) +.93 | [48-3] | [25-8] 30.01 (0.998) +.98 |
Language Pair | Submitted | UEDIN Phrase | Best |
English-Turkish | |||
Turkish-English | |||
English-Finnish | |||
Finnish-English | |||
English-Romanian | |||
Romanian-English | |||
English-Russian | |||
Russian-English | |||
English-Czech | |||
Czech-English | |||
English-German | |||
German-English |