experiment.perl, or Experiment Management System (EMS), for lack of a better name, is our experimental management and report system for machine translation experiments with Moses.
In order to run properly, EMS will require:
Experiment.perl is extremely simple to use:
experiment.perl
(see SVN information below).
mkdir
does it).
working-dir
in the config file.
experiment.perl -config CONFIG
from your experiment working directory.
experiment.perl -config CONFIG -exec
.
If you survived this process you may want to familiarize yourself with the parameters in the configuration file. If some of them are unclear, try to investigate and write a much more informative description of the paramter into the config.
Other options:
--no-graph
supresses the display of the graph
--continue RUN
continues the experiment RUN
, which crashed earlier. Make sure that crashed step and its output is deleted (this should be done automatically at some point, se TODO below). Also, make sure to specify the right config file (i.e. in the current run directory) when using --continue.
The latest experiment.perl
is available from the moses subversion repository.
mosesdecoder/
trunk/
scripts/
ems/
a expriment management system
experiment.perl
your entry point in the management system
experiment.meta
template file defining how an experiment is ran
config/
sample self-documented config files go here
support/
experiment.perl sub-scripts
web/
the web interface for EMS
A useful command for seeing what has changed in experiment.perl
since you last downloaded it is svn diff
.
> cd ems > svn diff --revision HEAD experiment.perl
Right now, Hieu, Philipp and Josh are set to receive e-mail notification whenever a commit is made to the repository. If anyone you want to be added, let us know. If there is enough interest we will set up a mailing list for notifications.
Experiment.perl is a experiment management tool. You have to define an experiment in a configuration file and experiment.perl figures out which steps need to be run and schedules them either as jobs on a cluster or runs them serially on a single machine.
An experimental run is broken up into several steps. Here a typical example:
In this graph, each step is a small box. Experiment.perl builds for each step a script file that gets either submitted to the cluster or run on the same machine. Note that some steps are quite involved, for instance tuning: On a cluster, the tuning script runs on the head node a submits jobs to the queue itself.
The main stages of running an experiment are:
The actual steps, their dependencies and other salient information is to be found in the file experiment.meta
. Think of experiment.meta as a "template" file.
Here the parts of the step description for CORPUS:get-corpus
and CORPUS:tokenize
:
get-corpus in: get-corpus-script out: raw-stem [...] tokenize in: raw-stem out: tokenized-stem [...]
Each step takes some input (in
) and provides some output (out
). This also establishes the dependencies between the steps. The step tokenize
requires the input raw-stem
. This is provided by the step get-corpus
.
experiment.meta
provides a generic template for steps and their interaction. For an actual experiment, a configuration file determines which steps need to be run. This configuration file is the one that is specified when invocing experiment.perl
. It may contain for instance the following:
[CORPUS:europarl] ### raw corpus files (untokenized, but sentence aligned) # raw-stem = $europarl-v3/training/europarl-v3.fr-en
Here, the parallel corpus to be used is named europarl
and it is provided in raw text format in the location $europarl-v3/training/europarl-v3.fr-en
(the variable $europarl-v3
is defined elsewhere in the config file). The effect of this specification in the config file is that the step get-corpus
does not need to be run, since its output is given as a file. More on the configuration file below in the next section.
Several types of information are specified in experiment.meta
:
in
and out
: Established dependencies between steps; input may also be provided by files specified in the configuration.
default-name
: Name of the file in which the output of the step will be stored.
template
: Template for the command that is placed in the execution script for the step.
template-if
: Potential command for the execution script. Only used, if the first parameter exists.
error
: experiment.perl
detects if a step failed by scanning STDERR for key words such as killed, error, died, not found, and so on. Additional key words and phrase are provided with this parameter.
not-error
: Declares default error key words as not indicating failures.
pass-unless
: Only if the given parameter is defined, this step is executed, otherwise the step is passed (illustrated by a yellow box in the graph).
ignore-unless
: If the given parameter is defined, this step is not executed. This overrides requirements of downstream steps.
rerun-on-change
: If similar experiment are runs, the output of steps may be used, if input and parameter settings are the same. This specifies a number of parameters whose change disallows a re-use in different run.
parallelizable
: When running on the cluster, this step may be parallelized (only if generic-parallelizer
is set in the config file, typically to $edinburgh-script-dir/generic-parallelizer.perl
.
qsub-script
: If running on a cluster, this step is run on the head node, and not submitted to the queue (because it submits jobs itself).
Here now the full definition of the step CONFIG:tokenize
tokenize in: raw-stem out: tokenized-stem default-name: corpus/tok pass-unless: input-tokenizer output-tokenizer template-if: input-tokenizer IN.$input-extension OUT.$input-extension template-if: output-tokenizer IN.$output-extension OUT.$output-extension parallelizable: yes
The step takes raw-stem
and produces tokenized-stem
. It is parallizable with the generic parallelizer.
That output is stored in the file corpus/tok
. Note that the actual file name also contains the corpus name, and the run number. Also, in this case, the parallel corpus is stored in two files, so file name may be something like corpus/europarl.tok.1.fr
and corpus/europarl.tok.1.en
.
The step is only executed, if either input-tokenizer
or output-tokenizer
are specified. The templates indicate how the command lines in the execution script for the steps look like.
You need Data file and Dev. You can simply download at workshop shared task
Typically, when setting up an experiment, you will take an existing configuration file and modify it. The config files are self-documenting and describe each possible parameter. Obviously, explaining each parameter would require explaining the entire training, tuning, testing, and evaluation process of the Moses statistical machine translation system, which goes beyond this short manual.
CORPUS_europarl_tokenize.1
CORPUS_europarl_tokenize.1.DONE
CORPUS_europarl_tokenize.1.INFO
CORPUS_europarl_tokenize.1.STDERR
CORPUS_europarl_tokenize.1.STDOUT
Along with EMS comes a web interface allowing you to follow running and achieved experiments. Put or link the web
directory provided with EMS on a web server (LAMPP on Linux or MAMP on Mac does the trick). Make sure the web server user has the right write permissions on the web interface directory.
To add your experiments to this interface, add a line to the file
To add a description to each run, edit the file
.STDERR
extension. (The exact format is NAMEOFPROCESS_step.numberofexperiment.STDERR
, e.g. CORPUS_factorize.13.STDERR
) . Display this file, find the error and try to correct it. If there is no error, look at the previous steps; it may have occured before but not been detected.
.STDERR
file and everything that has been produced by the crashed step. To find what has been produced by the crashed step, you may need to consult where the output of this step is placed, by looking at experiment.meta
.
nice sh NAMEOFPROCESS_step.numberofexperiment
(e.g. nice sh CORPUS_factorize.13
). If it crashes again go to step 2
nice experiment.perl -config steps/config.13 -continue 13
. Let the graph to be displayed and verify that it indeed it goes on from there. (This is not always shown by the graph)
nice experiment.perl -config steps/config.13 -continue 13 -exec
.
Tuning is treated by experiment.perl as one step. Though, this refers to the execution of mert-moses script, which performs multiple runs; each of these runs is split into pieces and parallelized (for more details read here).
steps/TUNING_tune.num.STDERR
and try to find at what point the error occurred. There may be an obvious error that you may need to correct.
.STDERR
file and you have been running in a cluster, the error may exist in one of the split parallelized runs used for multiple decoding. try to find the last bunch of lines that look like: Executing: qsub -l mem_free=0.5G -hard -b no -j yes -o folder/tuning/tmp.28/out.job15397-aa -e folder/tuning/tmp.28/err.job15397-aa -N mert15-aa folder/tuning/tmp.28/job15397-aa.bash >& folder/tuning/tmp.28/job15397-aa.log
-o
parameter, and the file that holds the submission errors (e.g. by the GRID queuing system) is specified by the -e
parameter.
steps/TUNING_tune.num
by using a text editor. Find the line which starts with executing mert-moses.pl
. There are two parameters you can add to resume the broken tuning:
--continue
.
--continue --skip decoder
which will save you some time. That's optional, though.
nice sh steps/TUNING_tune.num
steps/TUNING_tune.num.STDERR
and let experiment.perl go on experiment.perl -continue -config steps/config.num
-clean
to delete intermediate files and crashed steps
-delete
to delete all files of one experiment, unless other non-deleted steps depend on them
Bug on experiments with factors: The experiment will re-use factorized (corpus, tuning and test) sets of an older experiment, even if the input-factors specification has changed.
Here some random requests, or other futuristic features: