experiment.perl, or Experiment Management System (EMS), for lack of a better name, is our experimental management and report system for machine translation experiments with Moses.
In order to run properly, EMS will require:
Experiment.perl is extremely simple to use:
experiment.perl(see SVN information below).
working-dirin the config file.
experiment.perl -config CONFIGfrom your experiment working directory.
experiment.perl -config CONFIG -exec.
If you survived this process you may want to familiarize yourself with the parameters in the configuration file. If some of them are unclear, try to investigate and write a much more informative description of the paramter into the config.
--no-graphsupresses the display of the graph
--continue RUNcontinues the experiment
RUN, which crashed earlier. Make sure that crashed step and its output is deleted (this should be done automatically at some point, se TODO below). Also, make sure to specify the right config file (i.e. in the current run directory) when using --continue.
experiment.perl is available from the moses subversion repository.
ems/a expriment management system
experiment.perlyour entry point in the management system
experiment.metatemplate file defining how an experiment is ran
config/sample self-documented config files go here
web/the web interface for EMS
A useful command for seeing what has changed in
experiment.perl since you last downloaded it is
> cd ems > svn diff --revision HEAD experiment.perl
Right now, Hieu, Philipp and Josh are set to receive e-mail notification whenever a commit is made to the repository. If anyone you want to be added, let us know. If there is enough interest we will set up a mailing list for notifications.
Experiment.perl is a experiment management tool. You have to define an experiment in a configuration file and experiment.perl figures out which steps need to be run and schedules them either as jobs on a cluster or runs them serially on a single machine.
An experimental run is broken up into several steps. Here a typical example:
In this graph, each step is a small box. Experiment.perl builds for each step a script file that gets either submitted to the cluster or run on the same machine. Note that some steps are quite involved, for instance tuning: On a cluster, the tuning script runs on the head node a submits jobs to the queue itself.
The main stages of running an experiment are:
The actual steps, their dependencies and other salient information is to be found in the file
experiment.meta. Think of experiment.meta as a "template" file.
Here the parts of the step description for
get-corpus in: get-corpus-script out: raw-stem [...] tokenize in: raw-stem out: tokenized-stem [...]
Each step takes some input (
in) and provides some output (
out). This also establishes the dependencies between the steps. The step
tokenize requires the input
raw-stem. This is provided by the step
experiment.meta provides a generic template for steps and their interaction. For an actual experiment, a configuration file determines which steps need to be run. This configuration file is the one that is specified when invocing
experiment.perl. It may contain for instance the following:
[CORPUS:europarl] ### raw corpus files (untokenized, but sentence aligned) # raw-stem = $europarl-v3/training/europarl-v3.fr-en
Here, the parallel corpus to be used is named
europarl and it is provided in raw text format in the location
$europarl-v3/training/europarl-v3.fr-en (the variable
$europarl-v3 is defined elsewhere in the config file). The effect of this specification in the config file is that the step
get-corpus does not need to be run, since its output is given as a file. More on the configuration file below in the next section.
Several types of information are specified in
out: Established dependencies between steps; input may also be provided by files specified in the configuration.
default-name: Name of the file in which the output of the step will be stored.
template: Template for the command that is placed in the execution script for the step.
template-if: Potential command for the execution script. Only used, if the first parameter exists.
experiment.perldetects if a step failed by scanning STDERR for key words such as killed, error, died, not found, and so on. Additional key words and phrase are provided with this parameter.
not-error: Declares default error key words as not indicating failures.
pass-unless: Only if the given parameter is defined, this step is executed, otherwise the step is passed (illustrated by a yellow box in the graph).
ignore-unless: If the given parameter is defined, this step is not executed. This overrides requirements of downstream steps.
rerun-on-change: If similar experiment are runs, the output of steps may be used, if input and parameter settings are the same. This specifies a number of parameters whose change disallows a re-use in different run.
parallelizable: When running on the cluster, this step may be parallelized (only if
generic-parallelizeris set in the config file, typically to
qsub-script: If running on a cluster, this step is run on the head node, and not submitted to the queue (because it submits jobs itself).
Here now the full definition of the step
tokenize in: raw-stem out: tokenized-stem default-name: corpus/tok pass-unless: input-tokenizer output-tokenizer template-if: input-tokenizer IN.$input-extension OUT.$input-extension template-if: output-tokenizer IN.$output-extension OUT.$output-extension parallelizable: yes
The step takes
raw-stem and produces
tokenized-stem. It is parallizable with the generic parallelizer.
That output is stored in the file
corpus/tok. Note that the actual file name also contains the corpus name, and the run number. Also, in this case, the parallel corpus is stored in two files, so file name may be something like
The step is only executed, if either
output-tokenizer are specified. The templates indicate how the command lines in the execution script for the steps look like.
You need Data file and Dev. You can simply download at workshop shared task
Typically, when setting up an experiment, you will take an existing configuration file and modify it. The config files are self-documenting and describe each possible parameter. Obviously, explaining each parameter would require explaining the entire training, tuning, testing, and evaluation process of the Moses statistical machine translation system, which goes beyond this short manual.
Along with EMS comes a web interface allowing you to follow running and achieved experiments. Put or link the
web directory provided with EMS on a web server (LAMPP on Linux or MAMP on Mac does the trick). Make sure the web server user has the right write permissions on the web interface directory.
To add your experiments to this interface, add a line to the file
To add a description to each run, edit the file
.STDERRextension. (The exact format is
CORPUS_factorize.13.STDERR) . Display this file, find the error and try to correct it. If there is no error, look at the previous steps; it may have occured before but not been detected.
.STDERRfile and everything that has been produced by the crashed step. To find what has been produced by the crashed step, you may need to consult where the output of this step is placed, by looking at
nice sh NAMEOFPROCESS_step.numberofexperiment(e.g.
nice sh CORPUS_factorize.13). If it crashes again go to step 2
nice experiment.perl -config steps/config.13 -continue 13. Let the graph to be displayed and verify that it indeed it goes on from there. (This is not always shown by the graph)
nice experiment.perl -config steps/config.13 -continue 13 -exec.
Tuning is treated by experiment.perl as one step. Though, this refers to the execution of mert-moses script, which performs multiple runs; each of these runs is split into pieces and parallelized (for more details read here).
steps/TUNING_tune.num.STDERRand try to find at what point the error occurred. There may be an obvious error that you may need to correct.
.STDERRfile and you have been running in a cluster, the error may exist in one of the split parallelized runs used for multiple decoding. try to find the last bunch of lines that look like:
Executing: qsub -l mem_free=0.5G -hard -b no -j yes -o folder/tuning/tmp.28/out.job15397-aa -e folder/tuning/tmp.28/err.job15397-aa -N mert15-aa folder/tuning/tmp.28/job15397-aa.bash >& folder/tuning/tmp.28/job15397-aa.log
-oparameter, and the file that holds the submission errors (e.g. by the GRID queuing system) is specified by the
steps/TUNING_tune.numby using a text editor. Find the line which starts with executing
mert-moses.pl. There are two parameters you can add to resume the broken tuning:
--continue --skip decoderwhich will save you some time. That's optional, though.
nice sh steps/TUNING_tune.num
steps/TUNING_tune.num.STDERRand let experiment.perl go on
experiment.perl -continue -config steps/config.num
-cleanto delete intermediate files and crashed steps
-deleteto delete all files of one experiment, unless other non-deleted steps depend on them
Bug on experiments with factors: The experiment will re-use factorized (corpus, tuning and test) sets of an older experiment, even if the input-factors specification has changed.
Here some random requests, or other futuristic features: