This section will show you how to install and build Moses, and how to use Moses to translate with some simple models. If you experience problems, then please check the support page. If you do not want to build Moses from source, then there are packages available for Windows and popular Linux distributions.
To compile with bare minimum of features:
./bjam -j4
If you have compiled boost manually, then tell bjam where it is:
./bjam --with-boost=~/workspace/temp/boost_1_64_0 -j8
If you have compiled the cmph library manually:
./bjam --with-cmph=/Users/hieu/workspace/cmph-2.0
If you have compiled the xmlrpc-c library manually:
./bjam --with-xmlrpc-c=/Users/hieu/workspace/xmlrpc-c/xmlrpc-c-1.33.17
If you have compiled the xmlrpc-c library manually:
./bjam --with-irstlm=/Users/hieu/workspace/irstlm/irstlm-5.80.08/trunk
This is the exact command I (Hieu) used on Linux:
./bjam --with-boost=/home/s0565741/workspace/boost/boost_1_57_0 --with-cmph=/home/s0565741/workspace/cmph-2.0 --with-irstlm=/home/s0565741/workspace/irstlm-code --with-xmlrpc-c=/home/s0565741/workspace/xmlrpc-c/xmlrpc-c-1.33.17 -j12
Boost 1.48 has a serious bug which breaks Moses compilation. Unfortunately, some Linux distributions (eg. Ubuntu 12.04) have broken versions of the Boost library. In these cases, you must download and compile Boost yourself.
This is the exact commands I (Hieu) use to compile boost:
wget https://dl.bintray.com/boostorg/release/1.64.0/source/boost_1_64_0.tar.gz tar zxvf boost_1_64_0.tar.gz cd boost_1_64_0/ ./bootstrap.sh ./b2 -j4 --prefix=$PWD --libdir=$PWD/lib64 --layout=system link=static install || echo FAILURE
This create library file in the directory lib64, NOT in the system directory. Therefore, you don't need to be system admin/root to run this. However, you will need to tell moses where to find boost, which is explained below
Once boost is installed, you can then compile Moses. However, you must tell Moses where boost is with the --with-boost flag. This is the exact commands I use to compile Moses:
./bjam --with-boost=~/workspace/temp/boost_1_64_0 -j4
Moses requires a word alignment tool, such as giza++, mgiza, or Fast Align.
I (Hieu) use MGIZA because it is multi-threaded and give general good result, however, I've also heard good things about Fast Align. You can find instructions to compile them here.
Moses includes the KenLM language model creation program, lmplz.
You can also create language models with IRSTLM and SRILM. Please read this if you want to compile IRSTLM. Language model toolkits perform two main tasks: training and querying. You can train a language model with any of them, produce an ARPA file, and query with a different one. To train a model, just call the relevant script.
If you want to use SRILM or IRSTLM to query the language model, then they need to be linked with Moses. For IRSTLM, you first need to compile IRSTLM then use the --with-irstlm switch to compile Moses with IRSTLM. This is the exact command I used:
./bjam --with-irstlm=/home/s0565741/workspace/temp/irstlm-5.80.03 -j4
Personally, I only use IRSTLM as a query tool in this way if the LM n-gram order is over 7. In most situation, I use KenLM because KenLM is multi-threaded and faster.
The primary development platform for Moses is Linux, and this is the recommended platform since you will find it easier to get support for it. However Moses does work on other platforms:
Install the following packages using the command
su apt-get install [package name]
Packages:
git subversion make libtool gcc g++ libboost-dev tcl-dev tk-dev zlib1g-dev libbz2-dev python-dev libicu-dev (Debian) libunistring-dev (Debian)
Install the following packages using the command
sudo apt-get install [package name]
Packages:
g++ git subversion automake libtool zlib1g-dev libicu-dev libboost-all-dev libbz2-dev liblzma-dev python-dev graphviz imagemagick make cmake libgoogle-perftools-dev (for tcmalloc) autoconf doxygen
Install the following packages using the command
su yum install [package name]
Packages:
git subversion make automake cmake libtool gcc-c++ zlib-devel python-devel bzip2-devel boost-devel ImageMagick cpan expat-devel
In addition, you have to install some perl packages:
cpan XML::Twig cpan Sort::Naturally
Mac OSX is widely used by Moses developers and everything should run fine. Installation is the same as for Linux.
Mac OSX out-of-the-box doesn't have many programs that are critical to Moses, or different version of standard GNU programs. For example, split
, sort
, zcat
are incompatible BSD-versions rather than GNU versions.
Therefore, Moses has been tested with Mac OSX with Mac Ports. Make sure you have this installed on your machine. Success has also been reported with brew
installation. Do note, however, that you will need to install xmlrpc-c
independently, and then compile with bjam
using the --with-xmlrpc-c=/usr/local
flag (where /usr/local/ is the default location of the xmlrpc-c installation).
Recent versions of OSX have clang C/C++ compiler, rather than gcc. When compiling with bjam, you must add the following:
./bjam toolset=clang
This is the exact command I (Hieu) use on OSX Yosemite:
./bjam --with-boost=/Users/hieu/workspace/boost/boost_1_59_0.clang/ --with-cmph=/Users/hieu/workspace/cmph-2.0 --with-xmlrpc-c=/Users/hieu/workspace/xmlrpc-c/xmlrpc-c-1.33.17 --with-irstlm=/Users/hieu/workspace/irstlm/irstlm-5.80.08/trunk --with-mm --with-probing-pt -j5 toolset=clang -q -d2
You also need to add this argument when manually compiling boost. This is the exact command I use:
./b2 -j8 --prefix=$PWD --libdir=$PWD/lib64 --layout=system link=static toolset=clang install || echo FAILURE
Moses can run on Windows 10 with Ubuntu 16.04 subsystem, available within windows programs' feature tab. More information here:
https://docs.microsoft.com/en-us/windows/wsl/install-win10
Thereafter, installation is exactly the same as for Ubuntu.
Download the sample models and extract them into your working directory:
cd ~/mosesdecoder wget http://www.statmt.org/moses/download/sample-models.tgz tar xzf sample-models.tgz cd sample-models
Run the decoder
cd ~/mosesdecoder/sample-models ~/mosesdecoder/bin/moses -f phrase-model/moses.ini < phrase-model/in > out
If everything worked out right, this should translate the sentence "das ist ein kleines haus" (in the file in
) as "this is a small house" (in the file out
).
Note that the configuration file moses.ini
in each directory is set to use the KenLM language model toolkit by default. If you prefer to use IRSTLM, then edit the language model entry in moses.ini
, replacing KENLM
with IRSTLM
. You will also have to compile with ./bjam --with-irstlm
, adding the full path of your IRSTLM installation.
Moses also supports SRILM and RandLM language models. See here for more details.
The chart decoder is part of the same executable as of version 3.0.
You can run the chart demos from the sample-models directory as follows
~/mosesdecoder/bin/moses -f string-to-tree/moses.ini < string-to-tree/in > out.stt ~/mosesdecoder/bin/moses -f tree-to-tree/moses.ini < tree-to-tree/in.xml > out.ttt
The expected result of the string-to-tree demo is
this is a small house
Why not try to build a Baseline translation system with freely available data?
This is a list of options to bjam. On a system with Boost installed in a standard path, none should be required, but you may want additional functionality or control.
In addition to KenLM and ORLM (which are always compiled):
If your SRILM install is non-standard, use these options:
https://github.com/moses-smt/moses-regression-tests
.
PREFIX/bin
]
PREFIX/lib
]
PREFIX/include
.
PREFIX/scripts
.
By default, the build is multi-threaded, optimized, and statically linked.
There is a video showing you how to set up Moses with Eclipse.
How to compile Moses with Eclipse
Moses comes with Eclipse project files for some of the C++ executables. Currently, there are project files for
* moses (decoder) * moses-cmd (decoder) * extract * extract-rules * extract-ghkm * server * ...
The Eclipse build is used primarily for development and debugging. It is not optimized and doesn't have many of the options available in the bjam build.
The advantage of using Eclipse is that it offers code-completion, and a GUI debugging environment.
NB. The recent update of Mac OSX replaces g++ with clang. Eclipse doesn't yet fully function with clang. Therefore, you should not use the Eclipse build with any OSX version higher than 10.8 (Mountain Lion)
Follow these instructions to build with Eclipse:
* Use the version of Eclipse for C++. Works (at least) with Eclipse Kepler and Luna. * Get the Moses source code git clone git@github.com:moses-smt/mosesdecoder.git cd mosesdecoder * Create a softlink to Boost (and optionally to XMLRPC-C lib if you want to compile the moses server) in the Moses root directory eg. ln -s ~/workspace/boost_x_xx_x boost * Create a new Eclipse workspace. The workspace MUST be in contrib/other-builds/ Eclipse should now be running. * Import all the Moses Eclipse project into the workspace. File >> Import >> Existing Projects into Workspace >> Select root directory: contrib/other-builds/ >> Finish * Compile all projects. Project >> Build All
!! Easy Setup on Ubuntu (on other linux systems, you'll need to install packages that provide gcc, make, git, automake, libtool)
# Install required Ubuntu packages to build Moses and its dependencies:\\
sudo apt-get install build-essential git-core pkg-config automake libtool wget zlib1g-dev python-dev libbz2-dev
\\ For the regression tests, you'll also need \\
sudo apt-get install libsoap-lite-perl
\\ See below for additional packages that you'll need to actually run Moses (especially when you are using EMS).
# Clone Moses from the repository and cd into the directory for building Moses\\
git clone https://github.com/moses-smt/mosesdecoder.git
\\
cd mosesdecoder
# Run the following to install a recent version of Boost (the default version on your system might be too old), as well as cmph (for CompactPT), irstlm (language model from FBK, required to pass the regression tests), and xmlrpc-c (for moses server). By default, these will be installed in ./opt in your working directory:\\
make -f contrib/Makefiles/install-dependencies.gmake
# To compile moses, run \\
./compile.sh [additional options]
!!! Popular additional bjam options (called from within
./compile.sh
and ./run-regtests.sh
): *
--prefix=/destination/path --install-scripts
\\ ... to install Moses somewhere else on your system
*
--with-mm
\\ ...to enable suffix array-based phrase tables
Note that you'll still need a word aligner; this is not built automatically
!!! Running regression tests (Advanced; for Moses developers; normal users won't need this)
To compile and run the regression tests all in one go, run \\
./run-regtests.sh [additional options]
\\ Regression testing is only of interest for people who are actively making changes in the Moses codebase. If you are just using Moses to run MT experiments, there's no point in running regression tests, unless you want to check that your current version of Moses is working as expected. However, you can also check your version against the daily regression tests here.
If you run your own regression tests, sometimes Moses will fail them even when everything is working correctly, because different compilers produce slightly different executables that might produce slightly different output because they make different kinds of rounding errors.