machine translation

Reference: All Decoder Parameters

  • -beam-threshold (b): threshold for threshold pruning
  • -cache-path: ?
  • -config (-f): location of the configuration file
  • -constraint: Target sentence to produce
  • -cube-pruning-diversity (-cbd): How many hypotheses should be created for each coverage. (default = 0)
  • -cube-pruning-pop-limit (-cbp): How many hypotheses should be popped for each stack. (default = 1000)
  • -distortion: configurations for each factorized/lexicalized reordering model.
  • -distortion-file: source factors (0 if table independent of source), target factors, location of the factorized/lexicalized reordering tables
  • -distortion-limit (-dl): distortion (reordering) limit in maximum number of words (0 = monotone, -1 = unlimited)
  • -drop-unknown (-du): drop unknown words instead of copying them
  • -early-discarding-threshold (-edt@@): threshold for constructing hypotheses based on estimate cost
  • -factor-delimiter (-fd): specify a different factor delimiter than the default
  • -generation-file: location and properties of the generation table
  • -include-alignment-in-n-best: include word alignment in the n-best list. default is false
  • -input-factors: list of factors in the input
  • -input-file (-i): location of the input file to be translated
  • -inputtype: text (0), confusion network (1) or word lattice (2)
  • -labeled-n-best-list: print out labels for each weight type in n-best list. default is true
  • -lmodel-dub: dictionary upper bounds of language models
  • -lmodel-file: location and properties of the language models
  • -lmstats (-L): (1/0) compute LM backoff statistics for each translation hypothesis
  • -mapping: description of decoding steps
  • -max-partial-trans-opt: maximum number of partial translation options per input span (during mapping steps)
  • -max-phrase-length: maximum phrase length (default 20)
  • -max-trans-opt-per-coverage: maximum number of translation options per input span (after applying mapping steps)
  • -mbr-scale: scaling factor to convert log linear score probability in MBR decoding (default 1.0)
  • -mbr-size: number of translation candidates considered in MBR decoding (default 200)
  • -minimum-bayes-risk (-mbr): use miminum Bayes risk to determine best translation
  • -monotone-at-punctuation (-mp): do not reorder over punctuation
  • -n-best-factor: factor to compute the maximum number of contenders (=factor*nbest-size). value 0 means infinity, i.e. no threshold. default is 0
  • -n-best-list: file and size of n-best-list to be generated; specify - as the file in order to write to STDOUT
  • -output-factors: list if factors in the output
  • -output-search-graph (-osg): Output connected hypotheses of search into specified filename
  • -output-word-graph (-owg): Output stack info as word graph. Takes filename, 0=only hypos in stack, 1=stack + nbest hypos
  • -persistent-cache-size: maximum size of cache for translation options (default 10,000 input phrases)
  • -phrase-drop-allowed (-da): if present, allow dropping of source words
  • -print-alignment-info: Output word-to-word alignment into the log file. Word-to-word alignments are taken from the phrase table if any. Default is false
  • -print-alignment-info-in-n-best: Include word-to-word alignment in the n-best list. Word-to-word alignments are takne from the phrase table if any. Default is false
  • -recover-input-path (-r): (confusion net/word lattice only) - recover input path corresponding to the best translation
  • -report-all-factors: report all factors in output, not just first
  • -report-segmentation (-t): report phrase segmentation in the output
  • -search-algorithm: Which search algorithm to use. 0=normal stack, 1=cube pruning (default = 0)
  • -stack (-s): maximum stack size for histogram pruning
  • -stack-diversity (-sd): minimum number of hypothesis of each coverage in stack (default 0)
  • -time-out: seconds after which is interrupted (-1=no time-out, default is -1)
  • -translation-details (-T): for each best translation hypothesis, print out details about what source spans were used, dropped
  • -translation-option-threshold (-tot): threshold for translation options relative to best for input phrase
  • -ttable-file: location and properties of the translation tables
  • -use-alignment-info: Use word-to-word alignment: actually it is only used to output the word-to-word alignment. Word-to-word alignments are taken from the phrase table if any. Default is false.
  • -use-persistent-cache: cache translation options across sentences (default true)
  • -verbose (-v): verbosity level of the logging
  • -weight-d (-d): weight(s) for distortion (reordering components)
  • -weight-e (-e): weight for word deletion
  • -weight-file (-wf): file containing labeled weights
  • -weight-generation (-g): weight(s) for generation components
  • -weight-i (-I): weight for word insertion
  • -weight-l (-lm): weight(s) for language models
  • -weight-t (-tm): weights for translation model components
  • -weight-w (-w): weight for word penalty
  • -xml-input (-xi): allows markup of input with desired translations and probabilities. values can be 'pass-through' (default), 'inclusive', 'exclusive', 'ignore'
Edit - History - Print
Page last modified on August 25, 2014, at 08:41 AM