Decoding with a LM Server

Monday, 25th -- first steps:

  • briefly checked the Moses decoding code from SVN trunk
  • localized LM code and discusses possible extensions
  • compiled lmserver, successful local testing with Moses+lmserver


  • Philipp Koehn
  • Mark Fishel
  • Christian Federmann

While large language models have been shown to be very useful, they easily outstrip available computing resources. There is already a language model server implementation as part of Moses, but it is not well integrated with the decoding algorithm. Using the LM server adds additional overhead due to the latency of the TCP/IP requests.

It therefore would be preferable to make these requests in large batches, instead one at a time. This would require to organize the decoding algorithm around such packed requests. The algorithm would create hypothesis in two stages:

  1. First as partially resolved "temporary hypotheses" that are place in temporary stacks.
  2. Once a large number of these are collected, they will be scored with the LM and finalized.

There are a couple of ways to optimize this further, such using a randomized LM to filter requests and use multi-threading to not waste time on waiting, but this would be beyond a short project.

Possible extensions:

  • non-blocking lmserver calls
  • multi-threading (can that really improve stuff?)
Page last modified on January 25, 2010, at 11:49 PM