Preprocessing is done with Monoses (https://github.com/artetxem/monoses)
Description of files:
If you use these splits of the data, please cite:
Marchisio, Kelly  and  Duh, Kevin  and  Koehn, Philipp: When Does Unsupervised Machine Translation Work?, Proceedings of the Fifth Conference on Machine Translation (WMT), 2020.
@InProceedings{marchisio-duh-koehn:2020:WMT,
  author    = {Marchisio, Kelly  and  Duh, Kevin  and  Koehn, Philipp},
  title     = {When Does Unsupervised Machine Translation Work?},
  booktitle      = {Proceedings of the Fifth Conference on Machine Translation},
  month          = {November},
  year           = {2020},
  publisher      = {Association for Computational Linguistics},
}