POS and Chunk Based Pre-Reordering
Reordering patterns over part-of-speech tags used in a pre-reordering stage are typically trained from aligned data.
POS Chunk Prereordering is the main subject of 18 publications. 12 are discussed here.
Publications
POS-based pre-reordering may convert the input sentence into a reordering graph
Josep M. Crego and José B. Mariño (2006):
Integration of POS-tag-based Source Reordering into SMT Decoding by an Extended Search Graph, 5th Conference of the Association for Machine Translation in the Americas (AMTA)
@InProceedings{Crego:2006:AMTA,
author = {Josep M. Crego and Jos\'{e} B. Mari{\~n}o},
title = {Integration of {POS}-tag-based Source Reordering into {SMT} Decoding by an Extended Search Graph},
url = {
http://gps-tsc.upc.es/veu/research/pubs/download/Cre\_Int\_06.pdf},
googlescholar = {1891198961367566265},
booktitle = {5th Conference of the Association for Machine Translation in the Americas (AMTA)},
month = {August},
address = {Boston, Massachusetts},
year = 2006
}
(Crego and Mariño, 2006) or uses a rescoring approach with the patterns as features
Boxing Chen and Mauro Cettolo and Marcello Federico (2006):
Reordering rules for phrase-based statistical machine translation, Proc. of the International Workshop on Spoken Language Translation
@inproceedings{Chen:2006b:IWSLT,
author = {Boxing Chen and Mauro Cettolo and Marcello Federico},
title = {Reordering rules for phrase-based statistical machine translation},
url = {
http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_06/papers/slt6\_182.pdf},
googlescholar = {5144961835222561878},
month = {November},
booktitle = {Proc. of the International Workshop on Spoken Language Translation},
address = {Kyoto, Japan},
year = 2006
}
(Chen et al., 2006). The reordering rules may also be integrated into an otherwise monotone decoder
Tillmann, Christoph (2008):
A Rule-Driven Dynamic Programming Decoder for Statistical MT, Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)
@InProceedings{tillmann:2008:SSST,
author = {Tillmann, Christoph},
title = {A Rule-Driven Dynamic Programming Decoder for Statistical {MT}},
booktitle = {Proceedings of the ACL-08:~HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {37--45},
url = {
http://www.aclweb.org/anthology/W/W08/W08-0405},
year = 2008
}
(Tillmann, 2008). Such rules may also be used in a separate reordering model. Such rules may be based on automatic word classes
Costa-jussà, Marta Ruiz and Fonollosa, José A. R. (2006):
Statistical Machine Reordering, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
@InProceedings{costajussa-fonollosa:2006:EMNLP,
author = {Costa-juss\`{a}, Marta Ruiz and Fonollosa, Jos\'{e} A. R.},
title = {Statistical Machine Reordering},
booktitle = {Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {70--76},
url = {
http://www.aclweb.org/anthology/W/W06/W06-1609},
year = 2006
}
(Costa-jussà and Fonollosa, 2006;
Crego, Josep M. and de Gispert, Adrià and Lambert, Patrik and Costa-jussà, Marta Ruiz and Khalilov, Maxim and Banchs, Rafael E. and Mariño, José B. and Fonollosa, José A. R. (2006):
N-gram-based SMT System Enhanced with Reordering Patterns, Proceedings on the Workshop on Statistical Machine Translation
@InProceedings{crego-EtAl:2006:WMT,
author = {Crego, Josep M. and de Gispert, Adri\`{a} and Lambert, Patrik and Costa-juss\`{a}, Marta Ruiz and Khalilov, Maxim and Banchs, Rafael E. and Mari{\~n}o, Jos\'{e} B. and Fonollosa, Jos\'{e} A. R.},
title = {N-gram-based {SMT} System Enhanced with Reordering Patterns},
booktitle = {Proceedings on the Workshop on Statistical Machine Translation},
month = {June},
address = {New York City},
publisher = {Association for Computational Linguistics},
pages = {162--165},
url = {
http://www.aclweb.org/anthology/W/W06/W06-3125},
year = 2006
}
Crego et al., 2006), which was shown to outperform part-of-speech tags
Costa-jussà, Marta Ruiz and Fonollosa, José A. R. (2007):
Analysis of Statistical and Morphological Classes to Generate Weigthed Reordering Hypotheses on a Statistical Machine Translation System, Proceedings of the Second Workshop on Statistical Machine Translation
mentioned in Research Groups and POS Chunk Prereordering@InProceedings{rcostajussia-rfonollosa:2007:WMT,
author = {Costa-juss\`{a}, Marta Ruiz and Fonollosa, Jos\'{e} A. R.},
title = {Analysis of Statistical and Morphological Classes to Generate Weigthed Reordering Hypotheses on a Statistical Machine Translation System},
booktitle = {Proceedings of the Second Workshop on Statistical Machine Translation},
month = {June},
address = {Prague, Czech Republic},
publisher = {Association for Computational Linguistics},
pages = {171--176},
url = {
http://www.aclweb.org/anthology/W/W07/W07-0221},
year = 2007
}
(Costa-jussà and Fonollosa, 2007), or they may be based on syntactic chunks
Zhang, Yuqi and Zens, Richard and Ney, Hermann (2007):
Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation, Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
@InProceedings{zhang-zens-ney:2007:SSST,
author = {Zhang, Yuqi and Zens, Richard and Ney, Hermann},
title = {Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation},
booktitle = {Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {1--8},
url = {
http://www.aclweb.org/anthology/W/W07/W07-0401},
year = 2007
}
(Zhang et al., 2007;
Yuqi Zhang and Richard Zens and Hermann Ney (2007):
Improved Chunk-level Reordering for Statistical Machine Translation, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{Zhang:2007:IWSLT,
author = {Yuqi Zhang and Richard Zens and Hermann Ney},
title = {Improved Chunk-level Reordering for Statistical Machine Translation},
url = {
http://20.210-193-52.unknown.qala.com.sg/archive/iwslt\_07/papers/slt7\_021.pdf},
googlescholar = {1877345433028572530},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2007
}
Zhang et al., 2007b;
Crego, Josep M. and Habash, Nizar (2008):
Using Shallow Syntax Information to Improve Word Alignment and Reordering for SMT, Proceedings of the Third Workshop on Statistical Machine Translation
mentioned in Symmetrization and POS Chunk Prereordering@InProceedings{crego-habash:2008:WMT,
author = {Crego, Josep M. and Habash, Nizar},
title = {Using Shallow Syntax Information to Improve Word Alignment and Reordering for {SMT}},
booktitle = {Proceedings of the Third Workshop on Statistical Machine Translation},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {53--61},
url = {
http://www.aclweb.org/anthology/W/W08/W08-0307},
year = 2008
}
Crego and Habash, 2008). Scoring for rule applications may be encoded in the reordering graph, or done once the target word order is established which allows for rewarding reorderings that happened due to phrase-internal reordering
Elming, Jakob (2008):
Syntactic Reordering Integrated with Phrase-Based SMT, Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)
@InProceedings{elming:2008:SSST,
author = {Elming, Jakob},
title = {Syntactic Reordering Integrated with Phrase-Based {SMT}},
booktitle = {Proceedings of the ACL-08:~HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {46--54},
url = {
http://www.aclweb.org/anthology/W/W08/W08-0406},
year = 2008
}
(Elming, 2008;
Elming, Jakob (2008):
Syntactic Reordering Integrated with Phrase-Based SMT, Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)
@InProceedings{elming:2008:PAPERS,
author = {Elming, Jakob},
title = {Syntactic Reordering Integrated with Phrase-Based {SMT}},
booktitle = {Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)},
month = {August},
address = {Manchester, UK},
publisher = {Coling 2008 Organizing Committee},
pages = {209--216},
url = {
http://www.aclweb.org/anthology/C08-1027},
year = 2008
}
Elming, 2008b).
Benchmarks
Discussion
Related Topics
New Publications
Li, Shuo and Wong, Derek F. and Chao, Lidia S. (2013):
Experiments with POS-based restructuring and alignment-based reordering for statistical machine translation, Proceedings of the Second Workshop on Hybrid Approaches to Translation
@InProceedings{li-wong-chao:2013:HyTra,
author = {Li, Shuo and Wong, Derek F. and Chao, Lidia S.},
title = {Experiments with POS-based restructuring and alignment-based reordering for statistical machine translation},
booktitle = {Proceedings of the Second Workshop on Hybrid Approaches to Translation},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {82--87},
url = {
http://www.aclweb.org/anthology/W13-2812},
year = 2013
}
Li et al. (2013)
Arianna Bisazza and Daniele Pighin and Marcello Federico (2012):
Chunk-lattices for verb reordering in Arabic-English statistical machine translation, Machine Translation
@article{MTJ:2012:Bisazza,
author = {Arianna Bisazza and Daniele Pighin and Marcello Federico},
title = {Chunk-lattices for verb reordering in {Arabic}-{English} statistical machine translation},
pages = {85-103},
journal = {Machine Translation},
volume = {26},
number = {1-2},
month = {March},
year = 2012
}
Bisazza et al. (2012)
Marta R. Costa-jussà and José A. R. Fonollosa (2008):
Computing multiple weighted reordering hypotheses for a phrase-based statistical machine translation system, Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas (AMTA)
@inproceedings{amta08:Costa-jussa,
author = {Marta R. Costa-juss\`{a} and Jos\'{e} A. R. Fonollosa},
title = {Computing multiple weighted reordering hypotheses for a phrase-based statistical machine translation system},
pages = {82--88},
booktitle = {Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {Waikiki, Hawaii},
year = 2008
}
Costa-jussà and Fonollosa (2008)
Xianchao Wu and Katsuhito Sudoh and Kevin Duh and Hajime Tsukada and Masaaki Nagata (2011):
Extracting Pre-ordering Rules from Chunk-based Dependency Trees for Japanese-to-English Translation, Proceedings of the 13th Machine Translation Summit (MT Summit XIII)
@inproceedings{MTS-2011-Wu,
author = {Xianchao Wu and Katsuhito Sudoh and Kevin Duh and Hajime Tsukada and Masaaki Nagata},
title = {Extracting Pre-ordering Rules from Chunk-based Dependency Trees for {Japanese-to-English} Translation},
url = {
http://www.mt-archive.info/MTS-2011-Wu.pdf},
pages = {300-307},
booktitle = {Proceedings of the 13th Machine Translation Summit (MT Summit XIII)},
publisher = {International Association for Machine Translation},
location = {Xiamen, China},
year = 2011
}
Wu et al. (2011)
Josep M. Crego and José B. Mariño and Adrià de Gispert (2005):
Reordered Search, and Tuple Unfolding for Ngram-based SMT, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
@InProceedings{Crego:2005:MTS,
author = {Josep M. Crego and Jos\'{e} B. Mari{\~n}o and Adri\`{a} de Gispert},
title = {Reordered Search, and Tuple Unfolding for Ngram-based {SMT}},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
Crego et al. (2005)
Stymne, Sara (2012):
Clustered Word Classes for Preordering in Statistical Machine Translation, Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
@InProceedings{stymne:2012:ROBUS-UNSUP2012,
author = {Stymne, Sara},
title = {Clustered Word Classes for Preordering in Statistical Machine Translation},
booktitle = {Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP},
month = {April},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
pages = {28--34},
url = {
http://www.aclweb.org/anthology/W12-0704},
year = 2012
}
Stymne (2012)
Andreas, Jacob and Habash, Nizar and Rambow, Owen (2011):
Fuzzy Syntactic Reordering for Phrase-based Statistical Machine Translation, Proceedings of the Sixth Workshop on Statistical Machine Translation
mentioned in Syntactic Prereordering and POS Chunk Prereordering@InProceedings{andreas-habash-rambow:2011:WMT,
author = {Andreas, Jacob and Habash, Nizar and Rambow, Owen},
title = {Fuzzy Syntactic Reordering for Phrase-based Statistical Machine Translation},
booktitle = {Proceedings of the Sixth Workshop on Statistical Machine Translation},
month = {July},
address = {Edinburgh, Scotland},
publisher = {Association for Computational Linguistics},
pages = {227--236},
url = {
http://www.aclweb.org/anthology/W11-2127},
year = 2011
}
Andreas et al. (2011)