Word Alignment Combination
System combination has been a very successful method for statistical machine translation, so the idea has been also picked up to improve word alignments.
Word Alignment Combination is the main subject of 16 publications. 7 are discussed here.
Publications
Dan Tufiş and Radu Ion and Alexandru Ceauşu and Dan \,Stefanescu (2006):
Improved Lexical Alignment by Combining Multiple Reified Alignments, Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics
@InProceedings{Tufis:2006:EACL,
author = {Dan Tufi{\,s} and Radu Ion and Alexandru Ceau{\,s}u and Dan {\,S}tefanescu},
title = {Improved Lexical Alignment by Combining Multiple Reified Alignments},
booktitle = {Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics},
month = {April},
address = {Trento, Italy},
year = 2006
}
Tufiş et al. (2006) combine different word aligners with heuristics.
Schrader, Bettina (2006):
ATLAS -- A New Text Alignment Architecture, Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions
@InProceedings{schrader:2006:POS,
author = {Schrader, Bettina},
title = {ATLAS -- A New Text Alignment Architecture},
booktitle = {Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {715--722},
url = {
http://www.aclweb.org/anthology/P/P06/P06-2092},
year = 2006
}
Schrader (2006) combines statistical methods with manual rules.
Elming, Jakob and Habash, Nizar (2007):
Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
@InProceedings{elming-habash:2007:ShortPapers,
author = {Elming, Jakob and Habash, Nizar},
title = {Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {25--28},
url = {
http://www.aclweb.org/anthology/N/N07/N07-2007},
year = 2007
}
Elming and Habash (2007) combine word aligners that operate under different morphological analysis of one the languages, in their case Arabic.
Huang, Fei and Zhang, Ying and Vogel, Stephan (2005):
Mining Key Phrase Translations from Web Corpora, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
@InProceedings{huang-zhang-vogel:2005:HLTEMNLP,
author = {Huang, Fei and Zhang, Ying and Vogel, Stephan},
title = {Mining Key Phrase Translations from Web Corpora},
booktitle = {Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {483--490},
url = {
http://www.aclweb.org/anthology/H/H05/H05-1061},
year = 2005
}
Huang et al. (2005) discuss the issue of interpolating word alignment models trained on data from multiple domains. Word alignment in a specific domain may also be improved with a dictionary obtained from a general domain
Hua Wu and Haifeng Wang (2004):
Improving domain-specific word alignment with a general bilingual corpus, Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)
@inproceedings{wu:2004:AMTA,
author = {Hua Wu and Haifeng Wang},
title = {Improving domain-specific word alignment with a general bilingual corpus},
url = {
http://ir.hit.edu.cn/~wanghaifeng/paper/AMTA04\_Alignment.pdf},
googlescholar = {9303484077530298148},
booktitle = {Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)},
pages = {262--271},
year = 2004
}
(Wu and Wang, 2004). Combining different word aligners that stem from different methodologies may also improve performance
Necip Fazil Ayan and Bonnie J. Dorr and Nizar Habash (2004):
Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT, Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)
@inproceedings{ayan:2004:AMTA,
author = {Necip Fazil Ayan and Bonnie J. Dorr and Nizar Habash},
title = {Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable {MT}},
url = {
http://www.speech.sri.com/people/nfa/Publications/ayan-amta04-multialign.pdf},
googlescholar = {3599574756516325464},
booktitle = {Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)},
pages = {17--26},
year = 2004
}
(Ayan et al., 2004). The log-linear modeling approach may be used for the combination of word alignment models with simpler models, for instance based on part-of-speech tags
Liu, Yang and Liu, Qun and Lin, Shouxun (2005):
Log-Linear Models for Word Alignment, Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)
@InProceedings{liu-liu-lin:2005:ACL,
author = {Liu, Yang and Liu, Qun and Lin, Shouxun},
title = {Log-Linear Models for Word Alignment},
booktitle = {Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {459--466},
url = {
http://www.aclweb.org/anthology/P/P05/P05-1057},
year = 2005
}
(Liu et al., 2005).
Benchmarks
Discussion
Related Topics
System combination of output from different machine translation systems is a more complex problem, but has been shown to be very successful.
New Publications
Liu, Qun and Tu, Zhaopeng and Lin, Shouxun (2013):
A Novel Graph-based Compact Representation of Word Alignment, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{liu-tu-lin:2013:Short,
author = {Liu, Qun and Tu, Zhaopeng and Lin, Shouxun},
title = {A Novel Graph-based Compact Representation of Word Alignment},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {358--363},
url = {
http://www.aclweb.org/anthology/P13-2064},
year = 2013
}
Liu et al. (2013)
Tu, Zhaopeng and Liu, Yang and Liu, Qun and Lin, Shouxun (2011):
Extracting Hierarchical Rules from a Weighted Alignment Matrix, Proceedings of 5th International Joint Conference on Natural Language Processing
@InProceedings{tu-EtAl:2011:IJCNLP-2011,
author = {Tu, Zhaopeng and Liu, Yang and Liu, Qun and Lin, Shouxun},
title = {Extracting Hierarchical Rules from a Weighted Alignment Matrix},
booktitle = {Proceedings of 5th International Joint Conference on Natural Language Processing},
month = {November},
address = {Chiang Mai, Thailand},
publisher = {Asian Federation of Natural Language Processing},
pages = {1294--1303},
url = {
http://www.aclweb.org/anthology/I11-1145},
year = 2011
}
Tu et al. (2011)
Tu, Zhaopeng and Liu, Yang and He, Yifan and van Genabith, Josef and Liu, Qun and Lin, Shouxun (2012):
Combining Multiple Alignments to Improve Machine Translation, Proceedings of COLING 2012: Posters
@InProceedings{tu-EtAl:2012:POSTERS,
author = {Tu, Zhaopeng and Liu, Yang and He, Yifan and van Genabith, Josef and Liu, Qun and Lin, Shouxun},
title = {Combining Multiple Alignments to Improve Machine Translation},
booktitle = {Proceedings of COLING 2012: Posters},
month = {December},
address = {Mumbai, India},
publisher = {The COLING 2012 Organizing Committee},
pages = {1249--1260},
url = {
http://www.aclweb.org/anthology/C12-2122},
year = 2012
}
Tu et al. (2012)
Pal, Santanu and Naskar, Sudip and Bandyopadhyay, Sivaji (2013):
A Hybrid Word Alignment Model for Phrase-Based Statistical Machine Translation, Proceedings of the Second Workshop on Hybrid Approaches to Translation
@InProceedings{pal-naskar-bandyopadhyay:2013:HyTra,
author = {Pal, Santanu and Naskar, Sudip and Bandyopadhyay, Sivaji},
title = {A Hybrid Word Alignment Model for Phrase-Based Statistical Machine Translation},
booktitle = {Proceedings of the Second Workshop on Hybrid Approaches to Translation},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {94--101},
url = {
http://www.aclweb.org/anthology/W13-2814},
year = 2013
}
Pal et al. (2013)
Xu, Jinxi and Rosti, Antti-Veikko (2010):
Combining Unsupervised and Supervised Alignments for MT: An Empirical Study, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
@InProceedings{xu-rosti:2010:EMNLP,
author = {Xu, Jinxi and Rosti, Antti-Veikko},
title = {Combining Unsupervised and Supervised Alignments for {MT}: An Empirical Study},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {667--673},
url = {
http://www.aclweb.org/anthology/D/D10/D10-1065},
year = 2010
}
Xu and Rosti (2010)
Deng, Yonggang and Zhou, Bowen (2009):
Optimizing Word Alignment Combination For Phrase Table Training, Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
@InProceedings{deng-zhou:2009:Short,
author = {Deng, Yonggang and Zhou, Bowen},
title = {Optimizing Word Alignment Combination For Phrase Table Training},
booktitle = {Proceedings of the ACL-IJCNLP 2009 Conference Short Papers},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {229--232},
url = {
http://www.aclweb.org/anthology/P/P09/P09-2058},
year = 2009
}
Deng and Zhou (2009)
DeNero, John and Macherey, Klaus (2011):
Model-Based Aligner Combination Using Dual Decomposition, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies
@InProceedings{denero-macherey:2011:ACL-HLT2011,
author = {DeNero, John and Macherey, Klaus},
title = {Model-Based Aligner Combination Using Dual Decomposition},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies},
month = {June},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {420--429},
url = {
http://www.aclweb.org/anthology/P11-1043},
year = 2011
}
DeNero and Macherey (2011)
Xi, Ning and Tang, Guangchao and Li, Boyuan and Zhao, Yinggong (2011):
Word Alignment Combination over Multiple Word Segmentation, Proceedings of the ACL 2011 Student Session
@InProceedings{xi-EtAl:2011:SS,
author = {Xi, Ning and Tang, Guangchao and Li, Boyuan and Zhao, Yinggong},
title = {Word Alignment Combination over Multiple Word Segmentation},
booktitle = {Proceedings of the ACL 2011 Student Session},
month = {June},
address = {Portland, OR, USA},
publisher = {Association for Computational Linguistics},
pages = {1--5},
url = {
http://www.aclweb.org/anthology/P11-3001},
year = 2011
}
Xi et al. (2011)
Tufiş, Dan and Ion, Radu and Ceauşu, Alexandru and Stefanescu, Dan (2005):
Combined Word Alignments, Proceedings of the ACL Workshop on Building and Using Parallel Texts
@InProceedings{tufis-EtAl:2005:WPT,
author = {Tufi{\,s}, Dan and Ion, Radu and Ceau{\,s}u, Alexandru and Stefanescu, Dan},
title = {Combined Word Alignments},
booktitle = {Proceedings of the ACL Workshop on Building and Using Parallel Texts},
month = {June},
address = {Ann Arbor, Michigan},
publisher = {Association for Computational Linguistics},
pages = {107--110},
url = {
http://www.aclweb.org/anthology/W/W05/W05-0817},
year = 2005
}
Tufiş et al. (2005)