Discriminative Word Alignment
Viewed from machine learning, word alignment is an interesting structured prediction problem, with the interesting angle of having small amounts of supervised and large amount of unsupervised data.
Discriminative Word Alignment is the main subject of 22 publications. 17 are discussed here.
Publications
Statistical machine translation systems achieve better quality with manually labeled word alignments
Callison-Burch, Chris and Talbot, David and Osborne, Miles (2004):
Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora, Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume
@inproceedings{Callison-Burch:2004,
author = {Callison-Burch, Chris and Talbot, David and Osborne, Miles},
title = {Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora},
url = {
http://acl.ldc.upenn.edu/acl2004/main/pdf/238\_pdf\_2-col.pdf},
booktitle = {Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume},
month = {July},
address = {Barcelona, Spain},
pages = {175--182},
year = 2004
}
(Callison-Burch et al., 2004), but such data does not exist in large quantities. Discriminative word alignment methods typically generate statistics over a large unlabeled corpus which may have been aligned with some baseline method such as the IBM models, which form the basis for features that are optimized during machine learning over a much smaller labeled corpus.
Fraser, Alexander and Marcu, Daniel (2007):
Getting the Structure Right for Word Alignment: LEAF, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
@InProceedings{fraser-marcu:2007:EMNLP-CoNLL2007,
author = {Fraser, Alexander and Marcu, Daniel},
title = {Getting the Structure Right for Word Alignment: {LEAF}},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {51--60},
url = {
http://www.aclweb.org/anthology/D/D07/D07-1006},
year = 2007
}
Fraser and Marcu (2007) extend their generative model that allows many-to-many alignments by a discriminative optimization step that uses small amounts of labeled data.
Discriminative approaches may use the perceptron algorithm
Moore, Robert C. (2005):
A Discriminative Framework for Bilingual Word Alignment, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
@InProceedings{moore:2005:HLTEMNLP,
author = {Moore, Robert C.},
title = {A Discriminative Framework for Bilingual Word Alignment},
booktitle = {Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {81--88},
url = {
http://www.aclweb.org/anthology/H/H05/H05-1011},
year = 2005
}
(Moore, 2005;
Moore, Robert C. and Yih, Wen-tau and Bode, Andreas (2006):
Improved Discriminative Bilingual Word Alignment, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{moore-yih-bode:2006:COLACL,
author = {Moore, Robert C. and Yih, Wen-tau and Bode, Andreas},
title = {Improved Discriminative Bilingual Word Alignment},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {513--520},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1065},
year = 2006
}
Moore et al., 2006), maximum entropy models
Ittycheriah, Abraham and Roukos, Salim (2005):
A Maximum Entropy Word Aligner for Arabic-English Machine Translation, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
@InProceedings{ittycheriah-roukos:2005:HLTEMNLP,
author = {Ittycheriah, Abraham and Roukos, Salim},
title = {A Maximum Entropy Word Aligner for {Arabic-English} Machine Translation},
booktitle = {Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {89--96},
url = {
http://www.aclweb.org/anthology/H/H05/H05-1012},
year = 2005
}
(Ittycheriah and Roukos, 2005), neural networks
Ayan, Necip Fazil and Dorr, Bonnie J. and Monz, Christof (2005):
NeurAlign: Combining Word Alignments Using Neural Networks, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
@InProceedings{ayan-dorr-monz:2005:HLTEMNLP1,
author = {Ayan, Necip Fazil and Dorr, Bonnie J. and Monz, Christof},
title = {{NeurAlign}: Combining Word Alignments Using Neural Networks},
booktitle = {Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {65--72},
url = {
http://www.aclweb.org/anthology/H/H05/H05-1009},
year = 2005
}
(Ayan et al., 2005), max-margin methods
Taskar, Ben and Simon, Lacoste-Julien and Klein, Dan (2005):
A Discriminative Matching Approach to Word Alignment, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
@InProceedings{taskar-simon-dan:2005:HLTEMNLP,
author = {Taskar, Ben and Simon, Lacoste-Julien and Klein, Dan},
title = {A Discriminative Matching Approach to Word Alignment},
booktitle = {Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {73--80},
url = {
http://www.aclweb.org/anthology/H/H05/H05-1010},
year = 2005
}
(Taskar et al., 2005), boosting
Hua Wu and Haifeng Wang (2005):
Boosting Statistical Word Alignment, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
@InProceedings{Wu:2005:MTS,
author = {Hua Wu and Haifeng Wang},
title = {Boosting Statistical Word Alignment},
url = {
http://mt-archive.info/MTS-2005-Wu-1.pdf},
googlescholar = {10352336439493088695},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
(Wu and Wang, 2005;
Wu, Hua and Wang, Haifeng and Liu, Zhanyi (2006):
Boosting Statistical Word Alignment Using Labeled and Unlabeled Data, Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions
@InProceedings{wu-wang-liu:2006:POS,
author = {Wu, Hua and Wang, Haifeng and Liu, Zhanyi},
title = {Boosting Statistical Word Alignment Using Labeled and Unlabeled Data},
booktitle = {Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {913--920},
url = {
http://www.aclweb.org/anthology/P/P06/P06-2117},
year = 2006
}
Wu et al., 2006), support vector machines
Cherry, Colin and Lin, Dekang (2006):
Soft Syntactic Constraints for Word Alignment through Discriminative Training, Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions
@InProceedings{cherry-lin:2006:POS,
author = {Cherry, Colin and Lin, Dekang},
title = {Soft Syntactic Constraints for Word Alignment through Discriminative Training},
booktitle = {Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {105--112},
url = {
http://www.aclweb.org/anthology/P/P06/P06-2014},
year = 2006
}
(Cherry and Lin, 2006), conditional random fields
Blunsom, Phil and Cohn, Trevor (2006):
Discriminative Word Alignment with Conditional Random Fields, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{blunsom-cohn:2006:COLACL,
author = {Blunsom, Phil and Cohn, Trevor},
title = {Discriminative Word Alignment with Conditional Random Fields},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {65--72},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1009},
year = 2006
}
(Blunsom and Cohn, 2006;
Niehues, Jan and Vogel, Stephan (2008):
Discriminative Word Alignment via Alignment Matrix Modeling, Proceedings of the Third Workshop on Statistical Machine Translation
@InProceedings{niehues-vogel:2008:WMT,
author = {Niehues, Jan and Vogel, Stephan},
title = {Discriminative Word Alignment via Alignment Matrix Modeling},
booktitle = {Proceedings of the Third Workshop on Statistical Machine Translation},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {18--25},
url = {
http://www.aclweb.org/anthology/W/W08/W08-0303},
year = 2008
}
Niehues and Vogel, 2008) or MIRA
Venkatapathy, Sriram and Joshi, Aravind (2007):
Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair, Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
@InProceedings{venkatapathy-joshi:2007:SSST,
author = {Venkatapathy, Sriram and Joshi, Aravind},
title = {Discriminative word alignment by learning the alignment structure and syntactic divergence between a language pair},
booktitle = {Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation},
month = {April},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {49--56},
url = {
http://www.aclweb.org/anthology/W/W07/W07-0407},
year = 2007
}
(Venkatapathy and Joshi, 2007).
Such methods allow the integration of features such as a more flexible fertility model and interactions between consecutive words
Lacoste-Julien, Simon and Taskar, Ben and Klein, Dan and Jordan, Michael I. (2006):
Word Alignment via Quadratic Assignment, Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
@InProceedings{lacostejulien-EtAl:2006:HLT-NAACL06-Main,
author = {Lacoste-Julien, Simon and Taskar, Ben and Klein, Dan and Jordan, Michael I.},
title = {Word Alignment via Quadratic Assignment},
booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
month = {June},
address = {New York City, USA},
publisher = {Association for Computational Linguistics},
pages = {112--119},
url = {
http://www.aclweb.org/anthology/N/N06/N06-1015},
year = 2006
}
(Lacoste-Julien et al., 2006). Especially smaller parallel corpora benefit from more attention to less frequent words
Yujie Zhang and Qun Liu and Qing Ma and Hitoshi Isahara (2005):
A Multi-aligner for Japanese-Chinese Parallel Corpora, Proceedings of the Tenth Machine Translation Summit (MT Summit X)
@InProceedings{Zhang:2005b:MTS,
author = {Yujie Zhang and Qun Liu and Qing Ma and Hitoshi Isahara},
title = {A Multi-aligner for {J}apanese-{C}hinese Parallel Corpora},
url = {
http://www.mt-archive.info/MTS-2005-Zhang-2.pdf},
googlescholar = {4005956388130000346},
booktitle = {Proceedings of the Tenth Machine Translation Summit (MT Summit X)},
month = {September},
address = {Phuket, Thailand},
year = 2005
}
(Zhang et al., 2005). Discriminative models open a path to add additional features such as ITG constraint
Wen-Han Chao and Zhou-Jun Li (2007):
Incorporating Constituent Structure Constraint into Discriminative Word Alignment, Proceedings of the MT Summit XI
@inproceedings{Chao:2007:MTSummit,
author = {Wen-Han Chao and Zhou-Jun Li},
title = {Incorporating Constituent Structure Constraint into Discriminative Word Alignment},
url = {
http://www.mt-archive.info/MTS-2007-Chao.pdf},
googlescholar = {4105768373208245828},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
(Chao and Li, 2007).
Related to the discriminative approach, posterior methods use agreement in the n-best alignments to adjust alignment points
Kumar, Shankar and Byrne, William (2002):
Minimum Bayes-Risk Word Alignments of Bilingual Texts, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
@inproceedings{Kumar:2002,
author = {Kumar, Shankar and Byrne, William},
title = {Minimum {B}ayes-Risk Word Alignments of Bilingual Texts},
url = {
http://acl.ldc.upenn.edu/acl2002/EMNLP/pdfs/EMNLP244.pdf},
googlescholar = {9528366540531076980},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {July},
address = {Philadelphia},
publisher = {Association for Computational Linguistics},
pages = {140--147},
year = 2002
}
(Kumar and Byrne, 2002).
Benchmarks
Discussion
Related Topics
New Publications
Nadi Tomeh and Alexandre Allauzen and François Yvon and Guillaume Wisniewski (2010):
Refining Word Alignment with Discriminative Training, Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas
@inproceedings{AMTA-2010-Tomeh,
author = {Nadi Tomeh and Alexandre Allauzen and Fran{\,c}ois Yvon and Guillaume Wisniewski},
title = {Refining Word Alignment with Discriminative Training},
url = {
http://www.mt-archive.info/AMTA-2010-Tomeh.pdf},
booktitle = {Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas},
location = {Denver, Colorado},
year = 2010
}
Tomeh et al. (2010)
Yang Liu and Qun Liu and Shouxun Lin (2010):
Discriminative Word Alignment by Linear Modeling, Computational Linguistics
@Article{CL:2010-3002,
author = {Yang Liu and Qun Liu and Shouxun Lin},
title = {Discriminative Word Alignment by Linear Modeling},
journal = {Computational Linguistics},
volume = {36},
number = {3},
url = {
http://aclweb.org/anthology-new/J/J10/J10-3002.pdf},
year = 2010
}
Liu et al. (2010)
Liu, Yang and Xia, Tian and Xiao, Xinyan and Liu, Qun (2009):
Weighted Alignment Matrices for Statistical Machine Translation, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
@InProceedings{liu-EtAl:2009:EMNLP3,
author = {Liu, Yang and Xia, Tian and Xiao, Xinyan and Liu, Qun},
title = {Weighted Alignment Matrices for Statistical Machine Translation},
booktitle = {Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing},
month = {August},
address = {Singapore},
publisher = {Association for Computational Linguistics},
pages = {1017--1026},
url = {
http://www.aclweb.org/anthology/D/D09/D09-1106},
year = 2009
}
Liu et al. (2009)
Setiawan, Hendra and Dyer, Chris and Resnik, Philip (2010):
Discriminative Word Alignment with a Function Word Reordering Model, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
@InProceedings{setiawan-dyer-resnik:2010:EMNLP,
author = {Setiawan, Hendra and Dyer, Chris and Resnik, Philip},
title = {Discriminative Word Alignment with a Function Word Reordering Model},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {534--544},
url = {
http://www.aclweb.org/anthology/D/D10/D10-1052},
year = 2010
}
Setiawan et al. (2010)
Dyer, Chris and Clark, Jonathan H. and Lavie, Alon and Smith, Noah A. (2011):
Unsupervised Word Alignment with Arbitrary Features, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies
@InProceedings{dyer-EtAl:2011:ACL-HLT2011,
author = {Dyer, Chris and Clark, Jonathan H. and Lavie, Alon and Smith, Noah A.},
title = {Unsupervised Word Alignment with Arbitrary Features},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies},
month = {June},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {409--419},
url = {
http://www.aclweb.org/anthology/P11-1042},
year = 2011
}
Dyer et al. (2011)