Training Data for Transliteration
Since transliteration examples typically do not exist, there has been significant effort to collect such data.
Transliteration Training Data is the main subject of 43 publications. 18 are discussed here.
Publications
Training data may be collected from parallel corpora
Chun-Jen Lee and Jason S. Chang (2003):
Acquisition of English-Chinese Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model, HLT-NAACL 2003 Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond
@inproceedings{Lee:2003,
author = {Chun-Jen Lee and Jason S. Chang },
title = {Acquisition of {English-Chinese} Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model},
url = {
http://acl.ldc.upenn.edu/W/W03/W03-0317.pdf},
booktitle = {HLT-NAACL 2003 Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond},
editor = {Rada Mihalcea and Ted Pedersen},
month = {May 31},
address = {Edmonton, Alberta, Canada},
publisher = {Association for Computational Linguistics},
year = 2003
}
(Lee and Chang, 2003;
Chun-Jen Lee and Jason S. Chang and Thomas C. Chuang (2004):
Alignment of bilingual named entities in parallel corpora using statistical model, Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)
@inproceedings{lee:2004:AMTA,
author = {Chun-Jen Lee and Jason S. Chang and Thomas C. Chuang},
title = {Alignment of bilingual named entities in parallel corpora using statistical model},
booktitle = {Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)},
pages = {144-153},
year = 2004
}
Lee et al., 2004), or by mining comparable data such as news streams
Klementiev, Alexandre and Roth, Dan (2006):
Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{klementiev-roth:2006:COLACL,
author = {Klementiev, Alexandre and Roth, Dan},
title = {Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {817--824},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1103},
year = 2006
}
(Klementiev and Roth, 2006;
Klementiev, Alexandre and Roth, Dan (2006):
Named Entity Transliteration and Discovery from Multilingual Comparable Corpora, Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
@InProceedings{klementiev-roth:2006:HLT-NAACL06-Main,
author = {Klementiev, Alexandre and Roth, Dan},
title = {Named Entity Transliteration and Discovery from Multilingual Comparable Corpora},
booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
month = {June},
address = {New York City, USA},
publisher = {Association for Computational Linguistics},
pages = {82--88},
url = {
http://www.aclweb.org/anthology/N/N06/N06-1011},
year = 2006
}
Klementiev and Roth, 2006b). Training data for transliteration may also be obtained from monolingual text where the spelling of a foreign name is followed by its native form in parenthesis
Tracy Lin and Jian-Cheng Wu and Jason S. Chang (2004):
Extraction of name and transliteration in monolingual and parallel corpora, Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)
@inproceedings{lin:2004:AMTA,
author = {Tracy Lin and Jian-Cheng Wu and Jason S. Chang},
title = {Extraction of name and transliteration in monolingual and parallel corpora},
booktitle = {Proceedings of the 6th Conference of the Association for Machine Translation in the Americas (AMTA 2004)},
pages = {177--186},
year = 2004
}
(Lin et al., 2004;
Chen, Conrad and Chen, Hsin-Hsi (2006):
A High-Accurate Chinese-English NE Backward Translation System Combining Both Lexical Information and Web Statistics, Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions
@InProceedings{chen-chen:2006:POS,
author = {Chen, Conrad and Chen, Hsin-Hsi},
title = {A High-Accurate {Chinese-English} {NE} Backward Translation System Combining Both Lexical Information and Web Statistics},
booktitle = {Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {81--88},
url = {
http://www.aclweb.org/anthology/P/P06/P06-2011},
year = 2006
}
Chen and Chen, 2006;
Lin, Dekang and Zhao, Shaojun and Van Durme, Benjamin and Paşca, Marius (2008):
Mining Parenthetical Translations from the Web by Word Alignment, Proceedings of ACL-08: HLT
@InProceedings{lin-EtAl:2008:ACLMain,
author = {Lin, Dekang and Zhao, Shaojun and Van Durme, Benjamin and Pa\c{s}ca, Marius},
title = {Mining Parenthetical Translations from the Web by Word Alignment},
booktitle = {Proceedings of ACL-08: HLT},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {994--1002},
url = {
http://www.aclweb.org/anthology/P/P08/P08-1113},
year = 2008
}
Lin et al., 2008), which is common for instance for unusual English names in Chinese text. Such an acquisition may be improved by bootstrapping — iteratively extracting high-confidence pairs and improving the matching model
Sherif, Tarek and Kondrak, Grzegorz (2007):
Bootstrapping a Stochastic Transducer for Arabic-English Transliteration Extraction, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
@InProceedings{sherif-kondrak:2007:ACLMain1,
author = {Sherif, Tarek and Kondrak, Grzegorz},
title = {Bootstrapping a Stochastic Transducer for {A}rabic-{E}nglish Transliteration Extraction},
booktitle = {Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics},
month = {June},
address = {Prague, Czech Republic},
publisher = {Association for Computational Linguistics},
pages = {864--871},
url = {
http://www.aclweb.org/anthology/P/P07/P07-1109},
year = 2007
}
(Sherif and Kondrak, 2007).
Sproat, Richard and Tao, Tao and Zhai, ChengXiang (2006):
Named Entity Transliteration with Comparable Corpora, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{sproat-tao-zhai:2006:COLACL,
author = {Sproat, Richard and Tao, Tao and Zhai, ChengXiang},
title = {Named Entity Transliteration with Comparable Corpora},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {73--80},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1010},
year = 2006
}
Sproat et al. (2006) fish for name transliteration in comparable corpora, also using phonetic correspondences.
Tao, Tao and Yoon, Su-Youn and Fister, Andrew and Sproat, Richard and Zhai, ChengXiang (2006):
Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
@InProceedings{tao-EtAl:2006:EMNLP,
author = {Tao, Tao and Yoon, Su-Youn and Fister, Andrew and Sproat, Richard and Zhai, ChengXiang},
title = {Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation},
booktitle = {Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {250--257},
url = {
http://www.aclweb.org/anthology/W/W06/W06-1630},
year = 2006
}
Tao et al. (2006) exploit additionally temporal distributions of name mentions, and
Yoon, Su-Youn and Kim, Kyoung-Young and Sproat, Richard (2007):
Multilingual Transliteration Using Feature based Phonetic Method, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
mentioned in Transliteration With FSM and Transliteration Training Data@InProceedings{yoon-kim-sproat:2007:ACLMain,
author = {Yoon, Su-Youn and Kim, Kyoung-Young and Sproat, Richard},
title = {Multilingual Transliteration Using Feature based Phonetic Method},
booktitle = {Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics},
month = {June},
address = {Prague, Czech Republic},
publisher = {Association for Computational Linguistics},
pages = {112--119},
url = {
http://www.aclweb.org/anthology/P/P07/P07-1015},
year = 2007
}
Yoon et al. (2007) use a Winnow algorithm and a classifier to bootstrap the acquisition process.
Guihong Cao and Jianfeng Gao and Jian-Yun Nie (2007):
A System to Mine Large-Scale Bilingual Dictionaries from Monolingual Web Pages, Proceedings of the MT Summit XI
@inproceedings{Cao:2007:MTSummit,
author = {Guihong Cao and Jianfeng Gao and Jian-Yun Nie},
title = {A System to Mine Large-Scale Bilingual Dictionaries from Monolingual Web Pages},
url = {
http://mt-archive.info/MTS-2007-Cao.pdf},
googlescholar = {15558836676960243186},
booktitle = {Proceedings of the {MT} Summit XI},
year = 2007
}
Cao et al. (2007) use various features, including that a Chinese character is part of a transliteration a priori in a perceptron classifier. Large monolingual corpus resources such as the web are used for validation
Al-Onaizan, Yaser and Knight, Kevin (2002):
Machine Transliteration of Names in Arabic Texts, Proceedings of the Workshop on Computational Approaches to Semitic Languages
@inproceedings{Al-Onaizan:2002b,
author = {Al-Onaizan, Yaser and Knight, Kevin},
title = {Machine Transliteration of Names in {Arabic} Texts},
booktitle = {Proceedings of the Workshop on Computational Approaches to Semitic Languages},
month = {July},
address = {Philadelphia},
publisher = {Association for Computational Linguistics},
pages = {34--46},
year = 2002
}
(Al-Onaizan and Knight, 2002;
Yaser Al-Onaizan and Kevin Knight (2002):
Translating Named Entities Using Monolingual and Bilingual Resources, Proceedings of the 40th Annual Meeting of the Association of Computational Linguistics (ACL)
@Inproceedings{Al-Onaizan:2002,
author = {Yaser Al-Onaizan and Kevin Knight},
title = {Translating Named Entities Using Monolingual and Bilingual Resources},
url = {
http://acl.ldc.upenn.edu/acl2002/MAIN/pdfs/Main209.pdf},
booktitle = {Proceedings of the 40th Annual Meeting of the Association of Computational Linguistics (ACL)},
year = 2002
}
Al-Onaizan and Knight, 2002b;
Qu, Yan and Grefenstette, Gregory (2004):
Finding Ideographic Representations of Japanese Names Written in Latin Script via Language Identification and Corpus Validation, Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume
@inproceedings{Qu:2004,
author = {Qu, Yan and Grefenstette, Gregory},
title = {Finding Ideographic Representations of {J}apanese Names Written in Latin Script via Language Identification and Corpus Validation},
url = {
http://acl.ldc.upenn.edu/acl2004/main/pdf/305\_pdf\_2-col.pdf},
googlescholar = {17933666699661624244},
booktitle = {Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume},
month = {July},
address = {Barcelona, Spain},
pages = {183--190},
year = 2004
}
Qu and Grefenstette, 2004;
Kuo, Jin-Shea and Li, Haizhou and Yang, Ying-Kuei (2006):
Learning Transliteration Lexicons from the Web, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
@InProceedings{kuo-li-yang:2006:COLACL,
author = {Kuo, Jin-Shea and Li, Haizhou and Yang, Ying-Kuei},
title = {Learning Transliteration Lexicons from the Web},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {1129--1136},
url = {
http://www.aclweb.org/anthology/P/P06/P06-1142},
year = 2006
}
Kuo et al., 2006;
Yang, Fan and Zhao, Jun and Zou, Bo and Liu, Kang and Liu, Feifan (2008):
Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages, Proceedings of ACL-08: HLT
@InProceedings{yang-EtAl:2008:ACLMain1,
author = {Yang, Fan and Zhao, Jun and Zou, Bo and Liu, Kang and Liu, Feifan},
title = {{Chinese}-{English} Backward Transliteration Assisted with Mining Monolingual Web Pages},
booktitle = {Proceedings of ACL-08: HLT},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {541--549},
url = {
http://www.aclweb.org/anthology/P/P08/P08-1062},
year = 2008
}
Yang et al., 2008). Of course, training data may also be manually created, possibly aided by an active learning component that suggests the most valuable new examples
Goldwasser, Dan and Roth, Dan (2008):
Active Sample Selection for Named Entity Transliteration, Proceedings of ACL-08: HLT, Short Papers
@InProceedings{goldwasser-roth:2008:ACLShort,
author = {Goldwasser, Dan and Roth, Dan},
title = {Active Sample Selection for Named Entity Transliteration},
booktitle = {Proceedings of ACL-08: HLT, Short Papers},
month = {June},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {53--56},
url = {
http://www.aclweb.org/anthology/P/P08/P08-2014},
year = 2008
}
(Goldwasser and Roth, 2008).
Benchmarks
Discussion
Related Topics
New Publications
You, Gae-won and Cha, Young-rok and Kim, Jinhan and Hwang, Seung-won (2013):
Enriching Entity Translation Discovery using Selective Temporality, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
@InProceedings{you-EtAl:2013:Short,
author = {You, Gae-won and Cha, Young-rok and Kim, Jinhan and Hwang, Seung-won},
title = {Enriching Entity Translation Discovery using Selective Temporality},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {August},
address = {Sofia, Bulgaria},
publisher = {Association for Computational Linguistics},
pages = {201--205},
url = {
http://www.aclweb.org/anthology/P13-2036},
year = 2013
}
You et al. (2013)
Kunchukuttan, Anoop and Bhattacharyya, Pushpak (2015):
Data representation methods and use of mined corpora for Indian language transliteration, Proceedings of the Fifth Named Entity Workshop
@InProceedings{kunchukuttan-bhattacharyya:2015:NEWS2015,
author = {Kunchukuttan, Anoop and Bhattacharyya, Pushpak},
title = {Data representation methods and use of mined corpora for Indian language transliteration},
booktitle = {Proceedings of the Fifth Named Entity Workshop},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {78--82},
url = {
http://www.aclweb.org/anthology/W15-3912},
year = 2015
}
Kunchukuttan and Bhattacharyya (2015)
Richardson, John and Nakazawa, Toshiaki and Kurohashi, Sadao (2013):
Robust Transliteration Mining from Comparable Corpora with Bilingual Topic Models, Proceedings of the Sixth International Joint Conference on Natural Language Processing
@InProceedings{richardson-nakazawa-kurohashi:2013:IJCNLP,
author = {Richardson, John and Nakazawa, Toshiaki and Kurohashi, Sadao},
title = {Robust Transliteration Mining from Comparable Corpora with Bilingual Topic Models},
booktitle = {Proceedings of the Sixth International Joint Conference on Natural Language Processing},
month = {October},
address = {Nagoya, Japan},
publisher = {Asian Federation of Natural Language Processing},
pages = {261--269},
url = {
http://www.aclweb.org/anthology/I13-1030},
year = 2013
}
Richardson et al. (2013)
Yufeng Chen and Chengqing Zong and Keh-Yih Su (2013):
A Joint Model to Identify and Align Bilingual Named Entities, Computational Linguistics
@Article{CL:2013-2001,
author = {Yufeng Chen and Chengqing Zong and Keh-Yih Su},
title = {A Joint Model to Identify and Align Bilingual Named Entities},
journal = {Computational Linguistics},
volume = {39},
number = {2},
url = {
http://aclweb.org/anthology-new/J/J13/J13-2001.pdf},
year = 2013
}
Chen et al. (2013)
El-Kahki, Ali and Darwish, Kareem and Abdul-Wahab, Mohamed and Taei, Ahmed (2012):
Transliteration Mining Using Large Training and Test Sets, Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
@InProceedings{elkahki-EtAl:2012:NAACL-HLT,
author = {El-Kahki, Ali and Darwish, Kareem and Abdul-Wahab, Mohamed and Taei, Ahmed},
title = {Transliteration Mining Using Large Training and Test Sets},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
address = {Montr\'{e}al, Canada},
publisher = {Association for Computational Linguistics},
pages = {243--252},
url = {
http://www.aclweb.org/anthology/N12-1025},
year = 2012
}
El-Kahki et al. (2012)
Munro, Robert and Manning, Christopher D. (2012):
Accurate Unsupervised Joint Named-Entity Extraction from Unaligned Parallel Text, Proceedings of the 4th Named Entity Workshop (NEWS) 2012
@InProceedings{munro-manning:2012:NEWS2012,
author = {Munro, Robert and Manning, Christopher D.},
title = {Accurate Unsupervised Joint Named-Entity Extraction from Unaligned Parallel Text},
booktitle = {Proceedings of the 4th Named Entity Workshop (NEWS) 2012},
month = {July},
address = {Jeju, Korea},
publisher = {Association for Computational Linguistics},
pages = {21--29},
url = {
http://www.aclweb.org/anthology/W12-4403},
year = 2012
}
Munro and Manning (2012)
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut (2012):
A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
@InProceedings{sajjad-fraser-schmid:2012:ACL2012,
author = {Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut},
title = {A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {469--477},
url = {
http://www.aclweb.org/anthology/P12-1049},
year = 2012
}
Sajjad et al. (2012)
Walid Aransa and Holger Schwenk and Loic Barrault (2012):
Semi-supervised transliteration mining from parallel and comparable corpora, Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)
@inproceedings{iwslt12:Aransa,
author = {Walid Aransa and Holger Schwenk and Loic Barrault},
title = {Semi-supervised transliteration mining from parallel and comparable corpora},
url = {
http://www.mt-archive.info/IWSLT-2012-Aransa.pdf},
pages = {185-192},
booktitle = {Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT)},
location = {Hong Kong},
year = 2012
}
Aransa et al. (2012)
Chang, Ming-Wei and Goldwasser, Dan and Roth, Dan and Tu, Yuancheng (2009):
Unsupervised Constraint Driven Learning For Transliteration Discovery, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
@InProceedings{chang-EtAl:2009:NAACLHLT09,
author = {Chang, Ming-Wei and Goldwasser, Dan and Roth, Dan and Tu, Yuancheng},
title = {Unsupervised Constraint Driven Learning For Transliteration Discovery},
booktitle = {Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
month = {June},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {299--307},
url = {
http://www.aclweb.org/anthology/N/N09/N09-1034},
year = 2009
}
Chang et al. (2009)
Yang, Fan and Zhao, Jun and Liu, Kang (2009):
A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
@InProceedings{yang-zhao-liu:2009:ACLIJCNLP,
author = {Yang, Fan and Zhao, Jun and Liu, Kang},
title = {A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment},
booktitle = {Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {387--395},
url = {
http://www.aclweb.org/anthology/P/P09/P09-1044},
year = 2009
}
Yang et al. (2009)
You, Gae-won and Hwang, Seung-won and Song, Young-In and Jiang, Long and Nie, Zaiqing (2010):
Mining Name Translations from Entity Graph Mapping, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
@InProceedings{you-EtAl:2010:EMNLP,
author = {You, Gae-won and Hwang, Seung-won and Song, Young-In and Jiang, Long and Nie, Zaiqing},
title = {Mining Name Translations from Entity Graph Mapping},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {430--439},
url = {
http://www.aclweb.org/anthology/D/D10/D10-1042},
year = 2010
}
You et al. (2010)
Ji, Heng (2009):
Mining Name Translations from Comparable Corpora by Creating Bilingual Information Networks, Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
@InProceedings{ji:2009:BUCC,
author = {Ji, Heng},
title = {Mining Name Translations from Comparable Corpora by Creating Bilingual Information Networks},
booktitle = {Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora},
month = {August},
address = {Singapore},
publisher = {Association for Computational Linguistics},
pages = {34--37},
url = {
http://www.aclweb.org/anthology/W/W09/W09-3107},
year = 2009
}
Ji (2009)
Chen, Yufeng and Zong, Chengqing and Su, Keh-Yih (2010):
On Jointly Recognizing and Aligning Bilingual Named Entities, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
@InProceedings{chen-zong-su:2010:ACL,
author = {Chen, Yufeng and Zong, Chengqing and Su, Keh-Yih},
title = {On Jointly Recognizing and Aligning Bilingual Named Entities},
booktitle = {Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {631--639},
url = {
http://www.aclweb.org/anthology/P10-1065},
year = 2010
}
Chen et al. (2010)
Udupa, Raghavendra and Saravanan, K and Kumaran, A and Jagarlamudi, Jagadeesh (2009):
MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora, Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
@InProceedings{udupa-EtAl:2009:EACL,
author = {Udupa, Raghavendra and Saravanan, K and Kumaran, A and Jagarlamudi, Jagadeesh},
title = {{MINT}: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora},
booktitle = {Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)},
month = {March},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {799--807},
url = {
http://www.aclweb.org/anthology/E09-1091},
year = 2009
}
Udupa et al. (2009)
Kumaran, A and M. Khapra, Mitesh and Li, Haizhou (2010):
Report of NEWS 2010 Transliteration Mining Shared Task, Proceedings of the 2010 Named Entities Workshop
@InProceedings{kumaran-mkhapra-li:2010:NEWS1,
author = {Kumaran, A and M. Khapra, Mitesh and Li, Haizhou},
title = {Report of {NEWS} 2010 Transliteration Mining Shared Task},
booktitle = {Proceedings of the 2010 Named Entities Workshop},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {21--28},
url = {
http://www.aclweb.org/anthology/W10-2403},
year = 2010
}
Kumaran et al. (2010)
Kumaran, A and M. Khapra, Mitesh and Li, Haizhou (2010):
Whitepaper of NEWS 2010 Shared Task on Transliteration Mining, Proceedings of the 2010 Named Entities Workshop
@InProceedings{kumaran-mkhapra-li:2010:NEWS2,
author = {Kumaran, A and M. Khapra, Mitesh and Li, Haizhou},
title = {Whitepaper of {NEWS} 2010 Shared Task on Transliteration Mining},
booktitle = {Proceedings of the 2010 Named Entities Workshop},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {29--38},
url = {
http://www.aclweb.org/anthology/W10-2404},
year = 2010
}
Kumaran et al. (2010)
Li, Haizhou and Kumaran, A and Zhang, Min and Pervouchine, Vladimir (2010):
Report of NEWS 2010 Transliteration Generation Shared Task, Proceedings of the 2010 Named Entities Workshop
@InProceedings{li-EtAl:2010:NEWS1,
author = {Li, Haizhou and Kumaran, A and Zhang, Min and Pervouchine, Vladimir},
title = {Report of {NEWS} 2010 Transliteration Generation Shared Task},
booktitle = {Proceedings of the 2010 Named Entities Workshop},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {1--11},
url = {
http://www.aclweb.org/anthology/W10-2401},
year = 2010
}
Li et al. (2010)
Li, Haizhou and Kumaran, A and Zhang, Min and Pervouchine, Vladimir (2010):
Whitepaper of NEWS 2010 Shared Task on Transliteration Generation, Proceedings of the 2010 Named Entities Workshop
@InProceedings{li-EtAl:2010:NEWS2,
author = {Li, Haizhou and Kumaran, A and Zhang, Min and Pervouchine, Vladimir},
title = {Whitepaper of {NEWS} 2010 Shared Task on Transliteration Generation},
booktitle = {Proceedings of the 2010 Named Entities Workshop},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {12--20},
url = {
http://www.aclweb.org/anthology/W10-2402},
year = 2010
}
Li et al. (2010)
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut (2011):
An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies
@InProceedings{sajjad-fraser-schmid:2011:ACL-HLT2011,
author = {Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut},
title = {An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies},
month = {June},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {430--439},
url = {
http://www.aclweb.org/anthology/P11-1044},
year = 2011
}
Sajjad et al. (2011)
El Kahki, Ali and Darwish, Kareem and Saad El Din, Ahmed and Abd El-Wahab, Mohamed and Hefny, Ahmed and Ammar, Waleed (2011):
Improved Transliteration Mining Using Graph Reinforcement, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
@InProceedings{elkahki-EtAl:2011:EMNLP,
author = {El Kahki, Ali and Darwish, Kareem and Saad El Din, Ahmed and Abd El-Wahab, Mohamed and Hefny, Ahmed and Ammar, Waleed},
title = {Improved Transliteration Mining Using Graph Reinforcement},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {1384--1393},
url = {
http://www.aclweb.org/anthology/D11-1128},
year = 2011
}
Kahki et al. (2011)
Freeman, Andrew and Condon, Sherri and Ackerman, Christopher (2006):
Cross Linguistic Name Matching in English and Arabic, Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
@InProceedings{freeman-condon-ackerman:2006:HLT-NAACL06-Main,
author = {Freeman, Andrew and Condon, Sherri and Ackerman, Christopher},
title = {Cross Linguistic Name Matching in {English} and {Arabic}},
booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
month = {June},
address = {New York City, USA},
publisher = {Association for Computational Linguistics},
pages = {471--478},
url = {
http://www.aclweb.org/anthology/N/N06/N06-1060},
year = 2006
}
Freeman et al. (2006)
Wu, Jian-Cheng and Chang, Jason S. (2007):
Learning to Find English to Chinese Transliterations on the Web, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
@InProceedings{wu-chang:2007:EMNLP-CoNLL2007,
author = {Wu, Jian-Cheng and Chang, Jason S.},
title = {Learning to Find {E}nglish to {C}hinese Transliterations on the Web},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {996--1004},
url = {
http://www.aclweb.org/anthology/D/D07/D07-1106},
year = 2007
}
Wu and Chang (2007)
Jong-Hoon Oh and Hitoshi Isahara (2008):
Hypothesis Selection in Machine Transliteration: A Web Mining Approach , Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP)
@inproceedings{Oh:2008:IJCNLP,
author = {Jong-Hoon Oh and Hitoshi Isahara},
title = {Hypothesis Selection in Machine Transliteration: A Web Mining Approach },
url = {
http://www.mt-archive.info/IJCNLP-2008-Oh.pdf},
googlescholar = {15247339783523356815},
booktitle = {Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP)},
year = 2008
}
Oh and Isahara (2008)
Chengguo Jin and Seung-Hoon Na and Dong-Il Kim and Jong-Hyeok Lee (2008):
Automatic Extraction of English-Chinese Transliteration Pairs using Dynamic Window and Tokenizer, Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing
@inproceedings{Jin:2008:IJCNLP,
author = {Chengguo Jin and Seung-Hoon Na and Dong-Il Kim and Jong-Hyeok Lee},
title = {Automatic Extraction of {E}nglish-{C}hinese Transliteration Pairs using Dynamic Window and Tokenizer},
url = {
http://oldsite.aclweb.org/anthology-new/I/I08/I08-4002.pdf},
googlescholar = {14103457912353076560},
booktitle = {Proceedings of the Sixth SIGHAN Workshop on {Chinese} Language Processing},
year = 2008
}
Jin et al. (2008)
Jin-Shea Kuo and Haizhou Li and Chih-Lung Lin (2008):
Mining Transliterations from Web Query Results: An Incremental Approach , Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing
@inproceedings{Kuo2:2008:IJCNLP,
author = {Jin-Shea Kuo and Haizhou Li and Chih-Lung Lin},
title = {Mining Transliterations from Web Query Results: An Incremental Approach },
url = {
http://www.mt-archive.info/IJCNLP-2008-Kuo-2.pdf},
googlescholar = {14247836374749958932},
booktitle = {Proceedings of the Sixth SIGHAN Workshop on {Chinese} Language Processing},
year = 2008
}
Kuo et al. (2008)