String to Tree Models

The motivation to use linguistic syntax trees on the target side is to support grammatical coherent output and ground restructuring in syntactic properties.

String To Tree is the main subject of 18 publications. 12 are discussed here.

Topics in SyntaxBasedModels

Publications

String to tree models differ by the type of rules and linguistic annotation. Galley et al. (2004) build translation rules that map input phrases to output tree fragments. Contextually richer rules and learning rule probabilities with the EM algorithm may lead to better performance (Galley et al., 2006). But also adjusting the parse trees to be able to extract rules for all lexical matches may be important — which requires the introduction of additional nonterminal symbols (Marcu et al., 2006) or rules with multiple head nodes (Liu et al., 2007). Instead of using standard Penn treebank labels for nonterminals, relabeling the constituents may lead to the acquisiton of better rules (Huang and Knight, 2006). Since syntactic structure prohibits some phrase pairs that may be learned as syntactic translation rules, leading to less coverage, this may be alleviated by adjusting the rule extraction algorithm (DeNeefe et al., 2007). DeNeefe et al. (2005) present an interactive tool to inspect the workings of such syntactic translation models.

Syntax-augmented models (Zollmann et al., 2006) overcome the restricting of matching the range of rules to syntactic constituent boundaries by merging or otherwise adding constituent labels. Zollmann and Venugopal (2006) describe an efficient decoding algorithm for this approach.

Almaghout et al. (2011) use simplified CCG tags that specify only context but not the resulting category as syntactic labels in a string-to-tree model.

When translating into morphologically rich languages who exhibit an increased number of long distance agreement, it may be better to encode morphological properties not in the grammar but in distinct agreement constraints that are checked at the appropriate level in the tree (Williams and Koehn, 2011).

Benchmarks

Discussion

New Publications

Braune, Fabienne and Seemann, Nina and Fraser, Alexander (2015): Rule Selection with Soft Syntactic Features for String-to-Tree Statistical Machine Translation, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{braune-seemann-fraser:2015:EMNLP,
author = {Braune, Fabienne and Seemann, Nina and Fraser, Alexander},
title = {Rule Selection with Soft Syntactic Features for String-to-Tree Statistical Machine Translation},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {1095--1101},
url = {http://aclweb.org/anthology/D15-1129},
year = 2015
}
Braune et al. (2015)
Sennrich, Rico and Haddow, Barry (2015): A Joint Dependency Model of Morphological and Syntactic Structure for Statistical Machine Translation, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
add
@InProceedings{sennrich-haddow:2015:EMNLP,
author = {Sennrich, Rico and Haddow, Barry},
title = {A Joint Dependency Model of Morphological and Syntactic Structure for Statistical Machine Translation},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
month = {September},
address = {Lisbon, Portugal},
publisher = {Association for Computational Linguistics},
pages = {2081--2087},
url = {http://aclweb.org/anthology/D15-1248},
year = 2015
}
Sennrich and Haddow (2015)
Seemann, Nina and Braune, Fabienne and Maletti, Andreas (2015): String-to-Tree Multi Bottom-up Tree Transducers, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
add
@InProceedings{seemann-braune-maletti:2015:ACL-IJCNLP,
author = {Seemann, Nina and Braune, Fabienne and Maletti, Andreas},
title = {String-to-Tree Multi Bottom-up Tree Transducers},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {815--824},
url = {http://www.aclweb.org/anthology/P15-1079},
year = 2015
}
Seemann et al. (2015)
Hassan, Hany and Sima'an, Khalil and Way, Andy (2007): Supertagged Phrase-Based Statistical Machine Translation, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics mentioned in String To Tree and Syntactic Reranking
add
@InProceedings{hassan-simaan-way:2007:ACLMain,
author = {Hassan, Hany and Sima'an, Khalil and Way, Andy},
title = {Supertagged Phrase-Based Statistical Machine Translation},
booktitle = {Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics},
month = {June},
address = {Prague, Czech Republic},
publisher = {Association for Computational Linguistics},
pages = {288--295},
url = {http://www.aclweb.org/anthology/P/P07/P07-1037},
year = 2007
}
Hassan et al. (2007)
Weese, Jonathan and Callison-Burch, Chris and Lopez, Adam (2012): Using Categorial Grammar to Label Translation Rules, Proceedings of the Seventh Workshop on Statistical Machine Translation
add
@InProceedings{weese-callisonburch-lopez:2012:WMT,
author = {Weese, Jonathan and Callison-Burch, Chris and Lopez, Adam},
title = {Using Categorial Grammar to Label Translation Rules},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
month = {June},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
pages = {268--277},
url = {http://www.aclweb.org/anthology/W12-3132},
year = 2012
}
Weese et al. (2012)
Williams, Philip and Koehn, Philipp (2012): GHKM Rule Extraction and Scope-3 Parsing in Moses, Proceedings of the Seventh Workshop on Statistical Machine Translation
add
@InProceedings{williams-koehn:2012:WMT,
author = {Williams, Philip and Koehn, Philipp},
title = {GHKM Rule Extraction and Scope-3 Parsing in Moses},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
month = {June},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
pages = {434--440},
url = {http://www.aclweb.org/anthology/W12-3155},
year = 2012
}
Williams and Koehn (2012)
DeNeefe, Steve and Knight, Kevin and Vogler, Heiko (2010): A Decoder for Probabilistic Synchronous Tree Insertion Grammars, Proceedings of the 2010 Workshop on Applications of Tree Automata in Natural Language Processing
add
@InProceedings{deneefe-knight-vogler:2010:ATANLP,
author = {DeNeefe, Steve and Knight, Kevin and Vogler, Heiko},
title = {A Decoder for Probabilistic Synchronous Tree Insertion Grammars},
booktitle = {Proceedings of the 2010 Workshop on Applications of Tree Automata in Natural Language Processing},
month = {July},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {10--18},
url = {http://www.aclweb.org/anthology/W10-2502},
year = 2010
}
DeNeefe et al. (2010)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

String to Tree Models

Publications

Benchmarks

Discussion

Related Topics

New Publications