General
Neural machine Translation
Statistical Machine Translation
Search Publications
Within translation models, it is common to drop distinctions between surface forms of words that occur at the beginning of sentences (The), in the middle of sentences (the) or in all-caps headlines (THE). This require a pre-processing step of normalizing case and a post-processing step of generating the proper surface forms.
Truecasing is the main subject of 3 publications. 3 are discussed here.
Topics in Data
Parallel Corpora | Comparable Corpora | Dictionaries | Corpus Cleaning | Sentence Alignment | Truecasing | Word Segmentation | Spelling Correction | Sparse Data | Pivot Languages | Domain AdaptationActions
Download
Contribute