Neural machine Translation
Statistical Machine Translation
Within translation models, it is common to drop distinctions between surface forms of words that occur at the beginning of sentences (The), in the middle of sentences (the) or in all-caps headlines (THE). This require a pre-processing step of normalizing case and a post-processing step of generating the proper surface forms.
Truecasing is the main subject of 3 publications. 3 are discussed here.
Topics in DataParallel Corpora | Comparable Corpora | Dictionaries | Corpus Cleaning | Sentence Alignment | Truecasing | Word Segmentation | Spelling Correction | Sparse Data | Pivot Languages | Domain Adaptation