Suzy Howlett: Home > Research > Information Packaging for SMT
Early phase of PhD project, 2009. Supervisors: Mark Dras, Robert Dale
Machine translation (MT) is the task of automatically translating a written text from one human language to another. In statistical machine translation (SMT), this is accomplished by developing a probabilistic model of the translation process. Intuitively, linguistic information about the sentence should aid translation, but so far the addition of such information to the statistical model has not consistently proven useful. This project investigates how such information can be usefully incorporated into the system.
Information packaging refers to the speaker's choice between several possible realisations of the same information. In other words, it is the selection of the order and manner in which information is presented in a sentence. I hypothesised that information packaging is used to place emphasis on a particular part of the sentence. Based on this, I conducted a small survey to investigate whether native speakers prefer a translation where emphasis falls on the same element over a translation with different emphasis.
Survey: Effect of Information Packaging on Perceived Translation Quality
Survey questions are based on sentences from the freely-available German–English Europarl training data from the 2009 Workshop on Statistical Machine Translation (WMT'09).
No formal publications have arisen from this survey.
After completing my analysis of the survey and exploring the theoretical linguistics work on information packaging, I concluded it would be difficult to implement an adequate IP analysis for SMT within the time frame of the project. My project shifted direction to consider confidence in syntactic information. No further work is planned for this project.