As machine translation (MT) systems mature and are used for practical purposes, automatic quality estimation (QE) of their output acquires increasing importance. The talk will present an outline of the work on MTQE and derive some conclusions. MTQE starts where the MT system has finished. It is the estimation of the quality of output of an MT system and it has to go beyond what the MT system has already considered, which is usually quite a lot. Researchers have been trying all kinds of information and techniques to get good quality estimates. Quality is something that even human beings do not always agree about. The degree of disagreement is remarkable, and it naturally makes the problem harder.
Like any NLP problem, MTQE has many aspects and is related to other problems. At present, it is at a stage where deployability is still a challenge. We discuss, how, it being a typical NLP problem, the insights gained from it can often be easily transferred to other problems, and vice-versa. MTQE requires both statistical techniques and linguistic knowledge in some form. Learning to do MTQE well may be a good starting point for those who are just stepping in to the area of NLP.
SHORT BIO:
Anil Kumar Singh is a researcher (and currently a teacher) who has been working in the area of NLP for the last twelve years. He is working as a faculty member in the department of Computer Science and Engineering at IIT (BHU), Varanasi, India. He has published on various topics in NLP and has organized a couple of research symposiums, workshops and a couple of introductory workshops on NLP. He did his PhD in Computational Linguistics from IIIT, Hyderabad, India. He is the creator of Sanchay, a collection of tools and APIs familiar to some researchers in India. He spent one year as a post-doctoral researcher at LIMSI-CNRS, Orsay, France. It was there that he worked on the topic of machine translation quality estimation (MTQE). The work there resulted in some publications and in the development of a tool called Questimate. This tool implements the complete workflow for MTQE, from feature extraction and selection to machine learning (with the help of Weka as a library) to analysis and visualization of the results. He has been associated over the years with machine translation and related activities in several different capacities.