Generate-and-Test Models for Machine Translation

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=VuF0J8RTYng



Duration: 1:21:25
74 views
0


I discuss translation as an optimization problem subject to three kinds of constraints: lexical, configurational, and constraints enforcing target-language wellformedness. Lexical constraints ensure that the lexical choices in the output are meaning-preserving; configurational constraints ensure that the relationships between source words and phrases (e.g., semantic roles and modifier-head relationships) are properly transformed in translation; and target-language wellformedness constraints ensure the grammaticality of the output. The constraint-based framework suggests a generate-and-test (discriminative) model of translation in which features sensitive to input and output structures are engineered by language and translation experts, and the feature weights are trained to maximize the conditional likelihood of a corpus of example translations. The specified features represent empirical hypotheses about what correlates (but not why) and thus encode domain-specific knowledge; the learned weights indicate to what extent these hypotheses are confirmed or refuted. To verify the usefulness of the feature-based approach, I discuss the performance two models: first, a lexical translation model evaluated by the word alignments it learns. Unlike previous unsupervised alignment models, the new model utilizes features that capture diverse lexical and alignment relationships, including morphological relatedness, orthographic similarity, and conventional co-occurrence statistics. Results from typologically diverse language pairs demonstrate that the generate-and-test model provides substantial performance benefits compared to state-of-the-art generative baselines. Second, I discuss the results of an end-to-end translation model in which lexical, configurational, and wellformedness constraints are modeled explicitly. This model is substantially more compact than state-of-the-art translation models, but still performs significantly better on languages where source-target word order differences are substantial.




Other Videos By Microsoft Research


2016-08-11NUI for Scientists, Layerscape, Environment & Water OData, Microsoft Research OData
2016-08-11Dealing with Quantifier Scope Ambiguity in Computational Linguistics
2016-08-11The Golem Project: a Laboratory for the Construction of Service Robots
2016-08-11Panel Session | Big Data on Campus: Addressing the Challenges and Opportunities Across Domains
2016-08-11Intersection Workshop - Less painful high performance image processing
2016-08-11Moving the Needle and Growing Women in Computing in Latin America
2016-08-11Intersection Workshop - Learning to Constrain Deformable Surfaces
2016-08-11Experiences in Software Engineering
2016-08-11Intersection Workshop - Metric-Topologic dense real-time visual localisation and world mapping
2016-08-11Intersection Workshop - Taking early vision off-grid: From discrete samples to continuous signals
2016-08-11Generate-and-Test Models for Machine Translation
2016-08-11Intersection Workshop - title TBC
2016-08-11Intersection Workshop - Interaction for mobile content creation
2016-08-11Session 2B: Computational Models and Applications
2016-08-11On the Uplink MAC Performance of Drive Thru Internet
2016-08-11NW-NLP 2012 Morning Talks
2016-08-11Intersection Workshop - Exploring Shape Variations by 3D-Model Decomposition and Recombination
2016-08-11Intersection Workshop - Shape grammars and recursive refinement - limitations and extensions
2016-08-11Advancing Environmental Understanding: the Role of eScience
2016-08-11Using Computer Vision for Graphics
2016-08-11Plenary Session



Tags:
microsoft research