Language Technology Seminar Series

Department of Computer Science and Software Engineering
The University of Melbourne


Title: Discriminative Word Alignment for Statistical Machine Translation with Conditional Random Fields

Speaker: Phil Blunsom (University of Melbourne)

Location: ICT Building, Room 2.06

Date: 7 July 2006

Time: 1-2pm

Abstract:

In this seminar we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions.

We apply this alignment model to both French-English and Romanian-English language pairs. We show how a large number of highly predictive features can be easily incorporated into the CRF, and demonstrate that even with only a few hundred word-aligned training sentences, our model improves over the current state-of-the-art with alignment error rates of 5.29 and 25.8 for the two tasks respectively.


Disclaimer: This page, its contents and style, are the responsibility of the author and do not necessarily represent the views, policies or opinions of The University of Melbourne.