|
Natural Language Processing University of Melbourne, 10-14 July 2006 |
|
Parsing algorithms commonly used in computational linguistics are built upon formal grammars that generate strings. This tutorial will motivate an approach using Tree-Adjoining Grammars (TAG) that goes from strings to trees. Formal properties of TAG that make it attractive for computational linguistics will be introduced such as enabling context-sensitivity with trees, handling of crossing dependencies and lexicalization of grammars.
We will see how the complex structural descriptions in TAG can be exploited in a corpus-based learning approach for parsing language. TAGs can be easily extracted from a Treebank and Synchronous TAG grammars can be used for NLP applications like language understanding and machine translation. We will cover a robust shallow parsing approach called SuperTagging and see how TAGs are used to define a simple model for statistical parsers that can obtain state-of-the-art accuracy.
TAG has deep relations to hot topics of research such as tree automata, tree transducers and synchronous grammars and it is also related to statistical parsing approaches such as parse re-ranking, tree kernels and DOP models.
Anoop Sarkar is an Assistant Professor in the School of Computing Science at Simon Fraser University. His research has been focused on machine learning algorithms for the parsing of natural language with stochastic grammar formalisms such as probabilistic context-free and tree-adjoining grammars as well as the application of parsing to various NLP tasks.
Anoop received his PhD from the Department of Computer and Information Science at the University of Pennsylvania, with Prof. Aravind Joshi as his advisor. More details including a full list of his publications is available at http://www.cs.sfu.ca/~anoop.