|
Natural Language Processing University of Melbourne, 10-14 July 2006 |
|
The problem of temporal information extraction from natural language poses many interesting challenges, but the potential applications are numerous, including the automatic construction of chronologies from news, medical narratives, accident reports, etc. This tutorial will discuss methods for automatically building chronologies of events from natural language data, using information extraction techniques along with temporal reasoning. The tutorial will begin with an overview of theoretical work on tense, aspect, and event structure in natural language, as well as the fundamentals of temporal reasoning. It will discuss the annotation of temporal and event expressions in corpora, including the TimeML and ACE (Automatic Content Extraction) annotation schemes. The tutorial will provide an overview of a variety of methods for ordering events in time from natural language, including rule-based and machine-learning methods. It will identify difficulties in automatically constructing chronologies of events in the above genres. Tutorial attendees can expect to learn about current methodologies and computational resources, the outstanding problems in the area, as well as obtain follow-up pointers to the literature.
Inderjeet Mani is a Senior Principal Scientist at MITRE, a Research Affiliate at MIT (CSAIL), a Visiting Scholar at Brandeis (Computer Science), and an (outgoing) Associate Professor and Program Head in Computational Linguistics at Georgetown University. His research, funded by MITRE, NSF, DARPA, ARDA, and others, has included information extraction, automatic summarization, and bioinformatics. He has published three technical books, including The Language of Time (co-edited, Oxford University Press, 2005), Advances in Automatic Summarization (co-edited, MIT Press, 1999) and Automatic Summarization (John Benjamins, 2001), along with more than 60 technical papers. In temporal information extraction, he led the DARPA TIDES effort to develop the TIMEX2 annotation scheme for time expressions, which has been adopted by ACE, the TERN competition, and TimeML; his recent research related to temporal ordering and anchoring in TimeML has been funded by the ARDA AQUAINT program. He has taught tutorials and given invited talks on his research at various conferences and universities worldwide, and has served on the Editorial Board of Computational Linguistics (2002-4), has reviewed for other journals such as Natural Language Engineering, Journal of Artificial Intelligence Research, Bioinformatics, Information Retrieval, Journal of the American Society for Information Science and Technology, as well as various book publishers and institutions such as the National Science Foundation, the National Institutes of Health, etc., and conferences like ACL, HLT, AAAI, IJCAI, LREC, FLAIRS, TIME, INLG, ICON, and a wide variety of workshops, including the Document Understanding Conference (DUC).