COLING-2008 Workshop
MMIES-2: Multi-source, Multilingual Information Extraction
and Summarization

Manchester, 23 August, 2008

Held in conjunction with COLING-2008 the 22nd International Conference on Computational Linguistics
18-22 August, 2008

Theme

The objective of The 2nd MMIES Workshop: Multi-source, Multilingual Information Extraction and Summarization is to bring together researchers and practitioners in the areas of extraction, summarization, and other information access technologies, to discuss recent approaches to multi-source and multi-lingual challenges. Approaches to coping with the idiosyncratic nature of the new Web2.0 media are especially welcome, including: mixed input, new jargon, ungrammatical and mixed-language input, and emotional discourse.

Organisers

Invited Speaker

Workshop Programme

9:30–10:30 Invited Talk:
Generating Image Captions using Topic-Focused Multi-document Summarization
Robert Gaizauskas    (abstract)
10:30–11:00 Coffee Break
Session 1: Named Entity and Lexical Resources for IE and Summarization
11:00–11:30 Learning to Match Names Across Languages
Inderjeet Mani, Alex Yeh and Sherri Condon    (abstract)
11:30–12:00 Automatic Construction of Nordic Domain Specific Dictionaries on Sparse Parallel Corpora
Sumithra Velupillai and Hercules Dalianis    (abstract)
12:00–12:30 Graph-Based Keyword Extraction for Single-Document Summarization
Marina Litvak and Mark Last    (abstract)
12:30–14:00 Lunch
Session 2: Multi-document Summarization
14:00–14:30 MultiSum: Query-Based Multi-Document Summarisation
Michael Rosner and Carl Camilleri    (abstract)
14:30–15:00 Mixed-Source Multi-Document Speech-to-Text Summarization
Ricardo Ribeiro and David Martins de Matos    (abstract)
15:00–15:30 Evaluating automatically generated user-focused multi-document summaries for geo-referenced images
Ahmet Aker and Robert Gaizauskas    (abstract)
15:30–16:00 Coffee Break
Session 3: Applications
16:00–16:30 Story tracking: linking similar news over time and across languages
Bruno Pouliquen, Olivier Deguernel and Ralf Steinberger    (abstract)
16:30–17:00 Automatic Annotation of Bibliographical References with Target Language
Harald Hammarström    (abstract)
17:00–17:30 Open Discussion

Call for Papers

Information extraction (IE) and text summarization (TS) are key technologies aiming at extracting relevant information from texts and presenting the information to the user in condensed form. The on-going information explosion makes IE and TS particularly critical for successful functioning within the information society. These technologies, however, face new challenges with the adoption of the Web 2.0 paradigm (e.g. blogs, wikis) because of their inherent multi-source nature. These technologies have to deal no longer with isolated texts or single narratives, but with large-scale repositories, or sources -- possibly in several languages -- containing a multiplicity of views, opinions, or commentaries on particular topics, entities or events. There is thus a need to adapt and/or develop new techniques to deal with these new phenomena.

Recognising similar information across different sources and/or in different languages is of paramount importance in this multi-source, multi-lingual context. In information extraction, merging information from multiple sources can lead to increased accuracy relative to extraction from a single source. In text summarization, similar facts found across sources can inform sentence scoring algorithms. In question answering, the distribution of answers in similar contexts can inform answer ranking components.

Often, it is not the similarity of information that matters, but its complementary nature. In a multi-lingual context, information extraction and text summarization can provide solutions for cross-lingual access: key pieces of information can be extracted from different texts in one or many languages, merged, and then conveyed in many natural languages in concise form. Applications need to be able to cope with the idiosyncratic nature of the new Web 2.0 media: mixed input, new jargon, ungrammatical and mixed-language input, emotional discourse, etc. In this context, synthesizing or inferring opinions from multiple sources is a new and exciting challenge for NLP. On another level, profiling of individuals who engage in the new social Web, and identifying whether a particular opinion is appropriate/relevant in a given context are important topics to be addressed.

It is therefore important that the research community address the following issues:

Important Dates:

Call for papers  1 March
Paper submission deadline  ***Extended: 12 May
Notification of acceptance of Papers  6 June
Camera-ready copy of papers due  1 July
Workshop  23 August

Programme Committee:


Previous MMIES Workshop, at RANLP-2007 in Borovets, Bulgaria

Paper Submission

Deadline for submission: 05 May 2008
Papers must:

Please declare any conflicts of interest when submitting your papers. For guidelines, consult the ACL conflict of interest policy.


Last update: