 |
COLING-2008 Workshop
MMIES-2: Multi-source, Multilingual Information Extraction and Summarization
|
|
Manchester, 23 August, 2008
|
Held in conjunction with COLING-2008
the 22nd International Conference on Computational Linguistics
18-22 August, 2008
Theme
The objective of The 2nd MMIES Workshop: Multi-source, Multilingual
Information Extraction and Summarization is to bring together researchers
and practitioners in the areas of extraction, summarization, and other
information access technologies, to discuss recent approaches to multi-source
and multi-lingual challenges. Approaches to coping with the idiosyncratic
nature of the new Web2.0 media are especially welcome, including: mixed input,
new jargon, ungrammatical and mixed-language input, and emotional discourse.
Organisers
Invited Speaker
Workshop Programme
| 9:30–10:30 |
Invited Talk:
Generating Image Captions using Topic-Focused Multi-document
Summarization
Robert Gaizauskas
(abstract)
|
| 10:30–11:00 |
Coffee Break
|
|
Session 1: Named Entity and Lexical Resources for IE and
Summarization
|
| 11:00–11:30 |
Learning to Match Names Across Languages
Inderjeet Mani, Alex Yeh and Sherri Condon
(abstract)
|
| 11:30–12:00 |
Automatic Construction of Nordic Domain Specific Dictionaries
on Sparse Parallel Corpora
Sumithra Velupillai and Hercules Dalianis
(abstract)
|
| 12:00–12:30 |
Graph-Based Keyword Extraction for Single-Document Summarization
Marina Litvak and Mark Last
(abstract)
|
| 12:30–14:00 |
Lunch
|
|
Session 2: Multi-document Summarization
|
| 14:00–14:30 |
MultiSum: Query-Based Multi-Document Summarisation
Michael Rosner and Carl Camilleri
(abstract)
|
| 14:30–15:00 |
Mixed-Source Multi-Document Speech-to-Text Summarization
Ricardo Ribeiro and David Martins de Matos
(abstract)
|
| 15:00–15:30 |
Evaluating automatically generated user-focused multi-document
summaries for geo-referenced images
Ahmet Aker and Robert Gaizauskas
(abstract)
|
| 15:30–16:00 |
Coffee Break
|
|
Session 3: Applications
|
| 16:00–16:30 |
Story tracking: linking similar news over time and across languages
Bruno Pouliquen, Olivier Deguernel and Ralf Steinberger
(abstract)
|
| 16:30–17:00 |
Automatic Annotation of Bibliographical References with Target
Language
Harald Hammarström
(abstract)
|
| 17:00–17:30 |
Open Discussion
|
Call for Papers
Information extraction (IE) and text summarization (TS) are key technologies
aiming at extracting relevant information from texts and presenting the
information to the user in condensed form. The on-going information explosion
makes IE and TS particularly critical for successful functioning within the
information society. These technologies, however, face new challenges with the
adoption of the Web 2.0 paradigm (e.g. blogs, wikis) because of their inherent
multi-source nature. These technologies have to deal no longer with isolated
texts or single narratives, but with large-scale repositories, or sources --
possibly in several languages -- containing a multiplicity of views, opinions,
or commentaries on particular topics, entities or events. There is thus a need
to adapt and/or develop new techniques to deal with these new phenomena.
Recognising similar information across different sources and/or in different
languages is of paramount importance in this multi-source, multi-lingual
context. In information extraction, merging information from multiple sources
can lead to increased accuracy relative to extraction from a single source. In
text summarization, similar facts found across sources can inform sentence
scoring algorithms. In question answering, the distribution of answers in
similar contexts can inform answer ranking components.
Often, it is not the similarity of information that matters, but its
complementary nature. In a multi-lingual context, information extraction and
text summarization can provide solutions for cross-lingual access: key pieces
of information can be extracted from different texts in one or many languages,
merged, and then conveyed in many natural languages in concise form.
Applications need to be able to cope with the idiosyncratic nature of the new
Web 2.0 media: mixed input, new jargon, ungrammatical and mixed-language input,
emotional discourse, etc. In this context, synthesizing or inferring opinions
from multiple sources is a new and exciting challenge for NLP. On another
level, profiling of individuals who engage in the new social Web, and
identifying whether a particular opinion is appropriate/relevant in a given
context are important topics to be addressed.
It is therefore important that the research community address the following
issues:
- What methods are appropriate to detect similar/complementary/contradictory
information? Are hand-crafted rules and knowledge-rich approaches suitable?
- What methods are available to tackle cross-document and cross-lingual
entity and event coreference?
- What machine learning approaches are most appropriate for this task --
supervised/unsupervised/semi-supervised? What type of corpora are required for
training and testing?
- What techniques are appropriate to synthesize condensed synopses of the
extracted information? What generation techniques are useful here? What kind
of techniques can be used to cross domains and languages?
- What techniques can improve opinion mining and sentiment analysis through
multi-document analysis? How do information extraction and opinion mining
connect?
- What tools exist for supporting multi-lingual/multi-source access to
information? What solutions exist beyond full document translation to produce
cross-lingual summaries?
Important Dates:
| Call for papers | | 1 March |
| Paper submission deadline | | ***Extended: 12 May |
| Notification of acceptance of Papers | | 6 June |
| Camera-ready copy of papers due | | 1 July |
| Workshop | | 23 August |
Programme Committee:
- Javier Artiles (UNED, Spain)
- Kalina Bontcheva (University of Sheffield, UK)
- Nathalie Colineau (CSIRO, Australia)
- Nigel Collier (NII, Japan)
- Hercules Dalianis (KTH/Stockholm University, Sweden)
- Thierry Declerk (DFKI, Germany)
- Michel Généreux (LIPN-CNRS, France)
- Julio Gonzalo (UNED, Spain)
- Brigitte Grau (LIMSI-CNRS, France)
- Ralph Grishman (New York University, USA)
- Kentaro Inui (NAIST, Japan)
- Min-Yen Kan (National University of Singapore, Singapore)
- Guy Lapalme (University of Montreal, Canada)
- Diana Maynard (University of Sheffield, UK)
- Jean-Luc Minel (Modyco-CNRS, France)
- Constantin Orasan (University of Wolverhampton, UK)
- Cecile Paris (CSIRO, Australia)
- Maria Teresa Pazienza (University of Rome ‘Tor Vergata’, Italy)
- Bruno Pouliquen (European Commission - Joint Research Centre, Italy)
- Patrick Saint-Dizier (IRIT-CNRS, France)
- Agnes Sandor (Xerox XRCE, France)
- Satoshi Sekine (New York University, USA)
- Ralf Steinberger (European Commission - Joint Research Centre, Italy)
- Stan Szpakowicz (University of Ottawa, Canada)
- Lucy Vanderwende (Microsoft Research, USA)
- José Luis Vicedo (Universidad de Alicante, Spain)
Previous
MMIES Workshop, at RANLP-2007 in Borovets, Bulgaria
Paper Submission
Deadline for submission: 05 May 2008
Papers must:
- be anonymous.
- be in Adobe/Acrobat PDF format.
- be maximum 8 pages long (including data, tables, figures, and references).
- include a one-paragraph abstract of the work (about 200 words).
- conform to the Coling 2008 style guidelines.
- be submitted throught the START submission web page.
Please declare any conflicts of interest when submitting your
papers. For guidelines, consult the
ACL conflict of interest policy.
Last update: