Keynote Speakers
-
Gerhard Weikum
Max-Planck Institute for Informatics, Germany
"Efficient Top-k Queries for XML Information Retrieval"
Abstract
Non-schematic XML data that comes from many different sources and
inevitably exhibits
heterogeneous structures and annotations (i.e., XML tags) cannot be
adequately searched
using database query languages like XPath or XQuery. Often, queries
either return too many
or too few results. Rather the ranked-retrieval paradigm is called for,
with relaxable search conditions, various forms of similarity predicates
on tags and contents,
and quantitative relevance scoring.
The talk discusses recent advances and open research issues for ranked
retrieval
of XML data, and exemplifies them by the TopX search engine, a prototype
system developed at the Max-Planck Institute for Informatics.
TopX supports a probabilistic-IR scoring model for full-text content
conditions
and tag-term combinations,
path conditions for all XPath axes as exact or relaxable constraints, and
ontology-based relaxation of terms and tag names as
similarity conditions for ranked retrieval.
For speeding up top-k queries, various techniques are employed:
probabilistic models as efficient score predictors for a variant of the
threshold
algorithm, judicious scheduling of sequential accesses for scanning
index lists and
random accesses to compute full scores, incremental merging of index
lists for
on-demand, self-tuning query expansion, and a suite of specifically
designed,
precomputed indexes to evaluate structural path conditions.
-
Andrei Broder
Yahoo! Research, USA
"From query based Information Retrieval to context driven Information Supply"
Abstract
In the past decade, Web search engines have evolved from a first
generation based on classic Information Retrieval scaled up to web size
and supporting only informational queries, to a second generation
supporting navigational queries using web specific information
(primarily link analysis), and then to a third generation enabling
transactional and other "semantic" queries based on a variety of
technologies aimed to directly satisfy the unexpressed "user intent."
What is coming next? In this talk, we argue for the trend towards
context driven Information Supply, that is, the goal of Web IR will
widen to include the supply of relevant information without requiring
the user to make an explicit query. The information supply concept
greatly precedes information retrieval. (Newspapers, or even the "Acta
Diurna" of ancient Rome.) What is new in the web framework, is the
ability to supply relevant information specific to a given activity and
a given user, while the activity is being performed. A prime example is
the matching of ads to content being read, however the information
supply paradigm is starting to appear in other contexts such as social
networks, e-commerce, browsers, and others.
|