Workshop Program
Friday, April 4
| 9:00 | - | 9:15 |
Opening |
| 9:15 | - | 10:00 |
Keynote Speaker
TRECVid & future of video search - (1.5M)
Wessel Kraaij (TNO, The Netherlands)
TRECVid is the leading annual benchmark conference on content
based video indexing & retrieval techniques. In seven years, TRECVid
has evaluated various component tasks and attained a participation of
40+ research groups from North America, Europe, Asia and Australia.
TRECVid stimulates innovative research on video access by providing
a standardized testbed allowing comparison of approaches. An important
element determining the success of TRECVid is the fact that it
is driven by the research community itself.
Some of the component tasks that have been studied can be considered
as solved (such as shot boundary detection), others such as generic
concept indexing are still way behind the accuracy of textual indexing
techniques that internet users expect. In this talk we will contrast
state of the art video indexing and retrieval techniques with some
of the existing video search services that are deployed. An example is
the role and importance of manual annotation. Concept indexing techniques
typically depend on the availability and reliability of manually
annotated keyframes. Existing video search applications also rely on
manual content descriptions, but these usually apply to the full clip.
We will also assess the potential impact of TRECVid on end user
solutions for video search and in what way the current video search
solutions influence new tasks for TRECVid.
|
| 10:00 | - | 10:30 |
Coffee break |
| 10:30 | - | 11:40 |
Session on Multimedia
- Audio-video search within a corpus news contents in 6 languages
Julien Law-To (Exalead)
- Making visual content on the web searchable
Jan-Erik Solem (Polar Rose, Sweden)
Indexing images on the web using computer vision makes is possible
to search based on visual content in images. At Polar Rose we use
face recognition and visual descriptors to make this happen. In this
talk I will describe different ways to index visual content, look at some
big challenges and show some applications with millions of faces and
images.
|
| 11:40 | - | 14:00 |
Session on CHORUS projects (Multimedia)
- SEMEDIA Project -
(3.1M)
Roelof van Zwol (Yahoo! Research Barcelona, Spain)
Professional Media repositories and Intranets, like the Internet as a whole, present the problem
of how to find precise segments of audiovisual files among a sea of largely un-indexed,
heterogeneous data. SEMEDIA aims to create new methods, environments and widely usable
tools for media labelling, searching and retrieval from very large collections of heterogeneous
data, building on and extending research in media technologies, web semantics, AI, CBIR and
interface design.
SEMEDIA enables the rapid, semi-automatic annotation of large data populations at greatly
reduced cost, and the production of professional and consumer tools for near instantaneous data
search & retrieval from very large, distributed stores of mainly un-indexed audiovisual media.
SEMEDIA develops techniques to extract metadata from 'essence' in ways that allow the
automatic inference of high-level structural information from new, partly annotated media
content. It will create tools for:
- navigating intelligently and searching efficiently;
- summarising and clustering visually similar content; searching for 'scenes similar to...';
- finding scenes with a given character and all the associated files, geometry, shading, colour
model, speech characteristics, motion data etc;
- data architectures that support secure content tracking and multiple user access;
- user interfaces that allow fast browsing with user defined criteria and adaptive feedback that
recognises context and intent, and contributes to the annotation.
- MESH and RUSHES Projects -
(2.2M) - (439K)
Pedro Concejero (Telefónica Investigación y Desarrollo, Spain)
- SAPIR Project -
(4.9M)
Pavel Zezula (Masaryk University, Czech Republic)
Multimedia content retrieval is performed on corresponding descriptive
features. Though there are many ways to extract features, they
are typically compared by specific similarity (distance) measures, so
a query execution must be supported by efficient similarity search index
structures. MUFIN (Multi Feature Indexing Network) is a general
purpose prototype system with the following objectives: Extensibility -
performing (combined) similarity queries for arbitrary metric distance
measures; Scalability - by application of structure P2P networks the
system is able to scale up to the web dimension; Performance tuning
- by a suitable mapping of the logical peer structure to specific computer
network infrastructure, the query response time and throughput
can be adjusted. The properties will be demonstrated by an image
content-based retrieval over a dataset of 10 million images indexed by
five MPEG7 descriptors, running on a modest computer infrastructure.
A combination with the traditional text search will also be presented.
The work is supported by the SAPIR EU project.
- AIM@SHAPE Project -
(5.5M)
Francesco Robbiano (CNR-IMATI Genova, Italy)
- VITALAS Project -
(1.3M)
Arjen de Vries (CWI, The Netherlands)
- TRIPOD Project -
(984K)
Xin Fan (University of Sheffield, UK)
-
- Pharos Project: Using AV-RSS for Publishing Audiovisual Content Metadata
Oscar Celma (Universitat Pompeu Fabra)
Audiovisual content is not like traditional web content. The main differences
lie in the opaqueness of the content, the file size, the use of multiple encoding
schemes and the constrained licensing agreement often associated with it. This
raises many issues for an audiovisual search engine that is not directly operated
by content owners. One issue is the difficulty of obtaining remote access to a
high-quality copy of the content in order to automatically extract information
necessary for indexing. This requires time consuming licensing negotiations, a
high-bandwidth network connection and large storage capacity.
One way to address this problem is to allow content owners wishing to increase
the awareness of the availability of their contents to directly generate and
publish richly annotated metadata to search engines or to any content mediator
services. The published metadata can then be received by any services having
subscribed to them using a publication-subscription protocol.
The PHAROS European research project has developed such a protocol using an
extended XML schema for publishing metadata describing audiovisual content.
This AV-RSS schema ( AudioVisual RSS) is a publicly available format. It reuses
RSS for its simplicity and extends Media-RSS with a number of features including
content identification and time-indexed annotations.
AV-RSS allows the use of multiple content identifiers using different policies
or different registries such as ISAN. An audiovisual item can also be identified
using content fingerprints computed by different technologies. The AV-RSS schema
also defines different means by which content can be accessed by and presented
to end users (e.g. as streamable previews, downloadable content etc.) in different
query contexts, and it includes related copyright, license and acquisition information.
Part of the AV-RSS schema is dedicated to the temporal annotation of the visual
or audio track of an audiovisual item. This part reuses some principles of MPEG-7,
while largely simplifying it. A full MPEG-7 description can be accessed by including
it as a link to external content in the AV-RSS description of an item. Another
Media-RSS extension has also been defined in order to describe the scheduling of
content availability. Moreover, some Media-RSS information elements have been
redefined in order to add mandatory language and schema attributes.
The PHAROS project is currently experimenting with, and validating the first
version of the AV-RSS format, and wishes to present it to a larger audience of
audiovisual content practitioners to collect remarks and feedback in order to
improve it in a second release planed at the end of this year.
|
| 14:00 | - | 15:30 |
Lunch |
| 15:30 | - | 16:30 |
Session on Specialized Search
- Learning to Rank Answers on Large Online QA Collections -
(100K)
Mihai Surdeanu (Fundacio Barcelona Media, Spain)
This work describes an answer ranking engine for non-factoid questions
built using a large online community-generated question-answer collection
(Yahoo! Answers). We show how such collections may be used to
effectively set up large supervised learning experiments. Furthermore
we investigate a wide range of feature types, some exploiting NLP processors,
and demonstrate that using them in combination leads to a
very significant improvement in performance.
- Exploiting explicit and implicit semantics on the Web -
(2.1M)
Peter Mika (Yahoo! Research Barcelona, Spain)
In this presentation we will give an overview of the developments that
we believe will lead in the mid-term to the realization of Semantic
Search, i.e. search with the capabilities to understand the user's intent
and the Web's content at a much deeper, conceptual level than currently
possible. Semantic Search will require a combination of methods
from IR with recent results in Natural Language Processing and the Semantic
Web to tackle both implicit and explicit semantics on the Web.
The first project we will present combines entity extraction technology
and metadata to improve tagging and retrieval on the Wikipedia
corpus. Our second case study will show how dynamically collected
metadata can be used to enrich the interface of a search engine.
- Graph-based context-sensitive search
Aristides Gionis (Yahoo! Research Barcelona, Spain)
We consider the problem of query search in a hyper-linked document
collection. We extend the typical keyword search, by considering a
source document, which provides the context of a query. The task is to
rank other documents in Wikipedia with respect to their relevance to
the query terms given the source document. By attaching a context to
the query terms, the search results of a search initiated in a particular
page can be made more relevant.
If we consider the collection of documents as a directed graph G with
nodes the documents and edges the links among the documents, the
above search problem maps to the problem of finding relevant target
nodes in the graph G when the query is initiated from a source node.
We suggest a number of techniques and features that extend the classical
query-search model so that the source document is taken into
account.
Our features take into account both the content of the context document
as well as its position in the graph G. Our experiments, made
using Wikipedia, indicate that the proposed method considerably improves
results obtained by a more traditional approach that does not
take the context into account.
|
| 16:30 | - | 18:00 |
Coffee break and Demos/Posters |
| 17:00 | - | 18:00 |
Hands-on session
Teaching User Interface Design using a web-based Usability Tool
Ernesto Arroyo (Fundacio Barcelona Media, Spain)
This workshop demonstrates how a user interface analysis tool can be
used introduce design and interaction concepts. It presents MouseTrack
as a web logging system that tracks mouse movements on websites. This
system includes a visualization tool that displays the mouse
cursor path followed by website visitors. It helps web site administrators
run usability tests and analyze the collected data. Practitioners
can track any existing webpage by simply entering its URL.
Here is a link to the project website.
Some examples:
|
| 20:00 | | |
Banquet
At the Roc de les Bruixes restaurant. One of the most highly regarded establishments in the entire Grandvalira domain, both for the quality of its creative cuisine with French, Andorran and Pyrenean touches, as well as for the views of the Canillo valley from its terrace. T. +376 890 696
Includes a bus from the hotel, climbing upto 2,000m above sea-level in a cable car.
Dinner Menu (in Spanish).
|
Saturday, April 5
| 9:00 | - | 10:30 |
Session on Specialized Search
- Search and Recommendation: two sides of the same coin? -
(1.6M)
Xavier Amatriain (Telefónica Investigación y Desarrollo, Spain)
Recently the field of Recommender Systems has gained growing popularity
among the research community with new conferences such as the
ACM Recsys going into its 2ond edition and established conferences
such as SIGKDD or SIGCHI focusing a great deal of attention on this
topic.
The Recommendation field started from a different background than
web search, namely Data Mining and HCI versus Information Retrieval.
While the goal of Recommendation Systems is to optimize a
fitness function between content and users by "discovering" hidden
relations in the data, Search Engines focus on "retrieving" pre-existing
data.
However there are clear trends that point to both fields coming closer
together. On the one hand, web search is becoming more and more
personalized, highlighting the need for user profiling and collaborative
filtering. On the other hand, it is becoming clear that in many cases
search strategies are essential for the performance of Recommender
Systems.
As a result, some claim that search is just a "simpler form of recommendation",
where the fitness function to be optimized is that of a
generic average user (e.g. using algorithms such as Page Rank) Obviously
statements in the opposite direction can also be made.
In this talk we will assume that the audience is familiar with Web
Search systems and therefore we will focus on describing the basic techniques
and current research trends in Recommender Systems, highlighting
where and how they are similar or different.
At the end of the talk, We hope to convey the message that the "Future
of Web Search is in Recommendation", hoping that such a claim will
spark an interesting discussion and debate throughout the workshop.
- Time in Web Search -
(1.2M)
Omar Alonso (UC Davis/A9.com)
Time is an important dimension of any information space and can be
very useful in search applications. As Internet search engines keep
gathering new and diverse information sources, identifying relevant
information that has sensitive time becomes more important for users.
Temporal information is available in every Web page as temporal expressions
or in the form of metadata. Recognizing such information
and exploiting it for retrieval and presentation purposes are important
features that can significantly improve the functionality of search applications.
Current search applications do not take advantage of all
the time information available in Web pages to provide an alternative
user experience.
In this presentation, I explore how temporal information can be used
in search and retrieval and I outline some of the areas that can benefit
from exploiting such information.
- MyMobileSearch : Next generation search engine for mobile users -
(1.3M)
José Manuel Cantera Fonseca (Telefónica Investigación y Desarrollo, Spain)
In the next years, it is expected that the number of contents in the Web
prepared to be delivered to mobile devices will augment exponentially.
As a result, users will need to search to get information while on the
move. However mobile search introduces a series of challenges, related
to the provision of the results more suitable for each delivery context.
The next generation of search engines will need to deal with mobile
web users needs, and as result, mobile devices will become the universal
access point to information anywhere, any time.
This talk will explain the technological research challenges around mobile
search and how we are facing it in MyMobileSearch, an internal
research project aimed to develop a search engine capable to combine
serveral search criteria related to the mobile context such as location,
the user’s device and browser.
- Mobile Search: requirements for a personalised service
Ben Bratu (Motorola Labs., France)
Search has become one of the most important mechanisms for discovery
of content on PCs and wired internet devices. Many attempts have
been made to port internet applications and experiences from the PC
domain to mobile. However, the requirements of mobile users are significantly
different from PC users. Device limitations (e.g. keyboard,
display, computation power, limited battery, etc.) are the key factors
in restricting the use of search on mobiles.
In the same time, compared to the PS search, the mobile search tends
to be a service that is more time and context sensitive. The user
is expecting to get the most useful information giving not only his
expressed interests but also his current context (e.g. location, current
activities, social surroundings, etc.).
Finally and most important, the mobile phone is a strictly personal
device. Very fine-grained individual information about the user can
be used not only to achieve e better efficiency of the search service
but also to make from the mobile search one of the most personalized
service.
|
| 10:30 | - | 11:00 |
Coffee break |
| 11:00 | - | 12:00 |
Session Multimedia
- Text-based Retrieval Models for Media Search
Vanessa Murdock (Yahoo! Research Barcelona, Spain)
Given a query from a user, we would like to return relevant media items,
where the media item is represented by a title and short description, or
perhaps a set of tags, generated by the owner. Typical user queries are
2 - 3 terms, and descriptions or tag sets are typically fewer than 15
terms per media item, making the distribution of terms in the retrieval
model especially sparse. We developed a family of retrieval models that
is especially suited to cope with the limited information provided. In
this talk I describe ongoing work in retrieving relevant videos, based
on textual information associated with videos and user queries.
- The Social Media Opportunity and Application for Landmark Search -
(13M)
Mor Naaman (Yahoo! Research Berkeley, USA)
Community-contributed collections of media on the web are a becoming
a vast, rich resource for image and video on a long-tailed array
of topics. These multimedia resources present a new opportunity for
multimedia search and retrieval - but also pose new challenges. I
will describe some initial exploration into turning social media content
into a data source for image search. Using a combination of context-
and content-based tools, we generate representative sets of images for
location-driven features and landmarks, a common search task. To do
that, we using location and other metadata, as well as tags associated
with images, and the images' visual features. This approach can potentially
scale to provide better search and representation for every
landmark, worldwide.
- VISTO: VIsual STOryboard for Web Video Browsing -
(433K)
Marco Pellegrini (CNR IIT)
Web video browsing is rapidly becoming a very popular activity in
the Web scenario, causing the production of a concise video content
representation a real need. Currently, static video summary techniques
can be used to this aim. Unfortunately, they require long processing
time and hence all the summaries are produced in advance without any
users customization. With an increasing number of videos and with
the large users heterogeneousness, this is a burden. In this talk we
describe VISTO, a summarization technique that produces customized
on-the-fly video storyboards. The mechanism uses a fast clustering
algorithm that selects the most representative frames using their HSV
color distribution and allows users to select the storyboard length and
the processing time. An objective and subjective evaluation shows
that the storyboards are produced with good quality and in a time
that allows on-the-fly usage.
Joint work with Marco Furini, Filippo Geraci, and Manuela Montangero.
A preliminary version appeared in ACM CIVR International
Conference on Image and Video Retrieval 2007
|
| 12:00 | - | 12:45 |
Invited Speaker
The Future of Web Search
Usama Fayyad (CDO Yahoo!, USA)
|
| 12:45 | - | 14:00 |
Speakers' Corner / Discussion / Wrap up based on topics raised during the event
|
| 14:00 | - | 15:30 |
Lunch |
|