yahoo
logo_upf
The Future of Web Search: Beyond Text
Andorra - April 4-5, 2008
chorus
andorra

Workshop Program

Friday, April 4

9:00-9:15 Opening
9:15-10:00 Keynote Speaker
TRECVid & future of video search - (1.5M)
Wessel Kraaij (TNO, The Netherlands)

TRECVid is the leading annual benchmark conference on content based video indexing & retrieval techniques. In seven years, TRECVid has evaluated various component tasks and attained a participation of 40+ research groups from North America, Europe, Asia and Australia. TRECVid stimulates innovative research on video access by providing a standardized testbed allowing comparison of approaches. An important element determining the success of TRECVid is the fact that it is driven by the research community itself.

Some of the component tasks that have been studied can be considered as solved (such as shot boundary detection), others such as generic concept indexing are still way behind the accuracy of textual indexing techniques that internet users expect. In this talk we will contrast state of the art video indexing and retrieval techniques with some of the existing video search services that are deployed. An example is the role and importance of manual annotation. Concept indexing techniques typically depend on the availability and reliability of manually annotated keyframes. Existing video search applications also rely on manual content descriptions, but these usually apply to the full clip.

We will also assess the potential impact of TRECVid on end user solutions for video search and in what way the current video search solutions influence new tasks for TRECVid.
10:00-10:30 Coffee break
10:30-11:40 Session on Multimedia
  • Audio-video search within a corpus news contents in 6 languages
    Julien Law-To (Exalead)

  • Making visual content on the web searchable
    Jan-Erik Solem (Polar Rose, Sweden)

    Indexing images on the web using computer vision makes is possible to search based on visual content in images. At Polar Rose we use face recognition and visual descriptors to make this happen. In this talk I will describe different ways to index visual content, look at some big challenges and show some applications with millions of faces and images.

11:40-14:00 Session on CHORUS projects (Multimedia)
  • SEMEDIA Project - (3.1M)
    Roelof van Zwol (Yahoo! Research Barcelona, Spain)

    Professional Media repositories and Intranets, like the Internet as a whole, present the problem of how to find precise segments of audiovisual files among a sea of largely un-indexed, heterogeneous data. SEMEDIA aims to create new methods, environments and widely usable tools for media labelling, searching and retrieval from very large collections of heterogeneous data, building on and extending research in media technologies, web semantics, AI, CBIR and interface design.

    SEMEDIA enables the rapid, semi-automatic annotation of large data populations at greatly reduced cost, and the production of professional and consumer tools for near instantaneous data search & retrieval from very large, distributed stores of mainly un-indexed audiovisual media. SEMEDIA develops techniques to extract metadata from 'essence' in ways that allow the automatic inference of high-level structural information from new, partly annotated media content. It will create tools for:
    • navigating intelligently and searching efficiently;
    • summarising and clustering visually similar content; searching for 'scenes similar to...';
    • finding scenes with a given character and all the associated files, geometry, shading, colour model, speech characteristics, motion data etc;
    • data architectures that support secure content tracking and multiple user access;
    • user interfaces that allow fast browsing with user defined criteria and adaptive feedback that recognises context and intent, and contributes to the annotation.


  • MESH and RUSHES Projects - (2.2M) - (439K)
    Pedro Concejero (Telefónica Investigación y Desarrollo, Spain)

  • SAPIR Project - (4.9M)
    Pavel Zezula (Masaryk University, Czech Republic)

    Multimedia content retrieval is performed on corresponding descriptive features. Though there are many ways to extract features, they are typically compared by specific similarity (distance) measures, so a query execution must be supported by efficient similarity search index structures. MUFIN (Multi Feature Indexing Network) is a general purpose prototype system with the following objectives: Extensibility - performing (combined) similarity queries for arbitrary metric distance measures; Scalability - by application of structure P2P networks the system is able to scale up to the web dimension; Performance tuning - by a suitable mapping of the logical peer structure to specific computer network infrastructure, the query response time and throughput can be adjusted. The properties will be demonstrated by an image content-based retrieval over a dataset of 10 million images indexed by five MPEG7 descriptors, running on a modest computer infrastructure. A combination with the traditional text search will also be presented. The work is supported by the SAPIR EU project.

  • AIM@SHAPE Project - (5.5M)
    Francesco Robbiano (CNR-IMATI Genova, Italy)

  • VITALAS Project - (1.3M)
    Arjen de Vries (CWI, The Netherlands)

  • TRIPOD Project - (984K)
    Xin Fan (University of Sheffield, UK)

  • Pharos Project: Using AV-RSS for Publishing Audiovisual Content Metadata
    Oscar Celma (Universitat Pompeu Fabra)

    Audiovisual content is not like traditional web content. The main differences lie in the opaqueness of the content, the file size, the use of multiple encoding schemes and the constrained licensing agreement often associated with it. This raises many issues for an audiovisual search engine that is not directly operated by content owners. One issue is the difficulty of obtaining remote access to a high-quality copy of the content in order to automatically extract information necessary for indexing. This requires time consuming licensing negotiations, a high-bandwidth network connection and large storage capacity.

    One way to address this problem is to allow content owners wishing to increase the awareness of the availability of their contents to directly generate and publish richly annotated metadata to search engines or to any content mediator services. The published metadata can then be received by any services having subscribed to them using a publication-subscription protocol.

    The PHAROS European research project has developed such a protocol using an extended XML schema for publishing metadata describing audiovisual content. This AV-RSS schema ( AudioVisual RSS) is a publicly available format. It reuses RSS for its simplicity and extends Media-RSS with a number of features including content identification and time-indexed annotations.

    AV-RSS allows the use of multiple content identifiers using different policies or different registries such as ISAN. An audiovisual item can also be identified using content fingerprints computed by different technologies. The AV-RSS schema also defines different means by which content can be accessed by and presented to end users (e.g. as streamable previews, downloadable content etc.) in different query contexts, and it includes related copyright, license and acquisition information. Part of the AV-RSS schema is dedicated to the temporal annotation of the visual or audio track of an audiovisual item. This part reuses some principles of MPEG-7, while largely simplifying it. A full MPEG-7 description can be accessed by including it as a link to external content in the AV-RSS description of an item. Another Media-RSS extension has also been defined in order to describe the scheduling of content availability. Moreover, some Media-RSS information elements have been redefined in order to add mandatory language and schema attributes.

    The PHAROS project is currently experimenting with, and validating the first version of the AV-RSS format, and wishes to present it to a larger audience of audiovisual content practitioners to collect remarks and feedback in order to improve it in a second release planed at the end of this year.

14:00-15:30 Lunch
15:30-16:30 Session on Specialized Search
  • Learning to Rank Answers on Large Online QA Collections - (100K)
    Mihai Surdeanu (Fundacio Barcelona Media, Spain)

    This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may be used to effectively set up large supervised learning experiments. Furthermore we investigate a wide range of feature types, some exploiting NLP processors, and demonstrate that using them in combination leads to a very significant improvement in performance.

  • Exploiting explicit and implicit semantics on the Web - (2.1M)
    Peter Mika (Yahoo! Research Barcelona, Spain)

    In this presentation we will give an overview of the developments that we believe will lead in the mid-term to the realization of Semantic Search, i.e. search with the capabilities to understand the user's intent and the Web's content at a much deeper, conceptual level than currently possible. Semantic Search will require a combination of methods from IR with recent results in Natural Language Processing and the Semantic Web to tackle both implicit and explicit semantics on the Web. The first project we will present combines entity extraction technology and metadata to improve tagging and retrieval on the Wikipedia corpus. Our second case study will show how dynamically collected metadata can be used to enrich the interface of a search engine.

  • Graph-based context-sensitive search
    Aristides Gionis (Yahoo! Research Barcelona, Spain)

    We consider the problem of query search in a hyper-linked document collection. We extend the typical keyword search, by considering a source document, which provides the context of a query. The task is to rank other documents in Wikipedia with respect to their relevance to the query terms given the source document. By attaching a context to the query terms, the search results of a search initiated in a particular page can be made more relevant.

    If we consider the collection of documents as a directed graph G with nodes the documents and edges the links among the documents, the above search problem maps to the problem of finding relevant target nodes in the graph G when the query is initiated from a source node. We suggest a number of techniques and features that extend the classical query-search model so that the source document is taken into account.

    Our features take into account both the content of the context document as well as its position in the graph G. Our experiments, made using Wikipedia, indicate that the proposed method considerably improves results obtained by a more traditional approach that does not take the context into account.

16:30-18:00 Coffee break and Demos/Posters
17:00-18:00 Hands-on session
Teaching User Interface Design using a web-based Usability Tool
Ernesto Arroyo (Fundacio Barcelona Media, Spain)

This workshop demonstrates how a user interface analysis tool can be used introduce design and interaction concepts. It presents MouseTrack as a web logging system that tracks mouse movements on websites. This system includes a visualization tool that displays the mouse cursor path followed by website visitors. It helps web site administrators run usability tests and analyze the collected data. Practitioners can track any existing webpage by simply entering its URL.

Here is a link to the project website.
Some examples:
20:00   Banquet
At the Roc de les Bruixes restaurant. One of the most highly regarded establishments in the entire Grandvalira domain, both for the quality of its creative cuisine with French, Andorran and Pyrenean touches, as well as for the views of the Canillo valley from its terrace. T. +376 890 696

Includes a bus from the hotel, climbing upto 2,000m above sea-level in a cable car.
Dinner Menu (in Spanish).


Saturday, April 5

9:00-10:30 Session on Specialized Search
  • Search and Recommendation: two sides of the same coin? - (1.6M)
    Xavier Amatriain (Telefónica Investigación y Desarrollo, Spain)

    Recently the field of Recommender Systems has gained growing popularity among the research community with new conferences such as the ACM Recsys going into its 2ond edition and established conferences such as SIGKDD or SIGCHI focusing a great deal of attention on this topic.

    The Recommendation field started from a different background than web search, namely Data Mining and HCI versus Information Retrieval. While the goal of Recommendation Systems is to optimize a fitness function between content and users by "discovering" hidden relations in the data, Search Engines focus on "retrieving" pre-existing data.

    However there are clear trends that point to both fields coming closer together. On the one hand, web search is becoming more and more personalized, highlighting the need for user profiling and collaborative filtering. On the other hand, it is becoming clear that in many cases search strategies are essential for the performance of Recommender Systems.

    As a result, some claim that search is just a "simpler form of recommendation", where the fitness function to be optimized is that of a generic average user (e.g. using algorithms such as Page Rank) Obviously statements in the opposite direction can also be made. In this talk we will assume that the audience is familiar with Web Search systems and therefore we will focus on describing the basic techniques and current research trends in Recommender Systems, highlighting where and how they are similar or different. At the end of the talk, We hope to convey the message that the "Future of Web Search is in Recommendation", hoping that such a claim will spark an interesting discussion and debate throughout the workshop.

  • Time in Web Search - (1.2M)
    Omar Alonso (UC Davis/A9.com)

    Time is an important dimension of any information space and can be very useful in search applications. As Internet search engines keep gathering new and diverse information sources, identifying relevant information that has sensitive time becomes more important for users. Temporal information is available in every Web page as temporal expressions or in the form of metadata. Recognizing such information and exploiting it for retrieval and presentation purposes are important features that can significantly improve the functionality of search applications. Current search applications do not take advantage of all the time information available in Web pages to provide an alternative user experience.

    In this presentation, I explore how temporal information can be used in search and retrieval and I outline some of the areas that can benefit from exploiting such information.

  • MyMobileSearch : Next generation search engine for mobile users - (1.3M)
    José Manuel Cantera Fonseca (Telefónica Investigación y Desarrollo, Spain)

    In the next years, it is expected that the number of contents in the Web prepared to be delivered to mobile devices will augment exponentially. As a result, users will need to search to get information while on the move. However mobile search introduces a series of challenges, related to the provision of the results more suitable for each delivery context. The next generation of search engines will need to deal with mobile web users needs, and as result, mobile devices will become the universal access point to information anywhere, any time.

    This talk will explain the technological research challenges around mobile search and how we are facing it in MyMobileSearch, an internal research project aimed to develop a search engine capable to combine serveral search criteria related to the mobile context such as location, the user’s device and browser.

  • Mobile Search: requirements for a personalised service
    Ben Bratu (Motorola Labs., France)

    Search has become one of the most important mechanisms for discovery of content on PCs and wired internet devices. Many attempts have been made to port internet applications and experiences from the PC domain to mobile. However, the requirements of mobile users are significantly different from PC users. Device limitations (e.g. keyboard, display, computation power, limited battery, etc.) are the key factors in restricting the use of search on mobiles.

    In the same time, compared to the PS search, the mobile search tends to be a service that is more time and context sensitive. The user is expecting to get the most useful information giving not only his expressed interests but also his current context (e.g. location, current activities, social surroundings, etc.).

    Finally and most important, the mobile phone is a strictly personal device. Very fine-grained individual information about the user can be used not only to achieve e better efficiency of the search service but also to make from the mobile search one of the most personalized service.

10:30-11:00 Coffee break
11:00-12:00 Session Multimedia
  • Text-based Retrieval Models for Media Search
    Vanessa Murdock (Yahoo! Research Barcelona, Spain)

    Given a query from a user, we would like to return relevant media items, where the media item is represented by a title and short description, or perhaps a set of tags, generated by the owner. Typical user queries are 2 - 3 terms, and descriptions or tag sets are typically fewer than 15 terms per media item, making the distribution of terms in the retrieval model especially sparse. We developed a family of retrieval models that is especially suited to cope with the limited information provided. In this talk I describe ongoing work in retrieving relevant videos, based on textual information associated with videos and user queries.

  • The Social Media Opportunity and Application for Landmark Search - (13M)
    Mor Naaman (Yahoo! Research Berkeley, USA)

    Community-contributed collections of media on the web are a becoming a vast, rich resource for image and video on a long-tailed array of topics. These multimedia resources present a new opportunity for multimedia search and retrieval - but also pose new challenges. I will describe some initial exploration into turning social media content into a data source for image search. Using a combination of context- and content-based tools, we generate representative sets of images for location-driven features and landmarks, a common search task. To do that, we using location and other metadata, as well as tags associated with images, and the images' visual features. This approach can potentially scale to provide better search and representation for every landmark, worldwide.

  • VISTO: VIsual STOryboard for Web Video Browsing - (433K)
    Marco Pellegrini (CNR IIT)

    Web video browsing is rapidly becoming a very popular activity in the Web scenario, causing the production of a concise video content representation a real need. Currently, static video summary techniques can be used to this aim. Unfortunately, they require long processing time and hence all the summaries are produced in advance without any users customization. With an increasing number of videos and with the large users heterogeneousness, this is a burden. In this talk we describe VISTO, a summarization technique that produces customized on-the-fly video storyboards. The mechanism uses a fast clustering algorithm that selects the most representative frames using their HSV color distribution and allows users to select the storyboard length and the processing time. An objective and subjective evaluation shows that the storyboards are produced with good quality and in a time that allows on-the-fly usage.

    Joint work with Marco Furini, Filippo Geraci, and Manuela Montangero. A preliminary version appeared in ACM CIVR International Conference on Image and Video Retrieval 2007

12:00-12:45 Invited Speaker
The Future of Web Search
Usama Fayyad (CDO Yahoo!, USA)
12:45-14:00 Speakers' Corner / Discussion / Wrap up based on topics raised during the event
14:00-15:30 Lunch