yahoo
logo_upf
4th Workshop on the Future of Web Search: Semantic Search
Ibiza - April 17-18, 2009
FBM

Keynote Speakers

  • Ziv Bar-Yossef
    Google Haifa, Israel

    External Mining of Search Query logs.

    Abstract: Search query logs are valuable data sources that are kept confidential by search engines, in order to protect their users' privacy. Some search engines disclose aggregate statistics about queries via services, like Google Trends and Google Trends for Websites. The information provided by these services, however, is obfuscated, non-repeatable, and partial. For example, statistics for medium to low volume queries and sites are not readily available.
    In this talk I will describe algorithms for "external" mining of search query logs. Our algorithms can be used to estimate the popularity of queries in the log and the amount of impressions web sites receive from search results. The algorithms use only public search engine services, like the web search service and the query suggestion service, and thus do not require privileged access to confidential search engine data sources. In addition, the algorithms use modest resources, and hence can be used by anyone to gather statistics about any query and/or any web site.
    Our algorithms rely on tools from information retrieval (keyword extraction), statistics (importance sampling), and database (tree volume estimation).
    The talk will be self-contained. Based on joint work with Maxim Gurevich.


  • Giovanni Tummarello
    DERI, Ireland

    Scalable, Tolerant, Fair... ultimately useful: Web of Data processing for the benefit of Humans.

    Abstract: At the beginning of 2009, hundreds of million of web locations are willing to provide structured data for integration and reuse. Despite this, killer applications showcasing the benefits and fulfilling the promise of the "Web of Data" have still to be seen. A closer look at the data reveals that, in fairness, there are many reasons why information reuse is a deceivingly complex task. Based on the research in Sindice.com, in this talk I'll present a series of "recipes" for Scalable, Tolerant and Fair Web Data. I will touch aspects such as Web Data collection, lightweight reasoning, consolidation, indexing, ranking and finally demonstrate how these technology can be leveraged in user oriented applications.


  • Hugo Zaragoza
    Yahoo! Research, Spain

    Interacting with Semantically Annotated Collections.

    Abstract: Semantic annotations of text can be used today in a number of ways: to create richer interfaces to the information locked in document collections, to help the user express its information need, and to improve the relevance of the results obtained by the search engine. I will give an overview of our recent work in these three areas, using example applications on online collections such as Wikipedia, financial news and Q&As.