Keynote Speakers
- Jeffrey Ullman
Stanford University, USA
"A Problem in Entity Resolution" [Download - 355K]
Abstract
"Entity Resolution" is the problem of matching records from one or several databases so that records representing the same "entity" (e.g., an individual) are matched and those that represent different entities are not matched. We shall discuss a real problem of matching name-address-phone records in the face of typos and other transformations. A key technique used for efficiency was locality-sensitive hashing, so we shall give an overview of this important technique and see how it applied to this particular form of data. Finally, we shall show a surprising technique for estimating the number of matches, even in situations where the differences between records are so large that it is impossible to be certain whether or not two particular records really represented the same entity.
Biography
Jeff Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus). His interests include database theory, database integration, data mining, and education using the information infrastructure. He is an author or coauthor of 16 books and 170 technical publications. His honors include: Honorary doctorate, Free University of Brussels, 1975; Einstein Fellowship, Israeli Academy of Sciences, 1984; Guggenheim Fellowship, 1988--89; National Academy of Engineering, 1989; Honorary doctorate, University of Paris--Dauphine, 1992; Fellow of Association for Computing Machinery, 1994. Sigmod Contributions Award, 1996. Best paper award, SIGMOD, 1996. Karl V. Karlstrom outstanding educator award, ACM, 1998. Knuth Prize, 2000.
- Renee Miller
University of Toronto, Canada
"New Challenges in Schema Mapping and Data Exchange"
Abstract
Heterogeneous datasets contain data that have been represented using
different data models, different structuring primitives, or different
modeling assumptions. Schema mappings, which represent the
relationship between schemas, are an important tool for managing and
sharing heterogeneous data. In this talk, I will briefly overview
work on creating schema mappings and their use in data exchange. I
will discuss the success of schema mapping and data exchange technology
in traditional enterprise integration applications, and discuss some
of the challenges in using (and creating) mappings in peer data
sharing systems and large scale web environments.
Biography
Renee J. Miller is a professor of computer science at the University
of Toronto and the Bell University Labs Chair of Information Systems.
She has received the US Presidential Early Career Award for Scientists
and Engineers (PECASE), an NSF CAREER Award, the Premier's Research
Excellence Award, and an IBM Faculty Award. She received her PhD in
Computer Science from the University of Wisconsin, Madison and
bachelor's degrees in Mathematics and Cognitive Science from MIT.
|