"Some Settling of Contents May Have Occurred"

Professor Michael W. Berry
Department of Computer Science, University of Tennessee at Knoxville


March 15 (Monday)
A common message found on the inside tab of many popular cereal boxes might be appended to many of the search engines available on the World-Wide Web. As with your favorite cereal, you assume that the contents at the bottom are never as good as those at the the top. Getting the right content or context of information is certainly an important goal for any information retrieval (IR) system.

Several approaches to retrieving textual information depend on a lexical match between words in users' requests and keywords or indices assigned to documents. However, due to the tremendous diversity in the words used by authors and readers, such lexical-matching methods are necessarily incomplete and imprecise. Recently, vector space IR models based on matrix decompositions from linear algebra have been used to estimate the implicit higher-order semantic structure or association of terms (or keywords) with documents. Using a low-rank approximations to large sparse "term-by-document" matrices, both terms and documents can be encoded for concept-matching with users' queries in high-dimensional vector spaces. When based upon the singular value decomposition (SVD), this approach is commonly referred to as Latent Semantic Indexing or LSI since the subspaces spanned by the approximate singular vectors represent important associative relationships between terms and documents that are not evident in individual documents.

This talk will focus on the motivation and design of LSI-based IR models.

Professor Berry received his B.S. in Mathematics from the University of Georgia, his M.S. in Applied Mathematics from the North Carolina State University and his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign-Urbana. While he was working on his Ph.D. in Computer Science at the University of Illinois, he was also a computer scientist for the Center for Supercomputing Research and Development of the university. He joined the Department of Computer Science at the University of Tennessee in 1991. He has been teaching and working on many research projects in the fields of numerical linear algebra, computer science, information retrieval, computational ecology, parallel computing, and performance evaluation. His hobbies include swimming, jogging and fishing. To learn more about Professor Berry, please visit his home page and his curriculum vita.