Charles R. Hildreth, Ph.D.
It is time to answer our original question: Are we moving in the right direction with online catalog design? The answer has to be, "only in small measure." Innovative design work on the user-system interface, including GUIs, has made many of the second-generation online catalogs far easier to use than the conventional, dial-up commercial database search systems after which they were modeled. There is little evidence that the "in-house" university-based developers or commercial suppliers of the majority of installed, operational online catalogs are developing a new generation of OPACs based on the probabilistic or exploratory design models. "Third-generation" online catalogs are not yet generally available in the mainstream library system marketplace. Only a few of these more advanced catalogs have been developed, primarily as prototype or demonstration systems. The "Kids Catalog" developed by CARL systems has an innovative GUI search and browse interface, but the underlying search engine employs conventional keyword, Boolean match and retrieval techniques.
To break out of the query-oriented, Boolean mind-set, we need to turn the conventional query-first-then-browse paradigm upside down. Searching by exploration, recognition, and discovery in a well-structured bibliographic space should be the primary search interface provided to information seekers, augmented by secondary query expansion methods and a choice of similarity operations. This paradigm shift will require a concomitant change in our overriding concern for bibliographic record content and structure to an equal or greater concern for the expanded content and structure of our bibliographic databases.
To move forward from the second-generation OPAC stage on which we seem to be stalled, a new design vision is needed that draws on both the probabilistic model and the browse/explore model. It is hoped that designers of online catalogs will begin to develop systems that combine the natural language, best-match, ranked-output approach with hypertext, exploratory search and navigation methods.
The World Wide Web has produced an expanded awareness of the advantages of hypertext lookup and retrieval. Although recently implemented by several of the commercial online search services, probabilistic retrieval techniques are not well understood by librarians and search specialists in the United States who still favor the Boolean approach. The popularity of the Internet's WAIS (Wide Area Information Server) probabilistic indexing and search approach seems to have waned. It is time for those who influence system designers to acquire a greater appreciation for a theoretically sound, feasible information retrieval approach which 1) permits and encourages expressive natural language search input of more than 2-3 words in length (typical in keyword OPACs), 2) employs flexible term-document matching algorithms that retrieve potentially relevant documents and rank them in order of their relevance to the user's query, and 3) provides a simple mechanism for the user to input relevance feedback on one or more records retrieved, this feedback then used by the system to refine a search in progress and improve the output of the search.
The major functional improvements that we believe will define the next generation of online catalogs are listed below.
Innovation in design will be encouraged, as there are many ways to define and implement these features. Progress will almost certainly occur in incremental steps, but the third-generation online catalog will be a wholly new kind of retrieval system because it will be based on much more representative models of actual user information seeking behaviors.