Introduction to Subject
Directories
Last session, Session 1, we indicated to you that
the tools available for finding material that is available through the public
internet are of several kinds:
- Using Smaller Directories,
which are subdivided into either
- Subject Guides
- Reference Works
- Browsing the Web through the larger,
less selective Directories of Web pages (also called Subject
Directories)
- Searching the Web through Search Engines
We covered the third category, "Searching
the Web through Search Engines" in Session 1. This
session is devoted to the second category above, "Directories of Web pages"
which is also known to some users, more specifically, as "Subject Directories." In
a latter session, we will be bringing up item 1: Reference Works, and what we call
Subject Guides.
General Subject
Directories The subject directory
to publicly accessible web resources, remember, differs from search
engines to web resources in two ways:
- Sites, not Pages: Subject
directories are lists of web sites; search engines lead
their users to web pages. In other words, Google
will return a list of web pages to your search request. Some of
those pages may, indeed, be top pages for a web site, but they don't
have to be. Google's engine doesn't care; it just determines that
something on that page matched your search request. Subject
directories are databases of entries for web sites, invariably taking
the user who clicks on an entry to the top page of a web site.
- Human Specialists, not Web Spiders or
Crawlers: Subject directories are composed of highly selective web
sites that human specialists suggest be added to the directory.
Those two features delineate the differences
between the information that search engines and subject directories supply.
Subject directories are much, much smaller in their "holdings" (web site
entries) than search engines. While Google and
AllTheWeb talk in terms of billions of web pages indexed, even the
largest general subject directories talk in terms of thousands of web sites
covered. But, that is exactly what is so helpful about
subject directories: you use them to learn what specialists consider to be
the best web sites to go to for information. This is to say that
it may be better for you to find out what specialists-- experts about some topic
or area--consider to be the best sites to find good, reliable information on
that topic. In fact, librarians and
information specialists trust subject directories much, much more than
they trust the search engines as being likely to lead end-users who
don't know an area of knowledge to helpful information. You see, a
ten-year-old can get results from a search engine like Google.
But, "using" Google (getting it to return some
results to you, even though they may not be relevant) and searching it
with a good search strategy and with an understanding of
how to select appropriate results from what it returns to you--are two
completely different things. Subject directories,
though, are composed of only quality sites--sites purposefully
assessed and selected by subject experts as being the best on a
topic. Unfortunately, most web
searchers are blissfully ignorant of how to assess the worth of the web
pages they see, knowing nothing of selection criteria such as authority,
scope, and treatment, or how to use those criteria in deciding a page's
reliability, point-of-view, and accuracy. They can, however, rely
on subject directories whose listings are assembled by subject
specialists who know those selection criteria and know their knowledge
domain's excellent web sites very well.
Too, subject directories are useful as an
"entry strategy" into an unknown area, or even a partially unknown area.
You see, subject directories are built on hierarchical classification
schemes that allow users to learn, through an inspection of a topic's
subcategories, all of the "other" things that are known to exist "about"
a topic's subtopics. If you were interested in some aspect of
national politics, you would find it helpful to look in a subject
directory to see what those "other aspects" of national politics are,
and not go directly to what you think you should be
looking for. In other words, searching through a subject directory
is a learning experience, because you are given
information about other aspects of a topic that you hadn't thought about
or that just aren't in your own background.
The recent trade paperback on Google,
by two of its software engineers and a Stanford instructor, How to
Do Everything with Google, gives these possible uses of subject
directories (p. 255):
- Familiarize yourself with a topic
- Get suggestions for ways to narrow
your search
- Come up with ideas for query terms
- Grasp the scope of a given category
- Find categories associated with a
particular topic
- Find lists of items
To repeat, then, you should give serious
consideration to "learning" about a topic or concept or category of knowledge by
taking a look at how the managers of subject directory classification schemes
have "surrounded" what you think you need to find with other topics or aspects,
and you should see what they define as the subcategories of a topic or
classification category. The Top General Subject
Directories The first general
subject directories we list in Dr. Bob's
Searching / Browsing
page are
The three that we are going to mention to you
here in the session are Yahoo!, About.com, and the
Open Directory Project. We will talk about them in that
order too, because, as you might know, Yahoo! was the first, most
popular subject directory that "drove" the Internet for years in the 1990's.
1. Yahoo!
The major general subject directories
include the old standby that have been around since "ancient" times
(in Web years, that means the middle 1990's!):
Yahoo!
Yahoo! has 15 major categories,
with innumerable subcategories beneath those 15.

Below is the result of a search through
the Yahoo! directory for information about a contemporary
scholar who has been making quite a stir in academe recently.
His name is Steven Pinker, the author of a number of
thought-provoking--and to some social scientists,
controversial--books:
The Blank Slate: The Modern Denial of Human Nature,
The Language Instinct : How the Mind Creates Language, and
How the Mind Works. As you might expect, the 10 sites
listed in Yahoo!'s Directory area may be dramatically
smaller than the number of web pages that a search of Yahoo! would
result in ( Yahoo! says "about
34,800"
web pages), but these 10 sites have been selected by human beings
who are subject experts and who should therefore know where the good
sources of information about Steven Pinker are on the web.

Notice, too, what we are told at the top
of this page, the complete, category-by-subsequent-subcategory
classification of the page's location in Yahoo!'s
hierarchical classification:
Social Science > Linguistics and Human Languages > Linguists >
Pinker, Steven An
understanding of what other topics might be under the subcategory
Linguists might be of interest to you or it may not, but you now
know that the Yahoo! directory classifies Pinker as a
linguist. And, of course, you know that the 10 sites given for
information about Pinker are likely to give you a good deal of
relevant information about who he is, what he has done, and what
others think of what he has done. 2.
About.com Another extremely
popular general subject directory of the Web is the
About.com directory.
About.com has been around since the middle 1990's (1997), having
started under another name (The Mining Company). Its
subject areas (topics) are called "guides," and it arranges its
guides into 23 categories that are called "channels":
The About network consists of hundreds of Guide sites
neatly organized into 23 channels. The sites cover more than
50,000 subjects with over 1 million links to the best resources on
the Net and the fastest-growing archive of high quality original
content. Topics range from pregnancy to cars, palm pilots to
painting, weight loss to video game strategies. No one has greater
depth and breadth than About.
A Brief History
In February of 1997, Scott Kurnit and a dedicated team launched
The Mining Company, the first information network to integrate
the Internet's most productive agent - people. The
company quickly grew in size and scale and in 1999 the company was
renamed About, to reflect its breadth of content, services
and ease of use. Today, About is visited by one in five
online users each month, making it one of the most popular
destinations on the Net.
3. Open Directory Project
Organized into 16 categories. the
Open Directory Project today has
"4,057,678
sites, 60,704 editors, and 544,142 categories. Probably the
most re-used directory service on the Web, the Open Directory
Project (ODP) web site selections are also found in, or
at least in some way as being the basis of, other web search
services too, like Google, AOL, DirectHit,
Hotbot, and Lycos.
Most web users probably
access ODP results through Google, the most frequently used search
engine for the Web. There, it is simply referred to in Google's
results as the Google Directory. However, here is what an Open
Directory Project result actually looks like:

Notice that this result had 7 levels in
its category-sub-category hierarchy:
Reference: Libraries: Library and
Information Science: Technical Services: Cataloguing:
Classification: Library of Congress
Also notice that the user is able to
search directory for the location of material on a particular topic
in the search box at the top of the page shown.
Academic Subject Directories
You will notice on the
Searching / Browsing page a set
of 4 "academic" subject directories:
These directories are much, much smaller than the
wider-scoped general subject directories just talked about above. But,
there 4 academic directories are absolutely wonder sources of quality
information, even though their scope is narrower than the full, general subject
directories. 1. Librarians' Index to the
Internet All of them are
interested products, such as the Librarians' Index to the
Internet. Begun in 1990 as a Berkeley reference
librarian's Gopher bookmarks file, it was migrated to the Berkeley
Public Library's web server in 1993 and renamed the Berkeley
Public Library Index to the Internet. In late 1996, it again
changed its format, and added a search engine and Library of
Congress Subject Headings. In
March 1997, it was moved to the Berkeley SunSITE where it is
hosted, courtesy of UC Berkeley SunSITE staff, and renamed the
Librarians' Index to the Internet.
Lii, as it is called by its
managers, is a very well selected set of web sites that are well
annotated in the Lii directory. As you can see below, again
you are able to either browse down through its classification
categories or search the site by a keyword or phrase in the search
box at the top.

2. Infomine
Another product of the California
higher educational system is
Infomine, a directory to over 115,000 "academically valuable
resources." It differs a bit in its menuing system, giving
its users a search box to enter its resource base:

You, the user, have a number of
features you can search by, limiting your results to, and browse
by.
Finding Quality Information: Evaluating Information on Web
Sites
Before we finish this session, we must say again
that you must have a good sense of what information you are going to be willing
to accept as legitimate and/or appropriate for your purposes. In the
professional area, Library and Information Studies, a good deal of attention is
given to the process of selecting and acquiring resources for a library or some
other form of information center. Not just everything in print or
available via other media is appropriate for a collection of academic resources:
librarians and information specialists have to know what is acceptable and what
isn't . . . before they begin to acquire and hold materials in any collection.
Similarly, you, the user of Internet-based information, should view most of the
resources available to you gotten from a search engine search as the unselected, unintegrated,
mass of materials from which you need to be alert enough to
choose materials that you judge as being acceptable or
unacceptable for scholarly and research purposes.
In this sense, the Internet is analogous to--not
a library, but instead--a huge flea market of resources. There are lots of
personal opinions out there on the Web, unverified and unsupported in a
scholarly sense. There are also lots of technically accurate sources of
information on the Internet, but sources that are not reflective or representative
of a complete picture of some topic or thing: you might visit an automobile
manufacturer's site to see what good things they can say about a particular
model of an automobile, but you would surely refrain from relying on that same
site as a source for the bad features of that same model of car! You
know what you should expect in the way of valid and reliable information, as you
understand the motives of the owners of the web site.
One excellent classification of web sites
according to their information quality is due to Alexander and Tate, the authors
of a new book called
Web Wisdom, published a few years ago (March, 1999). They suggest five
different categories of web sites:
- Advocacy pages
- Business / Marketing pages
- News pages
- Informational pages
- Personal pages
Janet Alexander and Marsha Ann Tate are
librarians at the Widener University library, Chester, Pennsylvania. Their
book is intended to offer their readers assistance in evaluating or establishing
information quality on the World Wide Web. They manage a companion web
site, Evaluating
Web Resources, from which the following five descriptions are taken.
If you follow the link associated with each of the five categories below, you
will find a series of criteria the authors recommend to you in assessing a
site's information quality :
1. An Advocacy
Web Page
is one sponsored by an organization
attempting to influence public opinion (that is, one trying to sell ideas).
The URL address of the page frequently ends in .org (organization).
Examples: National Abortion and Reproductive Rights Action League, the
National Right to Life Committee, the Democratic Party, the Republican Party
2. A Business/Marketing Web Page
is one sponsored by a commercial enterprise
(usually it is a page trying to promote or sell products). The URL address
of the page frequently ends in .com (commercial). Examples: Adobe Systems,
Inc., the Coca Cola Company, and numerous other large and small companies
using the Web for business purposes.
3. A News Web
Page
is one whose primary purpose is to provide
extremely current information. The URL address of the page usually ends in
.com (commercial). Examples: USA Today, Philadelphia Inquirer, CNN
4. An Informational Web Page
is one whose purpose is to present factual
information. The URL Address frequently ends in .edu or .gov, as many of
these pages are sponsored by educational institutions or government
agencies. Examples: Dictionaries, thesauri, directories,
transportation schedules, calendars of events, statistical data, and other
factual information such as reports, presentations of research, or
information about a topic
5. A Personal
Web Page
is one published by an individual who may or
may not be affiliated with a larger institution. Although the URL address of
the page may have a variety of endings (e.g. .com, .edu, etc.), a tilde is
frequently (~) embedded somewhere in the URL.
--Alexander, Janet E. and Tate, Marsha A.
Web Wisdom: How to Evaluate and Create Information Quality on the Web.
March 1999. Lawrence Erlbaum Assoc. Also see their web page, Evaluation Web Resources.
|