Rhetoric in the Age of the World Wide Web

Dianne L. Juby University of Oklahoma, Department of English juby@ou.edu


Abstract

The profusion of information on the World Wide Web presents new challenges for search engine designers. This paper suggests that the discipline of rhetoric provides valuable strategies for rethinking search engine possibilities. Information retrieval theorists have proposed a radically different indexing method based on argumentation rather than semantic networks for bibliographic and full text databases. Their proposals are expanded here to explore the potential benefits of an argumentational and rhetorical approach to Web searching.


Outlining the pros and cons of various World Wide Web search engines, Steinberg (1996) begins his critique of human indexer-assigned subject terms with an example of a classification problem faced by Yahoo. Members of the Messianic Jewish Alliance of America were distraught when Yahoo indexed their site under Society and Culture: Religion: Judaism. After protests from MJAA members, Yahoo placed the site under "Christianity," although this category also proved unsatisfactory. From this example, Steinberg launches into the argument that "we learn to automatically compensate for right-wing bias while reading The Wall Street Journal’s editorial page," and therefore we can similarly "learn to adjust for the perspective that Yahoo! embodies." Interestingly, Steinberg shifts the responsibility for dealing with bias from the classifiers to the audience, which supposedly seeks an objective reality lying behind the distortion. But is there really an objective reality lying behind either the MJAA’s categorical dilemma or a WSJ editorial? What if there were no perfect scheme, no potential classification of knowledge deserving of ontological status, but, rather, mere humans in various local situations making arguments? When we read a WSJ editorial, we know we are reading an argument, not a distorted or biased representation of "knowledge." Instead of "adjusting for bias," we would ourselves be situated humans finding, making, and reacting to arguments.

Could arguments as well as terms and concepts be indexed, and if so, could World Wide Web search engines effectively search for them? Kircz (1991) proposes that the argumentational structure of scientific articles holds the potential to improve information retrieval in bibliographic and full text databases. Sillince (1992a, 1992b) expands that suggestion to outline an arguer search program for all types of academic articles. Kircz and Sillince draw their argumentation theory from Chaim Perelman and Lucie Olbrechts-Tyteca’s The New Rhetoric (1958), a work which reclaimed informal reasoning, the classical realm of rhetoric, as a legitimate sphere of human activity and knowledge-making.

This paper explores a few rhetorical questions for expanding the proposals of Kircz and Sillince to Web search engines. If, as Sillince contends, academic articles should be searchable by argument, how much more do we need to rethink the Web’s search capabilities in this rhetorical direction? The arguments which Web users produce are infinitely more varied than those in academic articles. The Web expands the kinds of retrievable documents and files far beyond the limited types of discourse production in professional journals. To what extent are the vast numbers of documents on the Web making arguments? What assumptions do Web indexers and search engine designers make about the way language works, and how does the discipline of rhetoric speak to the problems that they identify? Would the ability to search for arguments on the Web add value to the information retrieved, helping searchers locate more relevant documents and fewer irrelevant ones, and assisting users in finding what they really have an interest in? Argumentation is interested discourse; it is situated discourse; argumentation is about the adherence of an audience to a position; it is about "the dialectical relationship between thought and action" (Perelman and Olbrechts-Tyteca, p. 509).

Representing Content

The New Rhetoric spins out its theory of argumentation between first and final sentences which seem to blame Descartes for three hundred years of the neglect of Reason in the human sciences. But the authors’ more contemporary concerns stem from the "brilliant development" of modern formal logic during the last century. Reacting against this decontextualizing logical empiricism which denies rationality to the vast majority of human endeavors, they seek primarily an answer to the question: "How do we reason about values?" Their concern is to reappropriate Reason and rehabilitate rhetoric for projects dependent on informal logic, e.g., value theory, law, ethics, and politics.

From Descartes and Thomas Sprat to the logical positivists, philosophers espousing scientific method and formal logic have rejected rhetoric as superfluous language designed to hoodwink an emotionally susceptible audience. The authors of The New Rhetoric write that "[t]he dissociation between form and substance, which has resulted in the dehumanization of the very notion of method, has also had the consequence of accentuating the irrational aspect of rhetoric" (p. 508). Substance, or content, becomes the privileged term, while form takes on a flavor of mere style, the fancy language which poets and literary types indulge, an appeal to the emotions, a verbal accessorizing of the real substance or content. The two elements are seen as completely separable: content exists independently of language and is objectively available through rational processes; rhetoric distorts this rational apprehension and communication of objective content. This understanding of content, communication, and rhetoric fuels the assumptions upon which search engines are designed. We have already seen one typical manifestation of this assumption that rhetoric clouds otherwise objective information in Steinberg’s accusations of the subjective bias of human indexers.

Steinberg makes a telling comment, which he doesn’t bother to examine at length, about the MJAA incident: "Yet, [Jerry Yang, one of Yahoo’s founders] knows the MJAA was pushed around because it didn’t have mainstream Judaism’s clout." Web indexing and searching are not problems of representing reality accurately, of seeking objective knowledge, or of adjusting for bias, but are problems of human beings engaged in discursive acts, constrained by politics and power.

There are problems of representation here, but in the political sense rather than in the sense of a word representing or naming reality. Yahoo’s, and the MJAA’s, problem was one of representation. How did the MJAA want to be represented? How did another group, in many ways a more powerful group, represent them? Blair (1990), an information retrieval theorist, argues that representation is precisely the issue in indexing. Claiming that the most pervasive characteristic of searching is the "indeterminacy inherent in the representation of documents," Blair contradicts information retrieval researchers who consider this indeterminacy to be a "minor irritant" and who assume a "perfectly rational" representation of documents, reducing complex problems of meaning and language use to simple data retrieval. Instead, he argues that these problems of representation are "not just due to sloppiness or irrationality, but are products of much more fundamental linguistic processes" (p. 23). Subject descriptions assigned by human indexers are problematic not because humans assign them, but because they "are used to represent what has been referred to as the ‘intellectual content’ of a document or similar kind of work. This is usually done by selecting a small number of subject terms which will represent what the document is ‘about’," the assumption being that "content-bearing" words unproblematically "bear content" (p. 156).

Blair, Kircz, and Sillince all cite indexer consistency studies which strongly indicate that the determination that a particular subject term should represent a given document is never as clear as this theory of representation would lead one to believe:

  1. If two groups of people construct thesauri in a particular subject area, the overlap of index terms will only be 60%.
  2. Two indexers using the same thesaurus on the same document use common index terms in only 30% of cases.
  3. The output from two experienced database searchers has only 40% overlap.
  4. Experts’ judgements of relevance concur in only 60% of cases. (Sillince, 1992b, p. 392)

Blair draws on Wittgenstein’s language theories to argue that information is not value-free; it exists within particular contexts, and, as Wittgenstein put it, "We don’t start from certain words, but from certain occasions or activities." Blair calls for an examination of the various situations, or Wittgenstein’s Forms of Life, and language games in which subject descriptions are used. Similarly, Sillince argues that current indexing methods which attempt to represent knowledge as a semantic network are out of date, having been developed when cognitive theory was based on formal logic. Semantic-based approaches assume that "information is value-free, that there is only one way of interpreting it, and that it is unidimensional in only consisting of ‘facts’ rather than possessing the attributes of intention, goal, activity, theory, evidence, and so on" (1992a, p. 388).

Keywords and Contexts

After portraying Yahoo’s reliance on human intelligence as a disgraceful fall into subjectivity, Steinberg discusses keyword searching, suggesting that this more automated approach avoids the problems of human-assigned categories. Returning to the MJAA example, sites about Messianic Judaism could be located by searching for the combination of keywords Jewish and Jesus. Here Steinberg faults keyword searching for its inability to provide contextual cues, although he doesn’t add what is obvious from an argumentational perspective, that is, that this particular keyword combination could potentially retrieve a great variety of sites representing positions very different from those of the MJAA. A combination of keywords, no matter how carefully chosen, does not an argument make.

Steinberg, however, persists in seeing indexing in terms of content representation: "The ‘problem’ of information retrieval can actually be nailed down to two issues: synonymy and homonymy." Keyword searches for film won’t retrieve sites that only use the synonymous movie; and they will retrieve homonymous but irrelevant sites concerning, for example, a film of oil. It is true that these will remain problems as long as search engine developers maintain the assumption that content is unproblematically "there" and that content is what humans search for. Even a more context-aware approach, for instance the Latent Semantic Indexing that Steinberg discusses, still assumes that concepts (determined by the words in a document surrounding, say, film) can determine more accurately what a document is "about." In other words, keyword searching could become sophisticated enough to search for concepts rather than single ambiguous words. But still the goal is to more adequately represent content. Semantic indexing can’t fill what Steinberg calls the need for something "in between" the subjectivism of the human categorizer and the contextless ambiguity of automated keyword searching.

Rhetoric, I suggest, can provide that "in-between." But it cannot function in its full power (or at all) unless we manage to abandon our ingrained notion that language can fully and accurately represent, or accurately point to, an external reality. Steinberg’s final example about the problems of indexing fiction, "or anything that relies on metaphor or allegory," subscribes to the notion of rhetoric as mere style, ornamentation, play. The Hobbit, for instance, can’t be classified or indexed, according to Steinberg, because its meaning, its content, is completely subjective. Search technology "can only work when the meaning of a document is directly correlated to the words it contains;" although he’s just written an entire article arguing why this can’t happen (because of bias or lack of context), he still falls back on articulating a representational-of-content goal. A rhetorical approach to indexing, however, would argue that agonizing over synonymy and homonymy, or subjective distortions of what should be objectively identifiable content, is not the way to nail down the problem. A rhetorical approach would ask us to look through different lenses, see a different problem, and a new solution.

Rhetorical Questions

Basing his proposed arguer-program on Perelman and Olbrechts-Tyteca’s The New Rhetoric, Sillince suggests that "[b]ecause of the argumentational nature of articles, the taxonomies revealed by rhetoric may be very different from those conventionally encountered by indexers" (1992a, p. 389). Kircz argues that "a new net has to be spanned over the document, this time not a set of semantic equivalents, but an argumentational or rhetorical network" (p. 355). If this is the case for academic and scientific articles in electronic databases, is it not even more necessary for the barrage of arguments, positions, and posturings, if you will, which we encounter on the Web? Kircz began with the argumentational structure of articles from scientific disciplines; Sillince broadened that understanding of argument to all scholarly disciplines. How much more, then, is a rhetorical perspective needed for the wild undisciplined nature of the Web?

Unfortunately, Web search engines still struggle with problems of categorization and keyword searching, which are semantic-based systems that cannot do, and do not even consider the possibility that they should be doing, what Sillince suggests; that is, they cannot represent a searcher’s search for a situated argument. In their conclusion, the authors of The New Rhetoric articulate the difference between the two views:

	The effect of restricting logic to the examination of the proofs termed ‘analytical’ by 

	Aristotle . . . was to remove from the study of reasoning all reference to argumentation.  

	We hope that our treatise may provoke a salutary reaction and that the mere fact of its 

	having been written may for the future prevent the reduction of all the techniques of 

	proof to formal logic and the habit of seeing nothing in reason except the faculty to 

	calculate.  (Perelman and Olbrechts-Tyteca, pp. 509-510)

Rather than continuing to invest in improving the calculating ability of present search engines, could we perhaps develop systems that allow us to search for arguments, for positions and adherences to arguments, for the alliances, liaisons and dissociations that situate documents? What arguments are being made in the oft-derided multitudes of junk on the Web? Do we classify materials as junk because we deem their content unsatisfactory, or because they fail to provide arguments that persuade us they are worth examining? Some group or individual, after all, found these materials compelling enough to invest labor in putting them on the Web. What arguments did they have for doing so?

Representing the "I"

In addition to problems of content representation, other popular writers see the search problem in terms of the sheer magnitude of content that floods across our screens. We strike a few keys and are inundated with a profusion of content, a tiny portion of which we find useful for our own fleeting purposes. To assist in sifting through the flood, one projected direction for cyberspace searching would pre-package context rather than content and turn it into a salable commodity. Saffo (1994) argues that context has become the scarcest resource in cyberspace. The future, he projects, belongs to those who can help us navigate the flood of content by giving us ways to filter, sift, and sort through the bounty. His vision of this automated creature that will help us weed out and categorize is simply "point-of-view." He even predicts that context engines will be developed in which individuals with identifiable points of view--Walter Cronkite, Howard Stern, John Updike, Siskel and Ebert--will do our searching for us.

Whalen (1995), too, claims that the personification of intelligent agents is a possible future for searching the Net. His point-of-view knowbots include Rush Limbaugh and Ralph Nader; end users could decide whose eyes they wanted to view the Net through and subscribe to a point of view to filter their information overload for them. Saffo and Whalen recognize the urgent need for context to supplement content searching. However, the suggestion of using point-of-view agents replicates the same theory of language and representation that plagued content searching in the first place. The assumption that Rush Limbaugh, or anyone else, comprises a single, unified identity or subject position that can be adequately represented by the linguistic construction Rush Limbaugh has been questioned. As Chantal Mouffe (1988) argues, "we are in fact always multiple and contradictory subjects, inhabitants of a diversity of communities . . . constructed by a variety of discourses and precariously and temporarily sutured at the intersection of those subject positions" (p. 44). Rush Limbaugh, then, points to a temporary and fluid intersection of shifting and multiple positions comprising gender, race, class, sexual orientation, nationality, and so on. Which of those temporary positionalities will be searching for you? Rush’s politics or his gender, his religion or his sexual orientation, his social class or his consumer commitments? And each of those shifting positions is influenced, constrained, and bound by the particular rhetorical situation of the moment. Point of view, complicated in this way, is another version of Catherine Hobbs’s topos of positionality discussed in this volume.

Lanham (1993) articulates an understanding of rhetoric as an information system that does not function by delivering packets of content, but rather functions economically, by allocating emphasis and attention. Here attention is the scarce resource, but unlike point of view, attention connotes shifting interests and alliances, multiple positionalities that attend to one particularity in one situation, to another in a different situation. "Librarians of electronic information," Lanham writes, "find their job now a radically rhetorical one--they must consciously construct human-attention structures rather than assemble a collection of books according to commonly accepted rules" (p. 134).

McKeon (1970) defines rhetoric as "an art of invention and disposition: it is an art of communication between a speaker and his audience, and it is therefore an art of construction of the subject-matter of communication, that is, of anything whatever that can be an object of attention" (p. 108). McKeon’s definition asserts that subject matter (what I have been calling content) is constructed as an object of attention within an act of communication. Far from simply a vast supply of content needing to be coded for retrieval, the Web, for rhetoricians, is a space for the production of abundant textual objects that, having been constructed by humans in particular local situations, manifest particular positionalities and are intended to be objects of attention or persuasion. If the task of designing effective search engines for the undisciplined abundance of the Web forces us, as Lanham says, to find new rules, to think in terms of attention structures rather than content, then rhetoric should be the art by which we invent those new rules.




References



Blair, D.C. (1990). Language and Representation in Information Retrieval. 

	Amsterdam: Elsevier Science Publishers.



Kircz, J.G. (1991). Rhetorical structure of scientific articles: the case for 

	argumentational analysis in information retrieval. Journal of Documentation, 

	47(4), 354-372.



Lanham, R.A. (1993). The Electronic Word: Democracy, Technology, and the Arts. 

	Chicago: University of Chicago Press.



McKeon, R. (1970). "Philosophy of communications and the arts." 

	In M. Backman (Ed.), Rhetoric: Essays in Invention and Discovery [1987] 

	(pp. 95-120). Woodbridge, CT: Ox Bow Press.



Mouffe, C. (1988). "Radical democracy: modern or postmodern?" 

	In A. Ross (Ed.), Universal Abandon? (pp.31-45). 

	Minneapolis: University of Minnesota Press.



Perelman, C. & Olbrechts-Tyteca, L. (1958). The New Rhetoric: A Treatise on Argumentation. 

	Trans. J. Wilkinson and P. Weaver [1969]. Notre Dame: University of Notre Dame Press.



Saffo, P. (1994). It’s the context, stupid. [On-line]. Available: 

	http://www.hotwired.com/wired/2.03/ departments/idees.fortes/context/html



Sillince, J.A.A. (1992a). Argumentation-based indexing for information retrieval from learned 

	articles. Journal of Documentation, 48(4), 387-405.



----- (1992b). Literature searching with unclear objectives: a new approach using 

	argumentation. Online Review, 16(6), 391-409.



Steinberg, S. G. (1996). Seek and ye shall find (maybe). [On-line]. Available: 

	http://www.hotwired.com/wired/ 4.05/features/indexweb.html



Whalen, J. (1995). Super searcher. [On-line]. Available: 

	http://www.hotwired.com/wired/3.05/features/searcher.html


Back to the Electronic Versions of The Architectonics of Information.

Back to Agora Home. Updated: 30-Jun-97
Disclaimer