Classification and Coherence

Marijke Rijsberman
marijke@interfacility.com

These thoughts were originally formulated in response to a mailing-list debate over whether "data" and "information" are the same thing.

I argue here that the typical information architecture challenge lies not in finding THE correct classification (the one the user already has in his or her mind) but in presenting AN intelligible classification (one the user can make sense of).


 
   

Classification
Asserting that "DATA=INFORMATION" is not at all like asserting that "NATRIUM=CHEMICAL ELEMENT." In fact, it's much more like asserting that "AFRICAN AMERICAN=BLACK." Depends on who you ask, no? Creole, white, negro, colored, biracial, and multiracial are all valid alternatives for some individuals, some populations.

Since racial categories do not map to biology, there is no scientific way of deciding between them. No correct choice. So we choose one or another term on the basis of the implications of the classification scheme they bespeak. This is of course generally true of socially constructed categories.

What's interesting here is that--in spite of individual differences in linguistic competence, (social) knowledge, and cultural background--people share the general ability to infer different classifications on the basis of partial representations. Not only that, they can hold a variety of them in mind simultaneously and assign each an appropriate meaning (without having to agree to the "truth" or "correctness" of any of them).

I can comprehend that different racial terms may all be applied to the same person and can assign appropriate meanings to each of those terms, regardless of my own political convictions that racial classifications (any racial classifications) do serious social harm. In a similar way, I can recognize that, to some people, data and information are identical; that information architects and communication professionals tend to distinguish them in one way; that database folks tend to distinguish them in another way--and that each of those classification schemes has its own uses.

Again, what strikes me as most significant is the basic ability to juggle categories and classifications in incredibly complex ways. While it's unfortunate that there is no one correct way of organizing content in a website, the good news is that it's not necessary. People can handle all sorts of schemes, as long as they can make some inferences about them on the basis of the part of it that they can see and get some constructive feedback on their search attempts. (Most folks can go from chairs to furniture or from furniture to chairs with equal competence, it seems to me, if these operations are suggested by their particular context.)

 

 

"A coherence approach [...] views discourse as process. In other words, texts are viewed as dynamic expressions of menaing jointly negotiated by particular speakers and hearers located in socio-cultural space and time." Jennifer Coates, "The Negotiation of Coherence in Face-to-Face Interaction" in Coherence in Spontaneous Text (Amsterdam, 1995)
 

Coherence and Collaboration
The linguistic concept of "coherence" also describes the ability to construct and juggle mental representations based on communicative acts. It seems particularly relevant here and reinforces the notion that it's not the one "correct" classification one should pursue, but mechanisms that help users infer the coherence of the classification adopted.

Coherence is not an attribute of an utterance or text, but is defined as a mental phenomenon (based on conventional characteristics at the lexical, grammatical, generic, and social levels) that arises out of a collaboration between interlocutors. The latter part strikes me as the most meaningful in this context. Coherence equates to arriving at a similar mental representation of a particular communication as the result of a "cognitive negotiation" that, in speech (or on mailing lists, for that matter), can be performed through a variety of "grounding" procedures that help people establish that they have the same mental representation of a particular communication and adjust them where necessary. (Of course, this doesn't always work--very well or at all.) Writing is usually also more problematic, in that the negotiation must generally be accomplished without collaboration, by the writer's more or less successful internalization of the mental representations of the reader.

With respect to classifications of content in a website, the notion of coherence suggests that interaction and collaboration with the user are far more important than the "correctness" of the scheme used. It is not necessary to find the hierarchy/ies that users apply to the material. Rather, you need to supply a classification scheme that they are able to construct a mental representation of. It's got to have some sort of logic, and it's got to offer users some mechanism for recovering what that logic is. That is, it's got to give some feedback about itself—progressively reveal itself in response to prodding.

Library classifications systems, whatever their particular brilliances, tend to be highly uncommunicative, affording employment to generations of reference librarians. In that respect, they are more like writing, similarly lacking in interactive capabilities. On the other hand, it seems that the interactivity of the Web offers some hope of incorporating, maybe not the practical intelligence of the reference librarian, but perhaps some of his responsiveness.

 
 
© Interfacility 2005. All Rights Reserved. 650-868-3432, marijke@interfacility.com