Information organization

Personal Information Architectures

Gene has an interesting post about personal information architectures, something he spoke about at the recent Future of IA Retreat. While the recent interest in social classification and folksonomy is a large reason to talk about personal info. architecture, I think that Thomas Vanderwal has also been talking about the issue for a few years as the Personal Info Cloud.

Wilshire Metadata Conference 2004 Trip Report

Not all metadata are created equal as I learned last year when I attended the Wilshire Metadata & DAMA International Conference in Orlando, FL. However, when I sat in their meetings and learned this new aspect of metadata I discovered that there are some similarlities of concern, basically information organization, management, access, and retrievable.

If you come from the database modeling/administration world, I hear this is their equivalent to the IA Summit or CHI. The 2004 just concluded in Los Angeles. Their trip report is very informative, with enough information to get you to dig into new ways of thinking about information management.

Wilshire Metadata & DAMA International Conference 2004 Trip Report

Using KWIC and KWOC displays on A-Z indexes

Keyword in context (KWIC) and keyword out of context (KWOC) displays might be a useful way to make more of the items in an AZ index findable without necessitating too much human interaction using thesauri. This might benefit organizations that have a CIO handling the site's CMS, for example, but don't have an IA or other dedicated content person to work on creating alternative labels for pages. I haven't noticed IA articles on AZ indexes that discuss the use of keyword in context, so I've posted some notes about some quick modifications my developer did for us to make our AZ index work a little harder.

A roadmap for enterprise weblog services

I recently presented a roadmap for providing enterprise information services related to weblogs (k-logs). This is in the realm of what I think Lou calls "Guerrilla IA" in his Enterprise Information Architecture talks. The presentation, given at Computers in Libraries, is aimed at Library/Information Services organizations in corporations, but is applicable elsewhere. It's really an untested discussion starter that proposes near term goals for supporting individuals doing bottom-up knowledge creation. It also discusses a mode of progress that aims at integration of many types of enterprise information in the long term. I'd be interested in getting feedback on these ideas, especially comments that point out weaknesses.

Managing the information glut

Dennis Berman's article in the Wall Street Journal, "Technology Has Us So Plugged Into Data, We Have Turned Off" talks about a phenomenon called "absent presence" or "surfer's voice". He refers to it as "...a habit of half-heartedly talking to someone on the telephone while simultaneously surfing the Web, reading e-mails, or trading instant messages." Because many of my meetings are conference calls I frequently hear the person on the other end typing while I get the "uh huh" responses. I have to direct specific questions to people that require more than yes or no answers in order to get their attention sometimes. Then I get, "I'm sorry, can you repeat the question?"

Related to computers, this article makes me think of two different problems. One is the ability focus on singular tasks to successful execution or completion. The other is how to get back to one of the many open tasks you have waiting for your attention. One of the ideas the article throws out is that of using software to help people regain their focus on singular tasks after going off on tangents -- responding to IM messages, etc. They suggest a simplistic solution in limiting extra information seeking sessions, e.g. with web reading, news feed watching, to help make the information glut manageable. But, it's hard to call all of that reading "extra" when some of it is business-related environmental scanning and simply beefing up your knowledge on topics of interest.

How can software help this problem? One area of focus seems to be on using visualization to alter the desktop metaphor to some more meaningful UI that presents a stream of information. See Jeff Raskin or David Gelertner on this topic. It's that idea of figuring out what you're working on that's interesting to me. I think of this problem in terms of how I keep track of "to do" items. With a list on paper of the prioritized tasks for the day, I can periodically check on how I'm meeting the day's goals. It's a high-level view of things I should be juggling with the goal of eventually finishing them one by one. In terms of a computer UI, I see Apple's Exposé as a step in the right direction towards helping users visualize what they're juggling at once. Apparently Microsoft's Longhorn may be considering ways to help users make sense of what they're juggling too.

With dozens of devices and applications beeping for your attention, is the only effective way to give business users better signal to noise to just tell them tune out a little and eliminate the number of things they try to watch? Or is there a far off concept for computer users that will make this watching of information and managing of individual processes more manageable -- a solution that is reasonable, usable, and won't be met with too much cultural adversity?

DiceLaRed: Visualization software

Phil Wolff pointed to DiceLaRed ("The Network Says"), a visualization application that allows users to understand the flow of data in various sources via visualization. According to Phil, "DiceLaRed creatively blends news crawling + lexical analysis + data mining + data visualization + customization + alerting." He points to an example real time graph on their home page, that shows Spain's political parties by share of the current news cycle. In real time. Clicking on a wedge lets you dive into the news stream. More thoughts from Phil on how this tool might be used.

Apply this to your customers' weblogs, your industry magazines, and local newspapers for an environmental scan.

Apply this to job board postings. Understand labor market demand across the usual dimensions. Then stretch to discover new buzzwords and "terms of art". Can you say competitive analysis? How about strategic recruiting?

Apply this to medical discussion boards. Look for spikes in conversation about symptoms to detect outbreaks and public health problems. Look for swings in interest to retarget investment in health education and social programs.

We are much closer to a dashboard that helps us understand and respond, sooner and with more precision. Thank goodness.

Controlled Vocabularies: A Glosso-Thesaurus

The Fast, Leise, Steckel trio publish part four of their Boxes and Arrows series on Controlled Vocabularies. This latest installment is a glossary of terms used in controlled vocabularies. Appropriately enough, the glossary was created as a thesaurus.

Text Mining: Making Connections to Help People & Business

Great article in NYTimes(free registrated required) related to information retrieval, categorization/classification, and use.

"Digging for Nuggets of Wisdom"

Marti Hearst is quoted regarding information vizualization, text mining, and such. Most of the focus was on retrieval in homogenous content such as Medline. The reason why I liked the article was it provides an example of how people/business benefit from better IR tools for such disciplines as medicine.

Cataloguing Cultural Objects: Guide to Describing Cultural Works

The Visual Resources Association has recently published the Cataloguing Cultural Objects (CCO) in the hopes of developing guidelines or standards for describing and retrieving information about cultural works.

CCO provides guidelines for selecting, ordering, and formatting data used to populate catalog records. CCO is designed to promote good descriptive cataloging, shared documentation, and enhanced end-user access. Whether used locally to develop training manuals, or universally as a guide to building consistent cultural heritage documentation in a shared environment, CCO will contribute to improved documentation and enhanced access to cultural heritage information.
XML Presentation Syntax for OWL is released

From the World Wide Web Consortium home page:

The Web Ontology Working
Group has released XML Presentation Syntax for the OWL Web Ontology Language (OWL) as a W3C Note. The Note suggests one possible XML presentation syntax and includes XML schemas for OWL Lite, OWL DL, and OWL Full.

• Read the XML Presentation Syntax note

• Find out more about Web ontologies

CIO Article on Auto/Semi-categorization software

CIO article "Sleuthing out data" by Fred Hapgood features a couple examples of how auto-semiauto categorization enables businesses and reduce costs. There is a company list included if you're interested in this arena.


MetaMap is an interesting visualization of metadata initiatives.

With the exponential development of the World Wide Web, there are so many metadata initiatives, so many organisations involved, and so many new standards that it's hard to get our bearings in this new environment.

The problem is exacerbated by the fact that the names of most of these new standards are represented by acronyms. The MetaMap exists to help gather in one place information about these metadata initiatives, to try to show relationships among them, and to connect them with the various players involved in their creation and use.

The MetaMap takes the form of a subway map, using the metaphor of helping users navigate in "metaspace", the environment of metadata.

Thanks, Catalogablog (David Bigwood)

disinfojournal for February 2003

The second issue of disinformation is out. Especially interesting is Don't trust your eyes - a laboratory study investigating consumer behavior on the net:

Responding pictures of secondhand goods or used vehicles, which are offered in the Internet e.g. with Ebay deceive frequently over the true quality of a commodity away. ...In our laboratory study which runs over a period of 3 months we logged the Internet purchase behavior of 859 persons with a customized XMosiac 10.5 browser. We can show in this study that during identical description of a product the preference was given to the article with a photo, in 87 percent of the cases. ... We can significantly show that a worse product with photo can be sold thus better than a better without photo.

This very clearly shows the power that information architects and web designers have to persuade visitors, which is what Andrew Chak and FutureNow (and I) have been saying for a while.

And, yes, as someone commented last time, disinfojournal is a bit strange, but that's what I think I like about it...

Vocabulary, taxonomy, thesaurus, ontology and meta-model

Woody Pidcock of the Boeing company gives an excellent overview of the differences between a vocabulary, a taxonomy, a thesaurus, an ontology, and a meta-model on He summarizes the differences as such:

    Bottom line: Taxonomies and Thesauri may relate terms in a controlled vocabulary via parent-child and associative relationships, but do not contain explicit grammar rules to constrain how to use controlled vocabulary terms to express (model) something meaningful within a domain of interest. A meta-model is an ontology used by modelers. People make commitments to use a specific controlled vocabulary or ontology for a domain of interest.

Thanks, Matt Webb.

Information Layers Model from Karl Fast

On SIGIA, Karl Fast proposed a rough 5 layered model for information. The layers are content, metadata, semantic, representational, and interaction.

Librarians kick ass on the metadata and semantic layers. They suck on the representational and interaction layers.


disinformation, “the first international e-journal of disinformation on the net,” has launched, and the first issue is available online. From their home page

There is obviously a huge lack of quality information on behavior, amount and usage regarding disinformation on the internet. As information has been increasingly invested with value, people have tried to manipulate, destroy, or acquire it in any way possible. Circumstances and instances cover a broad range of disinformation on the net or IP-based networks. The disinfojournal deals with topics in all areas of disinformation. This includes, but is not limited to library and information science, information technology, electronic publishing, database management, data mining, knowledge production, knowledge dissemination and of course malinformation and disinformation approached from sociological, psychological, philosophical, theoretical, technical, and applied perspectives.

The first issue includes About 5 percent of your intranet information is malicious or wrong and The usage of forms and false data: a field study, among others.

Unfortunately, the only way to get the full text is via email (?); HTML and PDF abstracts are available online.

Reversible is a site that automatically links back to anyone who links to it. There are some implications for this on the Reversible about page.

It has elements of a blog, a directory, a wiki, and more. Definitely an interesting effort in bottom-up categorization, for one thing. And I'm not sure how I can link to a page that I'm interested in, without also being included in that page...this is an issue, since pages act sort of like nodes in a hierarchy, and and so linking to a page implies that my linking page is a member of that node.

That means that appropriate places to link would be,,, and

We'll see what kind of emergent patterns reveal themselves in a week or two.

Ideagraph - interesting project for semantic/RDF/topic map folks

Ideagraph is a "Personal Knowledge Manager" that is in early beta. It is intended to eventually be a commercial product, but is currently free to download.

Rashmi on recommender systems

Andrew pointed me to Rashmi's excellent discussion of findability and recommender systems on sigia-l.

It sure would be nice if the best of sigia-l was culled periodically. Scott Berkun does this from time to time. Maybe the signal to noise has gotten better on the list?

XML feed