warning: Creating default object from empty value in /home/asilomar/iaslash.org/modules/taxonomy.module on line 914.
A List Apart
Brightly Colored Food
City of Sound
Croc o' Lyle
Digital Web Magazine
Dive Into Mark
Guide to ease
Joel on Software
Noise Between Stations
Off the top
Signal vs. Noise
The Montague Institute offers ten myths that need to be dispelled before embarking on a taxonomy project. They've got a *really* broad definition of taxonomy (think "classification system") but the myths are still useful to deflate before your client or boss goes taxonomy-happy.
Over at Boxes and Arrows, Karl Fast, Fred Leise, and Mike Steckel deliver a great "how-to" tutorial on creating controlled vocabularies. It's one thing to talk about how great CVs are, it's even better to show how to build them.
Donna Maurer shares her technique for evaluating classification schemes over at Boxes and Arrows. Ten minutes from twenty users means that it's pragmatic, and it addresses classification specifically, instead of being part of a prototype with other issues to evaluate. Here's what you need to do this kind of evaluation:
Looks great - thanks Donna!
David Riecks pointed me to his site on controlled vocabularies. David discusses the benefits of using CVs and offers a lot of examples of heavily-used controlled vocabularies and thesauri. Since David is a photographer, he also has a special interest in image indexing and devotes a special section to image databases and CVs.
Karl Fast, Fred Leise and Mike Steckel in Boxes and Arrows.
Karl Fast, Fred Leise and Mike Steckel have started a series of articles on Boxes and Arrows to make facetted classification and controlled vocabularies accessible to practicing IA's without LIS backgrounds. Look forward to it.
The Montague Institute gives us 10 taxonomy myths to dispel, so you can get past the hype and correctly grok how taxonomies will really work for you.
I am in a discussion with a programmer about ways to offer navigation using a poly-hierarchical arrangement of nodes. He brought up the concept of directed acyclic graphs (DAG), which is from Mathematics. I learned from the Free Online Dictionary of Computing that the idea is that a directed graph would contain no cycles, i.e. if there is a route from node A to node B then there is no way to cycle or loop back. I can see some applications benefitting from this algorithm, such as in forward citation searching. I think I may not understand the concept entirely, but I am guessing that in an information environment, this means that you'd lose context the deeper you find yourself in a directed path. Or perhaps it simply means you navigate forward to point A from point B and has nothing to do with providing backward movement.
The problem we're experiencing is that we have been dealing with a legacy of organizing by collections/products/services, which is reinforced in our site navigation. Oddly, we don't have problems post-coordinately displaying term combinations in database search results. Rather, in search results we display other terms from the subject taxonomy to narrow results by subject. The problem we have is with the legacy of hierarchical arrangements of access points organized by: collections, services, topics (this uses slices of the subject taxonomy). It's a very library-centric view that we've been dealing with changing, and if you ever worked in a library (corporate, private, special or public) you might know how difficult it is create this type of change.
I've pointed out that the concept of surfacing more facets of index terms would be helpful for browsing. Jim Anderson at Rutgers helped me to buy into this idea while I was in library school, and before I knew much about the web, I advocated this idea in an image index I proposed in 1997. That naive and over-ambitious Filemaker Pro screen shows how I envisioned it. It's funny. Today, I'm wondering how we can support the display of polyhierachical classifications such as our subject taxonomy and other database fields. We have some ideas floating around, but I feel like a toddler trying to topple an elephant.
Some follow-up. We're kicking around the idea of a) showing multiple breadcrumbs, and b) showing local navigation for one of the hierarches where the node exists. With the local navigation, we're going to check where the user came from in order to determine which tree to show. If they came from a bookmark or an email (most of our pages are also lined to from email alerts) we will show nothing, unless the node only has one parent, then we will show that tree. This is the theory. We need to test, but interested in opinions. Have you done something like this in a better way?
One Source has announced that they've started offering for sale their offering their Global Business Taxonomy, a business information classification system. I've used the One Source Business Browser in the past and have been impressed with how they index company information and present company profiles. If you've ever compared Factiva's (nee Dow Jones Interactive) company profiles to One Source you'll know what I mean.
Mike Lee points to and discusses the Delphi white paper, "Taxonomy & Content Classification" 1.3mb PDF, which is apparently licensed to every vendor mentioned in the paper -- my office mate Dave (the taxonomy guy) has seen three differently branded versions of the paper. It's apparently a good summary of why you should employ a taxonomy in your CMS. Mike says, "sheds some light on the misconceptions on the definition of a taxonomy, describes the benefits of systematic content classifcation, and surveys the currently available technology tools". They apparently also give some kind of seminar, "Proving Ground for Taxonomy & Information Architecture", but when I looked at the
I blogged the newish B&N book browser earlier today. Can't remember what I said about it. Mainly that it reminds me of Flamenco and FacetMap, I think. Perhaps I said something about facet classification being surfaced on the UIs of big ecommerce sites or some stuff.
I just looked at Barnes & Noble's Book Browser feature, which offers a way to browse books by subject and type of literature. The browser start page shows headings categorized under the different major sections you might find in the book store -- Fiction, Non-Fiction, Business. Each major section has subsections that closely match what I've seen in B&Stores.
I've read in a few places that people don't think that there have been good implementations employing the concepts of Ranganathan. I don't agree with that. This is an example of how the business world is employing the concept of categories for browsing and refining. Are these facets? In a broad sense of the word, yes. Like the Flamenco interface, the Book Browser allows you to see terms surfaced from several facets and then iteratively select terms or drill down until a string is formed that describes the information you find.
Peterme, musing on how we'll make sense of information offered in context-aware mobile devices, discusses facet-based description as a solution. I logged some thoughts of my own on his site because he makes sense to me.
Victor jots down some thoughts about creating controlled vocabularies within the context of the design of a project he's working on. He discusses some real considerations and dependencies related to the development of a controlled vocabulary and implications for systems design. Here's some of my own thoughts/reactions, based on experience.
I've watched the controlled vocabularies of subject headings and company information grow within my organization (a corporate library services org.) over the last four years. The approach we've taken is sort of like a web services model or much like a vendor service, such as those where data aggregators provide indexed content with their own proprietary controlled vocabulary (e.g. Factiva). This seems to me to be a good model because it centralizes semantic tagging and creation of indexing terms in one place, while enterprise use at different levels of granularity. When following this model, you're still confronted with the issues of knowledge representation when developing your terminology, but the system considerations are separated. The design of IR systems using indexes benefit from documenting scope, domain, documentary units, indexable matter, etc. prior to implementation. I have this great unpublished text by Jim Anderson that serves as a framework for such documentation.
Here's a short description of our approach, which has been top down and bottom up. Our people created our CVs starting with close relationships with business units to develop a set of subject headings and a company authority list. They iterated through these lists using the top down approach, informing the list with their subject area expertise. Then they take the bottom-up approach and add/modify terms that reflect subject headings identified while doing the daily work of indexing (knowledge representation). For my org., this is a daily process since a team of indexers sifts through machine filtered data and applies more granular indexing or alters machine-applied terms. As the telecom landscape changes or as our indexing needs require, terms are added to the vocab's. We have one person who manages/develops them, and a few additonal subject area experts who work on development of new terms in new subject areas. User feedback informs changes along the way. The controlled vocabularies are offered up for use by disparate systems within our company to represent that corpus of indexed data, or slices of it, as desired.
As an IA, I generally work with our taxonomy specialists to create page inventories -- sort of like microscopic content inventories on steroids -- that specify combinations of index terms used to build content modules. As an example, I show a small piece of one of these inventories on my old and dated portfolio. This use of the term content inventory is not typical in our field, I know. What this really is, is a design document showing such things as rubrics of content modules with their associated labels, and database searches that use terms from a controlled vocabulary. Maybe I should present something on this process some day. It's really a hybrid IA and technical document, but it's a format my entire team uses on all data-dense sections of our site.
Incidentally, the taxonomy guys I'm talking about are presenting on this topic at an ARK seminar in NYC in November in case you're interested. They're really smart. Hopefully they will get to network a bit at this thing, because everyone in our group could get pink slips if the cost-cutting winds decide to blow in our direction.
My office mates, Dave Goessling and Raphael Lasar, are giving the presentation, "Creating and implementing an effective taxonomy" at ARK Group's taxonomy seminar at Le Parker Meridien in New York, NY on 18-20 November 2002. A PDF for the "Practical Taxonomies" seminar is available for the rest of the program from ARK's conferences page. Other speakers include Amy Warner and knowledge managers from various financial institutions, government agencies, and other large corporations.
Is it me, or does anyone else find it interesting that everyone's so interested in Ranganathan lately. Seen in the news aggregator in the last few weeks:
I came across Amy Warner's article "A Taxonomy Primer" on her consulting site. Should be a helpful primer for people being introduced to the concepts associated with using thesauri.
Catalogablog is David Bigwood's weblog. I presume he's a cataloger since he's talking about MARC fields. He's also discusses metadata more generally for you non LIS types.
XFMLManager is a free authoring tool for hierarchical, faceted metadata. It is not yet available. We will also host the upcoming Hierarchical Faceted Metadata Authoring Experiment.