Ever since I was part of a failed taxonomy project in the ‘90s, I’ve had my eye on ways to classify media types online. I swooned for user generated tags and artful renditions of tag clouds. Enlisting users to categorize content seems like such a simple solution, a way of tapping into a large group of contributors. It’s also an entertaining window into what’s buzzing in the beehive mind. For example, the New York Time’s cloud shows us that searches for Imus are as big as searches for sex! Nothing else comes close. Do I care?
What use is that cloud to someone searching for things not in the cloud? It’s poorly designed, uninformative and there aren’t Boolean operators or faceted classification to cut through the chaff.
When I tag my photos in Flickr, I wonder if I’ve tagged my photos to optimize other people’s searches. There’s no guidance on the subject from the software, so it’s up to my vocabulary and intuition of what tags are best. That leads to an overabundance of tags. If I upload a picture of my car, I’ll tag it car. To cover more bases, I’ll also tag it auto, automobile, Mercedes, Mercedes Benz, MBZ, E320, sedan, four door, 4 door, etc… There’s got to be a better way to tag a photo of my ride. I want a plugin that guides my tag choices, perhaps suggesting frequently used tags that describe my post.
Optimized tagging is a tough nut to crack. Just because a tag is used infrequently doesn’t mean that it is not significant. And just because a tag is frequently searched for doesn’t mean that it has anything to do with your most relevant keyword. The New York Times tag cloud is a case in point. It’s a reflection of what’s on the mind of the readership and not a measure of all the news that’s fit to print. The fact that sex is the most frequently searched term means nothing to me when I’m trying to find the Chinese sesame noodle recipe from last month’s Sunday magazine section.
The issue of folksonomy is critical when considering the Enterprise 2.0 flavor of web development. Time is money in the work world, and free range tagging runs the risk of making search even less accurate. Serendipity is welcome when browsing photos on your own time, but can be a waste of time and money at work. The question becomes how can top down taxonomies and bottom up tags work together to optimize search?
I don’t have an answer yet, but there are clues. Social Media Now: Tag You’re It, a recent post on the Social Media Club blog by Jason Chervokas takes up the topic and points to some interesting research at Northwestern University. TagAssist is a proposed application that analyzes a collection of tags and makes suggestions to users as they tag their content. The software looks at tag frequency, eliminates near duplicates and synonyms leaving those tags that are most useful. It’s a brilliant idea but it only addresses user generated tags and TagAssist’s ability to optimize search remains unproven. At least it’s a start.