Friday, January 07, 2005

Pro metadata will lose to folksonomy

Not only does Shirky nail it, but Cory hones in on the money graf s for us. This is clearly one of a class of problems where scaling issues overwhelm other factors and force solutions to be somehow distributed.

These are much like the situation in the early days of long-distance telephone service that needed operators to complete all calls. Analyses at the time predicted that the services would fail because your clearly were going to need to hire so many operators that the system would collapse. The solution, in that case, was to effectively make everyone an operator by inventing direct-dial long distance and area codes. Of course, we've now reached the point where area codes are an anachronism and have little predictive value about where the phone in question exists in the physical universe.

Shirky: Pro metadata will lose to folksonomy. Cory Doctorow: Clay Shirky continues to just totally nail the questions of metadata, authority, and user-created content. Today's installment: why crappy, cheap, user-generated, uncontrolled metadata will win out over expensive, controlled, useful, professionally generated metadata:
Furthermore, users pollute controlled vocabularies, either because they misapply the words, or stretch them to uses the designers never imagined, or because the designers say "Oh, let's throw in an 'Other' category, as a fail-safe" which then balloons so far out of control that most of what gets filed gets filed in the junk drawer. Usenet blew up in exactly this fashion, where the 7 top-level controlled categories were extended to include an 8th, the 'alt.' hierarchy, which exploded and came to dwarf the entire, sanctioned corpus of groups.

The cost of finding your way through 60K photos tagged 'summer', when you can use other latent characteristics like 'who posted it?' and 'when did they post it?', is nothing compared to the cost of trying to design a controlled vocabulary and then force users to apply it evenly and universally.

This is something the 'well-designed metadata' crowd has never understood -- just because it's better to have well-designed metadata along one axis does not mean that it is better along all axes, and the axis of cost, in particular, will trump any other advantage as it grows larger. And the cost of tagging large systems rigorously is crippling, so fantasies of using controlled metadata in environments like Flickr are really fantasies of users suddenly deciding to become disciples of information architecture.



If you want to trace back to some of the items that launched this most recent disscussion, here are some of the key links:

11:27:37 PM •  • 
50 book challenge results for 2004

I did manage to read just over 50 books during 2004. If I were feeling exceptionally compulsive, I could go back and complete reviews of all of them, but the value for that isn't clear.
I expect to manage a comparable reading load in 2005. In the interests of making this blog a better back up brain for myself, I plan on keeping better running notes on the nonfiction titles as I read. Whether I bother to write up details for my fiction read remains to be seen.
One thing I plan to do over the next week or so is to identify the top five or so books that I found most valuable in 2004. For the record these are the books I fnished in 2004.
  1. Heinlein's For Us the Living - 50 Book Challenge
  2. Christensen's Innovator's Solution - 50 Book Challenge
  3. David Allen's Ready for Anything - 50 Book Challenge
  4. David Gerrold's The Man Who Folded Himself - 50 Book Challenge
  5. Dan Brown's Deception Point - 50 Book Challenge
  6. Dan Brown's Digital Fortress - 50 Book Challenge
  7. Richard Morgan's Altered Carbon - 50 Book Challenge
  8. John Brunner's Shockware Rider - 50 Book Challenge
  9. Robert Wilson's Chronoliths - 50 Book Challenge
  10. Bruce Sterling's Zenith Angle - 50 Book Challenge
  11. John McPhee's Curve of Binding Energy - 50 Book Challenge
  12. Lawrence Lessig's Free Culture - 50 Book Challenge
  13. John Brunner's The Sheep Look Up - 50 Book Challenge
  14. Greg Iles's The Footprints of God - 50 Book Challenge
  15. Steven Johnson's Mind Wide Open - 50 Book Challenge
  16. Anderson, Poul - For Love and Glory
  17. Charles Stross's Iron Sunrise - 50 Book Challenge
  18. Brian Arkill's LDAP Directories Explained - 50 Book Challenge
  19. Eric Meyer's Cascading Style Sheets: The Definitive Guide, 2nd Ed - 50 Book Challenge
  20. Eric Meyer on CSS - 50 Book Challenge
  21. Gregory Dicum's Window Seat - 50 Book Challenge
  22. Todd Carter's Microsoft OneNote 2003 for Windows - 50 Book Challenge
  23. Dvorak and Pirillo's Online! The Book - 50 Book Challenge
  24. Charles Stross's Singularity Sky - 50 book challenge
  25. Elizabeth Moon's Trading in Danger - 50 Book Challenge
  26. Austin, Robert - Artful Making: What Managers Need to Know About How Artists Work
  27. Cadenhead, Rogers - Radio UserLand Kick Start
  28. Caldwell, Ian - The Rule of Four
  29. Carr, Nicholas G. - Does IT Matter? Information Technology and the Corrosion of Competitive Advantage
  30. Hammond, Grant T. - The Mind of War: John Boyd and American Security
  31. Kelly, Kevin - Cool Tools
  32. Lakoff, George - Don't Think of an Elephant: Know Your Values and Frame the Debate--The Essential Guide for Progressives
  33. Modesitt, L. E. - Archform: Beauty
  34. MORIARTY, CHRIS - Spin State
  35. Ringo, John - Emerald Sea
  36. Ringo, John - There Will Be Dragons
  37. Ringo, John - Cally's War
  38. Stross, Charles - The Atrocity Archives
  39. Tharp, Twyla - The Creative Habit: Learn It and Use It for Life
  40. Wheaton, Wil - Just a Geek
  41. Wurman, Richard Saul - Information Anxiety 2
  42. Yamashita, Keith - Unstuck: A tool for Yourself, Your Team , and Your World
  43. Zackheim, Sarah Parsons - Getting Your Book Published for Dummies
  44. Bok, Derek Curtis - Universities in the Marketplace: The Commercialization of Higher Education
  45. Camp, Jim - Start with NO...The Negotiating Tools that the Pros Don't Want You to Know
  46. Graham, Paul - Hackers and Painters: Big Ideas from the Computer Age
  47. Weber, David - The Shadow of Saganami (The Saganami Island)
  48. Brand, Stewart - How Buildings Learn; What Happens After They're Built
  49. Kawasaki, Guy - The Art Of The Start: The Time-Tested, Battle-Hardened Guide For Anyone Starting Anything
  50. Cussler, Clive - Black Wind
  51. Modesitt, L. E. - Flash
  52. LAMOTT, ANNE - Bird by Bird : Some Instructions on Writing and Life
  53. Stross, Charles - Toast
10:49:38 PM •  •