So I have the AO3 freeform canonicals in a nice json file, and now I am looking at them. And I am boggling.

There are 14400 tags total in this category, but the tail of infrequently-used tags is very long indeed. Only 434 canonical freeform tags used more than 500 times each. Bring the cutoff down to 100 uses and you get 1589 tags, which is still a small number. The 500 use list includes junk items like Rating: NC17. No, really, that’s a tag in the AO3 archive that has appeared more than 500 times despite the existence of a completely separate fic rating field.

So, what lessons do we learn? One, if you want your tags to be a browsing tool not a search tool (consider the difference!) do not let your users invent them. Particularly not your writers, who have different mindsets than your readers. Readers are far more numerous and they interact with a fic archive far more often than writers do. Optimize for them. Also, tags will be used to prop up deficiencies in search if you have them. Really awesome search takes some burden off your tagging system.

Two, fandom-namespaced tags. Episode names, concepts, characters all should not be at the top level.

Three, argh, use punctuation consistently.

Four, I cannot examine just the high-frequency tag list for inspiration. There are solid things that should be in any taxonomy in the less-frequently-used list, and junk tags in the popular list. Woes.

And in conclusion: What’s your favorite thing about space? Mine is space.

