Metadata masterclass: codes and keywords
Accurate subject codes and focused keywords are vital to a book’s discoverability. EDItEUR’s Graham Bell shares some advice for getting them right
Subject codes are a key part of the metadata that publishers should provide to their supply chain partners. Ultimately, this is what guides retailer’s merchandising, what drives consumer discovery via ‘browse by subject’, and what allows industry sales stats to be classified by book type and genre.

What codes should I use?

In the UK it is typical to provide BIC subject codes. These have been in use since the mid 1990s, with the most recent update in 2010. The scheme provides around 2,700 subject categories to choose from, some rather broad and general, and some highly specific. They are arranged hierarchically, and each has a short alphanumeric code—so a typical crime novel might be classified as FF (Crime and mystery), and an industrial study of aircraft making as KNDV (Aviation manufacturing industry).
The full list of BIC subject codes is here, and an interactive category selection tool is here. For sales into the US, and for some global retailers based in the US, it is also important to supply BISAC codes. This is a separate scheme, more detailed in some respects but skewed towards American subjects. Information about the range of BISAC codes is here.

How many codes should a book have?

With both BIC and BISAC, it’s a common misconception that there must always be three. Or five. Or one. In truth, there is no magic number. If in classifying your book you find that the first subject code you assign gets it spot on, go no further. If you find your book covers ground that might require two codes, use two codes. As an example, a book about the architecture of railway stations might want two BISAC codes: ARC011000 (architecture of public, commercial and industrial buildings) and TRA004010 (transportation / railroads / history). On the other hand, if you find yourself assigning five codes to most of your books, that’s probably too many.
Another common misconception is that books need both a detailed code and a more general code. For example, a book about differential calculus is not a book about mathematics but a very small part of mathematics—so it should be assigned the BIC code PBKJ (Differential calculus and equations), but not PB (Mathematics). More general codes should be used only on books whose subject matter is also highly general. Adding extra—and largely irrelevant—codes doesn’t gain anything worthwhile. Don’t add them just to meet some mythical target.

What's coming next?

Having said all this, IPG members should now be looking beyond BIC and BISAC. The BIC scheme has not been updated since 2010, and will not be developed further. The future for the UK book trade is a new(ish) scheme called Thema. On the surface Thema looks a lot like BIC: it works in the same way, and some of the subject codes are the same.
But Thema is a more sophisticated scheme, capable of more nuanced subject description while remaining fairly simple to use. It is also international and multilingual. The BIC scheme is frozen at version 2.1, and the best way to think about Thema in the UK is as 'BIC version 3'. Read more about Thema here, and find an interactive code picker here.
Thema is being adopted in an increasing number of countries across Europe, and at a London Book Fair event a few weeks ago, Amazon announced that the internal subject browse tree in its European stores is now Thema-based—so it will want Thema subject codes in the metadata it receives from publishers. Nielsen, BDS and Bowker can all receive Thema subject codes too, and they are already built in to many of the software packages publishers use for managing their product metadata. Some of the largest UK-based publishers are already supplying Thema subject codes alongside legacy BIC and BISAC data; this isn’t as hard as it sounds, as mapping procedures are available to deal with converting backlist and mapping back to the older scheme for frontlist.

What about keywords?

Keywords are just as important. While subject codes drive ‘browse by subject’, keywords enhance the likelihood that a book will be found via organic search. They’re complementary. Not every consumer knows the book they want by title or author, and keywords provide another way to find it.
Many retailers, including Amazon, use the keywords that publishers provide, both for search purposes and to supplement structured subject category information. Other retailers ‘mine’ other metadata fields—the long description or table of contents, for example—and use words from the descriptive text as keywords.
Keywords should supplement, but not replace or repeat, data that is already in the more structured metadata fields. Try to think like a consumer. What will he or search for? More importantly, what will he or she be searching for when your book provides the answers? There’s no point in adding ‘cute cats’ as a keyword when your book is about architecture; keywords featuring the names of architects or important buildings would be more appropriate. Your book might be ‘discovered’ less often, but those who do find it will actually be looking for books about architecture, and are thus more likely to buy.
In fiction, keywords should include the names of characters, locations and maybe even key plot points and narrative themes. In non-fiction, highly specific keywords work better in practice than overly generic terms—‘Chartism’ or ‘communitarianism’ rather than ‘politics', for instance. Do supply synonyms, but don’t bother to provide plurals or multiple forms of verbs. Feel free to supply lots of keywords—up to 20 words or phrases is fine, though there are clearly diminishing returns as you supply more. And don’t ‘waste' keywords by using words that are already in the title or author fields. If a purchaser searches for the author’s name, they will find the book based on the data in the author field, not because you have included the author’s name in keywords.
BIC has a great advice about choosing keywords here, and BISG has a paper that applies to the UK as well as the US here.
Graham Bell is executive director of EDItEUR

