Alphabet Soup: DOI

The DOI is not in fact an identifier of digital objects. It’s a digital identifier of objects – DOIs can be assigned to physical items. But they are most frequently assigned to digitally-published journal articles.

The DOI is, like the ISBN, not a dumb number. There is a prefix and a suffix, separated by a slash. Most of the time, the prefix begins with the number “10”, followed by a period – this indicates that the identifier, while part of the Handle system, is specifically a DOI. After the period, there’s a number indicating who registered the DOI (similar to a publisher prefix). Following the slash, the actual identifier of the article can be alphanumeric.

The main DOI registration agency is CrossRef, but there is also one for the entertainment industry called EIDR. Most book publishers who are using DOIs register them with CrossRef, however - these are largely academic and scientific publishers who also publish journals.

More information about DOI can be found here.

How ONIX Came To Be

In 1997, superstore behemoth Barnes & Noble had launched their website in competition with Amazon. (As a matter of disclosure: I began working there in 1998, directing the database that served both the web and the stores.) Borders, Hastings, Books-A-Million, and others soon followed. There were numerous start-ups dedicated to selling specific sorts of books – Varsity Books for textbooks, FatBrain for business books. Companies were acquired, rose up, shut down – it was absolutely chaotic.

In the midst of all this, the problems presented by back-office, transactional metadata (truncated titles, metadata in ALL CAPS) were abundantly clear to consumers – these websites were ugly, clunky, and not very enticing. Publishers noticed. Distributors noticed. Everyone saw an opportunity to increase sales.

ONIX stands for Online Information Exchange, and was developed as a joint effort by the Association of American Publishers (AAP) and EDItEUR (which originally stood for EDI-to-Europe, but which has evolved more broadly into a London-based standards body for the book industry). It was created to solve two problems: (1) that consumers were now looking at this data so it had to be more robust, descriptive, accurate, and reflective of what they needed to see, and (2) that ANSI X12, as a US-based standard, was insufficient for international communications about books.

I was on the front lines back then – it was hotly competitive between Amazon, Borders, and Barnes & Noble. There were lawsuits, front-page news articles, insults, and shade thrown. It was nasty. But everybody could agree that the metadata was causing us all the same headaches. So there was a parley.

In 1998, in the conference room at AAP on 5th Avenue, the Big Seven publishers (yes, there were seven at the time), Barnes & Noble, Amazon, Ingram, Baker & Taylor, a handful of startups, and a number of other interested companies sat down at a large conference table and laid out the problems. It was the first time B&N and Amazon had allowed representatives to sit in the same room together – Cindy Cunningham and myself. (We later became good friends – largely due to this experience.) It was clear that we needed to present a unified front to persuade publishers to adopt this new standard that would benefit all of us. After two years of negotiations, ONIX 1.0 was published and its maintenance in the US was handed to the Book Industry Study Group, which created a metadata committee (now run by Richard Stark at B&N, who was one of my first hires there) to handle changes and fixes and additions to the code lists.

ONIX is not terribly sexy. But it allowed sexiness to happen. As we evolve through ONIX 3.0 and beyond, knowing how we got here will help us lay the infrastructure for whatever awaits the book industry as the Web itself advances.

Alphabet Soup: ISTC

The ISTC began development in the early 2000s as a way of collocating editions of textual works. It’s not technically a “Work ID” for books, but was misperceived that way – in actuality, it’s an identifier of text strings. So (broadly speaking) the hardcover, paperback and ebook edition of a book would all receive the same ISTC, but the French translation would not, because the text strings are different.

The ISTC is not necessarily assigned by a publisher, or a library, or a bookseller. It can be, but it doesn’t have to be. There is no “ownership” of the ISTC like there is with other identifiers.Anybody who wants to register a textual work – whether it’s an author, an agent, a publisher, or whoever - must submit a request to an ISTC registration agency with the necessary metadata needed to distinguish that work from others. The registration agency determines whether or not that request qualifies for a new or an existing ISTC.

More information about ISTC can be found here.

Small Changes Afoot

IPG acquires InScribe Digital. Barnes & Noble acquires Adaptive Studios and is experimenting with POD for self-published titles. Ingram's been on an acquisitions binge since last December. Hachette acquiredPerseus's publishing unit. And Mike Shatzkin has announced he is stepping down as program director of DBW.
Mike makes a good point - the dust raised up by the drastic disruption brought on by digitization is settling. Pain points seem to have minimized. These days, the big news is in consolidation, iterative experimentation, and (dare I say it) infrastructure improvements. The acquisitions I mentioned are not earth-shattering but incremental - the Big Scary Days are safely in the rear-view mirror for the time being.
For those of us who thrive on disturbance, this can be a difficult time - casting about for The Next Big Thing (especially in summer, when the buzz of the tree locusts lulls us into either napping or impatience) is quite frustrating when all seems well in hand. Forcing disruption where none is naturally occurring is, of course, not terribly honest. One can always argue that publishing is complacent, that the book trade is blinkered, that if traditional publishers don't focus on something other than the next bestseller then they'll be blindsided by Pokemon Go or something like it.
And that's true.
But publishing continues. For all the disruption caused by ebooks, print sales are still strong. Audiobook sales continue a spectacular hockey stick growth. Pikachu can exist side-by-side with adult coloring books and the Knopf frontlist.
So perhaps now is a time for contemplation. For digging ditches, and focusing on fortification. For refining processes and smoothing efficiencies. It's a luxury to be able to do so. I used to have a trainer who would hold me back from pushing myself during certain workouts. "It's okay if it's easy," he would say. "Enjoy it. Not everything has to be hard."

Alphabet Soup: ISAN

The ISAN was developed in the year 2000 and published by ISO in 2002. It identifies an audiovisual work – so it applies to films, TV shows, video games, etc. ISANs are used by film and television studios, services such as iTunes and HBO, and technology companies like Microsoft.

The first twelve numbers of the ISAN form the “root” of the identifier. The root is assigned to the core work. The next set of numbers applies to the episode or part (if there is one – if not, the next four numbers are zeroes). The next character is a check character. The next eight numbers identify the version of the work. The last number is also a checksum.