About Numerical Gurus
Numerical Gurus LLC is a consulting company providing services to the advertising, media, entertainment, library, and book industries. Principal Laura Dawson (MLIS), has consulted or worked in numerous organizations in these verticals, primarily focusing on solving problems related to data governance, taxonomies, ontologies, and managing the people who are working in these areas. She blogs regularly at New Virago. She can be reached at ljndawson at gmail dot com.

What I Saw at the Evolution
Notes on a career in data governance
Fresh out of college, I started my career in publishing in 1987, as an editorial assistant at Doubleday. (I was really bad at it.)
From there I moved into bookselling, and then databases about books and bookselling, and then…well, a bunch of different companies, some of which don’t exist anymore, some of which have evolved way beyond their original value proposition. And I ran two consulting companies of my own – the final one in 2016, where I ran Metadata Boot Camps for publishers, and was an ISNI registration agent.
But in 2016, the landscape changed. Whether for political or other reasons, the consulting money dried up. I was introduced to someone at HBO who’d made the jump from Pearson, a big educational publisher. I worked there for a good while, creating and maintaining controlled vocabularies for regions and languages, doing data forensics and a LOT of mapping between systems. And then, three years later in 2019, after HBO had been acquired by AT&T and changed beyond all recognition, I leapt at an opportunity at an ad-tech company, EDO, where I spent 5 years.
Advertising is WILD. The lifespan of a single ad can last from 4 hours to 4 quarters. The pace in this industry is moment-by-moment. The company Slack had multiple dozens – bordering on a hundred – separate channels. I was getting notifications every two minutes, all day, every day. I worked from home and couldn’t even get up from my desk to make food for myself. And I did it for 5 years, happily, because I had a terrific team and a lot of good work to do.
All of which is to say, I’ve seen things you people wouldn’t believe.
The world sees publishing (and libraries) as a near-antiquated business, ideas mired in paper, dust ever-present, presided over by nearsighted intellectuals – a very narrow view, to be sure. The book world is barely one step evolved from typewriters. Or quill pens. You get the idea.
But what non-book people don’t know is that the book industry is built on standards. Look at what the industry has – so many tools to bring order to chaos. So many organizations to help, no matter what size your organization is. So much tech that’s been around for so long that underpins everything the book industry does. The ISBN was invented in the 1960s; BISAC was invented in the 1970s; ONIX was invented in the 1990s. And these standards WORK, no matter where or how a book is being sold.
Industries that rely on BILLIONS of dollars, that rely on MILLIONS of subscribers, don’t have this bedrock of standards – industries like broadcast or linear TV that have to air certain things at certain times per day within certain markets; industries like streaming TV or audio, that have to make content available 24/7 to whoever has access to it; industries like advertising, which forms the economic basis of how all these other businesses operate.
Now – there ARE standards. Many valiant efforts – and I’m not dismissing any of them. Quite the opposite – standards are great! Standards help!
The difference between these standards and those used by the book industry is this – none of them are required to do business. There is no overall agreement, outside of a few technical standards (like aspect ratio or lumen count). There is no ONIX, there is no ISBN, there is no BISAC. No controlled vocabularies or authority files to bring sense to the data – each organization, each startup, has its own database, its own identifiers, its own protocols. Standards are a “nice to have.”
In fact, the more time I spent in the world of television and advertising, the more amazed I was that anything made it to air at all.
I arrived at HBO a few years after they started streaming – first with HBOGo and then with HBONow. Initially, HBO built their own streaming service. The season 4 premiere of Game of Thrones brought it screaming to a halt – the service crashed, and it made worldwide news. It was 11 years ago – I don’t think people remember that Game of Thrones was the biggest show on TV at the time. The complaints were loud, frequent, and…a good portion of the team behind HBOGo was fired. They then licensed white-label streaming capabilities from Major League Baseball to form the basis of HBONow.
In the course of this transition from cable to streaming, it was discovered that the connection with the satellite to broadcast HBO to cable networks was…a single server located under the desk of an engineer in Long Island, who hadn’t been with the company for years.
I’m going to let you sit with that for a minute. Think of all the HBO shows you’ve watched on cable – Game of Thrones, Silicon Valley, Girls, Sex and the City, the Sopranos, The Wire, even Westworld and Treme and The Righteous Gemstones – and think about that one poor neglected server that allowed that to happen.
Another thing that I think a lot of people don’t realize is the importance of sports to TV and advertising. Seasonal NFL brings in the most money in terms of advertising. The Super Bowl really is the Super Bowl of ad revenue. But as important as the NFL is, there’s so much more – college football, the NBA, baseball, hockey. NASCAR, golf, and even soccer – the amount of advertising dollars invested in these sports comes to the billions of dollars.
Much more than scripted TV, much more than reality TV, much more than DIY shows, much more than any other programming – whether streaming or broadcast, whether audio or video – sports dictates how TV and advertising are run.
Sports is live. Anything can happen, and frequently does. Players get injured, rain delays happen, a raccoon runs out onto the field. When this kind of thing happens, it pushes back the programming that’s scheduled to air after the game. Again, it’s a miracle anything on TV goes smoothly.
And the data reflects that chaos. The chaos is coming from all angles – the live performance, the streaming wars, the regional rights management (when a game is blacked out, whether a show can air in the Republic of Georgia), the half-life of an advertisement (which can be less than 24 hours, but you still have to record that it happened)….For a data governance geek, there’s a ton of mess to clean up, all day every day. You’re on this endless treadmill where documentation and data management is outpaced by The Next Thing That’s Happening.
There is a crying need for standardization – but the very nature of the chaotic industry prevents most standards from being developed. There’s general agreement on things like codecs and bitrates – those kinds of technical standards abound and are embraced. But standards around contributors, identifiers of things, all the things that we in the book industry take for granted…that doesn’t exist. There is no ONIX. And there’s no time to develop it – the landscape is shifting too quickly. At EDO, we used to talk about building one area of the house while the other was burning down. That’s the reality.
For organizations in TV and advertising, chaos is opportunity to claim dominance. If you can move fast and break things enough – if you can create mayhem and confusion – you can tune the noise, the static, to your signal.
Back to publishing – it’s been almost 20 years since the last onslaught of chaos in the book world. For a while, ebooks upended publishing, rattling a lot of people before being absorbed into publishing workflows and settling back into business as usual. Why? Because we had standards for print that could be extended and built on to handle specific ebook issues such as accessibility, ownership vs licensing, reflowable vs fixed layout. We had established forums in which to discuss these things as an industry. And those forums could also extend – to work with organizations such as W3C. So yes, there was upheaval – no mistake, and a few of us built consulting practices out of soothing these roiling waters – but ultimately standards won out and ebooks are not the extinction event that was being predicted when they first became popularized.
Where will the next disruption come from?
I’m not a cheerleader for AI – I think it has the potential to do far more harm than good, and I’ve seen that in my job search.
But it’s true that AI is where the book world has an advantage over just about every other industry – starting with the ISBN, and then BISAC codes (per my lovely friend The Loud Poet, there is a LOT of work to be done to bring those up to par, but per me, having worked on the BISAC committee, there’s a good basis for improvement there), and then ONIX, and the many, many standards developments since then. Quite simply, compared to other types of media, metadata is something the book industry does very, very well.
So, despite the chaos and upheaval that AI can cause, I have a lot of faith that publishers and libraries can work with AI in ways that make sense – and they are much better prepared to handle the lack of nuance, and the brute-force and hamfisted results that we’re seeing while AI is in its relative nascency. ONIX and BISAC have the contextual signposts (or the potential signposts!) that bots need to distinguish between, say, mercury the element vs mercury the planet.
As the industry has these discussions, and participates in pilots from, say, Shimmer.ai and other AI providers – as the book trade wrestles with the problems that mass computing capabilities present – it’s important to remember: this business is weirdly ahead of the game. Bookland has been here before. And however messy and chaotic and disputatious things get, the book world is doing a LOT better than Netflix. Or Apple Podcasts. Or Amazon Prime Video. The standards that the publishing industry established so far will be essential as the world moves into a reality-bending future.
Data governance revolves around standardization. Media, entertainment, advertising – even as they regard publishing and libraries as being slow-moving and staid – can learn a LOT from the standards that the book industry has developed. Interoperability is crucial as chaotic environments proliferate.
You know who to call.