Choose Your Estimates: Ebook Sales Numbers Are “Known Unknowns”

You may have noticed that reports coming out of the BookInsights Conference staged in London by Nielsen last month included—as it was headlined by the Bookseller—“Self-Published Titles [Are] ‘22 Percent of UK E-book Market’.”

The reason that’s an eye-catcher is that, of course, we don’t know how many ebooks are sold in the main two digital markets, the UK and US. Amazon and other major online retailers don’t report their ebooks sales. And without an ISBN registration on many titles, we’re simply unable to do more than estimate how many digital titles are out there and how many authors are writing them. Because it’s had a lot of coverage in its two years, many authors are better informed on the methodology used by the Author Earnings process (using Amazon sales and rankings) than on more traditional evaluation approaches used by Nielsen.

So we asked our Nielsen contacts how—when they conduct their market surveys of consumers—they go about generating a percentage of overall ebook sales for a market like the UK without having Amazon sales figures. Nielsen director of research Jo Henry gave us this response, in which she’s describing a fairly arduous set of protocols:

“We ask respondents for the ISBN of the book they have purchased and, as they take the survey, this is matched against industry bibliographic databases (which do include some self-published ebooks) through an API lookup. If the respondent checks that what the search returns is the book they bought, then the survey automatically fills in the publisher, etc. If there is no ISBN, we ask for author and title, and the search is repeated. So if all works perfectly, when the raw survey data reaches us, we know who the publisher is and anything like CreateSpace [Amazon’s print-on-demand division] or similar organizations gets identified as self-published.

“If no match is made at either the ISBN or title level, the publisher field gets left blank, so once we have the raw data in-house we do an exhaustive manual cleansing process, which includes hand-coding the publisher to records that are missing them. At this point we don’t just search industry databases but also Amazon to identify as self-published anything that is a known self-publisher (CreateSpace, etc.) or, for example, [anything] where the author and the publisher name is the same. This process obviously relies on our industry knowledge to a large extent and won’t be 100 percent correct. But we’re satisfied it picks up the minimum amount of self-publishing—though we are probably missing some of the market!”

Bottom line: Nielsen suggests that it may be erring on the side of underestimating self-published ebooks’ portion of the UK market, and we’re reminded that every research approach to this problem can only be approximate. We don’t know how many self-published titles and authors are out there. And we don’t know which estimation strategy gets us closer to those missing numbers. As then Secretary of Defense Donald Rumsfeld said in 2002, there are “known knowns” and, surely, there are “unknown unknowns.” In terms of ebook sales, it’s the “known unknowns” we have to clear up if we’re to finally get a fix on what’s in the self-publishing marketplace and who’s writing it.