HarperCollins has struck a deal with Microsoft to provide some of their nonfiction backlist for training for an AI model that has not yet been announced. Authors have a choice whether to opt in to this one-off deal. Payment is $5,000 per title, split 50-50 with HarperCollins if authors agree, for a nonexclusive licensing term that lasts three years. (But does it even matter that the agreement is time limited? Once trained, always trained? What could change or develop in three years’ time for the AI industry?) Most writers responded on social media with anger at HarperCollins and disgust at the amount offered per title, although it’s unlikely any amount would be sufficient for authors who stand against generative AI or AI companies or who believe training these models will lead to their replacement.
Author Daniel Kibblesmith (Santa’s Husband) helped break the news by sharing the entirety of his agent’s email about the deal. It reads in part, “We know that these terms have already been negotiated between HarperCollins and the tech company and has been agreed to by several hundred authors, so individual negotiation at this point isn’t possible. It’s a yes or no choice. We have done some negotiation to make the language of the contractual amendment more specific and limited.” Curiously, Kibblesmith’s book, Santa’s Husband, is not nonfiction but an illustrated children’s book about a gay Santa. Was this title included by mistake? Does the AI deal go further than nonfiction?
Michael Cader in Publishers Lunch reports (sub required) that specific guardrails have been put in place that “would limit the model to output of no more than 200 consecutive words and/or 5 percent of the book’s text across multiple outputs during a user session.”
HarperCollins is not the first publisher to strike a deal like this, although it is the first of the Big Five publishers to do so. (Others have been professional, academic, and scholarly publishers.) It’s only a matter of time before similar deals are struck with other major publishers and more authors are offered take-it-or-leave-it terms. I would not be surprised if this is the largest amount of money, on a title-by-title basis, that will be offered to the average (not name-brand) nonfiction author. That doesn’t mean authors should accept the deal, but dozens of media companies have already decided it’s better to take the money, have a hand in shaping the terms desired, and jump on whatever advantages might be offered right now, while training material is clearly in demand. This interview with the head of The Atlantic is instructive in that regard.
For authors who opt out of the HarperCollins deal but want to explore other licensing opportunities, there’s Created by Humans (which has partnered with the Authors Guild), but no deal has yet been announced with AI companies. It seems unlikely or even impossible that the average author could negotiate their own bespoke deal in the foreseeable future; AI companies are currently signing deals with major companies that bring in as much data in aggregate as possible.
I don’t believe publishers are looking to displace authors or creative people in striking these deals, but, as was discussed in a Book Industry Study Group (BISG) panel last month, they are considering their shareholders before turning down multi-million dollar deals that boost their profits significantly. Paul Sweeting at RightsTech said, “As a publisher, you’re not really supposed to ignore those sorts of opportunities. It’s something that publishers of all kinds need to figure out. It’s really just too big to simply say no to. What do you get out of saying no?” But there remain many unknowns. Is this short-term decision-making they will regret later? Only time will tell.
Another consideration, as Cader writes, is how this deal points to a “commercially valuable market for authors’ work in training LLMs—which could be essential in setting damages or negotiating settlements in the many pending lawsuits against AI companies for stealing authors’ work without permission. It’s also a significant rebuff to the current legal position of the AI giants, who claim that using the whole text of books for AI training is fair use under the law and does not require a license at all.”
For further detail:
- Publishers Weekly has more commentary and perspective, useful for all.
- The Authors Guild has posted a formal statement that every author should read to understand the rights in play, not least between publisher and author, in regard to AI training. It also says, “It is important to understand that the licensed use of books must replace AI companies’ current unlicensed and uncontrolled use. Moving to a regime of licensed AI use gives authors the power to say no or to insist on limits on output uses and be compensated. The current regime, which relies on fair use, means that authors have no ability to prevent use of their books by AI or control output uses.”

Jane Friedman has spent her entire career working in the publishing industry, with a focus on business reporting and author education. Established in 2015, her newsletter The Bottom Line provides nuanced market intelligence to thousands of authors and industry professionals; in 2023, she was named Publishing Commentator of the Year by Digital Book World.
Jane’s expertise regularly features in major media outlets such as The New York Times, The Atlantic, NPR, The Today Show, Wired, The Guardian, Fox News, and BBC. Her book, The Business of Being a Writer, Second Edition (The University of Chicago Press), is used as a classroom text by many writing and publishing degree programs. She reaches thousands through speaking engagements and workshops at diverse venues worldwide, including NYU’s Advanced Publishing Institute, Frankfurt Book Fair, and numerous MFA programs.



