Artificial Intelligence puts the smart into your smartphone, it helps choose what movies you see, and even who you date. It will soon be driving our cars, it already helps to perform complex medical diagnoses, and it makes most of the trades on Wall Street. And now, new AI voice technologies have the potential to transform the audiobook industry.
These new technologies can replicate the human voice to create a listening experience that is virtually indistinguishable from the real thing, and they will enable publishers to create new, high-quality, audio content without the cost and time restraints of traditional production methods.
The audiobook market is enjoying its eighth year of rapid growth across the globe and, according to the latest research from the Audio Publishers Association (APA), it’s new content that’s driving this growth. Last year listeners consumed an average of eight audiobooks, up on six the previous year, yet, barriers to traditional audiobook production mean that only 6% of all books are currently available in audiobook format.
These new, advanced Text-to-Speech (TTS) technologies are breaking down these barriers, reducing the cost of traditional production methods, and also the time to market, by approximately half, making audiobook production more cost-effective for everyone, and much more accessible for small and mid-size publishers.
TTS technology is particularly suited to titles categorized as:
With these titles, the cost of studio production may not be justifiable. By removing the constraints of traditional production methods, publishers can produce more audiobooks quickly and cost-effectively.
DeepZen, a UK-based AI company, is a leader in the TTS field and produced the world’s first digitally narrated audiobook in 2019 (the darkly humorous, The Reluctant Cannibals by Ian Flitcroft). It is launching a new Publisher Portal, in association with Ingram, with exclusive discounts available to Ingram partner publishers, which is designed to make things simple, and enables publishers to conveniently manage all their audiobook projects in one place.
You might be asking yourself how AI Audiobook Production works and how realistic AI voices really sound? Even writers sometimes struggle to express the full range and experience of human emotions.
DeepZen’s technology was developed specifically for audiobooks and long form content, and it’s benchmarked against human narration. Its AI voices are a world away from the robotic, monotone, voice assistants we are all familiar with.
The technology incorporates AI voice and natural language processing and next generation algorithms. To put it simply, DeepZen has created a system that has allowed it to create a library of AI voices, based on recordings of the voices of actors and narrators, who are fully paid for their work. This voice data is fed into machines, enabling the machines to ‘learn’ to speak, so that when they are given a new text they can ‘read’ it in the actor’s voice.
The digital voice ‘learns’ to express a wide range of emotions by processing examples of the narrator speaking, for example, in a ‘happy’ voice, or an ‘angry’ voice. In the same way, they also ‘learn’ how to express elements of the human voice, such as pacing and intonation, that produce more realistic speech patterns.
Although it’s new, this technology has been tried and tested with very positive results.
In the last 12 months, DeepZen has:
Audiobooks produced by DeepZen are accepted by over 50 vendors globally including Apple Books, Google Play, Kobo, Scribd and Spotify.
The new Publisher Portal opens up a range of benefits to publishers by providing a high quality, convenient, and cost-effective production service that converts text into audio format, in approximately half the time it takes with traditional studio production, and at approximately half the cost.
DeepZen is providing a managed service that combines AI technology with human editing to ensure the high quality that listeners expect. The process couldn’t be easier:
It’s as simple as that. The whole process takes approximately three weeks, if you adhere to the deadlines for pronunciations and quality control review.
Whether you’re excited or nervous about AI, there is no doubt that it offers real business benefits to publishers.