With thousands of new podcasts launching every month, content creators have a slew of software tools to help streamline production. Many are built on artificial intelligence (AI) — a virtual audio assistant that improves podcast quality while allowing creators to focus on storytelling.
“Media production is now entering a phase where if you can dream it, it can happen,” says Jay LeBoeuf, head of business and corporate development at solution provider Descript. “And you no longer need to have an expensive studio or decades of training to accomplish your goals.”
The result has been a surge in podcasting, plus tremendous growth in podcast-like content, whether created by brands for promotion or event producers for on-demand consumption. Every piece of content needs to be produced and distributed, whether by audio professionals or enthusiasts learning the craft. Therefore, the more they can automate large swaths of audio production, the more they can concentrate on high-quality content. That’s a win-win for creators and consumers.
“A lot of podcasters aren’t even going through the big platforms now,” says Andy Taylor, formerly of BBC Radio and founder of Cardiff-based R&D consultancy Bwlb. “They’re going direct to their listeners, selling premium content, and having big success.”
AI Helps Quicken Workflows
Artificial intelligence — software that can automate tasks previously done by humans — holds the key to handling all this podcast content. Not only can AI speed up production, but it can also make podcasts sound better, which is important when generating the premium-quality content that discerning listeners seek out.
“AI basically helps take care of repetitive tasks to quicken the workflow of the podcaster,” explains Manos Chourdakis, research engineer at Nomono, which develops AI-based podcasting tools. “For example, with AI, you don’t have to listen to a whole podcast to find where someone said something wrong, then replace or remove it. You could do that yourself, but AI does it faster.”
Then there are chores that can only be accomplished with AI — at least at scale — such as removing noise or enhancing dialogue. “Good-quality dialogue enhancement would be impossible without AI,” Chourdakis says. “At least impossible in a reasonable timeframe using traditional tools.”
In fact, applications of AI in podcasting are as varied as production tasks. Some are built directly into podcast platforms. For example, when creators upload their podcasts to hosting platform Podcast.co, the system automatically “listens” to the audio files and normalizes sound levels.
“Any tool that can help reduce the mind-numbing bits of a job is a good thing,” says Mike Cunsolo, the platform’s co-founder. Cunsolo also runs Cue, a podcast production company working with corporate brands, and Matchmaker.fm, which connects podcast producers with guests. “You’ll always need that human expertise element, but soon machines could learn to understand what makes a podcast interesting and reduce time on task.”
For its part, Descript applies AI to many aspects of podcast engineering, including noise removal and echo control. One of the more “mind-numbing” chores Descript can handle is room tone.
“Sometimes producers need to insert digital silence into a podcast. Maybe between edits or to drag out the spacing between sentences,” says LeBoeuf. “But that sounds incredibly unnatural.” If producers didn’t capture room tone when a podcast was recorded, they may have to go back and get it. Or they can listen for it in the recording, copy-and-paste where needed, then edit the result to make it blend naturally.
Or computers can handle it. Descript’s AI-based room tone generator analyzes a recording, identifies the room tone, and automatically synthesizes it where it’s needed. Such technology not only avoids menial tasks, but it also allows for greater production flexibility.
“AI is going to allow us to use less expensive hardware, worse-sounding rooms, and noisier locations and still get good results,” says Nomono’s Chourdakis.
Emerging Applications of AI
AI also opens the door to innovation in podcasting — creating new solutions that raise the bar for podcasters and listeners. For example, the Epidemic Audio Reference (EAR) tool helps podcasters find copyright-free music based on songs they like.
“Say you’re looking for intro our outro music, and you’re thinking of a particular song, but it’s protected by copyright,” says Chourdakis. “The system uses AI under the hood to help you find something similar.”
At Bwlb, Taylor’s team developed Accordion, an AI-based solution that can take a podcast and reproduce it at various lengths.
“Every other part of our life is getting smarter — smart homes, smart refrigerators,” Taylor says. “People want more control and convenience from their podcast experience, too.”
When Taylor worked on documentaries for the BBC, he’d be asked for shorter versions to run on different platforms. The process was always manual. Accordion applies software algorithms to podcast content to intelligently create versions of different lengths. “It doesn’t speed anything up,” Taylor says, “but it gives the user control over the duration of the content without losing tone structure or listenability.”
No Advanced Audio Knowledge Needed
Ultimately, the more podcasters use AI tools, the better they become. In other words, the more data they ingest, the more they learn. Nomono’s dialogue enhancement algorithms are based on large datasets of voice recordings — some clean and intelligible, some less so — which teach the AI tools how to generate better sound.
In the future, AI tools will evolve to create a new genre of immersive, spatial podcasts. Nomono’s technology, for example, enables object-based audio production, which allows producers to position voices in a 3D soundscape or create dynamic versions that can be tailored to listeners. And while that sounds pretty advanced, it doesn’t have to be.
“Podcasters shouldn’t need advanced audio knowledge to produce high-quality audio,” says Chourdakis. “By automating some of these tasks, they can spend more time focusing on great storytelling.”
Brad Grimes is a long-time technology journalist and former communications director of the Audiovisual and Integrated Experience Association.
[Editor’s note: This is a contributed article. Streaming Media accepts contributor bylines based solely on their value to our readers.]
From content discovery to video indexing, AI can enable customization and hyper-personalization of the OTT viewer experience and help OTT programmers and marketers learn more about their audience.
25 Jul 2022
Bite-Size Video: How AI and Emerging Technologies are Transforming Video Production To Amplify Audience Engagement
Manual analysis of live streaming content to identify key highlights is both time-consuming and involves expensive manpower. For a growing number of content developers, the answer is bite-sized videos created with the help of AI.
11 Jan 2022