Artificial intelligence (AI) has more and more uses in more and more fields, and although in recent months its use for generating images from natural text descriptions is becoming very popular now the time has come to switch from image to sound, specifically the human voice.
AudioLM’s AI is capable of reproducing pitch, timbre, articulation, and even the pauses for a speaker’s breathing
One of the latest introductions from Google’s research division has been AudioLM, an AI capable of generate high-quality audio from a human voice recording of a few seconds. One of its distinctive features is that it does not require a previous training process based on previous transcriptions, maintaining the syntactic and semantic nature of the speaker on which it develops its new “speech”.
Beyond being able to faithfully reproduce the pitch, timbre, intensity or articulation of the starting voice it can also add the speaker’s breath sounds and, of course, form meaningful sentences. AudioLM achieves this by analyzing semantic and acoustic markers, with the former acting as a conditioner for the latter.
From these capabilities AudioLM is also capable of translating text to speech or enabling computer systems or intelligent assistants to generate synthetic speech. At the moment Google has not opened the use of AudioLM to the public, but this is not the only AI specialized in this task.
One of the most popular cases in which this technology has been used has been the “Obi Wan Kenoby” serieson the Disney+ streaming platform. The voice of the character Darth Vader has not been dubbed by the actor who is traditionally in charge of voicing him on the movie screen, James Earl Jones his voice has been recreated and “cloned” by the Ukrainian company Respeecher.
Jones signed a contract allowing the processing of the archive of recordings with his voice by AI so that the studio could have new lines of dialogue in the voice of the Dark Lord of the Sith without the 91-year-old actor having to go through the dubbing studio.
Precisely another product of the Disney factory, Lucasfilm division, also counted on the use of this technology to recreate the voice of Luke Skywalker in the series “The Book of Boba Fett” synthetically.