Imagine that someone, having access to 15 seconds of your voice – something very easy to achieve, for example, through a Story on your Instagram account or a video that you have uploaded to your YouTube channel – is capable of cloning your voice exactly and making it play any phrase you want.
This, which may seem like science fiction, is now possible thanks to the new Artificial Intelligence presented by OpenAI – the company that created ChatGPT – and which opens a whole new debate about the limits of AI and how it could be used to create “deep “fakes” and false audios that compromise people with phrases that they would not really have said, or that lead them to be able to contract services over the phone by imitating their voice.
The risks, for the moment, are infinite, which is why OpenAI has decided not to launch Voice Engine -as this new AI is called- and open the use of this technology to the general public, at least, until it is clear how to limit its use and ensure an application that does not put users at risk.
This is how OpenAI’s Voice Engine works
Voice Engine was developed by OpenAI back in 2022, but until now they have been secretly testing and perfecting this new Artificial Intelligence.
This AI allows you to create very realistic and emotional voices just by having a 15-second audio example, perfectly imitating the user’s voice. From there, you can reproduce examples of text even in languages other than that of the user whose voice you have cloned, even imitating their tone or fillers they may have.
According to OpenAI, this new technology, which has been trained with public voice databases, would have a lot of potential for uses such as assistance during reading, simultaneous translation of content or helping people with speech difficulties to recover their voice.
However, it is also necessary to employ strict usage policies that prohibit the use of a person’s or organization’s voice without their explicit consent for any purpose for which they have not been informed.
In fact, OpenAI has openly recognized that generating imitations of other people’s speech can, for the moment, have serious risks, which is why it is not yet going to allow the use of this technology to all citizens. At the moment, only a select few people can try Voice Engine, in the same way that only a few have access to Sora, OpenAI’s tool that allows you to create video from text and that was presented just a few weeks ago.
According to the company, until they have more information about the use of the tool, they will not be able to make a decision about when to allow the massive use of this technology in a responsible manner.
In any case, this is not the only AI developed in the world that allows imitating voices. Microsoft has one that allows you to imitate the voices of deceased users, and Google also has an AI tool that allows you to go from text to speech with all types of voices.