Last year, OpenAI announced Voice Engine, an AI tool that could clone voices with just 15 seconds of audio. A year later, it’s still not available to the public. OpenAI hasn’t shared when, or if, it will launch.
Why the delay? The company might be worried about misuse. It could also be trying to avoid regulatory scrutiny. OpenAI has faced criticism before for releasing AI products too quickly. Some say it prioritizes flashy launches over safety.
In a statement, OpenAI said it is still testing Voice Engine with “trusted partners.” The company is learning from these tests to improve safety and usefulness. It claims the tool is helping with speech therapy, language learning, customer support, video game characters, and AI avatars.
A Rocky Start
Voice Engine powers voices in OpenAI’s text-to-speech API and ChatGPT’s Voice Mode. It is capable of creating natural-sounding speech that mimics the original speaker. Thankfully, the tool has built-in safeguards against misuse.
OpenAI originally planned to release it on March 7, 2024, and planned for a group of 100 developers to have had early access. Those working on social benefits or responsible AI use would have gotten priority. The pricing was also set: $15 per million characters for standard voices and $30 per million for high-definition voices.
Then, at the last minute, OpenAI changed its mind. Instead of a public release, it gave access to just 10 developers. The company wanted to have discussions about ethical AI before expanding access. It also hoped to gather feedback on responsible AI voice deployment.
Years in Development, Yet No Launch
OpenAI started working on Voice Engine in 2022. In 2023, it showcased the tool to global policymakers to highlight its potential, and its risks. However, some companies are already using it. Livox, a startup that helps people with disabilities communicate, tested Voice Engine.
The CEO, Carlos Pereira, found it impressive. However, since the tool requires an internet connection, it wasn’t practical for many of Livox’s users. He hopes OpenAI develops an offline version. Despite testing, OpenAI hasn’t shared a launch timeline and Pereira hasn’t received updates about pricing or availability.
So far, Livox has used the tool for free.
Safety Concerns and AI Scams
In June 2024, OpenAI hinted at one reason for the delay: election security. AI-generated voices can be misused, and OpenAI wanted to prevent election-related abuse. The company says it built safeguards, including watermarking to track generated audio.
It also warranted that developers obtain consent from the original speaker and users must disclose that the voices are AI-generated. However, OpenAI hasn’t explained how it enforces these rules. Probably because scaling up enforcement will be difficult, even for a company of its size.
OpenAI also suggested it might add voice authentication to verify speakers. Another idea was a “no-go” list to block voices that sound like famous people. But these are complex challenges and getting them wrong could damage OpenAI’s reputation.
AI voice cloning is already a major problem. In 2024, it became one of the fastest-growing scams. Fraudsters used it to bypass security checks at banks, deepfake voices of celebrities and politicians spread quickly online, and recently an AI-generated audio of Donald Trump Jr. endorsing Russia against Ukraine went viral.
The problem was that many people had already believed it to be true before the disclaimer was issued.