Sesame, the AI startup behind the viral voice assistant Maya, has officially released its base AI model, CSM-1B. Now, developers worldwide can access it under an open-source Apache 2.0 license.
This release has unlocked new opportunities for innovation. But with great power comes great responsibility. The release also raises concerns about ethics; as AI-generated voice technology advances, so do the risks.
What Is CSM-1B?
CSM-1B is a 1-billion-parameter model that generates audio from text and voice inputs. The model uses residual vector quantization (RVQ) to encode and recreate human-like speech. This technique is already in use by Google’s SoundStream and Meta’s Encodec.
Here’s how it works:
- The model analyzes text or audio input.
- RVQ converts the input into discrete audio tokens.
- The decoder rebuilds human-like speech.
Sesame built CSM-1B on Meta’s Llama model, then, it added an advanced audio decoder. While the model is powerful, it has limitations. It isn’t fine-tuned for specific voices. It also struggles with non-English languages due to data contamination.
No Built-in Safeguards
The most controversial aspect of CSM-1B is its lack of security measures. Unlike other AI voice cloning companies, Sesame hasn’t added strict safeguards. Instead, it relies on an honor system. The company advises users not to misuse the technology, which means no voice impersonation, fake news, or malicious activities.
This approach to “safeguarding” has raised red flags. Consumer Reports recently warned that many AI voice cloning tools lack protections against fraud and CSM-1B is no exception.
The model was tested on Hugging Face and was found to clone a voice in under a minute. From there, generating speech on sensitive topics, like elections or propaganda, was simple. This raises an important question “How can AI-generated voices be regulated to prevent misuse?”
Just recently, a cloned voice of Donald Trump Jr. went viral for supporting Russia over Ukraine. However, before it could be disclaimed as an AI-generated audio, significant damage had been done. Without effective safeguards, scenarios like this will continue unhinged.
Sesame’s Bigger Vision
Despite these concerns, Sesame is moving full speed ahead. The company was co-founded by Oculus co-creator Brendan Iribe and has secured funding from Andreessen Horowitz, Spark Capital, and Matrix Partners. While Maya and Miles are its biggest successes so far, the company has bigger plans.
Sesame is now developing AI-powered smart glasses. These glasses will feature built-in AI, designed for all-day wear. Though details are scarce, word on the street is that Sesame is targeting the wearable AI market. That could put it in direct competition with Meta and Apple.