OpenAI Introduces Advanced Voice AI for ChatGPT Premium Subscribers
OpenAI has launched an advanced voice feature for its ChatGPT platform, enhancing real-time audio interaction. This new feature, powered by the GPT-4o model, promises hyper-realistic responses, allowing users to converse with ChatGPT without delays and even interrupt it mid-sentence.
Initially, the alpha version of Advanced Voice Mode is being made available to a select group of ChatGPT Plus subscribers, with plans to expand to all premium users by fall. This cautious rollout follows the controversy from its initial demonstration in May.
During the May showcase, the voice capability, named “Sky,” drew significant attention for its striking similarity to actress Scarlett Johansson’s voice. Despite Johansson’s claims that she had denied OpenAI permission to use her voice, the company denied using her likeness but chose to remove the demo, emphasizing the legal complexities surrounding AI and celebrity likeness rights.
To prevent misuse, OpenAI has restricted the system to four preset voices, developed in collaboration with paid voice actors. This measure aims to prevent the creation of deceptive deepfakes and ensure that the AI does not impersonate specific individuals or public figures.
OpenAI announced on X, formerly Twitter, that they tested GPT-4o’s voice capabilities with over 100 external testers across 45 languages. They assured that the model is trained to speak only in the preset voices and built systems to block outputs that deviate from these voices. Additionally, they implemented safeguards to prevent requests for violent or copyrighted content.
Furthermore, OpenAI has implemented filters to block requests for generating music or copyrighted audio. This move likely stems from recent legal actions against AI companies for alleged copyright infringement.
The new voice feature in ChatGPT aims to provide users with more realistic and seamless conversations while addressing the legal and ethical challenges posed by advanced AI technologies.