The generative artificial intelligence (AI) space continues to heat up as OpenAI has unveiled GPT-4V, a vision-capable model, and multimodal conversational modes for its ChatGPT system.
With the new upgrades, announced on Sep. 25, ChatGPT users will be able to engage ChatGPT in conversations. The models powering ChatGPT, GPT-3.5 and GPT-4, can now understand plain language spoken queries and respond in one of five different voices.
ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm pic.twitter.com/paG0hMshXb
According to a blog post from OpenAI, this new multimodal interface will allow users to interact with ChatGPT in novel ways:
The upgraded version of ChatGPT will roll out to Plus and Enterprise users on mobile platforms in the next two weeks, with follow-on access for developers and other users “soon after.”
ChatGPT’s multimodal upgrade comes fresh on the heels of the launch of DALL-E 3, OpenAI’s most advanced image generation system.
According to OpenAI, DALL-E 3 also integrates natural language processing. This allows users to talk to the model in order to fine-tune results and to integrate ChatGPT for help in creating image prompts.
In other AI news, OpenAI competitor Anthropic announced a partnership with Amazon on Sep. 25. As Cointelegraph reported, Amazon will invest up to $4 billion to include cloud services and hardware access. In return, Anthropic says it will provide enhanced support for Amazon’s Bedrock foundational AI model along with “secure model customization and fine-tuning for businesses.”
Related: Coinbase CEO warns against
Read more on cointelegraph.com