OpenAI on Monday announced its latest artificial intelligence large language model that it says will be easier and more intuitive to use.
The new model, called GPT-4o, is an update from the company’s previous GPT-4 model, which launched just over a year ago. The model will be available to unpaid customers, meaning anyone will have access to OpenAI’s most advanced technology through ChatGPT.
Based on the company’s Monday demonstration, GPT-4o will effectively turn ChatGPT into a digital personal assistant that can engage in real-time, spoken conversations. It will also be able to interact using text, voice and so-called vision, meaning it can view screenshots, photos, documents or charts uploaded by users and have a conversation about them.
OpenAI Chief Technology Officer Mira Murati said the ChatGPT will now also have memory capabilities, meaning it can learn from previous conversations with users, and can do real-time translation.
“This is the first time that we are really making a huge step forward when it comes to the ease of use,” Murati said during the live demo from the company’s San Francisco headquarters. “This interaction becomes much more natural and far, far easier.”
The new release comes as OpenAI seeks to stay ahead of the growing competition in the AI arms race. Rivals including Google and Meta have been working to build increasingly powerful large language models that power chatbots and can be used to bring AI technology to various other products.
The OpenAI event came one day ahead of Google’s annual I/O developer conference, at which it’s expected to announce updates to its Gemini AI model. Like the new GPT-4o, Google’s Gemini is also multimodal, meaning it can interpret and generate text, images and audio. OpenAI’s update also comes ahead of expected AI announcements from Apple at its Worldwide Developers Conference next month, which could include new ways of incorporating AI into the next iPhone or iOS releases.
Meanwhile, the latest GPT release could be a boon to Microsoft, which has invested billions of dollars into OpenAI to embed its AI technology into Microsoft’s own products.
OpenAI executives demonstrated a spoken conversation with ChatGPT to get real-time instructions for solving a math problem, to tell a bedtime story and to get coding advice. ChatGPT was able to speak in a natural, human-sounding voice, as well as a robot voice — and even sang part of one response. The tool was also able to look at an image of a chart and discuss it.
They also showed the model detecting users’ emotions; in one instance, it listened to the executive’s breathing and encouraged him to calm down.
“You’re not a vacuum cleaner!” the female voice of ChatGPT, jokingly told the staff member.
And it was able to have a conversation in multiple languages by translating and responding automatically.
Murati said that OpenAI will launch a ChatGPT desktop app with the GPT-4o capabilities, giving users another platform to interact with the company’s technology.
The updated technology and features are set to roll out to ChatGPT in the coming months.