OpenAI’s latest upgrade essentially lets users livestream with ChatGPT

2024-05-14 15:55:14 Views

ChatGPT creator OpenAI has announced its latest AI model, GPT-4o, a chattier, more humanlike AI chatbot, which can interpret a user’s audio and video and respond in real time.

A series of demos released by the firm shows GPT-4 Omni helping potential users with things like interview preparation — by making sure they look presentable for the interview — as well as calling a customer service agent to get a replacement iPhone.

Other demos show it can share dad jokes, translate a bilingual conversation in real time, be the judge of a rock-paper-scissors match between two users, and respond with sarcasm when asked. One demo even shows how ChatGPT reacts to being introduced to the user’s puppy for the first time.

"Well hello, Bowser! Aren't you just the most adorable little thing?" the chatbot exclaimed.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
— OpenAI (@OpenAI) May 13, 2024

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” said the firm’s CEO, Sam Altman, in a May 13 blog post.

“Getting to human-level response times and expressiveness turns out to be a big change.”

A text and image-only input version was launched on May 13, with the full version set to roll out in the coming weeks, OpenAI said in a recent X post.

GPT-4o will be available to both paid and free ChatGPT users and will be accessible from ChatGPT’s API.

OpenAI said the “o” in GPT-4o stands for “omni” — which seeks to mark a step toward more natural human-computer interactions.

Introducing GPT-4o, our new model which can reason across text, audio, and video in real time.

It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction): pic.twitter.com/VLG7TJ1JQx
— Greg Brockman (@gdb) May 13, 2024

GPT-4o’s ability to process any input of text, audio and image at the same time is a considerable advancement compared with OpenAI’s earlier AI tools, such as ChatGPT-4, which often “loses a lot of information” when forced to multi-task.

Related: Apple finalizing deal with OpenAI for ChatGPT iPhone integration: Report

OpenAI said “GPT-4o is especially better at vision and audio understanding compared to existing models,” which even includes picking up on a user’s emotions and breathing patterns.

It is also “much faster” and “50% cheaper” than GPT-4 Turbo in OpenAI’s API.

The new AI tool can respond to audio inputs in as little as 2.3 seconds, with an average time of 3.2 seconds, OpenAI claims, which it says is similar to human response times in an ordinary conversation.

Magazine:  How to stop the artificial intelligence apocalypse: David Brin, Uplift author

　　Disclaimer: Includes third-party opinions. No financial advice. See Risk Warning.

Title：OpenAI’s latest upgrade essentially lets users livestream with ChatGPT - Markets
Address：https://www.j56.xyz/markets/7216.html

Pre：Market Update on May 14: PEPE Sets New Record Next：Bitcoin miners poised to offload BTC as mining revenue plunges: Kaiko data

OpenAI’s latest upgrade essentially lets users livestream with ChatGPT

You may also like

On May 21, BlackRock IBIT fund had a net inflow of $291 million

Report: 10 listed Bitcoin mining companies raised a total of 2 billion US dollars through equity financing before the halving

The hacker has returned Gala worth nearly 23 million US dollars in ETH

The clause about pledging in the file that converts ETHE to ETF will be deleted in grayscale

On May 21, a total of 81,840 ETH flowed into the exchange, setting a record high since January 23

Related Articles

Editor's choice

Hot stories