Insight Into OpenAI’s GPT-4o: Everything We Know So Far

Read Time:5 Minute, 28 Second

On May 13, 2024, OpenAI announced the launch of Chat-GPT 4o, or GPT-4o, a significant update to the large language model (LLM) used by over 100 million people.

The new features, rolling out in the next few weeks, will provide voice and video options for all users, regardless of whether they are using the free or paid version of Chat GPT. The biggest advantage is that using voice and video makes a significant difference when interacting with Chat-GPT 4o.

OpenAI told viewers in the livestream that the changes are aimed at reducing “friction” between “people and machines” and “making AI accessible to everyone.”

In an impressive demonstration, chief technology officer Mira Murati and ChatGPT developers conducted real-time conversations with ChatGPT, including a request for a bedtime story.

At the request of OpenAI researcher Mark Chen, GPT-4o even made jokes in different voice tones – from playful to dramatic to singing.

Video capabilities, real-time voice communication, and simulated emotions were presented during the voice demonstration .

Key findings

  • OpenAI’s Chat-GPT 4o introduces voice and video features, allowing users to interact with the model via voice and video input.
  • The update aims to reduce the barrier between humans and machines by leveraging advanced AI capabilities to create more natural and seamless interactions.
  • GPT-4o can conduct real-time conversations, respond to multiple speakers at once, and even simulate emotions, adding depth and variety to interactions.
  • The upgrade includes quality and speed improvements in over 50 languages, as well as a desktop version for Mac users.
  • OpenAI recognizes the challenges associated with the potential misuse of real-time audio and video capabilities and emphasizes that it will work responsibly with stakeholders to address them.
  • GPT-4o will be rolled out gradually over the coming weeks, including a desktop app that will initially be available for Mac.

Using the video feature, ChatGPT had real-time conversations with the engineers, solving mathematical equations written on paper in front of a phone camera, while the AI ​​simultaneously chatted playfully and in real-time.

OpenAI announces that the features rolling out in the next few weeks are intended to improve quality and speed in over 50 languages ​​“to make this experience accessible to as many people as possible.”

The upgrade also includes a desktop version that will be released for Mac users on May 13, 2024, and will be available to paid users.

Different areas of application for GPT-4o

  • The team exchanged ideas on how university lecturers can provide their students with tools that support learning, whether through interactive learning materials, automated feedback systems, or personalized learning paths.
  • Similarly, podcasters can use Chat-GPT 4o’s new features to create content for their listeners that goes beyond just text. For example, you could create podcasts with interactive elements or respond to listener requests to create a personalized listening experience.
  • In addition, it was discussed how real-time data can be used in different areas of work, be it market research, customer support, or the analysis of real-time events, to make informed decisions and optimize processes.

OpenAI explains that GPT-4o (the ‘o’ stands for ‘Omni’) can respond to audio input in just 232 milliseconds, with an average of 320 milliseconds – similar to the human reaction time in a conversation.

GPT-4o is available for free

While the features will also be available to free users, OpenAI emphasizes that Pro users will not be disadvantaged as they will be able to use up to five times more capacity.

The changes will also impact the Application Programming Interface (API), with the API set to be twice as fast and 50% cheaper, according to OpenAI.

What particularly impressed us about voice and video features was that all three presenters spoke to ChatGPT at the same time – the artificial intelligence was able to successfully identify all speakers and respond to each of them.

Some users on X, formerly. Twitter, compared the new variant of ChatGPT to the movie “Her,” in which an omniscient AI was indistinguishable from a human personality. A real-time translation between Italian and English was also presented during the demonstration, based on a user question on Twitter.

OpenAI ChatGPT 4o launch

OpenAI emphasized that “GPT-4o brings new challenges in dealing with real-time audio and video functionality in terms of misuse. We continue to work with various stakeholders to explore how we can best integrate these technologies into the world.”

As a result, the features will be rolled out gradually over the coming weeks while maintaining security precautions.

OpenAI commented in a blog post:

“Over the last two years, we have made significant efforts to achieve efficiency improvements at every level of the system.

As a first step forward in this development process, we can make a GPT-4 level model much more widely accessible. GPT-4o’s capabilities will be rolled out iteratively (with expanded Red Team access starting today).

GPT-4o’s text and image capabilities are rolling out to ChatGPT today (May 13, 2024). We make GPT-4o available in the free version and for Plus users with message limits up to 5 times higher. In the coming weeks, we will be rolling out a new version of Voice Mode with GPT-4o in alpha within ChatGPT Plus.”

The choice of day for the impressive update was cleverly made by OpenAI, as it took place one day before the Google I/O developer conference, which was expected to be AI-heavy.

Our conclusion

In summary, the launch of GPT-4o is a real milestone in AI development. This wider availability of AI opens up a lot of exciting possibilities – but also some challenges.

As we continue to push the technological boundaries and with them the opportunities for even more complex interactions between humans and machines, we should not lose sight of the fact that with this expanded use there are also challenges, particularly regarding ethics and data protection.

It is critical that we ensure our innovations are consistent with our ethical values ​​and humanity.

We are facing an exciting journey into the future of AI, where on the one hand we want to fully exploit its potential, but at the same time, we should also ensure that we uphold ethical standards and principles. So it remains exciting to see what the future of AI will look like and what impact it will have on our daily lives.

0 %
0 %
0 %
0 %
0 %
0 %