Skip to main content

ChatGPT already listens and speaks. Soon it may see as well

ChatGPT meets a dog
OpenAI

ChatGPT’s Advanced Voice Mode, which allows users to converse with the chatbot in real time, could soon gain the gift of sight, according to code discovered in the platform’s latest beta build. While OpenAI has not yet confirmed the specific release of the new feature, code in the ChatGPT v1.2024.317 beta build spotted by Android Authority suggests that the so-called “live camera” could be imminently forthcoming.

OpenAI had first shown off Advanced Voice Mode’s vision capabilities for ChatGPT in May, when the feature was first launched in alpha. During a demo posted at the time, the system was able to identify that it was looking at a dog through the phone’s camera feed, identify the dog based on past interactions, recognize the dog’s ball, and associate the dog’s relationship to the ball (i.e. playing fetch).

Dog meets GPT-4o

The feature was an immediate hit with alpha testers as well. X user Manuel Sainsily employed it to great effect in answering verbal questions about his new kitten based on the camera’s video feed.

Recommended Videos

Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful — reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To

— Manuel Sainsily (@ManuVision) July 30, 2024

Advanced Voice Mode was subsequently released in beta to Plus and Enterprise subscribers in September, albeit without its additional visual capabilities. Of course, that didn’t stop users from going wild in testing the feature’s vocal limits. Advanced Voice, “offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions,” according to the company.

The addition of digital eyes would certainly set Advanced Voice Mode apart from OpenAI’s primary competitors Google and Meta, both of whom have in recent months introduced conversational features of their own.

Gemini Live may be able to speak more than 40 languages, but it cannot see the world around itself (at least until Project Astra gets off the ground) — nor can Meta’s Natural Voice Interactions, which debuted at the Connect 2024 event in September, use camera inputs.

OpenAI also announced today that Advanced Voice mode was now also available for paid ChatGPT Plus accounts on desktop. It was available exclusively on mobile for a bit, but can now be accessed right at your laptop or PC as well.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
OpenAI nixes its o3 model release, will replace it with ‘GPT-5’
ChatGPT and OpenAI logos.

OpenAI CEO Sam Altman announced via an X post Wednesday that the company's o3 model is being effectively sidelined in favor of a "simplified" GPT-5 that will be released in the coming months.

https://x.com/sama/status/1889755723078443244

Read more
Sam Altman thinks GPT-5 will be smarter than him — but what does that mean?
Sam Altman at The Age of AI Panel, Berlin.

Sam Altman did a panel discussion at Technische Universität Berlin last week, where he predicted that ChatGPT-5 would be smarter than him -- or more accurately, that he wouldn't be smarter than GPT-5.

He also did a bit with the audience, asking who considered themselves smarter than GPT-4, and who thinks they will also be smarter than GPT-5.
"I don’t think I’m going to be smarter than GPT-5. And I don’t feel sad about it because I think it just means that we’ll be able to use it to do incredible things. And you know like we want more science to get done. We want more, we want to enable researchers to do things they couldn’t do before. This is the history of, this is like the long history of humanity."
The whole thing seemed rather prepared, especially since he forced it into a response to a fairly unrelated question. The host asked about his expectations when partnering with research organizations, and he replied "Uh... There are many reasons I am excited about AI. ...The single thing I'm most excited about is what this is going to do for scientific discovery."

Read more
OpenAI’s ChatGPT Search is now free to use without a login
A person sits in front of a laptop. On the laptop screen is the home page for OpenAI's ChatGPT artificial intelligence chatbot.

ChatGPT is becoming more accessible to the masses. Its ChatGPT Search feature is now available without having to log in to the popular chatbot. Parent company OpenAI has also confirmed that ChatGPT Search will be free to use– the feature works similarly to a search engine.

When accessing the service’s web address, ChatGPT you will see ChatGPT Search front and center, with a message saying “What can I help you with?” You can immediately input your query into the text box. At the bottom of the text box are options that say “Search” and “Reason.” The Search option is the option that allows you to use the page without logging in. Selecting the Reason option will prompt you to log in or sign up to access ChatGPT.

Read more