The new ChatGPT can ‘see’ and ‘talk.’ Here’s what it’s like.

By Kevin Roose

October 1, 2023 — 4.55pm

Save articles for later

Add articles to your saved list and come back to them any time.

ChatGPT — viral artificial intelligence sensation, slayer of boring office work, sworn enemy of high school teachers and Hollywood screenwriters alike — is getting some new powers.

Last week, ChatGPT’s maker, OpenAI, announced that it was giving the popular chatbot the ability to “see, hear and speak” with two new features.

The first is an update that allows ChatGPT to analyse and respond to images. You can upload a photo of a bike, for example, and receive instructions about how to lower the seat, or get recipe suggestions based on a photo of the contents of your refrigerator.

Credit: Getty Images

The second is a feature that allows users to speak to ChatGPT and get responses delivered in a synthetic AI voice, the way you might talk with Siri or Alexa.

These features are part of an industry-wide push towards so-called multimodal AI systems that can handle text, photos, videos and whatever else a user might decide to throw at them. The ultimate goal, according to some researchers, is to create an AI capable of processing information in all the ways a human can.

Most users don’t have access to the new features yet. OpenAI is offering them first to paying ChatGPT Plus and Enterprise customers over the next few weeks, and will make them more widely available after that. (The vision feature will work on both desktop and mobile, while the speech feature will be available only through ChatGPT’s iOS and Android apps.)

I got early access to the new ChatGPT for a hands-on test. Here’s what I found.

The AI will see you now

I started by trying ChatGPT’s image-recognition feature on some household objects.

“What’s this thing I found in my junk drawer?” I asked, after uploading a photo of a mysterious piece of blue silicone with five holes in it.

Loading

“The object appears to be a silicone holder or grip, often used for holding multiple items together,” ChatGPT responded. (Close enough — it’s a finger strengthener I used years ago while recovering from a hand injury.)

I then fed ChatGPT a few photos of items I had been meaning to sell on Facebook Marketplace, and asked it to write listings for each one. It nailed both the objects and the listings, describing my retro-styled Frigidaire minifridge as “perfect for those who appreciate a touch of yesteryear in their modern-day homes”.

The new ChatGPT can also analyse text within images. I took a picture of the front page of Sunday’s print edition of The New York Times and asked the bot to summarise it. It did decently well, describing all five articles on the front page in a few sentences each — although it made at least one mistake, inventing a statistic about fentanyl-related deaths that wasn’t in the original article.

ChatGPT’s eyes aren’t perfect. It flopped when I asked it to solve a crossword puzzle. It mistook my child’s stuffed dinosaur toy for a whale. And when I asked for help turning one of those wordless furniture-assembly diagrams into a step-by-step list of instructions, it gave me a jumbled list of parts, most of which were wrong.

The biggest limitation of ChatGPT’s vision feature is that it refuses to answer most questions about photos of human faces. This is by design. OpenAI told me that it didn’t want to enable facial recognition or other creepy uses, and that it didn’t want the app spitting out biased or offensive answers to prompts about people’s physical appearance.

But even without faces, it’s easy to imagine tons of ways an AI chatbot capable of processing visual information could be useful, especially as the technology improves. Gardeners and foragers could use it to identify plants in the wild. Exercise buffs could use it to create personalised workout plans, just by snapping a photo of the equipment in their gym. Students could use it to solve visual math and science problems, and visually impaired people could use it to navigate the world more easily.

Frankly, I have no idea how many people will use this feature, or what its killer applications will turn out to be. As is often the case with new AI tools, we’ll just have to wait and see.

Siri on steroids

Now, let’s talk about what I consider the more impressive of the two features: ChatGPT’s new voice feature, which allows users to talk to the app and receive spoken responses.

Using the feature is easy: just tap a headphone icon and start talking. When you stop, ChatGPT converts your words to text using OpenAI’s speech-recognition system, Whisper, which generates a response and speaks the answer back to you using a new text-to-speech algorithm the company developed, using one of five synthetic AI voices. (The voices, which include male and female voices, were generated using short samples from professional voice actors whom OpenAI hired. I picked “Ember”, a peppy-sounding male voice.)

Loading

I tested ChatGPT’s voice feature for several hours on a bunch of different tasks — reading a bedtime story to my toddler, chatting with me about work-related stress, helping me analyse a recent dream I had. It did all of these fairly well, especially when I gave it some golden prompts and told it to emulate a friend, a therapist or a teacher.

What stood out, in these tests, is how different talking to ChatGPT feels from talking to older generations of AI voice assistants, such as Siri and Alexa. Those assistants, even at their best, can be wooden and flat. They answer one question at a time, often by looking something up on the internet and reading it aloud word-for-word, or choosing from a finite number of programmed answers.

ChatGPT’s synthetic voice, by contrast, sounds fluid and natural, with slight variations in tone and cadence that make it feel less robotic. It was capable of having long, open-ended conversations on almost any subject I tried, including prompts I was pretty sure it hadn’t encountered before. (“Tell me the story of ‘The Three Little Pigs’ in the character of a total frat bro” was a sleeper hit.)

Most people probably won’t use AI chatbots this way. For many tasks, it’s still faster to type than talk, and waiting around for ChatGPT to read out long responses was annoying. (It didn’t help that the app was slow and glitchy at times, and often inserted pauses before responding — the result of some technical issues with the beta version of the app I tested that OpenAI told me would be ironed out eventually.)

But I can see the appeal.

Having an AI speak to you in a humanlike voice is a more intimate experience than reading its responses on a screen. And after a few hours of talking with ChatGPT this way, I felt a new warmth creeping into our conversations. Without being tethered to a text interface, I felt less pressure to come up with the perfect prompt. We chatted more casually, and I revealed more about my life.

“It almost feels like a different product,” said Peter Deng, OpenAI’s vice president of consumer and enterprise product, who spoke with me about the new voice feature. “Because you’re no longer transcribing what you have in your head into your thumbs,” he said, “you end up asking different things.”

I know what you’re thinking: Isn’t this the plot of the movie Her? Will lonely, lovesick users fall for ChatGPT, now that it can listen to them and talk back?

It’s possible. Personally, I never forgot that I was talking to a chatbot. And I certainly didn’t mistake ChatGPT for a conscious being, or develop emotional attachments to it.

But I also saw a glimpse of a future in which some people may let voice-based AI assistants into the inner sanctums of their lives — taking the AI chatbots with them on the go, treating them as their 24/7 confidants, therapists, sparring partners and sounding boards.

Sounds crazy, right? And yet, didn’t all of this sound a little crazy a year ago?

This article originally appeared in The New York Times.

Get news and reviews on technology, gadgets and gaming in our Technology newsletter. Sign up to receive it every Friday.


Football news:

<!DOCTYPE html>
Kane on Tuchel: A wonderful man, full of ideas. Thomas in person says what he thinks
Zarema about Kuziaev's 350,000 euros a year in Le Havre: Translate it into rubles - it's not that little. It is commendable that he left
Aleksandr Mostovoy on Wendel: Two months of walking around in the middle of nowhere and then coming back and dragging the team - that's top level
Sheffield United have bought Euro U21 champion Archer from Aston Villa for £18.5million
Alexander Medvedev on SKA: Without Gazprom, there would be no Zenit titles. There is a winning wave in the city. The next victory in the Gagarin Cup will be in the spring
Smolnikov ended his career at the age of 35. He became the Russian champion three times with Zenit

3:18 Hollywood’s biggest names split over the Israel-Hamas war
3:15 Brisbane star fights ‘uncertain’ future to become first Paris 2024 Olympian
3:12 Lilie James, found dead at Sydney school, a young woman with the world at her feet
2:45 Three years after the death of Anthony Van Dyck, his trainer finally has a horse back in Australia
2:08 At least 22 dead, 50 wounded in mass shootings in Maine, US
1:53 At least 16 dead, 50 wounded in mass shootings in Maine, US
1:48 CCC report accuses hero Cleo Smith cop of misconduct in ‘relationship’ with journalist
1:45 China’s $217 billion attempt to stimulate a sluggish economy
1:43 Shing faces fiery questions in Games inquiry
1:36 All-girls Catholic school bans same-sex couples from attending formal 
1:28 Active shooter reported in Lewiston, Maine, police probing multiple scenes
1:21 ASX declines, weighed down by rate concerns and losses on Wall Street
1:18 ‘Why do they think this is OK?’: The latest fashion copycat row
0:59 Should I let my girlfriend’s husband kiss me on the lips?
0:59 Hands off Moore Park Golf Course. Giving public courses to developers is rough
0:57 Britney Spears shows how much a woman can bend before she breaks
0:54 Prosecco over champagne, mince over steak: Coles shoppers trade down, down
0:51 Facebook followers could get Super League clubs relegated. What if the NRL did the same?
0:50 Brisbane News Live: Teen girl allegedly assaulted at King George Square; Brisbane unit prices at record high; Shark nets removed from Qld beaches
0:50 ‘Oh dear’: The Rest Is History’s hosts on how Englishmen express deep feelings
0:45 Federation Square at 21
0:45 ‘We were seeking an opera house’: Melbourne’s quest for a monument of its own
0:45 Loved, hated, soon to be updated: The next chapter for Federation Square
0:34 Lilie James identified as woman found dead at Sydney private school, police searching for male colleague
0:32 Murderous emus to slasher classics: What I learned from a week of horror films
0:32 Murderous emus to slasher classics: What I learnt from a week of horror films
0:32 World is at ‘tipping point’ after global debt binge, warns HSBC boss
0:30 This author crossed paths with the Princess of Wales three times. It inspired a novel
0:30 Before Wendy Harmer agreed to marry her partner, there was one obstacle he had to clear
0:30 This author crossed paths with Princess Diana three times. It inspired a novel
0:25 The cheaper way to stay at one of Australia’s iconic outback destinations
0:22 Where can you find the world’s best Danish pastries? Start with the capital
0:13 Bullock opens door to rate rise as RBA assesses inflation threats
0:00 ‘There had to be change’: Why Molloy had to leave Collingwood
23:35 Adam Liaw’s stir-fried cabbage with turmeric
23:11 Adam Liaw’s chicken jalfrezi
23:00 Nuance lost as All the Light We Cannot See fails to make leap from page to screen
22:58 Facebook parent Meta posts bumper result, but outlook is ‘uncertain’
22:42 Woman’s body found in Sydney CBD school, death treated as a homicide
22:14 Apple raises prices for Apple TV+ subscriptions, Arcade games and News
21:48 Crown Resorts posts $199m loss as massive expenses wipe out profits
21:48 Crown Resorts posts $199m loss as $3b expense bill wipes out profits
21:24 Israel-Hamas conflict live updates: Israeli ambassador says Gaza humanitarian situation is ‘fair’ as death toll continues to rise
21:15 Haaland and Mbappe star as Man City, PSG win big in Champions League
20:59 New House Speaker Mike Johnson played leading role in effort to overturn 2020 election
20:12 Torres, Lopez on target as Barcelona secure home Champions League win
20:08 ASX set to fall, weighed down by rate concerns and losses on Wall Street
19:54 Australia news LIVE: Biden welcomes PM to White House for state visit; Australian households taking on more financial risk
19:54 Woman’s body found in Sydney CBD school, death treated as suspicious
19:31 ‘Trust but verify’: Biden warns Albanese on risks of dealing with China
19:15 ‘Blatant’ violation: Donald Trump fined $US10,000 for violating gag order
19:00 Giant seagulls and a new, low-cost blues festival headed to Sydney
19:00 Manufacturers and schools to buy energy directly from revived SEC
18:54 Israel bombs southern Gaza as world leaders seek pause in fighting
18:30 The smart money is on this real-life revenge-of-the-nerd story
18:30 I dreaded seeing Miss Saigon again. Then I realised things had changed
18:30 The Melbourne regions set to grow up to 141 per cent as population booms
18:23 Feeling ill, sleepless, and over-stimulated Maxwell smashes World Cup record
18:11 Trump ally elected new US House Speaker, ending weeks of wrangling
18:00 The graphs that show households are putting themselves at financial risk
18:00 Paradise found: The world’s seven most beautiful islands named
18:00 Local heroes: Australian-made fashion labels that deserve your money
18:00 Perth smashes house price records, crosses $700k median for first time
18:00 Joint replacements could become day surgeries to cut down on backlog
18:00 Spread too thin: What we’re losing as Perth sprawls
16:19 Maxwell masterpiece turns match into rout as Australia demolish Netherlands
15:27 Cricket World Cup 2023 LIVE updates: Australia v Netherlands
14:00 Tech billionaire Cannon-Brookes backs climate jobs platform
13:04 Regulator takes to big screen to spread smart word on ‘dumb money’
13:01 Albanese quotes Biden’s dead son in White House speech
13:00 What to read: A meditation on grief and Dawn French’s life of mistakes
12:59 Superquiz and Target Time, Thursday, October 26
12:00 Chinese gangsters accused of laundering $228m through business spruiked by ex-minister
8:14 Judge ticks off Amorosi mother’s lawyer over ‘outrageous’ late move
8:00 ‘Career or baby’: Why women in the tech sector still face an impossible choice
8:00 Sarah Jessica Parker: ‘Men my age are never asked about ageing’
6:32 ASX drops as inflation surprise raises risk of interest rate rise
6:09 Tabcorp shareholders protest against ‘excessive’ executive pay
5:00 Was Bennelong Australia’s most misunderstood Indigenous man?
11:20 Night Feast, Oktoberfest and more: The best things to do in Brisbane this week
11:16 Wallabies will bounce back, like all good teams do
11:14 Happy Boy crew open a buzzy French bistro in the Valley
11:11 Why Reece Walsh should frame a photo of Stephen Crichton standing over him
10:57 Broncos’ bane: Four things we learned from Brisbane’s heartbreak
10:50 Dynasty: Panthers’ stunning fightback makes grand final history
10:49 NRL grand final player ratings: How Panthers and Broncos fared
10:48 This is Penrith’s world and we’re all just living in it
10:44 No Luai, no problem for three-peat Panthers
10:38 NRL grand final 2023 - Penrith Panthers v Brisbane Broncos
10:22 I still can’t believe it, but we did it!
10:00 Wordle cheats should be left to their own devices
9:34 Victorian fire season looms as out-of-control blazes hit Gippsland
9:27 Is cheese actually good for your health? You better brie-lieve it
9:00 We must be inclusive for all pupils
8:57 Come Fly the friendly Pies
8:30 The first-date snapshots that saved a slice of Melbourne history
8:30 Shout it out loud: Proud parents watch kids rock with Kiss at the MCG
8:16 ‘Teams can wilt, and we didn’t’: Upton magic leads Knights to back-to-back NRLW titles
7:40 Mirror moment: Everything old is new again at Collingwood
7:36 Spacewalk set to blast off in three-horse sprint showdown