ChatGPT can now see, hear and speak: OpenAI

The voice feature is set to roll out on iOS and Android devices, while the image-based feature will be available for all platforms

OpenAI on Monday announced an upcoming feature of ChatGPT that has everyone in a chokehold. Simply put, the generative-AI-based chatbot will now be able to speak, hear, and see. How? Well, the chatbot is getting a new voice and image, allowing users to get answers from ChatGPT in the form of a voice. The new feature is said to be out in about two weeks.

ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm pic.twitter.com/paG0hMshXb
— OpenAI (@OpenAI) September 25, 2023

In a subsequent tweet on X, OpenAI also shared a video stating, “Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate” explaining how the voice feature works. The voice feature is set to roll out on iOS and Android devices, while the image-based feature will be available for all platforms.

How to have ChatGPT talk to you?

You can head to Settings, then ‘New Features’ on the mobile application. In the top right corner of the home screen, tap the headphone button to choose between the five different voices, which you can select from as per your preferences.

This exciting feature is powered by a new text-to-speech mode, allowing the bot to swiftly generate a human-like audio in seconds after receiving the sample speech. OpenAI has gone the extra mile by collaborating with professional voice actors to deliver an exceptional auditory experience.

The new image feature:

If using iOS or Android, tap the plus button first. Otherwise, directly tap the photo button to capture or choose an image. You may also choose multiple images or use the ‘drawing tool’ to guide the bot.

The image comprehension feature is driven by the combined capabilities of multimodal GPT-3.5 and GPT-4, ensuring a rich understanding of various image types, including photographs, screenshots, and documents containing both text and visuals.

While these new features are indeed captivating, it’s essential to acknowledge the potential implications they pose for security and privacy. OpenAI, in its reports, underscores its commitment to safeguarding user privacy, stating, “We’ve also taken technical measures to significantly limit ChatGPT’s ability to analyse and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy.”

Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the views of ET Edge Insights, its management, or its members

The voice feature is set to roll out on iOS and Android devices, while the image-based feature will be available for all platforms

Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the views of ET Edge Insights, its management, or its members

Related Articles