Hear Audio and Speech
Can accept spoken or audio input for understanding.
This page tracks AI tools that currently support hear audio and speech. Unlike marketing claims, inclusion here requires verified functionality. As of May 2026, 5 products support this, and support has changed in recent months.
Definition
What counts
- microphone input in conversation
- understanding spoken prompts
- multimodal listening modes
What doesn't count
- text-only chat
- outputting audio without taking spoken input
Product Availability (5 products)
ChatGPT
Advanced Voice Mode
Advanced Voice Mode enables natural, real-time conversations with GPT-4o. Available on **Windows, iOS, Android, and web**. **macOS desktop voice was retired January 15, 2026**βMac users should use the web version instead.
Access: paid
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| Go | β | |
| Plus | β | |
| Pro | β | |
| Team | β | |
| Enterprise | β |
Surfaces: Windows iOS Android web API
Verified: 2026-03-15
Microsoft Copilot
Copilot Voice
Copilot Voice is **available on all tiers including free**. You can speak to Copilot on desktop, mobile, and web.
Access: free
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| Copilot Pro | β |
Surfaces: Windows macOS iOS Android web
Verified: 2026-03-22
Gemini
Gemini Live (Voice Mode)
Gemini Live enables natural voice conversations with Gemini. **Basic voice is free; full Gemini Live with interruption and multi-turn requires AI Pro ($19.99/mo).** Currently mobile only.
Access: paid
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| AI Pro | β |
Surfaces: iOS Android API
Verified: 2026-02-28
Project Astra
Project Astra brings real-time vision to Geminiβshare your camera or screen and get contextual AI assistance. **Now free for everyone** on iOS and Android. It can remember context for 10+ minutes and integrates with Google Search, Maps, and more.
Access: free
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| AI Pro | β |
Surfaces: iOS Android
Verified: 2026-02-28
Grok
Grok Voice Mode
Grok Voice Mode allows natural voice conversations with Grok. **Now free for all users** on iOS, Android, and web (grok.com) with 11 voice modes. TTS API available for developers with 5 voice personalities. Supports attachments and photos in voice conversations (Mar 2026).
Access: free
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| Premium | β | |
| Premium+ | β | |
| SuperGrok | β |
Surfaces: iOS Android web API
Verified: 2026-03-21
Perplexity
Voice Mode
Perplexity Voice allows hands-free search on mobile. **Available on all tiers including free**βjust tap the microphone in the app.
Access: free
| Plan | Available | Notes |
|---|---|---|
| Free | β | |
| Pro | β | |
| Max | β |
Surfaces: Windows macOS iOS Android web
Verified: 2026-04-08
Also Known As
voice input, voice mode, speech recognition, talk to AI, audio input, voice chat