音声・Live interpretation

How Desktop Live Captions Work on Windows, Chrome, and Mac

May 26, 2026 Hiroki Tsukiyama

Your computer can already show you text for the audio playing through your speakers. Windows, Chrome, and macOS all offer built-in live caption features that generate text from any audio source. These captions can make meetings, videos, and presentations more accessible without installing additional software.

But there is an important distinction that causes frequent confusion: live captions and live translation are not the same thing. Captions show you text in the language being spoken. Translation converts that text into another language. Your operating system provides one of these. It may not provide the other.

This article explains how desktop live captions work on Windows, Chrome, and Mac, where the line between captions and translation falls, and what to use when you need more than same-language text.

What Live Captions Actually Do

Live captions use speech recognition to convert audio into text in real time. The system listens to whatever audio is playing through your computer and displays the recognized text on screen.

The key word is “recognized.” The text appears in the same language as the spoken audio. If someone is speaking English, the captions are in English. If someone is speaking Japanese, the captions are in Japanese.

Captions do not translate. They transcribe.

This distinction matters because many users assume that “live captions” means “live translated captions.” When they turn on captions expecting to see English text for a Japanese speaker, they are disappointed. The feature they actually need is translation, which is a separate capability.

Windows Live Captions

Windows 11 includes a built-in live captions feature that works with any audio playing on your system.

How to Turn It On

  1. Open Settings (press Windows key + I).
  2. Go to Accessibility and then Captions.
  3. Toggle Live captions to On.

Alternatively, you can use the keyboard shortcut Windows key + Ctrl + L to toggle live captions on and off.

When live captions are active, a caption bar appears at the top of your screen. The bar shows the transcribed text for whatever audio is currently playing. You can move the bar to the top or bottom of the screen, or float it as a separate window.

What It Captions

Windows live captions work with any audio source: meetings in any browser or application, videos playing locally or online, music with lyrics, and system sounds. If the audio is playing through your computer, the caption engine can process it.

Language Support

Windows live captions support captioning in several languages, but the feature was originally English-only and has been expanding over time. Check the current Windows accessibility documentation for the latest list of supported caption languages.

Source: Windows live captions FAQ

Translation Capability

Windows live captions focus on transcription, not translation. For live translation of spoken audio, you need a separate translation tool. Windows does not currently provide built-in real-time speech translation as part of the caption feature.

Chrome Live Captions

Google Chrome includes a built-in caption feature that generates text for audio playing in the browser.

How to Turn It On

  1. Open Chrome Settings.
  2. Go to Accessibility.
  3. Toggle Live Caption to On.

When enabled, Chrome shows a caption overlay at the bottom of the browser window whenever audio plays in a tab. The captions follow the tab, so each browser tab has its own caption stream.

What It Captions

Chrome captions work specifically with audio playing within the Chrome browser. This includes video calls running in browser tabs (like Google Meet in Chrome), YouTube videos, webinars, and any other web-based audio.

Chrome captions do not process audio from desktop applications. If you are running Zoom as a desktop app, Chrome captions will not pick up that audio. If you are running a meeting in a Chrome tab, they will.

Language Support

Chrome live captions support several languages for transcription. Google has been expanding language support over time. The feature originally launched in English and has grown from there.

Source: Chrome live caption

Translation Capability

Chrome does not translate captions by default. The captions appear in the language being spoken. However, Chrome does offer a separate translation feature for web pages, and some Google services like Google Meet offer translated captions as a meeting-specific feature. These are distinct from Chrome’s general live caption capability.

macOS Live Captions

macOS includes a live captions feature for Apple Silicon Macs and some Intel Macs.

How to Turn It On

  1. Open System Settings.
  2. Go to Accessibility.
  3. Toggle Live Captions to On.

When active, macOS displays a caption bar that shows transcribed text for system audio. The bar can be positioned at the bottom of the screen or moved as needed.

What It Captions

macOS live captions work with system-wide audio, similar to Windows. This means they capture audio from any application: meetings, videos, music, and system sounds.

Language Support

macOS live captions are available in a limited set of languages. The feature was initially English-only and has expanded gradually. Check Apple’s current accessibility documentation for the latest supported languages.

Source: Mac live captions

Translation Capability

macOS does not include built-in live translation as part of the caption feature. The captions are same-language only. For real-time translation, you need a separate translation application.

Captions vs Translation: The Key Differences

Understanding the difference between captions and translation is essential for choosing the right tool.

Live Captions

  • Convert spoken audio into text in the same language
  • Built into Windows, Chrome, and macOS at no additional cost
  • Work with any audio source (system-wide) or browser audio (Chrome)
  • Useful for accessibility, noisy environments, and following along when audio quality is poor
  • Do not help if you do not understand the language being spoken

Translated Captions

  • Convert spoken audio into text in a different language
  • Available in some meeting platforms (Zoom, Teams, Google Meet) with plan restrictions
  • May be available in third-party tools
  • Help you follow meetings and content in languages you do not speak

Live Translation

  • Converts spoken audio into translated text or speech in real time
  • Available through dedicated translation apps
  • Goes beyond captions by providing translation, not just transcription
  • The tool you need when you are in a multilingual meeting and do not speak the language being spoken

When Built-In Captions Are Enough

Built-in live captions are valuable in several scenarios, even without translation:

Accessibility. If you are hard of hearing or in an environment where you cannot use speakers, captions let you follow along with any audio content.

Noisy environments. Working from a busy office, a cafe, or an airport? Captions help you follow meeting content when background noise makes it hard to hear.

Quiet environments. When you need to keep your volume low, captions ensure you do not miss anything.

Language learning. Seeing the text of a language you are learning helps reinforce vocabulary and comprehension.

Note-taking support. Even if you understand the spoken language perfectly, having a text reference makes it easier to review key points later.

When You Need More Than Captions

Built-in captions are not enough when:

You do not speak the language. Captions in a language you cannot read do not help. You need translation.

You are in a multilingual meeting. If participants switch between languages, same-language captions only help when the language matches yours.

You need a written record in your language. If you are participating in a meeting conducted in another language and need documentation in your language, captions do not provide that.

Getting Live Translation on Your Desktop

When you need real-time translation of meeting audio, desktop translation apps fill the gap between what your operating system provides (captions) and what you actually need (translation).

These apps work similarly to built-in caption features in that they capture system audio. The difference is that instead of simply transcribing the audio, they also translate it into your chosen language. The translated text appears on your screen alongside the meeting.

Desktop translation apps offer several advantages:

Platform independence. They work with any meeting platform because they process system audio, not platform-specific data.

Language flexibility. They support multiple language pairs, often more than what meeting platforms offer.

No host dependency. Unlike platform translated captions, desktop apps do not depend on the host’s subscription plan.

Privacy. The translation tool runs on your machine and processes audio you already receive, without joining the meeting as a separate participant.

Combining Captions and Translation

The most effective setup for multilingual professionals is to run both built-in captions and a desktop translation app simultaneously. Here is why this combination works well.

Built-in captions give you a same-language transcript of the meeting audio. This is useful for confirming specific words, catching proper nouns, and following technical terminology that might not translate cleanly. The captions appear in the original language, so they are accurate for the actual words being spoken.

A desktop translation app running alongside the captions gives you the translated text in your preferred language. This helps you follow the meaning of the discussion even when the original-language captions do not make sense to you.

Having both streams available means you can cross-reference. If the translated text seems off, you can glance at the original-language captions to see what was actually said. If you recognize a technical term in the original language, you can confirm it against the translation. This dual-stream approach reduces the chance of misunderstanding.

On most monitors, you can position the built-in caption bar at the top of the screen and the translation app window at the bottom, keeping both visible without blocking the meeting video. On smaller screens, you may need to choose one or the other depending on which is more important for the current portion of the meeting.

Practical Setup Recommendations

For the most comprehensive multilingual meeting experience on your desktop:

  1. Enable built-in captions for your operating system. These provide a baseline text reference for all audio on your computer.

  2. Add a desktop translation app for meetings where you need translation. This gives you translated text for audio in languages you do not speak.

  3. Use a good headset. Both captions and translation depend on clear audio. A quality headset improves the accuracy of both transcription and translation.

  4. Test your setup before important meetings. Verify that captions are working, that your translation app captures the right audio, and that the text display does not obstruct your meeting view.

  5. Adjust caption positioning. Both Windows and macOS let you move the caption bar. Position it where you can read it without losing visual contact with the meeting participants.

  6. Use a wired internet connection when possible. Stable bandwidth reduces audio dropouts, which in turn improves both caption and translation accuracy. A wired Ethernet connection is more reliable than Wi-Fi for important meetings.

  7. Close unnecessary applications. Other programs competing for your computer’s audio resources can interfere with both the caption engine and the translation app. Close music players, extra browser tabs, and other audio-producing applications before joining an important multilingual meeting.

The Bottom Line

Windows, Chrome, and macOS all provide free, built-in live captions that transcribe spoken audio into text. These are genuinely useful accessibility features that make it easier to follow any audio content on your computer. They are worth enabling.

But captions are not translation. If your work involves multilingual meetings, you need a translation tool in addition to built-in captions. A desktop translation app that works with system audio provides the translation layer that your operating system’s caption feature does not include.

The best setup for multilingual professionals is both: built-in captions for same-language transcription and a desktop translation app for cross-language understanding. Together, they ensure you can follow any meeting, in any language, on any platform.

Real-time translation while you speak

Use the desktop interpreter for meetings and business calls with real-time translation and voice output across 12 languages.

Try JITAN