text to speech google assistant voice

Editor's Choice: Best Back-to-School Tech Gifts
Get These 12 Student Discounts!

How to Use Google's Text-to-Speech Feature on Android

Search the Settings app for Select to Speak to read text aloud with Google's TTS feature

How to Use Select to Speak
Managing the Options
Translating Text
Frequently Asked Questions

What to Know

Open the Settings app and go to Accessibility > Select to Speak .
Tap the toggle to turn it on, then tap Allow or OK to confirm permissions.
Open any app, tap the Select to Speak shortcut, then tap an item to read it aloud. Tap Stop to end playback.

This article explains how to use the Google text-to-speech feature on Android so that you can have texts read out loud. It includes information on managing the language and voice used for reading text aloud. Instructions apply to Android 7 and up.

How to Use Google Text-to-Speech on Android

Several accessibility features are built into Android. If you want to hear text read aloud to you, use Select to Speak.

Swipe down from the top of the phone, then tap the gear icon to open the Settings app.

Tap Accessibility .

Tap Select to Speak .

If you don't see Select to Speak , tap Installed services to find it.

Tap the Select to Speak toggle switch to turn it on. On some phones, this is called Select to Speak shortcut .

Tap Allow or OK to confirm the permissions your phone needs to turn on this feature.

Open any app and tap the Select to Speak icon from the side of the screen.

Tap the Play icon to have your phone read everything on the screen, starting at the top. If you only want some text read aloud, trigger Select to Speak by tapping the floating icon, then tap the text.

Tap the left arrow next to the Play button to see more playback options.

Tap Stop to end playback.

Use TalkBack on your Android if you want spoken feedback as you use your device.

How to Manage Android Text-to-Speech Voices and Options

Android gives you some control over the language and voice used to read text aloud via Select to Speak. It's easy to change the language, accent, pitch, or speed of the synthesized text voice.

Go to Settings > General management > Language and input . Or on some devices, Settings > Languages .

Tap Text-to-speech or Text-to-speech output .

In the menu that appears, adjust the Speech rate and Pitch until it sounds the way you want.

To change the language, tap Language , then choose the language you want to hear when text is read aloud.

Use Select to Speak With Google Lens to Translate Written Words

Another way you can use this text-to-speech functionality is while translating languages. Google Lens is great for this. Just point the camera at some text you don't understand and it'll be translated into your language. Select to Speak can then read that aloud.

To turn off text-to-speech, go to Settings > Accessibility > Select to Speak and tap the toggle switch to turn it Off .

The Android text-to-speech feature works in the Google Docs app, but on a computer, you must download the Screen Reader extension for Chrome . Then, go to Tools > Accessibility settings > Turn on Screen Reader Support > OK , highlight the text, and select Accessibility > Speak > Speak selection .

To use voice typing in Google Docs , place your cursor in the document where you want to begin typing, then select Tools > Voice Typing . Alternatively, you can also use a keyboard shortcut Ctrl + Shift + S or Command + Shift + S .

Get the Latest Tech News Delivered Every Day

How to Use Speech-to-Text on Android
How to Use the Google Voice Recorder App on Android
The Official Android Versions Guide: Everything You Need to Know
How to Use Android 12's Adaptive Notifications Ranking
Android 13: News, Release Date, and Features
How to Make Your Android Phone Read Your Texts
How to Change the Keyboard on Android
How to Control F on Android
What Is Android Dark Mode? And How to Enable It
9 Best Keyboards for Android in 2024
How to Transfer Text Messages From Android to Android
How to Connect a Phone to a TV Wirelessly
What Is an Android Photo Sphere?
All About the Gboard Keyboard for Android and iOS
How to Make a Video Call on Android
How to Check Your Data Usage

Search results for

Affiliate links on Android Authority may earn us a commission. Learn more.

How to use Google Assistant voice typing with Gboard

Published on June 16, 2023

If you’ve got a Pixel 6 or later, one of the perks Google offers is dictation using Gboard and Google Assistant , a combo that enables more elaborate voice commands. Here’s how to enable and control Gboard voice typing, plus info on supported languages.

QUICK ANSWER

To enable Gboard voice typing, open any app you can type with, then tap in a field where you can enter text. At the top of the keyboard, tap Settings (the gear icon ), then Voice typing . Toggle Assistant voice typing if it's not on already. Whenever you want to use feature, tap and hold the keyboard's mic icon or say "Hey Google, type."

JUMP TO KEY SECTIONS

How to turn on Google Assistant voice typing on a Pixel

How to use Assistant voice typing

Which languages does google assistant voice typing support, how to turn on google assistant voice typing on a pixel device.

In most cases the feature should be on by default, the exception being if you’ve installed multiple languages on your Pixel. In that situation you may need to switch to a supported language (see below) using the globe icon on your keyboard.

Regardless of how many languages you’re using, here’s how you can make sure Gboard voice typing is enabled:

Open any app that supports typing, such as Messages.
Tap on a field that supports text entry — this is just to pop up the keyboard.
Towards the top of the keyboard, tap Settings (the gear icon ), then Voice typing .
Toggle the Assistant voice typing switch if it isn’t already on.

Starting and stopping voice typing

Whenever the keyboard is active, start dictation by tapping and holding the mic icon or saying “Hey Google, type.” Double-tap the mic icon to keep listening for multiple messages. Either way, you’ll see a glowing mic icon when Assistant can hear you — say whatever it is you want to be typed.

When you’re done, tap the mic icon again or say “Stop.”

Using Assistant commands

The advantage of dictating with Assistant is that you can control it with voice commands while still entering text.

These are the most essential commands:

“Delete last word.”
“[Name] emoji.”
“Clear” deletes the last sentence.
“Clear all” starts dictation from scratch.
“Next” moves to the next open field if you’re filling out a form.
“Send” delivers a chat message.

Tap the Info ( question mark ) button during voice typing to learn other phrases.

Editing text

Google Assistant can make mistakes, just like humans. When you catch one, tap on the relevant word, then speak or type out what you meant to say. If a word is still being misinterpreted, you can spell it out or select onscreen suggestions.

Managing automatic punctuation

Normally, Assistant inserts punctuation itself based on the tone and cadence of your voice. If there are too many errors, you can turn this off through the keyboard’s Settings > Voice typing > Auto punctuation toggle and use typing to insert your own punctuation.

Per Google , you can only use Assistant voice typing in English, French, German, Italian, Japanese, or Spanish. You’ll need to add one of these to Gboard , and make sure both your keyboard and the rest of Android are switched to the same language.

System Intelligence is both an app and a core component of Android. If you’re still using Android 12, you should upgrade to Android 13 or later if possible, or update System Intelligence through the Google Play Store.

You can only download these updates over Wi-Fi, so try connecting to a (trusted) Wi-Fi network before launching voice typing again.

Pocket-lint

How to access gemini live, google's new lifelike voice assistant.

Your changes have been saved

Email is sent

Email has already been sent

Please verify your email address.

You’ve reached your account maximum for followed topics.

Key Takeaways

Gemini Live is Google's new lifelike voice assistant, available to Android users with a Gemini Advanced subscription.
You can access it through the Gemini app, and it's generally faster than text chat.
It's not as useful as it could be because it doesn't currently connect to other Google services, but Gemini Live does bring Google closer to a true next-generation voice assistant.

Among the announcements Google packed into the official launch of the Pixel 9 , its demo of Gemini Live is by far the most impressive. Not just because it was a dramatic step up from the old days of call and response with Google Assistant , but also because the company decided to start rolling it out to customers the same day as the event. There's a good chance you can access Google's lifelike Gemini Live right now.

The company originally teased the new voice mode at Google I/O 2024 , as a similar, if less natural, answer to the updated voice mode OpenAI demoed earlier in 2024 . Right now, Gemini Live is dedicated to conversation , it doesn't provide visual answers, and it can't access other Google services like Gmail or YouTube Music in the same way that normal Gemini can. But it's pretty clear Google thinks it could be a hit, something you talk to with your Pixel Buds , or even from a new Nest speaker down the road.

Whether or not you'll find it useful requires trying it, and luckily Google has made it relatively simple to start talking to Gemini Live right now. Here's how to access Google's new voice assistant on your phone.

Someone pulling up Google search on a computer

Gemini and Google Workspace can help you be more productive… most of the time

Google's Gemini is a pro at summarizing Google Docs and emails, but things get a little quirky when it comes to Sheets and other Workspace tools.

How to talk to Gemini Live

Google's Gemini Live feature is currently available in English, and can only be used by anyone with an Android phone, the Gemini app, and a Gemini Advanced subscription. Gemini Advanced uses Google's more powerful and is designed to have a larger context window, meaning it can handle analyzing and working with larger quantities of data (the PDFs, images, and text you drag into its chat window).

Currently, the only way to subscribe to Gemini Advanced is through the Google One AI Premium plan, which costs $19.99 per month, and comes with other benefits like 2TB of storage and access to Gemini and other premium features in Docs, Sheets, and Slides.

Provided you already have a Google account, you can sign up for the Google One AI Premium plan online or in the Google One app on your Android phone.

Access voice chat through the Gemini app

Once you're subscribed, you can access Gemini Live inside the Gemini app. Here's how to use it:

Download or open the Gemini app .
Tap on the Gemini Live icon in the bottom right (a waveform with a sparkle) to start Gemini Live.
Agree to Google's terms and conditions by tapping "Okay."
Select a voice by swiping through Gemini Live's 10 options and start chatting.

You're able to ask Gemini questions like you normally would and can interrupt responses just by speaking or tapping the screen. Tapping on the red "X" at the bottom ends the Gemini Live chat and dumps you back into a Gemini text chat covering what you just talked about. Tapping the Hold button turns off your microphone in case you don't want Gemini to respond. If you can get comfortable interrupting some of Gemini's more verbose responses, Gemini Live can be quicker and easier than normal text chat.

Google has a glimmer of a next-generation voice assistant

It just needs to connect to more apps.

The bottom two buttons in the Gemini Live feature.

That doesn't mean Gemini Live is perfect. Even if it feels more natural to talk to, the information it provides isn't any less machine-generated to be inoffensive. The Gemini Live voices sound different, but there's no personality to contend with. For right now, that doesn't make it more useful than more general text-based chats, especially because Google hasn't added a way for Gemini Live to access or control other apps and services, unlike Gemini.

But in Google's hands, it suddenly feels like we're closer to the Siri or Google Assistant that were originally imagined. A computer interface that you can talk to normally to get things done. Whether or not Google ever fully gets there, Gemini Live makes it feel like it's possible.

If you'd prefer to use ChatGPT, the AI's existing Voice Mode is slower, but fairly similar to Gemini Live. And if you're an iPhone user, Pocket-lint's hands-on preview of Apple Intelligence has some good insights into how Apple is currently approaching AI.

Q: Will Gemini Live be available on iOS?

Yes. You can currently access Gemini through the Google app on iOS and, according to Google's blog announcing Gemini Live , "in the coming weeks [Gemini Live] will expand to iOS and more languages."

How to use Google text-to-speech on your Android phone to hear text instead of reading it

You can use Google 's text-to-speech feature to do things like help you hear grammatical oddities in your text or documents.
Before you can use it, however, you'll have to enable the feature on your phone.
Here's what you need to do to enable and use Google text-to-speech on your Android device.
Visit Business Insider's homepage for more stories .

Speech-to-text is a popular productivity hack that many use to more quickly and easily create written sentences.

Its counterpart, text-to-speech , can help with productivity too, albeit in a different way: By hearing the text read back to you in a robotic voice, you may be able to catch skipped words, grammar mistakes, and awkward phrasing.

Here's what you need to know to start using text-to-speech on your Android :

Check out the products mentioned in this article:

Google pixel 3a (from $399.99 at best buy), how to enable google text-to-speech.

1. Go into your device's settings.

2. Tap "Accessibility."

3. Depending on your device, you may need to tap "Vision."

4. Choose "Select to speak."

5. Toggle the feature on and confirm by tapping "Ok" in the pop-up window.

Depending on your device, you will either see a circle pop-up with the text-to-speech icon, or it will appear in the lower-right corner of your screen.

How to use Google text-to-speech

Once you've set up the feature and you've navigated to a bit of text you want to have read back to you, here's what you'll need to do:

1. Tap the text-to-speech icon — you'll see a red stop button appear, with a greyed-out play button next to it.

2. Tap and select the speech you want read back to you. Drag your finger across the screen if there is more than one section, or press the play button to have everything on the screen read back to you, including button commands.

3. Tap the play button to begin the text-to-speech playback.

If you tap the carrot to the side of the icon, you'll also see the ability to pause the read-back, or go back or forward.

Related coverage from How To Do Everything: Tech :

How to update google maps on your iphone or android phone, to get the latest features and security updates, how to find the serial number on your samsung galaxy s10, which you'll need if you ever have it serviced, how to close apps on a samsung galaxy s10 to keep your phone running efficiently, how to schedule send an email on gmail on desktop or mobile, if you want to compose an email but schedule it to send at a later time.

Insider Inc. receives a commission when you buy through our links.

Watch: Everything we know about the Google Pixel 3

Main content

Trusted Reviews is supported by its audience. If you purchase through links on our site, we may earn a commission. Learn more.

What is Gemini Live? Google’s new AI voice chat tool explained

Google has announced the rollout of Gemini Live, a new voice chat tool enabling users to have human-like coversations with the new artificially intelligent assistant.

The new feature is rolling out today for people who subscribe to the Gemini Advanced tier, but Google also says it is trying to bring the new Gemini features to Pixel, Samsung, and other Android phones in the next few weeks.

Save £200 on the Pixel Watch 2

The Pixel Watch 2 is available at a huge £200 discount ahead of the launch of the Pixel Watch 3.

The feature was originally previewed at Google I/O in May and arrives in time to rival OpenAI’s controversial ChatGPT Advanced Voice Mode .

During the Made by Google event, where the company is about to launch the Pixel 9 series of handsets , Google said Gemini Live will allow users to have in-depth conversations, as well as “emotionally expressive” voice chats on mobile devices.

Google even says it’ll be possible to interrupt Gemini in mid-flow to ask follow-up questions. The speech will quickly adapt to your interruptions.

“Gemini Live is a mobile conversational experience that lets you have free-flowing conversations with Gemini,” Google writes in a blog post .

“Want to brainstorm potential jobs that are well-suited to your skillset or degree? Go Live with Gemini and ask about them. You can even interrupt mid-response to dive deeper on a particular point, or pause a conversation and come back to it later. It’s like having a sidekick in your pocket who you can chat with about new ideas or practice with for an important conversation.”

Google says the feature will be available hands free, and you’ll also be able to keep chatting with Gemini when your phone is locked. So it’s a lot like a regular phone call in that respect. There are ten unique voices to choose from, with Google previewing them today during the Made by Google keynote.

Gemini Live will also be compatible with the Pixel Buds 2 , which are launching today too.

You might like…

The Pixel 9 Pro Fold is the foldable Google should have made last year

Google Pixel 9 Pro vs Pixel 9 Pro XL: The important differences detailed

Google Pixel Watch 3 vs Apple Watch Series 9: Wear OS and watchOS go head to head

Google Pixel 9 Pro vs iPhone 15 Pro: Which should you buy?

Chris Smith is a freelance technology journalist for a host of UK tech publications, including Trusted Reviews. He's based in South Florida, USA. …

Why trust our journalism?

Founded in 2003, Trusted Reviews exists to give our readers thorough, unbiased and independent advice on what to buy.

Today, we have millions of users a month from around the world, and assess more than 1,000 products a year.

Editorial independence

Editorial independence means being able to give an unbiased verdict about a product or company, with the avoidance of conflicts of interest. To ensure this is possible, every member of the editorial staff follows a clear code of conduct.

Professional conduct

We also expect our journalists to follow clear ethical standards in their work. Our staff members must strive for honesty and accuracy in everything they do. We follow the IPSO Editors’ code of practice to underpin these standards.

Sign up to our newsletter

Get the best of Trusted Reviews delivered right to your inbox.

How-To Geek

How to modify google text-to-speech voices.

Your changes have been saved

Email is sent

Email has already been sent

Please verify your email address.

You’ve reached your account maximum for followed topics.

How to Get Google to Pay for Your Android Apps

How i made a minimalist dumb phone with free software, samsung messages is dead, and that’s not good for android, quick links, changing speech rate and pitch, choosing text-to-speech tone, switching languages, changing text-to-speech engines.

While Google focuses on the Assistant, Android owners shouldn't forget about the Text-to-Speech (TTS) accessibility feature. It'll convert text from your Android apps, but you might need to modify it to get the speech to sound the way you want it.

Modifying Text-to-Speech voices is easily done from the Android accessibility settings menu. You can change the speed and pitch of your chosen voice, as well as the voice engine you use.

Google Text-to-Speech is the default voice engine and is pre-installed on most Android devices. If your Android device doesn't have it installed, you can download the Google Text-to-Speech app from the Google Play Store.

Android will use default settings for Google Text-to-Speech, but you might need to change the speed and pitch of the Text-to-Speech voice to make it easier for you to understand.

Changing the TTS speech rate and pitch requires you to get into the Google accessibility settings menu. The steps for this might vary slightly, depending on your version of Android and your device manufacturer.

To open the Android accessibility menu, go to Android's "Settings" menu. You can get to this by swiping down on your display to access your notification shade and tapping the gear icon in the top right, or by launching the "Settings" app from within your apps drawer.

Scroll down the notifications shade and tap the gear icon to access your Android settings

In the "Settings" menu, tap the "Accessibility" option.

In the Android settings menu, tap Accessibility

Samsung device owners will have two extra steps here. Tap "Screen Reader" and then "Settings." Other Android owners can go straight to the next step.

Samsung device owners will need to tap Screen Reader, then Settings

Select "Text-to-Speech" or "Text-to-Speech Output," depending on your Android device.

Tap Text-to-speech or Text-to-speech Output, depending on your Android device

From here, you'll be able to change your Text-to-Speech settings.

Changing Speech Rate

Speech rate is the speed your Text-to-Speech voice will speak at. If your TTS engine is too fast (or too slow), the speech could sound deformed or hard to understand.

If you've followed the steps above, you should see a slider under the heading "Speech Rate" in the "Text-to-Speech" menu. With your finger, slide this right or left to raise or lower the rate you're seeking.

Move the Speech rate slider to change your TTS speech rate

Press the "Listen to an Example" button to test your new speech rate. Samsung owners will have a "Play" button, so tap that instead.

Changing Pitch

If you feel the Text-to-Speech engine is too high (or low) pitched, you can change this by following the same process as changing your speech rate.

As above, in your "Text-to-Speech" settings menu, adjust the "Pitch" slider to the pitch you like.

Move your Pitch slider to modify your TTS pitch rate

Once you're ready, press "Listen to an Example" or "Play" (depending on your device) to try the new rate.

Continue this process until you're happy with both your speech rate and pitch settings, or tap "Reset" to return to your default TTS settings.

Not only can you change the pitch and rate of your TTS speech engine, but you can also change the tone of the voice. Some language packs included with the default Google Text-to-Speech engine have different voices that sound either male or female.

Similarly, the Samsung Text-to-Speech engine included with Samsung devices has a varied selection of gendered voices for you to use.

If you're using the Google Text-to-Speech engine, tap the gear menu button in the "Text-to-Speech Output" settings menu, next to the "Google Text-to-Speech Engine" option.

If you're on a Samsung device, you'll only have one gear icon in the "Text-to-Speech Settings" menu, so tap that instead.

In the "Google TTS Options" menu, tap the "Install Voice Data" option.

Tap Install voice data in the Google TTS options menu

Tap your chosen regional language. For example, if you're from the U.S., you might want to choose "English (United States)."

In the Google TTS voice data menu, tap your chosen language

You'll see various voices listed and numbered, from "Voice I" onwards. Tap on each one to hear what it sounds like. You'll need to make sure your device isn't muted.

With the "English (United Kingdom)" language pack, "Voice I" is female, while "Voice II" is male, and the voices continue to alternate in this pattern. Tap on the tone you're happy with as your final choice.

In your language menu, choose your gendered voice

Your choice will be automatically saved, although if you've selected a different language to your device's default, you will also need to change this.

If you need to switch languages, you can easily do this from the "Text-to-Speech" settings menu. You might want to do this if you've chosen a different language in your TTS engine than your system default language.

You should see an option for "Language" in your "Text-to-Speech" settings menu. Tap this to open the menu.

Choose your language from the list by tapping it.

You can confirm the change in language by pressing the "Listen to an Example" or "Play" button to test it.

If the Google TTS language isn't suitable for you, you can install alternatives. Samsung devices, for instance, will come with their own Samsung Text-to-Speech engine, which your device will default to.

Installing Third-Party Text-to-Speech Engines

Alternative third-party Text-to-Speech engines are also available. These can be installed from the Google Play Store, or you can install them manually. Example TTS engines you could install include Acapela and eSpeak TTS , although others are available.

Once installed from the Google Play Store, these third-party TTS engines will appear in your Text-to-Speech settings.

Changing Text-to-Speech Engine

If you've installed a new Text-to-Speech engine and you want to change it, go to the "Text-to-Speech" settings menu.

At the top, you should see a list of your available TTS engines. If you have a Samsung device, you might need to tap the "Preferred Engine" option to see your list.

Tap on Preferred engine in your Text-to-speech settings menu

Tap on your preferred engine, whether it's Google Text-to-Speech or a third-party alternative.

With your new TTS engine selected, tap "Listen to an Example" or "Play" (depending on your device) to test it.

For most users, the default Google or Samsung Text-to-Speech engines will offer the best sounding speech generation, but third-party options could work better for other languages where the default engine isn't suitable.

Once your engine and languages are selected, you're free to use it with any Android app that supports it.

Google Upgrades Text-to-Speech Voices on Android

Sample Current Speaker	Sample Upgraded Speaker

Google Voices

The new models will be used for all 421 voices in 67 languages offered through Speech Services by Google. The upgrade will happen on the backend, so developers using text-to-speech and other voice services won’t have to change anything. Google will include the new voices automatically when users download the latest update for any 64-bit Android device from the Google Play Store. Plenty of native Google apps will deploy the new voices too, including Google Maps and Google Translate.

“We are upgrading the Speech Services by Google speech engine in a big way, providing clearer, more natural voice,” Google staff software engineer Rakesh Iyer and group product manager Leland Rechis explained in a blog post. “We’ve seen a significant side by side quality increase with this change, particularly in respects to clarity and naturalness. With this upgrade we will also be changing the default voice in en-US to one that is built using fresher speaker data, which alongside our new stack, results in a drastic improvement.”

Speech Clarity

Google’s ongoing improvements to its speech synthesis models are crucial as more and more companies look for advanced human speech synthesis as part of their software for both consumers and enterprise clients. Companies are eager to boast of improvements and new features, as exemplified recently by Nvidia’s upgraded Riva synthetic speech engine, WellSaid Labs’ upgraded voice models, and Neosapience rolling out an AI-powered tool allowing users to write out the emotion they want virtual actors to use when speaking. Reading websites, directions, or novels all require realistic sounding voices, and Google isn’t likely to forget that as it pushes out similar upgrades to its Speech Services in the future.

WellSaid Labs Teaches Synthetic Text-to-Speech AI Proper Pronunciation

New Neosapience Tool Synthesizes Any Text into Emotion for Virtual Actor Speeches – Exclusive

Nvidia Riva Synthetic Speech AI Adds New Languages and Voice Recognition

Subscribe to Voicebot Weekly

McDonald’s Abandons Drive Through AI for Order Taking

Apple Debuts ‘Apple Intelligence’ Generative AI Features Across All Devices

Stability AI Shares Open-Source Generative AI Audio Model for Creative Sound Design

Fable Studio Launches Generative AI TV Show Production Platform for Custom Streaming Content

Android Police

Google assistant gets a new voice as gemini ai comes to google home.

Your changes have been saved

Email is sent

Email has already been sent

Please verify your email address.

You’ve reached your account maximum for followed topics.

Google Home: Everything you need to know about the smart home platform

Google home's new thermostat ui is rolling out widely, and it even supports ecobee, i didn't want to like this walmart streaming device, but it won me over.

Google is introducing new devices and AI models to enhance the smart home experience for users starting this year.
Gemini AI will allow Nest Cameras to provide detailed event descriptions, while a Help Me Create feature simplifies automations.
New voices for Google Assistant will offer a more conversational experience with better understanding and follow-up capabilities.

Google appears to be revamping its smart home portfolio, with two new devices unveiled on the same day. The Mountain View, California-based company introduced a new streaming device, called the Google TV Streamer , alongside the 4th-Gen Nest Learning Thermostat . Both new devices are available for pre-order starting today, with the thermostat officially available starting August 20 and the Google TV Streamer on September 24.

Now, as part of a wider smart home push, Google has announced that it is using its advanced Gemini AI models to redefine how users interact with smart home devices in their homes, supercharging the tech giant's Nest Cameras, automations within the Google Home app, and "a variety of new voices" for the Google Assistant.

The Google Home icon overlaid on a colorful Android Police icon illustration

Google Home is essential for your smart home setup and makes your day a little simpler

In a new blog post , the tech giant announced that its Nest Camera should soon be able to see beyond the obvious. With Gemini, the camera's feed can be analyzed to provide detailed descriptions of events happening in front of your Nest Camera. "Your Nest cameras will go from understanding a narrow set of specific things (i.e., motion, people, packages, etc.) to being able to more broadly understand what it sees and hears, and then surface what’s most important," wrote Google in its blog post.

So, for example, if you have a camera pointing towards your backyard, and your dog happens to be causing mischief there, the camera might pick it up as an animal event. Going forward (later this year), with Gemini integration, your Google Home app will be able to offer context into what the camera detected, likely specifying that the "dog was digging in the garden."

Such granular control should help users search for certain events within the Google Home app, like "Did the kids leave their bikes in the driveway?" The app will then sift through the feed to find the event in recent days, as seen in the GIF above.

It's worth noting that users would need a Nest Aware subscription to take advantage of the Gemini context feature.

Creating automations has never been easier

Creating automations isn't difficult, but tedious. Google is introducing a new Help me create feature within the Google Home app, which, as the name suggests, will help you create automations by describing what you want in plain language. In an example shared by the tech giant, seen in the GIF above, users would be able to use prompts like "Help the kids remember to put their bikes away when they get home from school." The tool can then create an automation and describe what will happen for you to review. In this case, it described it as "When someone gets home between 3:30 p.m. and 5:00 p.m. on weekdays, turn on the garage light, and broadcast a message."

The automation likely employs three smart home devices: a smart camera to detect when someone is home, smart speakers to announce/broadcast the reminder, and smart lights in the garage to help the kids put their bikes away.

Help me create and Gemini context for events will roll out to a limited number of Nest Aware subscribers in Public Preview later this year. The tech giant said that it will expand the features over time but wants to take it slow to get it right.

A more natural Google Assistant

In its blog post, Google said that users can expect to experiment with "a variety of new voices" later this year, built to make "interacting with your devices feel more conversational."

The tech giant said that it is improving the assistant on your Nest speakers and displays with Gemini tech, which should soon be able to better understand you, allowing you to put forward your queries in a more natural way. Elsewhere, users would also be able to ask follow-up questions without having to offer context about their previous query, offering an overall more natural dialogue with the assistant.

Google Home
Google Gemini

WaveNet launches in the Google Assistant

Aäron van den Oord, Tom Walters

Copy link ×

Just over a year ago we presented WaveNet , a new deep neural network for generating raw audio waveforms that is capable of producing better and more realistic-sounding speech than existing techniques. At that time, the model was a research prototype and was too computationally intensive to work in consumer products.

But over the last 12 months we have worked hard to significantly improve both the speed and quality of our model and today we are proud to announce that an updated version of WaveNet is being used to generate the Google Assistant voices for US English and Japanese across all platforms.

Using the new WaveNet model results in a range of more natural sounding voices for the Assistant.

US English voice I

Us english voice ii, us english third party voice, japanese voice.

To understand why WaveNet improves on the current state of the art, it is useful to understand how text-to-speech (TTS) - or speech synthesis - systems work today.

The majority of these are based on so-called concatenative TTS , which uses a large database of high-quality recordings, collected from a single voice actor over many hours. These recordings are split into tiny chunks that can then be combined - or concatenated - to form complete utterances as needed. However, these systems can result in unnatural sounding voices and are also difficult to modify because a whole new database needs to be recorded each time a set of changes, such as new emotions or intonations, are needed.

To overcome some of these problems, an alternative model known as parametric TTS is sometimes used. This does away with the need for concatenating sounds by using a series of rules and parameters about grammar and mouth movements to guide a computer-generated voice. Although cheaper and quicker, this method creates less natural sounding voices.

WaveNet takes a totally different approach. In the original paper we described a deep generative model that can create individual waveforms from scratch, one sample at a time, with 16,000 samples per second and seamless transitions between individual sounds.

The structure of the convolutional neural network that underpins the original WaveNet model

It was built using a convolutional neural network , which was trained on a large dataset of speech samples. During this training phase, the network determined the underlying structure of the speech, such as which tones followed each other and what waveforms were realistic (and which were not). The trained network then synthesised a voice one sample at a time, with each generated sample taking into account the properties of the previous sample. The resulting voice contained natural intonation and other features such as lip smacks. Its “accent” depended on the voices it had trained on, opening up the possibility of creating any number of unique voices from blended datasets. As with all text-to-speech systems, WaveNet used a text input to tell it which words it should generate in response to a query.

Building up sound waves at such high-fidelity using the original model was computationally expensive, meaning WaveNet showed promise but was not something we could deploy in the real world. But over the last 12 months our teams have worked hard to develop a new model that is capable of more quickly generating waveforms. It is also now capable of running at scale and is the first product to launch on Google’s latest TPU cloud infrastructure .

The WaveNet team will now turn their focus to preparing a publication detailing the research behind the new model, but the results speak for themselves. The new, improved WaveNet model still generates a raw waveform but at speeds 1,000 times faster than the original model, meaning it requires just 50 milliseconds to create one second of speech. In fact, the model is not just quicker, but also higher-fidelity, capable of creating waveforms with 24,000 samples a second. We have also increased the resolution of each sample from 8 bits to 16 bits, the same resolution used in compact discs.

This makes the new model more natural sounding according to tests with human listeners. For example, the new US English voice I gets a mean-opinion-score (MOS) of 4.347 on a scale of 1-5, where even human speech is rated at just 4.667.

The new model also retains the flexibility of the original WaveNet, allowing us to make better use of large amounts of data during the training phase. Specifically, we can train the network using data from multiple voices. This can then be used to generate high-quality, nuanced voices even where there is little training data available for the desired output voice.

We believe this is just the start for WaveNet and we are excited by the possibilities that the power of a voice interface could now unlock for all the world's languages.

Recent in ML

Language models
Speech recognition

Recent in NLP

Data science
Data analytics
Data management
Synthetic data

Recent in Data

Robotic process automation
Intelligent automation

Recent in Automation

Cloud computing
Edge computing
Data centres
Quantum computing
Industrials / Manufacturing
Consumer tech
Health care

Recent in Verticals

Data Governance
Explainable AI

Recent in Responsible AI

Recent in Companies

White Papers
Generate leads with us
AI Business TV London 2022
Tech TV Austin 2022
AI Business TV New York 2022

Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode

Google's new AI-powered voice assistant offers advanced conversational capabilities and app integrations

August 15, 2024

Google has unveiled Gemini Live , a conversational voice assistant that’s set to rival OpenAI ’s Voice Mode .

Available through the Gemini app on Android and iOS, the new Live feature allows users to interact with the AI using their voice.

Powered by Google’s Gemini 1.5 Flash model , the Live feature can answer questions across a variety of generated voices, 10 in total. Users can ask the chatbot to manage their shopping lists or summarize incoming emails.

“With Gemini, we’re reimagining what it means for a personal assistant to be truly helpful,” said Sissie Hsiao, Google’s general manager for Gemini experiences and Google Assistant. “Gemini is evolving to provide AI-powered mobile assistance that will offer a new level of help — all while being more natural, conversational and intuitive.”

Google’s answer to ChatGPT Voice Mode lets users talk to the chatbot when moving to a different app and even when their phone is locked, allowing for interactions to occur as if they were taking a regular phone call.

Gemini Live is currently available in English to Gemini Advanced subscribers on Android phones, before coming to iOS and more languages in the coming weeks.

Gemini Advanced offers a free trial for the first month, with a subscription cost of $20 per month thereafter.

Related: OpenAI Rolls Out Upgraded ChatGPT Voice Mode to Plus Subscribers

In addition to the new voice functionality, subscribers have access to the Gemini 1.5 Pro model, and its mammoth input length, as well as more storage, access to Gemini in Workspace applications and the ability to upload files for the chatbot to interact with.

Live is getting further extensions — including interoperability with other Google apps such as YouTube Music, where the chatbot can create playlists from voice prompts.

Also in the works is calendar support, allowing the chatbot to interact with a user’s calendar app to set reminders about upcoming events.

New features are expected in the coming weeks.

“Because Gemini has built deep integrations for Android, it can do more than just read the screen,” Hsiao wrote in a blog post. “ It can interact with many of the apps you already use. For example, you can drag and drop images that Gemini generates directly into apps like Gmail and Google Messages.”

In addition to new functionality, Google plans to improve the speed and quality of Live responses. The underlying 1.5 Flash model was unveiled at this year’s Google I/O event and despite being smaller than the flagship 1.5 Pro model, it still boasts the same hefty context window, meaning it can handle huge data inputs.

Related: Google Refines AI Overviews After Bizarre Responses, Limits Content

Gemini Live comes as OpenAI steps up its improvements to ChatGPT’s audio feature, with the new GPT-4o greatly improving the chatbot’s voice functionality.

OpenAI recently began rolling out the newly revamped ChatGPT Voice Mode , though it’s currently locked to a small group of ChatGPT Plus subscribers.

Some might say Google is simply copying ChatGPTs Voice Mode, but the search company has been working on something similar for some time.

Gemini Live is a glimpse of what its researchers have been working on, with a conversational agent being teased at I/O back in May under the tagline Project Astra .

About the Author

Ben Wodecki

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Latest News

Use Google Assistant to type with your voice

You can dictate text through your voice with Assistant voice typing on Gboard.

Punctuation is automatically added as you speak.
You can type on your keyboard even if the mic is still on.

On Pixel 8+ and coming soon to Pixel 7, if you use multiple languages, Assistant voice typing can now automatically detect your spoken language seamlessly.

In the new “Language tag,” you can find the language detected as you speak.
To manually switch languages, switch to the relevant language keyboard.

What you need

To use Assistant voice typing, you must have:

A Pixel 6 or later, including Fold
Android 12 or later
Google Assistant turned on

Turn Assistant voice typing on or off

On your phone or tablet, open any app that you can type with, like Messages or Gmail.
Tap where you can enter text.

Turn Assistant voice typing on or off.

Tip: Based on words you've corrected in Gboard, Assistant voice typing improves just for you. You can change your voice typing personalization at any time .

Use voice commands

Important: Assistant voice typing is on by default unless you use multiple languages on your device. You can turn Assistant voice typing off at any time.

Say the text you want to type. If your microphone is still on, the microphone icon continues to glow.
To delete the last word: Say "Delete last word."
To delete the last sentence: Say "Clear."
To clear the text: Say "Clear all."
To send a message: Say "Send."
To fill out the next open field in a form: Say "Next."
To add an emoji: Say the name of the emoji, like "Smiley emoji."
To stop voice typing: Say "Stop."

Important: If you’re using "Fix it" for the first time, in the pop-up, tap Continue . You can use "Fix it" on most apps.

Eligibility requirements:

Pixel 8 or 8 Pro
English only
United States
Network connectivity

You can check and correct any typed, pasted, or voice dictated text in your text box for writing errors like typos, grammar, and punctuation.

On your device, open the app where you'll input your text.
Say the text that you want to type.
To apply corrections to your text, say “Fix it” or tap on the suggestion that's shown when an error is detected.
“More fixes” is shown after you say or tap Fix it .
Optional: To remove the correction, say or tap Undo .

Use multiple languages with Assistant voice typing

Turn Assistant voice typing on. Learn how to turn Assistant voice typing on .

English, French, German, Italian, Japanese, and Spanish are supported by Assistant voice typing.

For supported locales, voice typing is on by default, and you can switch between languages automatically. The language you’re speaking will be detected by Assistant.

Edit text with voice commands

If Assistant misheard a word, tap on the word to select it.
To correct a misheard word, speak, type, or spell out a correction, or tap one of the suggested alternatives.

Turn off automated punctuation

On your phone or tablet, open any app that you can type with, like Messages or Gmail.
Turn off Auto punctuation .

Manage voice typing personalization

Fix issues with assistant voice typing on gboard.

Make sure your phone is connected to Wi-Fi.
Restart your phone.
Charge your phone overnight.

Turn on & connect

Open your device's Settings app.

Learn how to connect to Wi-Fi networks on your Android device.

Assistant voice typing is available in:

French
German
Japanese

Switch languages on Gboard

On your Android phone or tablet, open any app that you can type with, like Gmail or Keep.
Touch and hold the space bar to switch languages.

Switch languages on your device

Drag your language to the top of the list.
Learn how to check your Android version.
Learn how to update your Android apps.

Related resources

Translate as you type
Type with your voice
Get word suggestions & fix mistakes
Change your keyboard theme, sound, or vibration

A smarter phone number

A Voice number works on smartphones and the web so you can place and receive calls from anywhere

Save time, stay connected

From simple navigation to voicemail transcription, Voice makes it easier than ever to save time while staying connected

Image showing Google Voice's calls page and voicemail page.

Take control of your calls

Forward calls to any device and have spam calls silently blocked. With Voice, you decide who can reach you and when.

Image showing Google Voice's spam page, showing a list of calls that were marked as spam

Apple, the Apple logo, and iPhone are trademarks of Apple Inc., registered in the U.S. and other countries.

'ZDNET Recommends': What exactly does it mean?

ZDNET's recommendations are based on many hours of testing, research, and comparison shopping. We gather data from the best available sources, including vendor and retailer listings as well as other relevant and independent reviews sites. And we pore over customer reviews to find out what matters to real people who already own and use the products and services we’re assessing.

When you click through from our site to a retailer and buy a product or service, we may earn affiliate commissions. This helps support our work, but does not affect what we cover or how, and it does not affect the price you pay. Neither ZDNET nor the author are compensated for these independent reviews. Indeed, we follow strict guidelines that ensure our editorial content is never influenced by advertisers.

ZDNET's editorial team writes on behalf of you, our reader. Our goal is to deliver the most accurate information and the most knowledgeable advice possible in order to help you make smarter buying decisions on tech gear and a wide array of products and services. Our editors thoroughly review and fact-check every article to ensure that our content meets the highest standards. If we have made an error or published misleading information, we will correct or clarify the article. If you see inaccuracies in our content, please report the mistake via this form .

How to try Google's new Gemini Live AI assistant for free

The current generation of mobile voice assistants -- with our simple requests often requiring multiple attempts to be understood -- surely leaves room for improvement. If you're ready for an upgraded experience, voice assistants supported by generative AI may be the solution, and there are some you can try today.

On Tuesday at Made by Google , Google finally released Gemini Live, its advanced mobile conversational experience that enables users to have free-flowing conversations with an AI assistant, or -- as Google describes it -- "a sidekick in your pocket."

Also: Google's new Pixel Screenshots may be the feature that finally converts me to use AI

The Live experience is supposed to mimic a conversation with a human. As a result, it can be interrupted, carry on multi-turn conversations, and even resume a prior conversation.

Users also can continue talking to it in the background or while their phone is locked, much as an ordinary phone call with a friend or relative. There are 10 voices users can pick from, all resembling human voices with different intonations, sounds, and more. You can listen to the video below:

We’re introducing Gemini Live, a more natural way to interact with Gemini. You can now have a free-flowing conversation, and even interrupt or change topics just like you might on a regular phone call. Available to Gemini Advanced subscribers. #MadeByGoogle pic.twitter.com/eNjlNKubsv — Google (@Google) August 13, 2024

Sound too good to be true? If you want to try it out for yourself, you can -- for free!

How to access

Gemini Live is rolling out to Android users subscribed to Gemini Advanced, which is offered as part of the Google One AI Premium Plan that costs $20 monthly. If you are an iPhone user, no worries, Google says Gemini Live will expand to iOS in the coming weeks.

The $20 per month cost -- which is on par with other premium AI plans like ChatGPT Plus and Claude Pro -- includes other perks besides Gemini Advanced, such as Gemini in Gmail and Docs, 2 TB of storage, and unlimited Magic Editor in Google Photos. Still, if you want to try it before committing to buy it, there are two ways you can do so for free.

The first (and easiest) way to try Gemini Live for free is a one-month free trial of the Google One AI Premium Plan.

Also: OpenAI reveals an updated GPT-4o model - but can't quite explain how it's better

To get started, visit the Google One AI Premium Plan webpage , click "Try it for one month," and sign in.

You will be asked to enter payment information because -- once the trial ends -- you start getting charged. To avoid paying, I suggest setting a reminder on your phone to end the subscription before the trial ends.

If you are invested in Google's AI offerings and want to have the full experience, a second option is to purchase the newly released Pixel Pro 9, which includes access to Gemini Advanced at no additional cost for the first year, a $240 value.

OpenAI offers a similar experience with its new and improved Voice Mode . However, this is rolling out in alpha to a small group of ChatGPT Plus users. Therefore, if you want a guaranteed way to access this experience, the Google One AI Premium Plan is likely your best bet -- at least for now.

Artificial Intelligence

Gemini live is finally available. here's how you can access it (and why you'll want to), gemini to replace google assistant as android's default - but you still have options, chatgpt vs. microsoft copilot vs. gemini: which is the best ai chatbot.

Gemini Live, Google’s answer to ChatGPT’s Advanced Voice Mode, launches

Gemini Live, Google’s answer to the recently launched (in limited alpha) Advanced Voice Mode for OpenAI’s ChatGPT , is rolling out on Tuesday, months after being announced at Google’s I/O 2024 developer conference . It was announced at Google’s Made by Google 2024 event .

Gemini Live lets users have “in-depth” voice chats with Gemini, Google’s generative AI-powered chatbot , on their smartphones. Thanks to an enhanced speech engine that delivers what Google claims is more consistent, emotionally expressive and realistic multi-turn dialogue, people can interrupt Gemini while the chatbot’s speaking to ask follow-up questions, and it’ll adapt to their speech patterns in real time.

Here’s how Google describes it in a blog post: “With Gemini Live [via the Gemini app ], you can talk to Gemini and choose from [10 new] natural-sounding voices it can respond with. You can even speak at your own pace or interrupt mid-response with clarifying questions, just like you would in any conversation.”

Gemini Live is hands-free if you want it to be. You can keep speaking with the Gemini app in the background or when your phone’s locked, and conversations can be paused and resumed at any time.

So how might this be useful? Google gives the example of rehearsing for a job interview — a bit of an ironic scenario , but OK. Gemini Live can practice with you, Google says, giving speaking tips and suggesting skills to highlight when speaking with a hiring manager (or AI, as the case may be ).

One advantage Gemini Live might have over ChatGPT’s Advanced Voice Mode is a better memory. The architecture of the generative AI model underpinning Live, Gemini 1.5 Pro and Gemini 1.5 Flash, has a longer-than-average “context window,” meaning they can take in and reason over a lot of data — theoretically hours of back-and-forth conversations — before crafting a response.

“Live uses our Gemini Advanced models that we have adapted to be more conversational,” a Google spokesperson told TechCrunch via email. “The model’s large context window is utilized when users have long conversations with Live.”

We’ll have to see how well this all works in practice, of course. If OpenAI’s setbacks with Advanced Voice Mode are any indication, rarely do demos translate seamlessly to the real world.

On that subject, Gemini Live doesn’t have one of the capabilities Google showcased at I/O just yet: multimodal input. Back in May, Google released pre-recorded videos showing Gemini Live seeing and responding to users’ surroundings via photos and footage captured by their phones’ cameras — for example, naming a part on a broken bicycle or explaining what a portion of code on a computer screen does.

Multimodal input will arrive “later this year,” Google said, declining to provide specifics. Also later this year, Live will expand to additional languages and to iOS via the Google app; it’s only available in English for the time being.

Gemini Live, like Advanced Voice Mode, isn’t free. It’s exclusive to Gemini Advanced, a more sophisticated version of Gemini that’s gated behind the Google One AI Premium Plan , priced at $20 per month.

Other new Gemini features on the way are free, though.

Android users can soon (in the coming weeks) bring up Gemini’s overlay on top of any app they’re using to ask questions about what’s on the screen (e.g., a YouTube video) by holding their phone’s power button or saying, “Hey Google.” Gemini will be able to generate images (but still not images of people , unfortunately) directly from the overlay — images that can be dragged and dropped into apps like Gmail and Google Messages.

Gemini is also gaining new integrations with Google services (or “extensions,” as the company prefers to call them) both on mobile and the web. In the coming weeks, Gemini will be able to take more actions with Google Calendar, Keep, Tasks, YouTube Music and Utilities, the apps that control on-device features like timers and alarms, media controls, the flashlight, volume, Wi-Fi, Bluetooth and so on.

In a blog post, Google gives a few ideas of how people might take advantage. Sounds nifty, assuming it all works reliably:

Ask Gemini to “make a playlist of songs that remind me of the late ’90s.”
Snap a photo of a concert flier and ask Gemini if you’re free that day — and even set a reminder to buy tickets.
Have Gemini dig out a recipe from Gmail and ask it to add the ingredients to your shopping list in Keep.

Lastly, starting later this week, Gemini will be available on Android tablets.

More TechCrunch

Get the industry’s biggest tech news, techcrunch daily news.

Every weekday and Sunday, you can get the best of TechCrunch’s coverage.

Startups Weekly

Startups are the core of TechCrunch, so get our best coverage delivered weekly.

TechCrunch Fintech

The latest Fintech news and analysis, delivered every Tuesday.

TechCrunch Mobility

TechCrunch Mobility is your destination for transportation news and insight.

Ben Horowitz Declares War on Michael Moritz

A social media spat between billionaire tech investors is raising questions about the journalistic independence of three-year-old news outfit SF Standard, after a reporter representing the outlet reached out to…

California AI bill SB 1047 aims to prevent AI disasters, but Silicon Valley warns it will cause one

SB 1047 has drawn the ire of Silicon Valley players large and small, including venture capitalists, big tech trade groups, researchers and startup founders.

California weakens bill to prevent AI disasters before final vote, taking advice from Anthropic

California’s bill to prevent AI disasters, SB 1047, has faced significant opposition from many parties in Silicon Valley. Today, California lawmakers bent slightly to that pressure, adding in several amendments…

Meta axed CrowdTangle, a tool for tracking disinformation. Critics claim its replacement has just ‘1% of the features’

Journalists, researchers and politicians are mourning Meta’s shutdown of CrowdTangle, which they used to track the spread of disinformation on Facebook and Instagram. In CrowdTangle’s place, Meta is offering its…

Rivian launches smaller $1,400 camp kitchen, 5 years after initial demo

The Rivian camp kitchen attracted buzz from almost the moment it appeared as a prototype in 2019 at Overland Expo West. Despite interest in the accessory, Rivian never actually sold…

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

3D printing stalwart Formlabs confirms ‘small number’ of layoffs

The layoffs occurred in waves over the past two years, and as recently as the last few weeks.

NASA and Rocket Lab aim to prove we can go to Mars for 1/10 the price

A pair of Rocket Lab-made spacecraft are about to embark on a two-step journey. The first step is the 55-hour, 2,500-mile stretch from California to the launch site at Cape…

CannonKeys launches a modern take on a classic mechanical keyboard with the Sat75 X

At a price of $111 for the Sat75 X board, this is a fun and easy way to get into building a custom mechanical keyboard without breaking the bank.

Jeff Bezos’ brother’s firm has launched a debut $100M VC fund called HIPstr

HighPost Capital, a private equity firm run by Mark Bezos, Jeff Bezos’ younger brother, and PE veteran David Moross, has launched a new venture capital arm.

Apple, Google wallets will soon support California driver’s licenses

California residents will soon be able to store their driver’s license or state ID in their Apple Wallet or Google Wallet apps, as the state’s government announced Thursday that support…

Bluesky’s UK surge has had little impact on X

Despite the influx of U.K. users to Bluesky, other new data indicates that it’s still Meta’s Threads, not Bluesky, that’s better poised to challenge X.

WeRide preps for an IPO, meet the man who built a startup pipeline at CNH and Waymo’s nightly honk-a-thon

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! We brought…

Geekbench releases AI benchmarking app

It’s a successor to Geekbench ML.

Meta’s X rival Threads gains multiple drafts, audience insights and more

Meta’s X rival Threads announced a number of new features today, including the ability to store multiple drafts, a way to rearrange columns on the desktop and insights into the…

Franki’s app rewards you for posting video reviews of local restaurants

Franki is a social discovery and video-based review app where users can interact with a community of foodies, discover local dining spots and create their own videos showing off their…

Announcing judges for the Startup Battlefield at TechCrunch Disrupt 2024

The Startup Battlefield is one of the highlights of Disrupt, and we can’t wait to see which of the thousands of applicants will be selected to pitch to panels of top-tier…

From Lauri Moore to Vic Singh, venture capitalists continue to play musical chairs

When Keith Rabois announced he was leaving Founders Fund to return to Khosla Ventures in January, it came as a shock to many in the venture capital ecosystem — and…

Linktree acquires social media scheduler tool Plann

Link-in-bio platform Linktree announced Thursday that it has acquired social media scheduling tool Plann for an undisclosed amount. While Sydney, Australia-headquartered Plann will continue to operate as usual for now,…

Bridgit Mendler will talk about building the data highway between Earth and space at TechCrunch Disrupt 2024

Earlier this year, Bridgit Mendler surprised her fans when she announced that she was heading a new space data startup called Northwood Space. With Northwood, the former Disney star and…

Epic Games’ ‘MegaGrant’ makes EU alternative app store, AltStore PAL, available for free

AltStore PAL, an app that takes advantage of the EU’s Digital Markets Act (DMA) to bring a third-party app store to EU users, is now available for free, thanks to Epic…

CodeRabbit raises $16M to bring AI to code reviews

Code reviews — peer reviews of code that help devs improve code quality — are time-consuming. According to one source, 50% of companies spend two to five hours a week…

X begins rolling out support for passkeys on Android

X, formerly known as Twitter, has announced that it’s rolling out support for passkeys on Android. The launch comes as the social network rolled out support for passkeys to all…

Lockheed Martin to buy satellite maker Terran Orbital in $450M deal

Lockheed, which holds a 28.3% stake in Terran Orbital, will take the satellite maker private in a deal that’s expected to close before the end of 2024.

Waymo to double down on winter testing its robotaxis

Waymo regularly takes its autonomous vehicles on winter road trips to test the cars in snowy environments. In 2017, it was Michigan. Last year, it was Buffalo. This year, Waymo…

AI social media vetting startup Ferretly secures $2.5M, launches election personnel screening tool

Ferretly leverages AI to scan social media and publicly available online data to uncover potential risks and behaviors that traditional background checks may overlook.

TipRanks, an AI-based stock tip evaluator created after its founder got burned by bad advice, sells for $200M to Prytek

Prytek had already been a big investor in TipRanks since 2017, most recently leading a $77 million round in the company in 2021.

Klarna takes on banking with new savings, cash-back offerings

Swedish fintech giant Klarna is rolling out two new products on Thursday that could make its buy now, pay later offerings more enticing to use. The company is offering consumers…

Cockroach Labs shakes up its licensing to force bigger companies to pay

Cockroach Labs, the business and core developer behind the eponymous distributed SQL database known as CockroachDB, is changing its licensing once again — five years after it moved on from an open source model. The company revealed today that it’s consolidating its self-hosted product under a single enterprise license, a…

Binance restarts services in India after seven-month regulatory halt

Binance, the world’s largest cryptocurrency exchange, has resumed operations for users in India after a seven-month hiatus imposed by a local authority for operating “illegally” in the country. The exchange…

Google Pixel

Google Gemini’s voice chat mode is here

Gemini advanced subscribers can use gemini live for conversational voice chat..

By Wes Davis , a weekend editor who covers the latest in tech and entertainment. He has written news, reviews, and more as a tech journalist since 2020.

Share this story

Google is rolling out a new voice chat mode for Gemini, called Gemini Live, the company announced at its Pixel 9 event today . Available for Gemini Advanced subscribers, it works a lot like ChatGPT’s voice chat feature , with multiple voices to choose from and the ability to speak conversationally, even to the point of interrupting it without tapping a button.

Google says that conversations with Gemini Live can be “free-flowing,” so you can do things like interrupt an answer mid-sentence or pause the conversation and come back to it later. Gemini Live will also work in the background or when your phone is locked. Google first announced that Gemini Live was coming during its I/O developer conference earlier this year, where it also said Gemini Live would be able to interpret video in real time.

Google also has 10 new Gemini voices for users to pick from, with names like Ursa and Dipper. The feature has started rolling out today, in English only, for Android devices. The company says it will come to iOS and get more languages “in the coming weeks.”

In addition to Gemini Live, Google announced other features for its AI assistant, including new extensions coming later on, for apps like Keep, Tasks, Utilities, and YouTube Music. Gemini is also gaining awareness of the context of your screen, similar to AI features Apple announced at WWDC this year. After users tap “Ask about this screen” or “Ask about this video,” Google says Gemini can give you information, including pulling out details like destinations from travel videos to add to Google Maps.

X’s new AI image generator will make anything from Taylor Swift in lingerie to Kamala Harris with a gun

Microsoft is now in a handheld gaming pc race, apple is finally going to open up iphone tap-to-pay, kim dotcom is being megauploaded to the us for trial, sonos considers relaunching its old app.

More from this stream Google Pixel 9 launch event live coverage: all the news

Everything google announced at the pixel 9 launch event, take us to the demo sandbox, rick., “let’s talk live.”, new chip alert.

Español – América Latina
Português – Brasil
Documentation
Cloud Text-to-Speech API

Supported voices and languages

Text-to-Speech provides the following voices. The list includes Neural2 , Studio , Standard, and WaveNet voices. Studio, Neural2 and WaveNet voices are higher quality voices with different pricing ; in the list, they have the voice type 'Neural2', 'Studio' or 'WaveNet'.

To use these voices to create synthetic speech, see how to create synthetic voice audio .

Language	Voice type	Language code	Voice name	SSML Gender
Afrikaans (South Africa)	Standard	af-ZA	af-ZA-Standard-A	FEMALE
Arabic	Standard	ar-XA	ar-XA-Standard-A	FEMALE
Arabic	Standard	ar-XA	ar-XA-Standard-B	MALE
Arabic	Standard	ar-XA	ar-XA-Standard-C	MALE
Arabic	Standard	ar-XA	ar-XA-Standard-D	FEMALE
Arabic	Premium	ar-XA	ar-XA-Wavenet-A	FEMALE
Arabic	Premium	ar-XA	ar-XA-Wavenet-B	MALE
Arabic	Premium	ar-XA	ar-XA-Wavenet-C	MALE
Arabic	Premium	ar-XA	ar-XA-Wavenet-D	FEMALE
Basque (Spain)	Standard	eu-ES	eu-ES-Standard-A	FEMALE
Bengali (India)	Standard	bn-IN	bn-IN-Standard-A	FEMALE
Bengali (India)	Standard	bn-IN	bn-IN-Standard-B	MALE
Bengali (India)	Standard	bn-IN	bn-IN-Standard-C	FEMALE
Bengali (India)	Standard	bn-IN	bn-IN-Standard-D	MALE
Bengali (India)	Premium	bn-IN	bn-IN-Wavenet-A	FEMALE
Bengali (India)	Premium	bn-IN	bn-IN-Wavenet-B	MALE
Bengali (India)	Premium	bn-IN	bn-IN-Wavenet-C	FEMALE
Bengali (India)	Premium	bn-IN	bn-IN-Wavenet-D	MALE
Bulgarian (Bulgaria)	Standard	bg-BG	bg-BG-Standard-A	FEMALE
Catalan (Spain)	Standard	ca-ES	ca-ES-Standard-A	FEMALE
Chinese (Hong Kong)	Standard	yue-HK	yue-HK-Standard-A	FEMALE
Chinese (Hong Kong)	Standard	yue-HK	yue-HK-Standard-B	MALE
Chinese (Hong Kong)	Standard	yue-HK	yue-HK-Standard-C	FEMALE
Chinese (Hong Kong)	Standard	yue-HK	yue-HK-Standard-D	MALE
Czech (Czech Republic)	Standard	cs-CZ	cs-CZ-Standard-A	FEMALE
Czech (Czech Republic)	Premium	cs-CZ	cs-CZ-Wavenet-A	FEMALE
Danish (Denmark)	Premium	da-DK	da-DK-Neural2-D	FEMALE
Danish (Denmark)	Standard	da-DK	da-DK-Standard-A	FEMALE
Danish (Denmark)	Standard	da-DK	da-DK-Standard-C	MALE
Danish (Denmark)	Standard	da-DK	da-DK-Standard-D	FEMALE
Danish (Denmark)	Standard	da-DK	da-DK-Standard-E	FEMALE
Danish (Denmark)	Premium	da-DK	da-DK-Wavenet-A	FEMALE
Danish (Denmark)	Premium	da-DK	da-DK-Wavenet-C	MALE
Danish (Denmark)	Premium	da-DK	da-DK-Wavenet-D	FEMALE
Danish (Denmark)	Premium	da-DK	da-DK-Wavenet-E	FEMALE
Dutch (Belgium)	Standard	nl-BE	nl-BE-Standard-A	FEMALE
Dutch (Belgium)	Standard	nl-BE	nl-BE-Standard-B	MALE
Dutch (Belgium)	Premium	nl-BE	nl-BE-Wavenet-A	FEMALE
Dutch (Belgium)	Premium	nl-BE	nl-BE-Wavenet-B	MALE
Dutch (Netherlands)	Standard	nl-NL	nl-NL-Standard-A	FEMALE
Dutch (Netherlands)	Standard	nl-NL	nl-NL-Standard-B	MALE
Dutch (Netherlands)	Standard	nl-NL	nl-NL-Standard-C	MALE
Dutch (Netherlands)	Standard	nl-NL	nl-NL-Standard-D	FEMALE
Dutch (Netherlands)	Standard	nl-NL	nl-NL-Standard-E	FEMALE
Dutch (Netherlands)	Premium	nl-NL	nl-NL-Wavenet-A	FEMALE
Dutch (Netherlands)	Premium	nl-NL	nl-NL-Wavenet-B	MALE
Dutch (Netherlands)	Premium	nl-NL	nl-NL-Wavenet-C	MALE
Dutch (Netherlands)	Premium	nl-NL	nl-NL-Wavenet-D	FEMALE
Dutch (Netherlands)	Premium	nl-NL	nl-NL-Wavenet-E	FEMALE
English (Australia)	Premium	en-AU	en-AU-Neural2-A	FEMALE
English (Australia)	Premium	en-AU	en-AU-Neural2-B	MALE
English (Australia)	Premium	en-AU	en-AU-Neural2-C	FEMALE
English (Australia)	Premium	en-AU	en-AU-Neural2-D	MALE
English (Australia)	Premium	en-AU	en-AU-News-E	FEMALE
English (Australia)	Premium	en-AU	en-AU-News-F	FEMALE
English (Australia)	Premium	en-AU	en-AU-News-G	MALE
English (Australia)	Premium	en-AU	en-AU-Polyglot-1	MALE
English (Australia)	Standard	en-AU	en-AU-Standard-A	FEMALE
English (Australia)	Standard	en-AU	en-AU-Standard-B	MALE
English (Australia)	Standard	en-AU	en-AU-Standard-C	FEMALE
English (Australia)	Standard	en-AU	en-AU-Standard-D	MALE
English (Australia)	Premium	en-AU	en-AU-Wavenet-A	FEMALE
English (Australia)	Premium	en-AU	en-AU-Wavenet-B	MALE
English (Australia)	Premium	en-AU	en-AU-Wavenet-C	FEMALE
English (Australia)	Premium	en-AU	en-AU-Wavenet-D	MALE
English (India)	Premium	en-IN	en-IN-Neural2-A	FEMALE
English (India)	Premium	en-IN	en-IN-Neural2-B	MALE
English (India)	Premium	en-IN	en-IN-Neural2-C	MALE
English (India)	Premium	en-IN	en-IN-Neural2-D	FEMALE
English (India)	Standard	en-IN	en-IN-Standard-A	FEMALE
English (India)	Standard	en-IN	en-IN-Standard-B	MALE
English (India)	Standard	en-IN	en-IN-Standard-C	MALE
English (India)	Standard	en-IN	en-IN-Standard-D	FEMALE
English (India)	Premium	en-IN	en-IN-Wavenet-A	FEMALE
English (India)	Premium	en-IN	en-IN-Wavenet-B	MALE
English (India)	Premium	en-IN	en-IN-Wavenet-C	MALE
English (India)	Premium	en-IN	en-IN-Wavenet-D	FEMALE
English (UK)	Premium	en-GB	en-GB-Neural2-A	FEMALE
English (UK)	Premium	en-GB	en-GB-Neural2-B	MALE
English (UK)	Premium	en-GB	en-GB-Neural2-C	FEMALE
English (UK)	Premium	en-GB	en-GB-Neural2-D	MALE
English (UK)	Premium	en-GB	en-GB-Neural2-F	FEMALE
English (UK)	Premium	en-GB	en-GB-News-G	FEMALE
English (UK)	Premium	en-GB	en-GB-News-H	FEMALE
English (UK)	Premium	en-GB	en-GB-News-I	FEMALE
English (UK)	Premium	en-GB	en-GB-News-J	MALE
English (UK)	Premium	en-GB	en-GB-News-K	MALE
English (UK)	Premium	en-GB	en-GB-News-L	MALE
English (UK)	Premium	en-GB	en-GB-News-M	MALE
English (UK)	Standard	en-GB	en-GB-Standard-A	FEMALE
English (UK)	Standard	en-GB	en-GB-Standard-B	MALE
English (UK)	Standard	en-GB	en-GB-Standard-C	FEMALE
English (UK)	Standard	en-GB	en-GB-Standard-D	MALE
English (UK)	Standard	en-GB	en-GB-Standard-F	FEMALE
English (UK)	Studio	en-GB	en-GB-Studio-B	MALE
English (UK)	Studio	en-GB	en-GB-Studio-C	FEMALE
English (UK)	Premium	en-GB	en-GB-Wavenet-A	FEMALE
English (UK)	Premium	en-GB	en-GB-Wavenet-B	MALE
English (UK)	Premium	en-GB	en-GB-Wavenet-C	FEMALE
English (UK)	Premium	en-GB	en-GB-Wavenet-D	MALE
English (UK)	Premium	en-GB	en-GB-Wavenet-F	FEMALE
English (US)	Premium	en-US	en-US-Casual-K	MALE
English (US)	Premium	en-US	en-US-Journey-D	MALE
English (US)	Premium	en-US	en-US-Journey-F	FEMALE
English (US)	Premium	en-US	en-US-Journey-O	FEMALE
English (US)	Premium	en-US	en-US-Neural2-A	MALE
English (US)	Premium	en-US	en-US-Neural2-C	FEMALE
English (US)	Premium	en-US	en-US-Neural2-D	MALE
English (US)	Premium	en-US	en-US-Neural2-E	FEMALE
English (US)	Premium	en-US	en-US-Neural2-F	FEMALE
English (US)	Premium	en-US	en-US-Neural2-G	FEMALE
English (US)	Premium	en-US	en-US-Neural2-H	FEMALE
English (US)	Premium	en-US	en-US-Neural2-I	MALE
English (US)	Premium	en-US	en-US-Neural2-J	MALE
English (US)	Premium	en-US	en-US-News-K	FEMALE
English (US)	Premium	en-US	en-US-News-L	FEMALE
English (US)	Premium	en-US	en-US-News-N	MALE
English (US)	Premium	en-US	en-US-Polyglot-1	MALE
English (US)	Standard	en-US	en-US-Standard-A	MALE
English (US)	Standard	en-US	en-US-Standard-B	MALE
English (US)	Standard	en-US	en-US-Standard-C	FEMALE
English (US)	Standard	en-US	en-US-Standard-D	MALE
English (US)	Standard	en-US	en-US-Standard-E	FEMALE
English (US)	Standard	en-US	en-US-Standard-F	FEMALE
English (US)	Standard	en-US	en-US-Standard-G	FEMALE
English (US)	Standard	en-US	en-US-Standard-H	FEMALE
English (US)	Standard	en-US	en-US-Standard-I	MALE
English (US)	Standard	en-US	en-US-Standard-J	MALE
English (US)	Studio	en-US	en-US-Studio-O	FEMALE
English (US)	Studio	en-US	en-US-Studio-Q	MALE
English (US)	Premium	en-US	en-US-Wavenet-A	MALE
English (US)	Premium	en-US	en-US-Wavenet-B	MALE
English (US)	Premium	en-US	en-US-Wavenet-C	FEMALE
English (US)	Premium	en-US	en-US-Wavenet-D	MALE
English (US)	Premium	en-US	en-US-Wavenet-E	FEMALE
English (US)	Premium	en-US	en-US-Wavenet-F	FEMALE
English (US)	Premium	en-US	en-US-Wavenet-G	FEMALE
English (US)	Premium	en-US	en-US-Wavenet-H	FEMALE
English (US)	Premium	en-US	en-US-Wavenet-I	MALE
English (US)	Premium	en-US	en-US-Wavenet-J	MALE
Filipino (Philippines)	Standard	fil-PH	fil-PH-Standard-A	FEMALE
Filipino (Philippines)	Standard	fil-PH	fil-PH-Standard-B	FEMALE
Filipino (Philippines)	Standard	fil-PH	fil-PH-Standard-C	MALE
Filipino (Philippines)	Standard	fil-PH	fil-PH-Standard-D	MALE
Filipino (Philippines)	Premium	fil-PH	fil-PH-Wavenet-A	FEMALE
Filipino (Philippines)	Premium	fil-PH	fil-PH-Wavenet-B	FEMALE
Filipino (Philippines)	Premium	fil-PH	fil-PH-Wavenet-C	MALE
Filipino (Philippines)	Premium	fil-PH	fil-PH-Wavenet-D	MALE
Filipino (Philippines)	Premium	fil-PH	fil-ph-Neural2-A	FEMALE
Filipino (Philippines)	Premium	fil-PH	fil-ph-Neural2-D	MALE
Finnish (Finland)	Standard	fi-FI	fi-FI-Standard-A	FEMALE
Finnish (Finland)	Premium	fi-FI	fi-FI-Wavenet-A	FEMALE
French (Canada)	Premium	fr-CA	fr-CA-Neural2-A	FEMALE
French (Canada)	Premium	fr-CA	fr-CA-Neural2-B	MALE
French (Canada)	Premium	fr-CA	fr-CA-Neural2-C	FEMALE
French (Canada)	Premium	fr-CA	fr-CA-Neural2-D	MALE
French (Canada)	Standard	fr-CA	fr-CA-Standard-A	FEMALE
French (Canada)	Standard	fr-CA	fr-CA-Standard-B	MALE
French (Canada)	Standard	fr-CA	fr-CA-Standard-C	FEMALE
French (Canada)	Standard	fr-CA	fr-CA-Standard-D	MALE
French (Canada)	Premium	fr-CA	fr-CA-Wavenet-A	FEMALE
French (Canada)	Premium	fr-CA	fr-CA-Wavenet-B	MALE
French (Canada)	Premium	fr-CA	fr-CA-Wavenet-C	FEMALE
French (Canada)	Premium	fr-CA	fr-CA-Wavenet-D	MALE
French (France)	Premium	fr-FR	fr-FR-Neural2-A	FEMALE
French (France)	Premium	fr-FR	fr-FR-Neural2-B	MALE
French (France)	Premium	fr-FR	fr-FR-Neural2-C	FEMALE
French (France)	Premium	fr-FR	fr-FR-Neural2-D	MALE
French (France)	Premium	fr-FR	fr-FR-Neural2-E	FEMALE
French (France)	Premium	fr-FR	fr-FR-Polyglot-1	MALE
French (France)	Standard	fr-FR	fr-FR-Standard-A	FEMALE
French (France)	Standard	fr-FR	fr-FR-Standard-B	MALE
French (France)	Standard	fr-FR	fr-FR-Standard-C	FEMALE
French (France)	Standard	fr-FR	fr-FR-Standard-D	MALE
French (France)	Standard	fr-FR	fr-FR-Standard-E	FEMALE
French (France)	Studio	fr-FR	fr-FR-Studio-A	FEMALE
French (France)	Studio	fr-FR	fr-FR-Studio-D	MALE
French (France)	Premium	fr-FR	fr-FR-Wavenet-A	FEMALE
French (France)	Premium	fr-FR	fr-FR-Wavenet-B	MALE
French (France)	Premium	fr-FR	fr-FR-Wavenet-C	FEMALE
French (France)	Premium	fr-FR	fr-FR-Wavenet-D	MALE
French (France)	Premium	fr-FR	fr-FR-Wavenet-E	FEMALE
Galician (Spain)	Standard	gl-ES	gl-ES-Standard-A	FEMALE
German (Germany)	Premium	de-DE	de-DE-Neural2-A	FEMALE
German (Germany)	Premium	de-DE	de-DE-Neural2-B	MALE
German (Germany)	Premium	de-DE	de-DE-Neural2-C	FEMALE
German (Germany)	Premium	de-DE	de-DE-Neural2-D	MALE
German (Germany)	Premium	de-DE	de-DE-Neural2-F	FEMALE
German (Germany)	Premium	de-DE	de-DE-Polyglot-1	MALE
German (Germany)	Standard	de-DE	de-DE-Standard-A	FEMALE
German (Germany)	Standard	de-DE	de-DE-Standard-B	MALE
German (Germany)	Standard	de-DE	de-DE-Standard-C	FEMALE
German (Germany)	Standard	de-DE	de-DE-Standard-D	MALE
German (Germany)	Standard	de-DE	de-DE-Standard-E	MALE
German (Germany)	Standard	de-DE	de-DE-Standard-F	FEMALE
German (Germany)	Studio	de-DE	de-DE-Studio-B	MALE
German (Germany)	Studio	de-DE	de-DE-Studio-C	FEMALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-A	FEMALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-B	MALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-C	FEMALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-D	MALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-E	MALE
German (Germany)	Premium	de-DE	de-DE-Wavenet-F	FEMALE
Greek (Greece)	Standard	el-GR	el-GR-Standard-A	FEMALE
Greek (Greece)	Premium	el-GR	el-GR-Wavenet-A	FEMALE
Gujarati (India)	Standard	gu-IN	gu-IN-Standard-A	FEMALE
Gujarati (India)	Standard	gu-IN	gu-IN-Standard-B	MALE
Gujarati (India)	Standard	gu-IN	gu-IN-Standard-C	FEMALE
Gujarati (India)	Standard	gu-IN	gu-IN-Standard-D	MALE
Gujarati (India)	Premium	gu-IN	gu-IN-Wavenet-A	FEMALE
Gujarati (India)	Premium	gu-IN	gu-IN-Wavenet-B	MALE
Gujarati (India)	Premium	gu-IN	gu-IN-Wavenet-C	FEMALE
Gujarati (India)	Premium	gu-IN	gu-IN-Wavenet-D	MALE
Hebrew (Israel)	Standard	he-IL	he-IL-Standard-A	FEMALE
Hebrew (Israel)	Standard	he-IL	he-IL-Standard-B	MALE
Hebrew (Israel)	Standard	he-IL	he-IL-Standard-C	FEMALE
Hebrew (Israel)	Standard	he-IL	he-IL-Standard-D	MALE
Hebrew (Israel)	Premium	he-IL	he-IL-Wavenet-A	FEMALE
Hebrew (Israel)	Premium	he-IL	he-IL-Wavenet-B	MALE
Hebrew (Israel)	Premium	he-IL	he-IL-Wavenet-C	FEMALE
Hebrew (Israel)	Premium	he-IL	he-IL-Wavenet-D	MALE
Hindi (India)	Premium	hi-IN	hi-IN-Neural2-A	FEMALE
Hindi (India)	Premium	hi-IN	hi-IN-Neural2-B	MALE
Hindi (India)	Premium	hi-IN	hi-IN-Neural2-C	MALE
Hindi (India)	Premium	hi-IN	hi-IN-Neural2-D	FEMALE
Hindi (India)	Standard	hi-IN	hi-IN-Standard-A	FEMALE
Hindi (India)	Standard	hi-IN	hi-IN-Standard-B	MALE
Hindi (India)	Standard	hi-IN	hi-IN-Standard-C	MALE
Hindi (India)	Standard	hi-IN	hi-IN-Standard-D	FEMALE
Hindi (India)	Premium	hi-IN	hi-IN-Wavenet-A	FEMALE
Hindi (India)	Premium	hi-IN	hi-IN-Wavenet-B	MALE
Hindi (India)	Premium	hi-IN	hi-IN-Wavenet-C	MALE
Hindi (India)	Premium	hi-IN	hi-IN-Wavenet-D	FEMALE
Hungarian (Hungary)	Standard	hu-HU	hu-HU-Standard-A	FEMALE
Hungarian (Hungary)	Premium	hu-HU	hu-HU-Wavenet-A	FEMALE
Icelandic (Iceland)	Standard	is-IS	is-IS-Standard-A	FEMALE
Indonesian (Indonesia)	Standard	id-ID	id-ID-Standard-A	FEMALE
Indonesian (Indonesia)	Standard	id-ID	id-ID-Standard-B	MALE
Indonesian (Indonesia)	Standard	id-ID	id-ID-Standard-C	MALE
Indonesian (Indonesia)	Standard	id-ID	id-ID-Standard-D	FEMALE
Indonesian (Indonesia)	Premium	id-ID	id-ID-Wavenet-A	FEMALE
Indonesian (Indonesia)	Premium	id-ID	id-ID-Wavenet-B	MALE
Indonesian (Indonesia)	Premium	id-ID	id-ID-Wavenet-C	MALE
Indonesian (Indonesia)	Premium	id-ID	id-ID-Wavenet-D	FEMALE
Italian (Italy)	Premium	it-IT	it-IT-Neural2-A	FEMALE
Italian (Italy)	Premium	it-IT	it-IT-Neural2-C	MALE
Italian (Italy)	Standard	it-IT	it-IT-Standard-A	FEMALE
Italian (Italy)	Standard	it-IT	it-IT-Standard-B	FEMALE
Italian (Italy)	Standard	it-IT	it-IT-Standard-C	MALE
Italian (Italy)	Standard	it-IT	it-IT-Standard-D	MALE
Italian (Italy)	Premium	it-IT	it-IT-Wavenet-A	FEMALE
Italian (Italy)	Premium	it-IT	it-IT-Wavenet-B	FEMALE
Italian (Italy)	Premium	it-IT	it-IT-Wavenet-C	MALE
Italian (Italy)	Premium	it-IT	it-IT-Wavenet-D	MALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Neural2-B	FEMALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Neural2-C	MALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Neural2-D	MALE
Japanese (Japan)	Standard	ja-JP	ja-JP-Standard-A	FEMALE
Japanese (Japan)	Standard	ja-JP	ja-JP-Standard-B	FEMALE
Japanese (Japan)	Standard	ja-JP	ja-JP-Standard-C	MALE
Japanese (Japan)	Standard	ja-JP	ja-JP-Standard-D	MALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Wavenet-A	FEMALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Wavenet-B	FEMALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Wavenet-C	MALE
Japanese (Japan)	Premium	ja-JP	ja-JP-Wavenet-D	MALE
Kannada (India)	Standard	kn-IN	kn-IN-Standard-A	FEMALE
Kannada (India)	Standard	kn-IN	kn-IN-Standard-B	MALE
Kannada (India)	Standard	kn-IN	kn-IN-Standard-C	FEMALE
Kannada (India)	Standard	kn-IN	kn-IN-Standard-D	MALE
Kannada (India)	Premium	kn-IN	kn-IN-Wavenet-A	FEMALE
Kannada (India)	Premium	kn-IN	kn-IN-Wavenet-B	MALE
Kannada (India)	Premium	kn-IN	kn-IN-Wavenet-C	FEMALE
Kannada (India)	Premium	kn-IN	kn-IN-Wavenet-D	MALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Neural2-A	FEMALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Neural2-B	FEMALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Neural2-C	MALE
Korean (South Korea)	Standard	ko-KR	ko-KR-Standard-A	FEMALE
Korean (South Korea)	Standard	ko-KR	ko-KR-Standard-B	FEMALE
Korean (South Korea)	Standard	ko-KR	ko-KR-Standard-C	MALE
Korean (South Korea)	Standard	ko-KR	ko-KR-Standard-D	MALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Wavenet-A	FEMALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Wavenet-B	FEMALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Wavenet-C	MALE
Korean (South Korea)	Premium	ko-KR	ko-KR-Wavenet-D	MALE
Latvian (Latvia)	Standard	lv-LV	lv-LV-Standard-A	MALE
Lithuanian (Lithuania)	Standard	lt-LT	lt-LT-Standard-A	MALE
Malay (Malaysia)	Standard	ms-MY	ms-MY-Standard-A	FEMALE
Malay (Malaysia)	Standard	ms-MY	ms-MY-Standard-B	MALE
Malay (Malaysia)	Standard	ms-MY	ms-MY-Standard-C	FEMALE
Malay (Malaysia)	Standard	ms-MY	ms-MY-Standard-D	MALE
Malay (Malaysia)	Premium	ms-MY	ms-MY-Wavenet-A	FEMALE
Malay (Malaysia)	Premium	ms-MY	ms-MY-Wavenet-B	MALE
Malay (Malaysia)	Premium	ms-MY	ms-MY-Wavenet-C	FEMALE
Malay (Malaysia)	Premium	ms-MY	ms-MY-Wavenet-D	MALE
Malayalam (India)	Standard	ml-IN	ml-IN-Standard-A	FEMALE
Malayalam (India)	Standard	ml-IN	ml-IN-Standard-B	MALE
Malayalam (India)	Standard	ml-IN	ml-IN-Standard-C	FEMALE
Malayalam (India)	Standard	ml-IN	ml-IN-Standard-D	MALE
Malayalam (India)	Premium	ml-IN	ml-IN-Wavenet-A	FEMALE
Malayalam (India)	Premium	ml-IN	ml-IN-Wavenet-B	MALE
Malayalam (India)	Premium	ml-IN	ml-IN-Wavenet-C	FEMALE
Malayalam (India)	Premium	ml-IN	ml-IN-Wavenet-D	MALE
Mandarin Chinese	Standard	cmn-CN	cmn-CN-Standard-A	FEMALE
Mandarin Chinese	Standard	cmn-CN	cmn-CN-Standard-B	MALE
Mandarin Chinese	Standard	cmn-CN	cmn-CN-Standard-C	MALE
Mandarin Chinese	Standard	cmn-CN	cmn-CN-Standard-D	FEMALE
Mandarin Chinese	Premium	cmn-CN	cmn-CN-Wavenet-A	FEMALE
Mandarin Chinese	Premium	cmn-CN	cmn-CN-Wavenet-B	MALE
Mandarin Chinese	Premium	cmn-CN	cmn-CN-Wavenet-C	MALE
Mandarin Chinese	Premium	cmn-CN	cmn-CN-Wavenet-D	FEMALE
Mandarin Chinese	Standard	cmn-TW	cmn-TW-Standard-A	FEMALE
Mandarin Chinese	Standard	cmn-TW	cmn-TW-Standard-B	MALE
Mandarin Chinese	Standard	cmn-TW	cmn-TW-Standard-C	MALE
Mandarin Chinese	Premium	cmn-TW	cmn-TW-Wavenet-A	FEMALE
Mandarin Chinese	Premium	cmn-TW	cmn-TW-Wavenet-B	MALE
Mandarin Chinese	Premium	cmn-TW	cmn-TW-Wavenet-C	MALE
Marathi (India)	Standard	mr-IN	mr-IN-Standard-A	FEMALE
Marathi (India)	Standard	mr-IN	mr-IN-Standard-B	MALE
Marathi (India)	Standard	mr-IN	mr-IN-Standard-C	FEMALE
Marathi (India)	Premium	mr-IN	mr-IN-Wavenet-A	FEMALE
Marathi (India)	Premium	mr-IN	mr-IN-Wavenet-B	MALE
Marathi (India)	Premium	mr-IN	mr-IN-Wavenet-C	FEMALE
Norwegian (Norway)	Standard	nb-NO	nb-NO-Standard-A	FEMALE
Norwegian (Norway)	Standard	nb-NO	nb-NO-Standard-B	MALE
Norwegian (Norway)	Standard	nb-NO	nb-NO-Standard-C	FEMALE
Norwegian (Norway)	Standard	nb-NO	nb-NO-Standard-D	MALE
Norwegian (Norway)	Standard	nb-NO	nb-NO-Standard-E	FEMALE
Norwegian (Norway)	Premium	nb-NO	nb-NO-Wavenet-A	FEMALE
Norwegian (Norway)	Premium	nb-NO	nb-NO-Wavenet-B	MALE
Norwegian (Norway)	Premium	nb-NO	nb-NO-Wavenet-C	FEMALE
Norwegian (Norway)	Premium	nb-NO	nb-NO-Wavenet-D	MALE
Norwegian (Norway)	Premium	nb-NO	nb-NO-Wavenet-E	FEMALE
Polish (Poland)	Standard	pl-PL	pl-PL-Standard-A	FEMALE
Polish (Poland)	Standard	pl-PL	pl-PL-Standard-B	MALE
Polish (Poland)	Standard	pl-PL	pl-PL-Standard-C	MALE
Polish (Poland)	Standard	pl-PL	pl-PL-Standard-D	FEMALE
Polish (Poland)	Standard	pl-PL	pl-PL-Standard-E	FEMALE
Polish (Poland)	Premium	pl-PL	pl-PL-Wavenet-A	FEMALE
Polish (Poland)	Premium	pl-PL	pl-PL-Wavenet-B	MALE
Polish (Poland)	Premium	pl-PL	pl-PL-Wavenet-C	MALE
Polish (Poland)	Premium	pl-PL	pl-PL-Wavenet-D	FEMALE
Polish (Poland)	Premium	pl-PL	pl-PL-Wavenet-E	FEMALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Neural2-A	FEMALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Neural2-B	MALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Neural2-C	FEMALE
Portuguese (Brazil)	Standard	pt-BR	pt-BR-Standard-A	FEMALE
Portuguese (Brazil)	Standard	pt-BR	pt-BR-Standard-B	MALE
Portuguese (Brazil)	Standard	pt-BR	pt-BR-Standard-C	FEMALE
Portuguese (Brazil)	Studio	pt-BR	pt-BR-Studio-B	MALE
Portuguese (Brazil)	Studio	pt-BR	pt-BR-Studio-C	FEMALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Wavenet-A	FEMALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Wavenet-B	MALE
Portuguese (Brazil)	Premium	pt-BR	pt-BR-Wavenet-C	FEMALE
Portuguese (Portugal)	Standard	pt-PT	pt-PT-Standard-A	FEMALE
Portuguese (Portugal)	Standard	pt-PT	pt-PT-Standard-B	MALE
Portuguese (Portugal)	Standard	pt-PT	pt-PT-Standard-C	MALE
Portuguese (Portugal)	Standard	pt-PT	pt-PT-Standard-D	FEMALE
Portuguese (Portugal)	Premium	pt-PT	pt-PT-Wavenet-A	FEMALE
Portuguese (Portugal)	Premium	pt-PT	pt-PT-Wavenet-B	MALE
Portuguese (Portugal)	Premium	pt-PT	pt-PT-Wavenet-C	MALE
Portuguese (Portugal)	Premium	pt-PT	pt-PT-Wavenet-D	FEMALE
Punjabi (India)	Standard	pa-IN	pa-IN-Standard-A	FEMALE
Punjabi (India)	Standard	pa-IN	pa-IN-Standard-B	MALE
Punjabi (India)	Standard	pa-IN	pa-IN-Standard-C	FEMALE
Punjabi (India)	Standard	pa-IN	pa-IN-Standard-D	MALE
Punjabi (India)	Premium	pa-IN	pa-IN-Wavenet-A	FEMALE
Punjabi (India)	Premium	pa-IN	pa-IN-Wavenet-B	MALE
Punjabi (India)	Premium	pa-IN	pa-IN-Wavenet-C	FEMALE
Punjabi (India)	Premium	pa-IN	pa-IN-Wavenet-D	MALE
Romanian (Romania)	Standard	ro-RO	ro-RO-Standard-A	FEMALE
Romanian (Romania)	Premium	ro-RO	ro-RO-Wavenet-A	FEMALE
Russian (Russia)	Standard	ru-RU	ru-RU-Standard-A	FEMALE
Russian (Russia)	Standard	ru-RU	ru-RU-Standard-B	MALE
Russian (Russia)	Standard	ru-RU	ru-RU-Standard-C	FEMALE
Russian (Russia)	Standard	ru-RU	ru-RU-Standard-D	MALE
Russian (Russia)	Standard	ru-RU	ru-RU-Standard-E	FEMALE
Russian (Russia)	Premium	ru-RU	ru-RU-Wavenet-A	FEMALE
Russian (Russia)	Premium	ru-RU	ru-RU-Wavenet-B	MALE
Russian (Russia)	Premium	ru-RU	ru-RU-Wavenet-C	FEMALE
Russian (Russia)	Premium	ru-RU	ru-RU-Wavenet-D	MALE
Russian (Russia)	Premium	ru-RU	ru-RU-Wavenet-E	FEMALE
Serbian (Cyrillic)	Standard	sr-RS	sr-RS-Standard-A	FEMALE
Slovak (Slovakia)	Standard	sk-SK	sk-SK-Standard-A	FEMALE
Slovak (Slovakia)	Premium	sk-SK	sk-SK-Wavenet-A	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-A	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-B	MALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-C	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-D	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-E	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Neural2-F	MALE
Spanish (Spain)	Premium	es-ES	es-ES-Polyglot-1	MALE
Spanish (Spain)	Standard	es-ES	es-ES-Standard-A	FEMALE
Spanish (Spain)	Standard	es-ES	es-ES-Standard-B	MALE
Spanish (Spain)	Standard	es-ES	es-ES-Standard-C	FEMALE
Spanish (Spain)	Standard	es-ES	es-ES-Standard-D	FEMALE
Spanish (Spain)	Studio	es-ES	es-ES-Studio-C	FEMALE
Spanish (Spain)	Studio	es-ES	es-ES-Studio-F	MALE
Spanish (Spain)	Premium	es-ES	es-ES-Wavenet-B	MALE
Spanish (Spain)	Premium	es-ES	es-ES-Wavenet-C	FEMALE
Spanish (Spain)	Premium	es-ES	es-ES-Wavenet-D	FEMALE
Spanish (US)	Premium	es-US	es-US-Neural2-A	FEMALE
Spanish (US)	Premium	es-US	es-US-Neural2-B	MALE
Spanish (US)	Premium	es-US	es-US-Neural2-C	MALE
Spanish (US)	Premium	es-US	es-US-News-D	MALE
Spanish (US)	Premium	es-US	es-US-News-E	MALE
Spanish (US)	Premium	es-US	es-US-News-F	FEMALE
Spanish (US)	Premium	es-US	es-US-News-G	FEMALE
Spanish (US)	Premium	es-US	es-US-Polyglot-1	MALE
Spanish (US)	Standard	es-US	es-US-Standard-A	FEMALE
Spanish (US)	Standard	es-US	es-US-Standard-B	MALE
Spanish (US)	Standard	es-US	es-US-Standard-C	MALE
Spanish (US)	Studio	es-US	es-US-Studio-B	MALE
Spanish (US)	Premium	es-US	es-US-Wavenet-A	FEMALE
Spanish (US)	Premium	es-US	es-US-Wavenet-B	MALE
Spanish (US)	Premium	es-US	es-US-Wavenet-C	MALE
Swedish (Sweden)	Standard	sv-SE	sv-SE-Standard-A	FEMALE
Swedish (Sweden)	Standard	sv-SE	sv-SE-Standard-B	FEMALE
Swedish (Sweden)	Standard	sv-SE	sv-SE-Standard-C	FEMALE
Swedish (Sweden)	Standard	sv-SE	sv-SE-Standard-D	MALE
Swedish (Sweden)	Standard	sv-SE	sv-SE-Standard-E	MALE
Swedish (Sweden)	Premium	sv-SE	sv-SE-Wavenet-A	FEMALE
Swedish (Sweden)	Premium	sv-SE	sv-SE-Wavenet-B	FEMALE
Swedish (Sweden)	Premium	sv-SE	sv-SE-Wavenet-C	MALE
Swedish (Sweden)	Premium	sv-SE	sv-SE-Wavenet-D	FEMALE
Swedish (Sweden)	Premium	sv-SE	sv-SE-Wavenet-E	MALE
Tamil (India)	Standard	ta-IN	ta-IN-Standard-A	FEMALE
Tamil (India)	Standard	ta-IN	ta-IN-Standard-B	MALE
Tamil (India)	Standard	ta-IN	ta-IN-Standard-C	FEMALE
Tamil (India)	Standard	ta-IN	ta-IN-Standard-D	MALE
Tamil (India)	Premium	ta-IN	ta-IN-Wavenet-A	FEMALE
Tamil (India)	Premium	ta-IN	ta-IN-Wavenet-B	MALE
Tamil (India)	Premium	ta-IN	ta-IN-Wavenet-C	FEMALE
Tamil (India)	Premium	ta-IN	ta-IN-Wavenet-D	MALE
Telugu (India)	Standard	te-IN	te-IN-Standard-A	FEMALE
Telugu (India)	Standard	te-IN	te-IN-Standard-B	MALE
Thai (Thailand)	Premium	th-TH	th-TH-Neural2-C	FEMALE
Thai (Thailand)	Standard	th-TH	th-TH-Standard-A	FEMALE
Turkish (Turkey)	Standard	tr-TR	tr-TR-Standard-A	FEMALE
Turkish (Turkey)	Standard	tr-TR	tr-TR-Standard-B	MALE
Turkish (Turkey)	Standard	tr-TR	tr-TR-Standard-C	FEMALE
Turkish (Turkey)	Standard	tr-TR	tr-TR-Standard-D	FEMALE
Turkish (Turkey)	Standard	tr-TR	tr-TR-Standard-E	MALE
Turkish (Turkey)	Premium	tr-TR	tr-TR-Wavenet-A	FEMALE
Turkish (Turkey)	Premium	tr-TR	tr-TR-Wavenet-B	MALE
Turkish (Turkey)	Premium	tr-TR	tr-TR-Wavenet-C	FEMALE
Turkish (Turkey)	Premium	tr-TR	tr-TR-Wavenet-D	FEMALE
Turkish (Turkey)	Premium	tr-TR	tr-TR-Wavenet-E	MALE
Ukrainian (Ukraine)	Standard	uk-UA	uk-UA-Standard-A	FEMALE
Ukrainian (Ukraine)	Premium	uk-UA	uk-UA-Wavenet-A	FEMALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Neural2-A	FEMALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Neural2-D	MALE
Vietnamese (Vietnam)	Standard	vi-VN	vi-VN-Standard-A	FEMALE
Vietnamese (Vietnam)	Standard	vi-VN	vi-VN-Standard-B	MALE
Vietnamese (Vietnam)	Standard	vi-VN	vi-VN-Standard-C	FEMALE
Vietnamese (Vietnam)	Standard	vi-VN	vi-VN-Standard-D	MALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Wavenet-A	FEMALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Wavenet-B	MALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Wavenet-C	FEMALE
Vietnamese (Vietnam)	Premium	vi-VN	vi-VN-Wavenet-D	MALE

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-08-09 UTC.

Google’s AI surprise: Gemini Live speaks like a human, taking on ChatGPT Advanced Voice Mode

Share on Facebook
Share on LinkedIn

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Google sometimes feels like it’s playing catchup in the generative AI race to rivals such as Meta, OpenAI, Anthropic and Mistral — but not anymore.

Today, the company leapfrogged most others by announcing Gemini Live , a new voice mode for its AI model Gemini through the Gemini mobile app, which allows users to speak to the model in plain, conversational language and even interrupt it and have it respond back with the AI’s own humanlike voice and cadence. Or as Google put it in a post on X: “You can now have a free-flowing conversation, and even interrupt or change topics just like you might on a regular phone call.”

We’re introducing Gemini Live, a more natural way to interact with Gemini. You can now have a free-flowing conversation, and even interrupt or change topics just like you might on a regular phone call. Available to Gemini Advanced subscribers. #MadeByGoogle pic.twitter.com/eNjlNKubsv — Google (@Google) August 13, 2024

If that sounds familiar, it’s because OpenAI in May demoed its own “Advanced Voice Mode” for ChatGPT which it openly compared to the talking AI operating system from the movie Her , only to delay the feature and begin to roll it out only selectively to alpha participants late last month .

Gemini Live is now available in English on the Google Gemini app for Android devices through a Gemini Advanced subscription ($19.99 USD per month), with an iOS version and support for more languages to follow in the coming weeks.

In other words: even though OpenAI showed off a similar feature first, Google is set to make it more available to a much wider potential audience (more than 3 billion active users on Android and 2.2 billion iOS devices ) much sooner than ChatGPT’s Advanced Voice Mode.

Yet part of the reason OpenAI may have delayed ChatGPT Advanced Voice Mode was due to its own internal “red-teaming” or controlled adversarial security testing that showed the voice mode in particular sometimes engaged in odd, disconcerting, and even potentially dangerous behavior such as mimicking the user’s own voice without consent — which could be used for fraud or malicious purposes.

How is Google addressing the potential harms caused by this type of tech? We don’t really know yet, but VentureBeat reached out to the company to ask and will update when we hear back.

What is Gemini Live good for?

Google pitches Gemini Live as offering free-flowing, natural conversation that’s good for brainstorming ideas, preparing for important conversations, or simply chatting casually about “various topics.” Gemini Live is designed to respond and adapt in real-time.

Additionally, this feature can operate hands-free, allowing users to continue their interactions even when their device is locked or running other apps in the background.

Google further announced that the Gemini AI model is now fully integrated into the Android user experience, providing more context-aware assistance tailored to the device.

Users can access Gemini by long-pressing the power button or saying, “Hey Google.” This integration allows Gemini to interact with the content on the screen, such as providing details about a YouTube video or generating a list of restaurants from a travel vlog to add directly into Google Maps.

In a blog post, Sissie Hsiao , Vice President and General Manager of Gemini Experiences and Google Assistant, emphasized that the evolution of AI has led to a reimagining of what it means for a personal assistant to be truly helpful. With these new updates, Gemini is set to offer a more intuitive and conversational experience, making it a reliable sidekick for complex tasks.

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here .

An error occured.

Python Voice Assistant Tutorial

This python voice assistant tutorial series is designed to teach you how to create a python voice assistant using the google text to speech module as well as the speech recognition module.

What You’ll Learn

This series is packed full of valuable information. You will learn and understand the following after this tutorial:

How to use the google text to speech module (gTTS)
How to use the speech recognition module
How to automate tasks with python
How to scrape information off the web with web scrapers

Prerequisites

This is NOT a beginner tutorial and I will not be teaching python syntax

Experience With Python 3 Syntax

Get the Reddit app

Android news, reviews, tips, and discussions about rooting, tutorials, and apps. General discussion about devices is welcome. Please direct technical support, upgrade questions, buy/sell, app recommendations, and carrier-related issues to other subreddits.

Use google assistant as Text-to-speech output

Is this possible? Google assistant voice is much more natural than embedded android solution

By continuing, you agree to our User Agreement and acknowledge that you understand the Privacy Policy .

Enter the 6-digit code from your authenticator app

You’ve set up two-factor authentication for this account.

Enter a 6-digit backup code

Create your username and password.

Reddit is anonymous, so your username is what you’ll go by here. Choose wisely—because once you get a name, you can’t change it.

Reset your password

Enter your email address or username and we’ll send you a link to reset your password

Check your inbox

An email with a link to reset your password was sent to the email address associated with your account

Choose a Reddit account to continue

Why ChatGPT’s Speech to Text Is the Best I’ve Ever Used

Your changes have been saved

Email is sent

Email has already been sent

Please verify your email address.

You’ve reached your account maximum for followed topics.

The Pixel 9 and Pixel 9 Pro Compared: Why I'm Getting the Pixel 9 Pro

5 ai-powered language learning apps worth trying, i create my own phone wallpapers with abstract photos: here's how you can too, quick links, chatgpt is better than google’s voice typing, what makes chatgpt’s speech-to-text function so good, note-taking using chatgpt on my phone, voice typing using chatgpt on my desktop, key takeaways.

ChatGPT’s speech-to-text is superior to Google’s, eliminating the need to say punctuation out loud.
WhisperAI neural network powers ChatGPT for flawless transcription, despite a lack of integration with keyboard apps.
Use ChatGPT effortlessly on Android, iPhone, macOS, and soon on Windows for efficient note-taking and transcription.

You have to speak it to believe it; ChatGPT’s fantastic speech-to-text function, that is. It’s proved to be far smoother and more precise than some of the most established voice-to-text apps.

Google’s voice typing is a tool I’ve used on and off for years. It comes with the SwiftKey keyboard app and Google’s own Gboard keyboard for mobile phones. It was good for a time—above average, in fact—but not anymore. ChatGPT has leaped ahead of the competition, and the results are slick.

If you’ve ever used Google’s voice typing, you will know how awkward it is to say “comma” or “period” out loud each time you want to add punctuation to your text. In ChatGPT, there’s no need. You can speak as naturally as if you’re having a chat with your friend, and it will effortlessly add punctuation where you would expect it to go.

This makes a huge difference. Take this sentence, for example: “I want to go to the supermarket and buy apples, oranges, watermelon, pears, and cherries.” To dictate it using Google’s voice typing, you would need to say “...apples comma oranges comma watermelon comma pears comma and cherries.” Repeating the word “comma” five times is clunky and unnatural.

ChatGPT does an incredible job of converting speech to text thanks to WhisperAI, an advanced neural network. OpenAI released it as an open-source model aimed at people wanting to develop this technology into useful applications. Which brings us to a key sticking point. ChatGPT’s speech-to-text function is not yet integrated into something like a voice typing keyboard.

Despite this, I have begun using it all the time in my workflow. Even though Google’s voice typing is easily accessible from my keyboard, I end up wasting a lot of time fixing its mistakes. At one point, I started speaking in short fragments—think robocalls and computerized speech—to help it pick up my speech better.

That’s why I am happily using ChatGPT’s speech-to-text with a small workaround. In the end, it’s going to save me far more time and effort, besides letting me talk naturally.

ChatGPT is available on Android , iPhone , and macOS (M1 and later).

Those using Windows computers can expect a desktop app for ChatGPT in late 2024.

I write notes for my articles using pen and paper. This is, ironically, a very low-tech approach for a tech writer! While I enjoy it, eventually I need to get those words into a digital format if they are going to be of any use to me.

My preferred place to transfer my ideas to is a note-taking app. Google Keep, for example, is good because it automatically syncs your notes online and between devices. Or there’s Obsidian, my new favorite way to organize my thoughts into long-term storage. In the long run, it’s best to aim for a note-taking app that works on any device for added convenience.

My process is simple. Open ChatGPT and hit the microphone button, then start speaking. After that, press stop to convert the audio to text. Finally, copy the text and paste it into a note-taking app.

At my desktop, I follow the same process. The app looks nearly identical to the mobile version, so you simply need to press the microphone button to start recording, then press the tick button when you’re done. After this, you can copy the text to where it needs to go, such as a Word document or an email.

ChatGPT macOS app audio recording window

Sometimes it’s good to have a record of your transcription directly in ChatGPT. In that case, you can add the line, “Do not comment:” immediately before the transcribed text, then hit enter to add it to ChatGPT’s conversation feed. This stops ChatGPT from replying with a long-winded answer, with the added benefit of maintaining a record of your transcriptions.

There are plenty of things you can do with ChatGPT besides converting speech to text, making it a nice multipurpose app to have on hand.

It won’t be long before this speech-to-text AI model makes its way into voice typing apps or transcription tools. Until then, you can use ChatGPT to produce clean and accurate transcriptions for spots of note-taking, brainstorming, or dictation.

Productivity
Speech to Text

IMAGES

Google Speech/Voice to Text in Android Studio Tutorial (Kotlin)
How to use Google Assistant's new text-to-speech feature on Android
Google Assistant's Text-to-Speech Feature on Android Is Live
How to Use Google's 6 Voice Assistant On Android iOS Smartphone
Use Google Cloud Speech-to-Text in Home Assistant
Google Launches New Text-to-Speech Cloud Service

COMMENTS

Use Google Assistant to type with your voice
Use Google Assistant to type with your voice Important: The text you speak stays on your device and isn't sent to Google servers except when you use the "Fix it" feature.
Types of voices
WaveNet voices. The Text-to-Speech API also offers a group of premium voices generated using a WaveNet model, the same technology used to produce speech for Google Assistant, Google Search, and Google Translate. WaveNet technology provides more than just a series of synthetic voices: it represents a new way of creating synthetic speech.
How to Use Google's Text-to-Speech Feature on Android
How to Manage Android Text-to-Speech Voices and Options Android gives you some control over the language and voice used to read text aloud via Select to Speak. It's easy to change the language, accent, pitch, or speed of the synthesized text voice.
How to use Google Assistant voice typing with Gboard
With recent Pixels, you can use the combination of Gboard and Assistant to speed up dictation. Find out how to enable the feature.
Choose the voice of your Google Assistant
On your Android phone or tablet, open the Google app . At the bottom right, tap More SettingsVoice Language. Choose a language. Important: If your Google Assistant reads text messages in the wrong language, remove English as a secondary language. Then change your Google Assistant's voice language to match your Google Assistant's language.
Speech Recognition & Synthesis
To use Google Speech-to-Text functionality on your Android device, go to Settings > Apps & notifications > Default apps > Assist App. Select Speech Recognition and Synthesis from Google as your preferred voice input engine. Google Text-to-Speech functionality. Speech Services powers applications to read the text on your screen aloud.
Gemini Live: How to access Google's new voice assistant
Gemini Live, Google's new voice mode for its AI assistant, is rolling out to Android users, but you'll need the right subscription to access it.
Cloud Computing Services
Cloud Computing Services | Google Cloud
How to Use Google Text-to-Speech on an Android Phone
You can use Google text-to-speech on an Android phone to help you hear text instead of reading it, and catch grammatical oddities in your own writing.
What is Gemini Live? Google's new AI voice chat tool explained
Google has announced the rollout of Gemini Live, a new voice chat tool enabling users to have human-like combinations with the artificially intelligent assistant. The new feature is rolling out ...
Introducing Cloud Text-to-Speech powered by DeepMind ...
Many Google products (e.g., the Google Assistant, Search, Maps) come with built-in high-quality text-to-speech synthesis that produces natural sounding speech. Developers have been telling us they'd like to add text-to-speech to their own applications, so today we're bringing this technology to Google Cloud Platform with Cloud Text-to-Speech.
How to Modify Google Text-to-Speech Voices
While Google focuses on the Assistant, Android owners shouldn't forget about the Text-to-Speech (TTS) accessibility feature. It'll convert text from your Android apps, but you might need to modify it to get the speech to sound the way you want it. Modifying Text-to-Speech voices is easily done from the Android accessibility settings menu.
Google Upgrades Text-to-Speech Voices on Android
Google is rolling out an enhanced set of models for its Speech Services by Google to make the Android app with text-to-speech voices easier to understand and more human-like. The improved clarity and more qualitative "human" improvements can be heard in the comparison below.
Google Assistant gets a new voice as Gemini AI comes to Google Home
New voices for Google Assistant will offer a more conversational experience with better understanding and follow-up capabilities. Google appears to be revamping its smart home portfolio, with two ...
WaveNet launches in the Google Assistant
Japanese voice To understand why WaveNet improves on the current state of the art, it is useful to understand how text-to-speech (TTS) - or speech synthesis - systems work today. The majority of these are based on so-called concatenative TTS, which uses a large database of high-quality recordings, collected from a single voice actor over many ...
Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode
Google has unveiled Gemini Live, a conversational voice assistant that's set to rival OpenAI's Voice Mode. Available through the Gemini app on Android and iOS, the new Live feature allows users to interact with the AI using their voice. Powered by Google's Gemini 1.5 Flash model, the Live ...
Is Gemini Live worth $20? Not while Google's best digital assistant is free
Google's new Gemini Live hyper-realistic text-to-speech model is parked behind a $20 paywall. Is it worth it?
Use Google Assistant to type with your voice
To activate Assistant voice typing, open any app that you can type with and tap on the Keyboard mic . To keep the microphone on and send multiple messages in a row: Double tap Keyboard mic . To turn the microphone off: Tap Keyboard mic , close the keyboard, or say "Stop." Say the text you want to type.
Android How To: Make Text To Speech Match Google Assistant Voice
What is going on everyone? Leon checkin' in & we are at it again with another video!In today's video we'll be demonstrating how to make Text To Speech Settin...
Google Voice
Google Voice lets you make and receive calls, texts, and voicemails with one number. Learn how to set up, use, and manage your Google Voice account.
How to try Google's new Gemini Live AI assistant for free
On Tuesday at Made by Google, Google finally released Gemini Live, its advanced mobile conversational experience that enables users to have free-flowing conversations with an AI assistant, or ...
Google Assistant's voice is changing
This week, Google announced plans to update the voice assistant using Gemini, to make it more natural and conversational - but after hearing it, I'm not completely sold.
What is Google Read Aloud, what does it do, and how does it work?
Google Assistant's built-in feature is great for hands (and eyes) free reading, accessibility, and custom read-aloud speed. Here's how it works.
Gemini Live, Google's answer to ChatGPT's Advanced Voice Mode, launches
Google's answer to ChatGPT's Advanced Voice Mode, Gemini Live, is rolling out months after it was first announced.
Google Gemini's voice chat mode is here
Google is rolling out a new voice chat mode for Gemini, called Gemini Live, the company announced at its Pixel 9 event today. Available for Gemini Advanced subscribers, it works a lot like ChatGPT ...
Supported voices and languages
Supported voices and languages. Text-to-Speech provides the following voices. The list includes Neural2, Studio, Standard, and WaveNet voices. Studio, Neural2 and WaveNet voices are higher quality voices with different pricing; in the list, they have the voice type 'Neural2', 'Studio' or 'WaveNet'. To use these voices to create synthetic speech ...
Google's new Gemini Live rivals ChatGPT Advanced Voice Mode
Even though OpenAI showed off a similar feature first, Google is set to make it more available to a much wider potential audience sooner.
Python Voice Assistant Tutorial
Overview This python voice assistant tutorial series is designed to teach you how to create a python voice assistant using the google text to speech module as well as the speech recognition module.
Use google assistant as Text-to-speech output : r/Android
Is this possible? Google assistant voice is much more natural than embedded android solution
Why ChatGPT's Speech to Text Is the Best I've Ever Used
You have to speak it to believe it; ChatGPT's fantastic speech-to-text function, that is. It's proved to be far smoother and more precise than some of the most established voice-to-text apps. ChatGPT Is Better Than Google's Voice Typing Google's voice typing is a tool I've used on and off for years.

How to Use Google's Text-to-Speech Feature on Android

What to Know

How to Use Google Text-to-Speech on Android

How to Manage Android Text-to-Speech Voices and Options

Use Select to Speak With Google Lens to Translate Written Words

Search results for

How to use Google Assistant voice typing with Gboard

How to use Assistant voice typing

Starting and stopping voice typing

Using Assistant commands

Editing text

Managing automatic punctuation

Pocket-lint

Key Takeaways

Gemini and Google Workspace can help you be more productive… most of the time

How to talk to Gemini Live

Access voice chat through the Gemini app

Google has a glimmer of a next-generation voice assistant

Q: Will Gemini Live be available on iOS?

How to use Google text-to-speech on your Android phone to hear text instead of reading it

Check out the products mentioned in this article:

How to use Google text-to-speech

Related coverage from How To Do Everything: Tech :

Watch: Everything we know about the Google Pixel 3

What is Gemini Live? Google’s new AI voice chat tool explained

You might like…

The Pixel 9 Pro Fold is the foldable Google should have made last year

Google Pixel 9 Pro vs Pixel 9 Pro XL: The important differences detailed

Google Pixel Watch 3 vs Apple Watch Series 9: Wear OS and watchOS go head to head

Google Pixel 9 Pro vs iPhone 15 Pro: Which should you buy?

Why trust our journalism?

Editorial independence

Professional conduct

Sign up to our newsletter

How-To Geek

How to Get Google to Pay for Your Android Apps

Changing Speech Rate

Changing Pitch

Installing Third-Party Text-to-Speech Engines

Changing Text-to-Speech Engine

Google Upgrades Text-to-Speech Voices on Android

Google Voices

Speech Clarity

Subscribe to Voicebot Weekly

McDonald’s Abandons Drive Through AI for Order Taking

Apple Debuts ‘Apple Intelligence’ Generative AI Features Across All Devices

Stability AI Shares Open-Source Generative AI Audio Model for Creative Sound Design

Fable Studio Launches Generative AI TV Show Production Platform for Custom Streaming Content

Android Police

Google Home: Everything you need to know about the smart home platform

Creating automations has never been easier

A more natural Google Assistant

WaveNet launches in the Google Assistant

US English voice I

Related Topics

Recent in ML

Recent in NLP

Recent in Data

Recent in Automation

Recent in Verticals

Recent in Responsible AI

Recent in Companies

Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode

About the Author

Latest News

Trending articles

Use Google Assistant to type with your voice

What you need

Turn Assistant voice typing on or off

Use voice commands

Use multiple languages with Assistant voice typing

Edit text with voice commands

Turn off automated punctuation

Manage voice typing personalization

Turn on & connect

Switch languages on Gboard

Switch languages on your device

Related resources

A smarter phone number

Save time, stay connected