What needs improvement with Google Cloud Speech-to-Text?

Learn what your peers think about Google Cloud Speech-to-Text. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.

DOWNLOAD NOW

867,836 professionals have used our research since 2012.

Related Q&As

Jun 1, 2023

What do you like most about Google Cloud Speech-to-Text?

Aug 29, 2025

What is your experience regarding pricing and costs for Google Cloud Speech-to-Text?

score 0 · Answer 1 · 2025-08-29T15:22:37Z

Google Cloud Speech-to-Text is not entirely accurate, so we have to correct for those errors in our AI software. It uses neural networks, and that stochastic processing is 70% to 75% accurate. It gets it wrong too often, and since I personally work with this, I don't appreciate that. However, they seem to be the best option currently. We have to write our own improvements because their tools to improve transcription accuracy in our domain aren't very powerful. The timestamp technology for recognized words is inadequate, so we don't use it. We understand words based on their meaning, and we have a whole AI engine that does that, which is one of our differentiators from a product standpoint. We didn't use the custom voice creation feature; we just use their voices, which are fine for our purposes.

reviewer2707809 Software Developer at a consultancy with 11-50 employees · Answer 2 · 2025-05-21T11:15:54Z

The major challenge with Google Cloud Speech-to-Text is that not every call is clear. Our representative may be in a silent environment, but the client can be anywhere. We need to manage background noise for all calls, so handling audio file clarity is a challenge. Sometimes, speaker diarization is affected, leading to incorrect speaker identification. For instance, if someone is speaking, it might attribute that incorrectly to another person. These challenges require an AI to analyze the complete output given by Google Cloud Speech-to-Text and correct it properly. The output might have incorrect grammar, especially since many of our clients speak Spanish. Sometimes, the transcription will be in English when it should be in Spanish, and the AI can help correct that as long as we provide the right context. A crucial update would be to autocorrect the output from Google Cloud Speech-to-Text because irrelevant words occasionally appear. In our integration, we handle the output with an AI to correct grammar mistakes and issues. If that feature were integrated into Google Cloud Speech-to-Text, it would remove the need for a third-party process. We could simply call Google Cloud Speech-to-Text without having to format or correct the output separately.

Venkatesh C S Full Stack | Machine Learning Engineer at Tiger Analytics · Answer 3 · 2024-08-05T10:01:59Z

The tool's telephony model does not produce accurate results. With the telephony model, whenever a phone call occurs, we want to transcribe it live or something, and this is not possible in Module Google since it does not support Cloud Translation V2 API. If we are having, like, pre-recorded calls, it is giving you the transcript, but in some cases, where if the call is on hold for five to ten minutes or fifteen minutes or something, whatever we spoke about is not getting recorded, which is an issue.

Alexandre Brum Chief Executive Officer at GC Sec · Answer 4 · 2023-06-01T13:08:00Z

Google Cloud Speech-to-Text's price could be improved. Google Cloud Speech-to-Text's trial experience could be improved by adding some extra minutes in the trial version.

score 0 · Answer 5 · 2023-05-30T16:55:00Z

The one thing that I find is when I often use specialized terms, and the solution doesn't know them. I'm not sure if there's an easy way out of this, so I usually have to delete the text to correct something. That would be good if I could just tell it to change the spelling or something like that, and it'd be smart enough to figure that out.

Javier Orellana SCRUM Master at a retailer with 10,001+ employees · Answer 6 · 2022-04-25T09:35:41Z

The multilanguage support for the chatbot needs to be better. If you incorporate the translation natively in a chatbot, maybe you can do a chatbot in English and automatically have the same chatbot, however, the same chatbot needs to be in all other languages as the translation services of Google is good. That translation is not bad for technical people. If they can get a multilanguage chatbot, it would be ideal.