What is our primary use case?
I can answer questions about my experience with SQL Server as we are trying to capture reviews for SQL Server. We don't use the reporting services within SQL Server; we're using this for heavy-duty engineering inside an AI engine.
We don't use SQL Server Data Warehouse or SQL Server Management. We just use SQL Server extensively. We use Office, but in terms of engineering products, the major use right now is SQL Server.
We use TensorFlow, which is different from TensorLeap. We use Google Cloud extensively; we use their AI, STT, and TTS capabilities.
We use these two products to be able to talk to our users, and then we have an AI meaning engine that back ends that. Once we get the speech, we can tell what it means.
Accuracy with Text-to-Speech is important. If you want to talk to your computer, using Google is the way to do it, but it's not without problems. Natural intonation and pacing in communications have gotten hugely better over the last two or three years.
We're following the research at MIT of Rosalind Picard about these kinds of issues, and we plan to implement it. This applies to both Text-to-Speech and Speech-to-Text because it's a conversation.
What is most valuable?
When we tested Google Cloud Speech-to-Text a few years ago, it was better than the equivalent IBM products because it was more accurate. However, it's not entirely accurate, so we have to correct for those errors in our AI software.
Google Cloud Speech-to-Text is 100 out of 100 when it works, and when it doesn't work, which is fairly often, it gets a zero. It doesn't fail gracefully; it fails in an unexpected way.
What needs improvement?
Google Cloud Speech-to-Text is not entirely accurate, so we have to correct for those errors in our AI software. It uses neural networks, and that stochastic processing is 70% to 75% accurate. It gets it wrong too often, and since I personally work with this, I don't appreciate that. However, they seem to be the best option currently.
We have to write our own improvements because their tools to improve transcription accuracy in our domain aren't very powerful.
The timestamp technology for recognized words is inadequate, so we don't use it. We understand words based on their meaning, and we have a whole AI engine that does that, which is one of our differentiators from a product standpoint.
We didn't use the custom voice creation feature; we just use their voices, which are fine for our purposes.
For how long have I used the solution?
I have dealt with Google Cloud Speech-to-Text for five years.
What was my experience with deployment of the solution?
On development machines, it works great out of the box, which is terrific and one of the reasons we chose them. However, in the cloud, we encounter numerous errors.
How are customer service and support?
The support from Google Cloud Speech-to-Text is inadequate. We are dealing with them on our development talks, and there's significant finger-pointing regarding problem ownership. Moving our systems to Google Cloud and maintaining the same performance as on development machines is problematic.
I cannot provide a score for their support because when issues arise and developers report unintelligible error messages, there's no effective support from Google. Comparing the support we receive from Google versus Amazon, Google's support is substantially inferior.
Without considering support, the product rating would be closer to ten, but including support changes everything significantly.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We evaluated IBM and Nuance as alternative options. The main differences between these solutions come down to accuracy. It doesn't matter what you do with the utterance if you don't recognize it. These other products didn't recognize speech as effectively as Google Cloud Speech-to-Text does.
What was our ROI?
We don't prioritize pricing considerations; they were competitive, but our primary concern is functionality, and currently, the competitors don't work well enough for us to consider them.
What's my experience with pricing, setup cost, and licensing?
Our experience with pricing and licensing for Google Cloud Speech-to-Text is that we didn't have any other viable choices, so we cannot effectively evaluate if it's well-priced or badly priced.
Which other solutions did I evaluate?
We evaluated IBM and Nuance as alternative options.
What other advice do I have?
Complex conversations require more work, and measuring the effectiveness of automated customer interactions involves abandonment rates and internal metrics we've developed about understanding conversational utterances and user interaction quality. Currently, the performance is disappointingly low.
We haven't implemented multi-language capability yet, though we're planning to try it as it's a significant advantage for our products.
We use Google Cloud rather than AWS marketplace for these products. Google Cloud Speech-to-Text sounds incredibly natural, which is impressive. The improvement in naturalness brings satisfaction, though the support needs significant enhancement.
This product is certainly suitable for enterprise environments.
The overall rating for this solution is 8 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google