Amazon Polly vs Microsoft Azure Speech Service comparison

Cancel
You must select at least 2 products to compare!
Amazon Web Services (AWS) Logo
3,965 views|2,710 comparisons
100% willing to recommend
Microsoft Logo
2,222 views|1,894 comparisons
100% willing to recommend
Comparison Buyer's Guide
Executive Summary
Updated on Mar 6, 2024

We compared Amazon Polly and Microsoft Azure Speech Service based on our user's reviews in several parameters.

Amazon Polly is praised for its natural-sounding voices, flexible parameters, and excellent customer service, while Microsoft Azure Speech Service is known for accurate speech recognition, high-quality text-to-speech, and seamless integration with other Azure services. Amazon Polly users appreciate the pay-as-you-go pricing model, while Azure Speech Service users value the reasonable pricing and flexibility. Users have suggested improvements for both services, such as enhancing pronunciation accuracy and better support for non-English languages for Amazon Polly, and improving accuracy and comprehension of complex phrases for Microsoft Azure Speech Service. Despite these suggestions for enhancements, both services have provided positive returns on investment for users, with Amazon Polly enhancing text-to-speech capabilities and increasing efficiency, and Microsoft Azure Speech Service improving productivity and customer experience.

Features: Amazon Polly is applauded for its natural-sounding voices, pronunciation accuracy, flexible parameter adjustment, and easy integration. In contrast, Microsoft Azure Speech Service excels in accurate speech recognition, high-quality text-to-speech conversion, seamless integration with Azure services, and impressive natural and human-like output.

Pricing and ROI: The setup cost for Amazon Polly and Microsoft Azure Speech Service is described as straightforward and hassle-free. Users find Amazon Polly's pricing to be reasonable and appreciate the flexible pay-as-you-go model. Meanwhile, users also consider Microsoft Azure Speech Service's pricing to be competitive and find the licensing to be flexible., Users have reported that both Amazon Polly and Microsoft Azure Speech Service have had a positive impact on returns on investment. However, Amazon Polly is praised for its cost-effective solution with high-quality voice output. On the other hand, Microsoft Azure Speech Service is commended for its seamless integration, accurate transcription, and effective voice recognition.

Room for Improvement: Amazon Polly could enhance its pronunciation accuracy and address audio distortion issues, while also improving the setup process and expanding the variety of voices and languages available. On the other hand, Microsoft Azure Speech Service users suggest enhancing accuracy and comprehension of complex phrases, better support for non-English languages, and adding real-time analysis functionality. They also mention the need for an improved pricing model for more flexibility.

Deployment and customer support: User reviews comparing the duration required to establish a new tech solution with Amazon Polly and Microsoft Azure Speech Service showed that while some users spent a week on both deployment and setup with Amazon Polly, others mentioned several months for deployment and a week for setup with Microsoft Azure Speech Service., Customers have expressed overall satisfaction with the customer service and support from both Amazon Polly and Microsoft Azure Speech Service. Representatives from both platforms are highly responsive, knowledgeable, and provide prompt and effective assistance.

The summary above is based on 1 interviews we conducted recently with Amazon Polly and Microsoft Azure Speech Service users. To access the review's full transcripts, download our report.

Featured Review
Raed Gharzeddine
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"Amazon Polly is useful because it's helpful to hear the words on top of it when I can't take in information in a general way. Sometimes, it's very taxing if I'm trying to read cases. They have the neural voices, and they're so realistic. You don't even know that a person is not reading to you, making things much better. I know that they do have the ability to provide you with your own lexicon that's personal to you. I like that you can adjust the pitch and the speed of the voice because some people talk way too fast. Or if you're reading, I read slowly, so that's always helpful. One of the functions that I find helpful is that when reading material on the web, it's like it has its own browser. You go to the URL, and you don't have to read the whole thing, and you can stick the cursor on the place where you want it to start. Then if you want it to skip over something, you put it somewhere else, and that's ideal for reading case law because you skip around a lot. You don't really read it from start to finish. It helps if someone's going to read all those citations because they definitely want to be able to skip that."

More Amazon Polly Pros →

"Useful text-to-speech and speech-to-text features."

More Microsoft Azure Speech Service Pros →

Cons
"The price could be better. I wish it weren't so expensive to do because it's really cool. I would love to see them have lexicon packages of them like, this is for lawyers, this is for accountants, and it's going to have a lot of things in it. I also think they could do a better job at showing use cases other than telemarketing or contact center stuff like bots that are very commercial. I know that's where the money is, but it's such a huge hole that's missing for people with disabilities that are even worse than mine. Some people cannot see or hear at all, but they're not just cognitively impaired."

More Amazon Polly Cons →

"Lacks a voice recording option."

More Microsoft Azure Speech Service Cons →

Pricing and Cost Advice
  • "The price could be better. Neural voices are so realistic, and I want to say that they have it so that you can try to tell where the voice is coming from or something like that. But if I have more than one, it's so expensive to have to listen to a bunch of cases on my phone and have the neural voice read to me. It really wouldn't be worth it. It'd be paying probably more than what I make in the case. Right now, I'm on the free tier, and I think the number of minutes that you get is reasonable as long as you're not doing this all the time and you're using it judiciously. I have some credits that I think I can use, but I don't know how fast they'll go through."
  • More Amazon Polly Pricing and Cost Advice →

    Information Not Available
    report
    Use our free recommendation engine to learn which Text-To-Speech Services solutions are best for your needs.
    767,667 professionals have used our research since 2012.
    Questions from the Community
    Ask a question

    Earn 20 points

    Top Answer:Useful text-to-speech and speech-to-text features.
    Top Answer:There is an open source version but once you choose to deploy, they charge a per minute fee for speech to text, and per number of words for text-to-speech. It's quite an expensive product.
    Top Answer:An additional feature I'd like to see would be the option for voice recording. It would be helpful for us to have that possibility.
    Ranking
    2nd
    Views
    3,965
    Comparisons
    2,710
    Reviews
    0
    Average Words per Review
    0
    Rating
    N/A
    3rd
    Views
    2,222
    Comparisons
    1,894
    Reviews
    1
    Average Words per Review
    282
    Rating
    8.0
    Comparisons
    Also Known As
    Azure Speech Service, MS Azure Speech Service
    Learn More
    Overview

    Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Polly's Text-to-Speech (TTS) service uses advanced deep learning technologies to synthesize natural sounding human speech. With dozens of lifelike voices across a broad set of languages, you can build speech-enabled applications that work in many different countries.

    In addition to Standard TTS voices, Amazon Polly offers Neural Text-to-Speech (NTTS) voices that deliver advanced improvements in speech quality through a new machine learning approach. Polly’s Neural TTS technology also supports two speaking styles that allow you to better match the delivery style of the speaker to the application: a Newscaster reading style that is tailored to news narration use cases, and a Conversational speaking style that is ideal for two-way communication like telephony applications.

    Finally, Amazon Polly Brand Voice can create a custom voice for your organization. This is a custom engagement where you will work with the Amazon Polly team to build an NTTS voice for the exclusive use of your organization.

    Easily add real-time speech-to-text capabilities to your applications for scenarios like voice commands, conversation transcription, and call center log analysis.

    Tailor your speech recognition models to adapt to users’ speaking styles, expressions, and unique vocabularies, and to accommodate background noises, accents, and voice patterns.

    Build smart apps and services that speak to users naturally with the Text to Speech service. Convert text to audio in near real time, tailor to change the speed of speech, pitch, volume, and more.

    Give your application a one-of-a-kind, recognizable brand voice using custom voice models. Simply record and upload training data, and the service will create a unique voice font tuned to your recording.

    Sample Customers
    GoAnimate, Duolingo, Bandwidth
    KPMG
    Top Industries
    VISITORS READING REVIEWS
    Computer Software Company16%
    Financial Services Firm9%
    University8%
    Manufacturing Company7%
    VISITORS READING REVIEWS
    Computer Software Company17%
    Financial Services Firm10%
    Manufacturing Company9%
    Government7%
    Company Size
    VISITORS READING REVIEWS
    Small Business30%
    Midsize Enterprise16%
    Large Enterprise54%
    VISITORS READING REVIEWS
    Small Business25%
    Midsize Enterprise14%
    Large Enterprise61%

    Amazon Polly is ranked 2nd in Text-To-Speech Services while Microsoft Azure Speech Service is ranked 3rd in Text-To-Speech Services with 1 review. Amazon Polly is rated 7.0, while Microsoft Azure Speech Service is rated 8.0. The top reviewer of Amazon Polly writes "A text to spoken audio solution with a realistic neural voice feature, but the price could be better". On the other hand, the top reviewer of Microsoft Azure Speech Service writes "Very useful and helpful text-to-speech and speech-to-text features". Amazon Polly is most compared with Google Cloud Text-to-Speech and IBM Watson Text To Speech, whereas Microsoft Azure Speech Service is most compared with Google Cloud Speech-to-Text, Amazon Transcribe, Google Cloud Text-to-Speech and IBM Watson Speech To Text.

    See our list of best Text-To-Speech Services vendors.

    We monitor all Text-To-Speech Services reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.