

Google Cloud Dataflow and Amazon MSK compete in data processing and streaming solutions. Google Cloud Dataflow is noted for high scalability and ease of use, while Amazon MSK is recognized for robust real-time streaming features and reliability.
Features: Google Cloud Dataflow integrates smoothly with Google services, offers excellent scalability, and supports complex data processing. Amazon MSK provides real-time Kafka streaming, integrates easily with AWS services, and ensures strong security.
Room for Improvement: Google Cloud Dataflow could improve its real-time streaming capabilities and offer more granular control settings. Amazon MSK could enhance user-friendliness for non-AWS users and provide more straightforward deployment options. Both could benefit from additional third-party integrations.
Ease of Deployment and Customer Service: Google Cloud Dataflow is known for easy deployment within the Google Cloud ecosystem and responsive customer support. Amazon MSK offers detailed configuration options and a reliable support network, focusing on more controlled setups.
Pricing and ROI: Google Cloud Dataflow's transparent pricing allows flexible cost management, yielding strong ROI. Amazon MSK, while having higher initial costs, offers substantial ROI via efficient real-time data streaming. Dataflow's cost-effectiveness contrasts with MSK's investment in real-time streaming capabilities.
They can manage most of our queries, and for what they cannot manage, they guide us through the process of finding out.
Amazon's support is excellent.
The fact that no interaction is needed shows their great support since I don't face issues.
Google's support team is good at resolving issues, especially with large data.
Whenever we have issues, we can consult with Google.
The functionality for scaling comes out of the box and is very effective.
As a B2B enterprise client, our clientele consists of large ticket clients but low amounts of users.
Google Cloud Dataflow has auto-scaling capabilities, allowing me to add different machine types based on pace and requirements.
As a team lead, I'm responsible for handling five to six applications, but Google Cloud Dataflow seems to handle our use case effectively.
Google Cloud Dataflow can handle large data processing for real-time streaming workloads as they grow, making it a good fit for our business.
It doesn't require any maintenance on my end yet, as I haven't had any issues.
I have not encountered any issues with the performance of Dataflow, as it is stable and backed by Google services.
The job we built has not failed once over six to seven months.
The automatic scaling feature helps maintain stability.
The increase in cloud costs by 50% to 60% does not justify the savings.
The only issue with Amazon MSK that we are facing is the configurations.
I had to remove and drop all the clusters and recreate them again, which is complicated in a production environment.
Outside of Google Cloud Platform, it is problematic for others to use it and may require promotion as an actual technology.
I feel there could be something that they can introduce, such as when we have data in the tables, a feature that creates a unique persona of the user automatically, so we do not have to do that manually.
Dealing with a huge volume of data causes failure due to array size.
Once we started using Kafka, our cloud costs rose by 50% to 60%.
We use Kafka M5 Large instance, and depending on the instances, that is the cost we have, along with storage cost and data transfer costs.
It is part of a package received from Google, and they are not charging us too high.
The scalability and usability are quite remarkable.
The best features of Amazon MSK are the real-time analytics that are excellent.
Amazon MSK is basically Kafka in the cloud, and when you need to create a cluster of Kafka brokers, Amazon MSK helps with that by automatically creating all the brokers according to the configuration you provide.
It supports multiple programming languages such as Java and Python, enabling flexibility without the need to learn something new.
The integration within Google Cloud Platform is very good.
Google Cloud Dataflow's features for event stream processing allow us to gain various insights like detecting real-time alerts.
| Product | Mindshare (%) |
|---|---|
| Amazon MSK | 4.3% |
| Google Cloud Dataflow | 3.7% |
| Other | 92.0% |

| Company Size | Count |
|---|---|
| Small Business | 4 |
| Midsize Enterprise | 7 |
| Large Enterprise | 4 |
| Company Size | Count |
|---|---|
| Small Business | 3 |
| Midsize Enterprise | 2 |
| Large Enterprise | 11 |
Amazon MSK offers seamless AWS integration, simplifying development and operation. It supports efficient data streaming and ensures cost-effective scalability without additional setup needs.
Amazon MSK stands out for its effortless creation, deployment, and access to new features without complex VPC configurations. Automating scalability, it demands minimal intervention, making it ideal for high-volume workflows. Developers benefit from real-time analytics, event sourcing, and log ingestion, aiding in dashboard maintenance and user log tracking. However, integration challenges exist as some face inflexibility, intricate configurations, and plugin development difficulties. Schema validation, connector variety, and complex update processes lead some to seek alternatives. Noteworthy for order data streaming, transaction tracking in retail and banking, and other real-time data applications, Amazon MSK remains attractive despite high cost concerns.
What are Amazon MSK's key features?In retail and banking, Amazon MSK facilitates order data streaming and transaction tracking. Its capabilities in supporting CDC pipelines, high-volume data management, and asynchronous processes make it favorable for integrating systems, streaming IoT data, and managing dashboard flows. Challenges in integration and configuration persist, nudging users to explore different options in certain contexts.
Google Cloud Dataflow provides scalable batch and streaming data processing with Apache Beam integration, supporting Python and Java. It's designed for efficient data transformations, analytics, and machine learning, featuring cost-effective serverless operations.
Google Cloud Dataflow is a robust tool for handling large-scale data processing tasks with flexibility in processing batch and streaming workloads. It integrates seamlessly with other Google Cloud services like Pub/Sub for real-time messaging and BigQuery for advanced analytics. The platform supports a wide array of data transformation and preparation needs, making it suitable for complex data workflows and machine learning applications. Despite its advantages, users have noted challenges such as incomplete error logs, longer job startup times, and some limitations in the Python SDK.
What are the key features of Google Cloud Dataflow?Industries, especially in retail and eCommerce, implement Google Cloud Dataflow for effective batch job execution, data transformation, and event stream processing. It aids in constructing distributed data pipelines for handling extensive analytics tasks, supporting effective large-scale data-driven decisions.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.