IT Central Station is now PeerSpot: Here's why
Buyer's Guide
Streaming Analytics
July 2022
Get our free report covering VMware, Confluent, Databricks, and other competitors of Cloudera DataFlow. Updated: July 2022.
619,967 professionals have used our research since 2012.

Read reviews of Cloudera DataFlow alternatives and competitors

Partner / Head of Data & Analytics at a computer software company with 11-50 employees
Real User
Top 10
Gives us low latency for fast, real-time data, with useful alerts for live data processing
Pros and Cons
  • "The top feature of Apache Flink is its low latency for fast, real-time data. Another great feature is the real-time indicators and alerts which make a big difference when it comes to data processing and analysis."
  • "One way to improve Flink would be to enhance integration between different ecosystems. For example, there could be more integration with other big data vendors and platforms similar in scope to how Apache Flink works with Cloudera. Apache Flink is a part of the same ecosystem as Cloudera, and for batch processing it's actually very useful but for real-time processing there could be more development with regards to the big data capabilities amongst the various ecosystems out there."

What is our primary use case?

We use Apache Flink to monitor the network consumption for mobile data in fast, real-time data architectures in Mexico. The projects we get from clients are typically quite large, and there are around 100 users using Apache Flink currently.

For maintenance and deployment, we split our team into two squads, with one squad that takes care of the data architecture and the other squad that handles the data analysis technology. Each squad is three members each.

What is most valuable?

The top feature of Apache Flink is its low latency for fast, real-time data. Another great feature is the real-time indicators and alerts which make a big difference when it comes to data processing and analysis.

What needs improvement?

One way to improve Flink would be to enhance integration between different ecosystems. For example, there could be more integration with other big data vendors and platforms similar in scope to how Apache Flink works with Cloudera. Apache Flink is a part of the same ecosystem as Cloudera, and for batch processing it's actually very useful but for real-time processing there could be more development with regards to the big data capabilities amongst the various ecosystems out there.

I am also looking for more possibilities in terms of what can be implemented in containers and not in Kubernetes. I think our architecture would work really great with more options available to us in this sense.

Finally, it's a challenge to find people with the appropriate skills for using Flink. There are a lot of people who know what should be done better in big data systems, but there are still very few people with Flink capabilities.

For how long have I used the solution?

I've been using Apache Flink for about one year.

What do I think about the stability of the solution?

We have not really had many issues. 

What do I think about the scalability of the solution?

Scaling Apache Flink is easily done because we use Kubernetes and containers.

How are customer service and technical support?

I can't comment on Apache Flink's technical support but I feel that the documentation is complete and adequate for our needs when doing configuration or solving technical issues.

How was the initial setup?

The setup was complex. The most challenging part of it was identifying how to realize the real-time low latency, fail-over, and high availability within our container and Kubernetes architecture. The configuration of all of this was not simple and it took about a month to get fully set up.

What about the implementation team?

We have two squads in our company that manage the implementation. One squad takes care of the data architecture and the other squad handles the data analysis technology.

What's my experience with pricing, setup cost, and licensing?

Apache Flink is open source so we pay no licensing for the use of the software.

Which other solutions did I evaluate?

Our clients had previously compared Apache Flink with Apache Spark and Apache Spark Streaming. The main advantage of Flink in comparison is that Flink handles complex processing better.

What other advice do I have?

My advice to others when using Apache Flink is to hire good people to manage it. When you have the right team, it's very easy to operate and scale big data platforms.

I would rate Apache Flink a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Buyer's Guide
Streaming Analytics
July 2022
Get our free report covering VMware, Confluent, Databricks, and other competitors of Cloudera DataFlow. Updated: July 2022.
619,967 professionals have used our research since 2012.