


Find out what your peers are saying about Amazon Web Services (AWS), Apache, Spot - A Flexera company and others in Compute Service.
Thanks to improvements on both our side in how we run processes and enhancements to Apache NiFi, we have reduced the time commitment to almost not needing to interact with Apache NiFi except for minor queue-clearance tasks, allowing it to run smoothly.
It supports not just ETL but also ELT, allowing us to save significant time.
There may be return on investment based on the technology and easily moving our workloads onto Apache NiFi from our previous system.
The customer support is really good, and they are helpful whenever concerns are posted, responding immediately.
Customer support for Apache NiFi has been excellent, with minimal response times whenever we raise cases that cannot be directly addressed by logs.
I would rate the customer support of Apache NiFi a 10 on a scale of 1 to 10.
I have received support via newsgroups or guidance on specific discussions, which is what I would expect in an open-source situation.
I would rate the technical support of Apache Spark an eight because when we had questions, we found solutions, and it was straightforward.
There is a big communication gap due to lack of understanding of local scenarios and language barriers.
They've managed to answer all my questions and provide help in a timely manner.
The support on critical issues depends on the level of subscription that you have with Microsoft itself.
Depending on the workload we process, it remains stable since at the end of the day, it is just used as an orchestration tool that triggers the job while the heavy lifting is done on Spark servers.
Scaling up is fairly straightforward, provided you manage configurations effectively.
Based on the workload, more nodes can be added to make a bigger cluster, which enhances the cluster whenever needed.
Maintenance requires a couple of people, however, it's not a full-time endeavor.
This is crucial for applications demanding constant monitoring, such as healthcare or financial services.
Azure Stream Analytics is scalable, and I would rate it seven out of ten.
I have seen Apache NiFi crashing at times, which is one of the issues we have faced in production.
Apache NiFi is stable in most cases.
Apache Spark resolves many problems in the MapReduce solution and Hadoop, such as the inability to run effective Python or machine learning algorithms.
Without a doubt, we have had some crashes because each situation is different, and while the prototype in my environment is stable, we do not know everything at other customer sites.
They require significant effort and fine-tuning to function effectively.
For example, Azure Stream Analytics processes more data every second, which is why it's recommended for real-time streaming.
Apache NiFi should have APIs or connectors that can connect seamlessly to other external entities, whether in the cloud or on-premises, creating a plug-and-play mechanism.
The history of processed files should be more readable so that not only the centralized teams managing Apache NiFi but also application folks who are new to the platform can read how a specific document is traversing through Apache NiFi.
The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.
Various tools like Informatica, TIBCO, or Talend offer specific aspects, licensing can be costly;
I find that there really lacks the technical depth to do any recommendations for future updates of Apache Spark.
A cost comparison between products is also not straightforward.
There's setup time required to get it integrated with different services such as Power BI, so it's not a straight out-of-the-box configuration.
Azure Stream Analytics currently allows some degree of code writing, which could be simplified with low-code or no-code platforms to enhance performance.
The pricing in Italy is considered a little bit high, but the product is worth it.
Choosing between pay-as-you-go or enterprise models can affect pricing, and depending on data volume, charges might increase substantially.
From my point of view, it should be cheaper now, considering the years since its release.
We sell the data analytics value and operational value to customers, focusing on productivity and efficiency from the cloud.
Apache NiFi has positively impacted my organization by definitely bridging the gap between the on-premises and cloud interaction until we find a solution to open the firewall for cloud components to directly interact with on-premises services.
Development has improved with a reduction in time spent being the main benefit; before we needed a matter of days to create the ingestion flows, but now it only takes a couple of hours to configure.
The ease of use in Apache NiFi has helped my team because anyone can learn how to use it in a short amount of time, so we were able to get a lot of work done.
Not all solutions can make this data fast enough to be used, except for solutions such as Apache Spark Structured Streaming.
The most important part is that everything can be connected, and the data exchange across overseas connections is fast and reliable.
The solution is beneficial in that it provides a base-level long-held understanding of the framework that is not variant day by day, which is very helpful in my prototyping activity as an architect trying to assess Apache Spark, Great Expectations, and Vault-based solutions versus those proposed by clients like TIBCO or Informatica.
It's very accurate and uses existing technologies in terms of writing queries, utilizing standard query languages such as SQL, Spark, and others to provide information.
Azure Stream Analytics reads from any real-time stream; it's designed for processing millions of records every millisecond.
It is quite easy for my technicians to understand, and the learning curve is not steep.
| Product | Mindshare (%) |
|---|---|
| Apache NiFi | 7.7% |
| AWS Lambda | 15.3% |
| Amazon EC2 | 13.7% |
| Other | 63.3% |
| Product | Mindshare (%) |
|---|---|
| Apache Spark | 13.9% |
| Cloudera Distribution for Hadoop | 14.7% |
| HPE Data Fabric | 10.2% |
| Other | 61.2% |
| Product | Mindshare (%) |
|---|---|
| Azure Stream Analytics | 6.8% |
| Apache Flink | 8.2% |
| Databricks | 7.9% |
| Other | 77.1% |


| Company Size | Count |
|---|---|
| Small Business | 5 |
| Midsize Enterprise | 1 |
| Large Enterprise | 18 |
| Company Size | Count |
|---|---|
| Small Business | 28 |
| Midsize Enterprise | 16 |
| Large Enterprise | 33 |
| Company Size | Count |
|---|---|
| Small Business | 8 |
| Midsize Enterprise | 3 |
| Large Enterprise | 18 |
Apache NiFi offers a flexible platform for data orchestration, transformation, and ingestion, catering to both low and high-code customization needs. It streamlines data movement with a powerful visual interface and robust scalability, facilitating seamless integration with diverse data sources.
With Apache NiFi's drag-and-drop capabilities and extensive built-in processors, users can easily simplify complex workflows. Its open-source framework promises cost savings and increased productivity, enabling efficient pipeline development and real-time data handling. While it's valued for data integration and external tool compatibility, there's a need for improvements in logging clarity, local development integration, and cloud-native features.
What are the key features of Apache NiFi?In industries like finance, healthcare, and logistics, Apache NiFi is often implemented for data orchestration and transformation tasks, enhancing workflows through integration with tools like Spark and Elasticsearch. It supports data migration and ETL processes, enabling seamless management of large-scale data operations across systems.
Apache Spark is a leading open-source processing tool known for scalability and speed in managing large datasets. It supports both real-time and batch processing and is widely used for building data pipelines, machine learning applications, and analytics.
Apache Spark's strengths lie in its ability to process large data volumes efficiently through real-time and batch capabilities. With in-memory computation, it ensures fast data processing and significant performance gains. Its wide range of APIs, including those for machine learning, SQL, and analytics, make it versatile in handling complex data operations. While popular for ease of use and fault tolerance, Spark's management, debugging, and user-friendliness could benefit from improvements. Better GUIs, integration with BI tools, and enhanced monitoring are desired, alongside shuffling optimization and compatibility with more programming languages.
What are Apache Spark's key features?Organizations use Apache Spark predominantly for in-memory data processing, enabling seamless integration with big data frameworks. It's applied in security analytics, predictive modeling, and helps facilitate secure data transmissions in AI deployments. Industries leverage Spark's speed for sentiment analysis, data integration, and efficient ETL transformations.
Azure Stream Analytics offers real-time data processing with seamless IoT hub integration and user-friendly setup. It efficiently manages data streams and supports Azure services, SQL Server, and Cosmos DB.
Azure Stream Analytics specializes in real-time data analytics, easily integrating with Microsoft technologies. It enables swift deployment, monitoring, and high-performance data streaming. Though praised for its powerful SQL language and machine learning capabilities, users face challenges with historical analysis, pricing clarity, debugging, and data connection outside Azure. Limited real-time data joining, query customization, and complex data handling are noted alongside needs for improved technical support, job monitoring, and trial periods.
What are the key features of Azure Stream Analytics?Azure Stream Analytics is leveraged in industries for real-time IoT data processing, predictive analytics, and accident prevention in logistics. It supports telemetry data processing for applications like predictive maintenance and integrates with Power BI for enhanced data visualization, aligning with Azure's IoT infrastructure.