Apache NiFi Reviews

Name: Apache NiFi
Brand: Apache
Rating: 3.9 (22 reviews)

Vendor: Apache

3.9 out of 5

22 reviews
95% willing to recommend

Leave a review

What is Apache NiFi?

Apache NiFi offers a flexible platform for data orchestration, transformation, and ingestion, catering to both low and high-code customization needs. It streamlines data movement with a powerful visual interface and robust scalability, facilitating seamless integration with diverse data sources.

Get the Apache NiFi Buyer's Guide and find out what your peers are saying about Apache NiFi, Apache Spark, AWS Lambda and more!

Apache NiFi is the #5 ranked solution in top Compute Service solutions. PeerSpot users give Apache NiFi an average rating of 7.8 out of 10. Apache NiFi is most commonly compared to Apache Spark: Apache NiFi vs Apache Spark. Apache NiFi is popular among the large enterprise segment, accounting for 55% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a manufacturing company, accounting for 14% of all views.

Helped 900,644 peers since 2012

Featured Apache NiFi reviews

Yamini Vadlamudi

architect with 51-200 employees

Improvements can be made in the way of the UI. From the deployment perspective, Git configurations are available in 2.6 versions and 2.0 and later versions of Apache NiFi. Before 2.0, templates had to be created and stored in Apache NiFi Registry, which is available. However, templates still need to be imported and exported manually if moving from one environment to another environment. Even in 2.0 versions, although GitHub configurations are available, how it will function needs to be evaluated. Seamless CI/CD deployments are somewhat tricky and challenging when it comes to Apache NiFi with the proper approvals, moving that flow to another environment, and giving the proper RBAC controls. These are areas that could be improved. Documentation is adequate, but the only pain point is the deployment aspect.

Read full review

BJ Tan

Data Engineer at The Kudelski Group

I believe Apache NiFi could be improved with easier, out-of-the-box provided monitoring solutions. While Apache NiFi has an API that generates logs, it would be beneficial to have simpler access to that data saved historically. It would assist in easily retrieving data for historical analysis and storing it elsewhere without the hassle of setting up APIs and delving into documentation. Just having a more streamlined approach to collecting this data would be greatly advantageous. I would suggest continuous improvements regarding the custom developer-built processors, as many times the errors that arise are not useful. We often seem to struggle with a combination of implementing our own error handling or analyzing logs, as the information does not always align or proves unhelpful. Continuous enhancement in this area would be wonderful, so we do not need to decipher which error is more accurate or which report gets us nearer to the actual problem. For instance, I encountered a situation where flow files would not process; they were retried but returned to the queue before the Python processor due to ambiguous errors. It eventually turned out that the issue was the flow files' size being too large for the Python processor, which we only discovered by splitting the flow files, at which point the issue resolved. The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.

Read full review

reviewer2785560

Cloud Data Architect at a healthcare company with 10,001+ employees

To train or onboard new team members to use Apache NiFi, the code has been modularized and processes have been documented really well, so when new team members are onboarded, they are asked to review that documentation to understand the processes and the modules that have been created. In a couple of days, once they go through all that material, they are up to speed. In terms of flexibility and ease of use, Apache NiFi is more open compared to other ETL tools that have been used, such as Informatica and Teradata. It is open source with many contributors and can handle various data sources, including log data, structured, unstructured, and semi-structured data, unlike traditional ETL tools. Tools such as Prometheus and Grafana are sometimes used to keep an eye on Apache NiFi server, and DataDog is also used along with it. Scaling Apache NiFi workloads is managed through auto-scaling. To handle data security and compliance when using Apache NiFi, LDAP authentication is utilized, all clusters and nodes are kerberized, and single sign-on is used to authenticate. In transit, SSL encryption is used, and at rest, AES encryption is used, which is more than enough for the needs. Apache NiFi is kept up to date by keeping an eye on new features that have been released, discussing them internally to assess if they need to be incorporated into development. If there are any gaps in the current version, an upgrade to the new version will be attempted. Apache NiFi is a pretty good tool that meets most ETL needs, and in terms of performance and security, it is really good. After using it for quite some time without any issues, it is recommended as the number one tool for ETL. The overall review rating for Apache NiFi is 8 out of 10.

Read full review

Apache NiFi mindshare

As of June 2026, the mindshare of Apache NiFi in the Compute Service category stands at 7.7%, down from 8.5% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Compute Service Mindshare Distribution
Product	Mindshare (%)
Apache NiFi	7.7%
AWS Lambda	15.3%
Amazon EC2	13.7%
Other	63.3%

Compute Service

PeerResearch reports based on Apache NiFi reviews

Type	Title	Date
Category	Compute Service	Jun 23, 2026	Download
Product	Reviews, tips, and advice from real users	Jun 23, 2026	Download
Comparison	Apache NiFi vs AWS Lambda	Jun 23, 2026	Download
Comparison	Apache NiFi vs Amazon EC2	Jun 23, 2026	Download
Comparison	Apache NiFi vs AWS Fargate	Jun 23, 2026	Download

Valuable Features

Apache NiFi offers a remarkable web-based UI, simplifying data flow management. Users appreciate the drag-and-drop functionality, Java extensibility, built-in provenance capturing, and extensive integration capabilities. Its ability to handle real-time data and scale efficiently is valued. NiFi's flexibility in creating and maintaining workflows, custom transformations, and connectors enhances development speed. The platform supports diverse data sources, promoting smooth transitions between environments while maintaining reliability and reducing development time for many organizations.

"It's very handy, I like building the independent flow, an automated flow where you can build a flow from source to destination, then do the transformation in between."
"NiFi works on data and file levels, streamlining real-time data processes."
"It is really good when it comes to dealing with pipelines."

Room for Improvement

Apache NiFi requires enhancements in integration with development environments, cloud-native capabilities, user interface design, and scalability. Users encounter stability issues and intricate operations, necessitating improvements in logging, error handling, and tutorial availability. Integrating better with tools like Kafka, bolstering security features, and streamlining deployment processes are sought. Enhanced monitoring and alert systems, access to metadata, and simplified configurations for migrations are desired. There is a call for expanded support for binary data and custom processors.

"Sometimes, when I run Apache NiFi, processes crash without any clue, which might relate to the logging system."
"The product is stable for simple tasks, like using databases that are not distributed. However, for distributed environments like Hadoop or HBase, some vulnerabilities exist."
"The overall stability of this solution could be improved."

ROI

Organizations find value in Apache NiFi due to its open-source nature, which has led to significant time savings. Teams have experienced increased efficiency, reducing intervention time and handling queues more smoothly. While one notes no measurable financial gain, others report considerable time-saving benefits and successful workload transitions. Although technology debt is noted by some, the general feedback highlights Apache NiFi's positive impact on development and support for ETL/ELT processes.

Pricing

Enterprise buyers find Apache NiFi highly affordable due to its open-source nature, offering free setup and usage. Maintenance costs are minimal, covering only server operations. While some report cost considerations when integrating with other platforms like Cloudera, as a standalone tool, Apache NiFi provides excellent pricing value. Users highlight significant time savings and efficiency gains, reinforcing its cost-effectiveness despite regional pricing variations.

"I used the tool's free version."
"The solution is open-source."
"We use the free version of Apache NiFi."

Popular Use Cases

Organizations use Apache NiFi for diverse data integration tasks, including collecting data from different sources, orchestrating ingestion processes, building data pipelines, and managing ETL workflows. It supports data migration, real-time and batch data ingestion, and transformation. Apache NiFi integrates well with technologies like Kafka, Spark, and cloud services, providing robust monitoring and metrics capabilities. Users leverage its graphical UI for creating data flows, simplifying data handling across various platforms from Oracle to AWS.

Service and Support

Apache NiFi's support relies heavily on community and documentation due to its open-source nature. Some find official help slow and minimal, but useful online resources and an active community offset this. Cloudera users benefit from better service. Teams often resolve issues independently after initial learning. Advanced users rarely seek support and find documentation sufficient. While some rate customer service highly, others cite delays.

Deployment

Apache NiFi's initial setup is mostly straightforward. Installation is easy, with flexibility in changing port settings. For on-premises deployment, it's uncomplicated, while cloud setups may require more effort. Documentation is clear, enabling users to proceed independently. Single-node installations are quick, but multi-node clusters may take additional time for configuration. Users found it resource-intensive on personal laptops but effective on proper servers. Experiences with cloud services, such as AWS, have been smooth.

Scalability

Apache NiFi is considered scalable but presents mixed feedback. Some report seamless scalability and efficient resource management, often when used with Kubernetes. In contrast, others highlight challenges such as limited scalability in specific versions and difficulties scaling down. Organizations frequently expand usage, with scaling done by adding nodes or larger machines. While some experience smooth scalability on multiple nodes, others note constraints like memory and warn about issues compared to other solutions.

Stability

Apache NiFi is generally stable when used correctly, although some users experience occasional issues with memory and process stability, particularly in distributed environments. Many rate its stability positively, mentioning it runs well in development and QA. Some users noted crashes and logging gaps. Proper resource and process management can enhance stability, with ratings typically ranging from seven to eight out of ten.

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our Apache NiFi Buyer's Guide for additional reliable information.

Review data by company size

By reviewers
Company Size	Count
Small Business	5
Midsize Enterprise	1
Large Enterprise	15

By reviewers

By visitors reading reviews
Company Size	Count
Small Business	73
Midsize Enterprise	32
Large Enterprise	127

By visitors reading reviews

Top industries

By visitors reading reviews

Manufacturing Company

14%

Financial Services Firm

13%

Computer Software Company

Comms Service Provider

Marketing Services Firm

Retailer

Government

Outsourcing Company

University

Media Company

Healthcare Company

Construction Company

Educational Organization

Insurance Company

Hospitality Company

Real Estate/Law Firm

Logistics Company

Legal Firm

Consumer Goods Company

Wholesaler/Distributor

Engineering Company

Non Profit

Recreational Facilities/Services Company

Renewables & Environment Company

Compare Apache NiFi with alternative products

Learn more about Apache NiFi

With Apache NiFi's drag-and-drop capabilities and extensive built-in processors, users can easily simplify complex workflows. Its open-source framework promises cost savings and increased productivity, enabling efficient pipeline development and real-time data handling. While it's valued for data integration and external tool compatibility, there's a need for improvements in logging clarity, local development integration, and cloud-native features.

What are the key features of Apache NiFi?

Ease of Use: A visual interface with drag-and-drop capabilities simplifies operation.
Scalability: Supports both small and large-scale data operations effectively.
Integration: Connects with a wide array of data sources and external tools.
Customization: Offers both low-code and high-code development options.
Real-Time Data Handling: Ensures timely processing and transformation of data.

What benefits can users expect from Apache NiFi?

Cost Savings: Open-source nature helps reduce financial investments.
Productivity: Streamlined processes enhance efficiency and workflow management.
Data Handling: Efficiently manages batch and real-time data for better decision-making.
Transformation Options: Comprehensive tools for flexible data transformation.

In industries like finance, healthcare, and logistics, Apache NiFi is often implemented for data orchestration and transformation tasks, enhancing workflows through integration with tools like Spark and Elasticsearch. It supports data migration and ETL processes, enabling seamless management of large-scale data operations across systems.

Apache NiFi customers

Macquarie Telecom Group, Dovestech, Slovak Telekom, Looker, Hastings Group

Product Categories

Compute Service

Popular Comparisons

Apache Spark vs Apache NiFi

AWS Lambda vs Apache NiFi

Zadara vs Apache NiFi

Amazon EC2 vs Apache NiFi

AWS Fargate vs Apache NiFi

AWS Batch vs Apache NiFi

Amazon EC2 Auto Scaling vs Apache NiFi

Amazon Virtual Private Cloud vs Apache NiFi

Oracle Compute Cloud Service vs Apache NiFi

Apache Storm vs Apache NiFi

See all alternatives

Apache NiFi Reviews Summary
Author info	Rating	Review Summary
architect with 51-200 employees	4.0	I've used Apache NiFi for six years as our core data ingestion tool due to its no-code interface, scalability, and ease of debugging, though improvements in deployment and CI/CD integration would enhance its usability.
Data Engineer at The Kudelski Group	4.0	I use Apache NiFi daily for integrating and transforming data into Elasticsearch, appreciating its flexibility, Python integration, and cost-efficiency, though improved error handling and built-in monitoring would enhance its usability.
Cloud Data Architect at a healthcare company with 10,001+ employees	4.0	I've found Apache NiFi highly effective for ETL tasks, with great scalability, connector support, and performance. It simplifies development and boosts productivity, though its UI and documentation could improve. Overall, it's my preferred ETL solution.
Team Lead Technical Specialist (Production Support) at a recreational facilities/services company with 11-50 employees	3.5	I've been using Apache NiFi for ETL and ELT workflows, mainly before ingesting data into Snowflake, and while it offers strong integration and standardization benefits, its logging and UI need improvements for better usability.
Manager at a tech company with 10,001+ employees	4.0	I've used Apache NiFi mainly for orchestration between on-premises and cloud systems; its low-code interface saves development time, though it needs better AI integration, stability improvements, and enhanced connectivity with other tools.
Data Consultant at a comms service provider with 201-500 employees	2.5	I've found Apache NiFi easy to use for data ingestion, especially with limited coding skills, but it lacks scalability, version control, and creates tech debt, making it costly and outdated despite saving time in development.
Partner at a tech vendor with 10,001+ employees	4.5	I've used Apache NiFi for five years for real-time data ingestion into AWS Redshift, appreciating its drag-and-drop simplicity, stability, and scalability, which sped up projects and cut costs by 30%, though ROI metrics weren't tracked.
Senior Data Engineer at a tech vendor with 10,001+ employees	4.5	I've used Apache NiFi mainly for data ingestion into AWS and found it simple, scalable, and highly efficient, significantly reducing development time. It's plug-and-play, stable, and works well without major drawbacks or needed improvements.
Senior Developer at Infosys	4.0	I find NiFi an effective ETL tool for real-time data and transformations, especially on-premises. I appreciate its data privacy and scalability. However, JSON processing and record-level tasks need improvement, and it has vulnerabilities in distributed environments.
Engineering Lead- Cloud and Platform Architecture at a financial services firm with 1,001-5,000 employees	4.0	As a DevOps engineer, I find Apache NiFi valuable for file transfers and transformations due to its robust monitoring, extensive processor library, and integration features. However, it lacks SSO integration, complicating access control at the enterprise level.

Yamini Vadlamudi

architect with 51-200 employees

Dec 21, 2025

Unified data flows have simplified large-scale ingestion and have improved SLA reliability

What is our primary use case?

Apache NiFi is used to fetch data from different sources and ingest it into different destinations. The entire platform depends on Apache NiFi for data transformation and data movement. Multiple sources such as flat files, Oracle, Postgres are connected, and the main database where data is stored is Postgres and Apache Druid. The UI connects directly to those databases to ensure the data is available at all times. This platform has been built for many enterprise platforms.

Sources include Oracle and files in S3. Apache NiFi creates flows using NiFi processors to fetch data from particular sources such as Oracle. Business data and customer data are available there. Every day there are many incidents, changes, and logging into the system. Data must be fetched daily in two ways: batch streaming and near real-time streaming. Data is fetched twice every day as a batch job. The data is processed after fetching. When connected to the Oracle database, data is received in Avro format. It is then changed into JSON format, and a mapping is used. Whatever destination is required, fields are mapped accordingly. The schema in Postgres is mapped using evaluate JSON path. Whatever JSON field is received is mapped to the respective schema in the database. Records are then ingested into Postgres. Almost daily, 5 million records are received and multiple flows process them. Each flow has a different schedule, which can be every five minutes or twice daily.

Another process retrieves cost data from AWS and Azure. Everyday billing files are received in S3 or Azure blob if Azure is being used. The data is retrieved and transformed with modifications made based on the schema. The data is then sent into Apache Druid using the REST API.

Apache NiFi is the heart of data ingestion for the entire platform. Whatever data movement is happening will happen through Apache NiFi, which is the single data ingestion tool in the platform. Once data lands into the respective destinations using Apache NiFi, multiple transformations, queries, or other operations are performed and projected to the UI. The main gateway for all data from whatever source can be is Apache NiFi. It is a no-code platform where multiple flows can be created, which reduces the time spent writing code and integrating into different connectors. Connectors are already available in Apache NiFi, so flows just need to be created, which makes the process easy and stable.

What is most valuable?

Apache NiFi offers multiple features. One feature is that it is a no-code platform, so specific code does not need to be written. Direct connectors are available for whatever integrations are needed. If it is Azure, AWS, GCP, or a normal RDBMS, or if connecting to fetch files from a server where files are landing, different types of connectors are available. Integration is simple, flows are created, and the process can run in minutes. Another feature is that if anything needs to be changed or if moving from one flow to a different environment or server is necessary, it is easy to do. Templates can be created, downloaded, and reused as many times as needed. Particular flows can be stored and used multiple times. Email notifications can be configured to trigger if any failures happen, which allows developer teams or the support team to take immediate action. One of the most important features is that if billions and billions of records are coming daily, Apache NiFi ensures that by increasing the configurations and thresholds, that data can be transported, which makes operations easier and allows more things to be accomplished while ensuring SLAs are met.

The most reliable features are the building functionality, the availability of connectors, and the scalability that Apache NiFi offers. These are the features that can be relied on. For example, if billions of records need to be sent from one source to another destination, such as Oracle to Postgres or from a file to an RDBMS, writing a simple Python code requires writing the code and changing the schema. If schema changes occur, it is not easy, as all the code must be reviewed and updated, which takes considerable time for debugging if issues arise. Parallel running of the Python program, multithreading, and all these concepts must be managed, and it is not feasible. The script must be run somewhere and cron must be set up. All these features are bundled and provided in Apache NiFi, which makes operations much easier.

Apache NiFi has tremendously reduced the effort of maintaining multiple different connectors. It makes the transformation and ingestion of billions of records much easier. It is the only platform that is easy for developers to maintain, which makes the entire data platform a very scalable and reliable tool. All data ingestions happen through a single outlet, so if any issues occur, Apache NiFi is the only thing that needs to be addressed, rather than having to go to multiple places and do multiple things or have debugging become a nightmare if Apache NiFi had not been used. It tremendously makes the platform reliable and useful for customers without crossing any SLAs. Timely ingestions are happening and everything is functioning properly.

What needs improvement?

Documentation is adequate, but the only pain point is the deployment aspect.

For how long have I used the solution?

Apache NiFi has been utilized as the main data ingestion tool for around six years and continues to be used currently.

What do I think about the stability of the solution?

There are no stability issues.

What do I think about the scalability of the solution?

There are no scalability issues.

What other advice do I have?

The first benefit is the time saved by moving to Apache NiFi. Generally, writing a Python program and creating different connectors using a Python program may take three weeks to one week for a developer to create, debug, and deploy. When using Apache NiFi, the time was reduced by almost half. Flows can be created in two days a week, whereas with Python it took almost one to one and a half weeks to complete a single flow. In terms of debugging, Apache NiFi is the only platform where errors can be easily debugged. A bulletin board is available where the exact issue can be viewed and traced back to see where it went wrong. These tasks can be completed quickly, possibly in 30 to 50 minutes to debug an issue. Whereas with Python, if working with a single program level and creating multiple files and multiple jobs, it is very difficult to go back and see exactly where the issue is because tracing from the start to find the issue is necessary, and then every line of code must be debugged. With Apache NiFi, the issue location is immediately identified, and focus can be placed on that particular processor. SLAs have been improved because if an SLA requires completing a job within 30 minutes, using a native programming language or normal jobs requires refinement and optimization for very large transformations, such as ingesting 5 billion records every day, which is not feasible with a normal Python program. Apache NiFi provides the infrastructure where an unlimited number of records, such as billions of records, can be processed in minutes, which ensures that SLAs have never been missed or crossed. SLAs are continuously met, and if additional processing is needed, threads can be increased and little configurations in Apache NiFi can be adjusted, which ensures SLA compliance. These are the metrics that have been seen and benefited from.

If an organization is looking for a unified tool to ingest data at very large scale and wants both batch and real-time streaming with one unified data tool, Apache NiFi is recommended. If a code-free approach is preferred and spending much time writing code or debugging code should be avoided, Apache NiFi is an excellent choice where data flows can be created simply through clicks instead of using other particular tools. This review has been given a rating of 8 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure

BJ Tan

Data Engineer at The Kudelski Group

Dec 6, 2025

Daily workflows have integrated diverse logs and have delivered flexible data orchestration

What is our primary use case?

I have been using Apache NiFi virtually daily, as it is part of my main responsibility in my current role.

My main use case for Apache NiFi involves integrating various data sources and performing transformations to load them into mostly our NoSQL database, Elasticsearch, but sometimes into other databases as well.

For integrating and transforming data, we receive a lot of logs generated with our AWS services that the company wants to collect, particularly for our security team to review those logs and ensure they can conduct their security checks and reviews to confirm there is no abnormal behavior. We use Apache NiFi to capture those logs sent to many S3 buckets, collect those logs, decompress them with Apache NiFi, perform any necessary transformations, and send them to Elasticsearch so that end users, often from the network team or security team, can then use Elasticsearch and Kibana for data analysis.

My advice for others considering Apache NiFi is that if you are willing to, you can use it on-premises; it offers great customizability. While it is specifically designed for streaming data, it can also accommodate batch data. Moreover, it is useful for various out-of-the-box solutions, including unique uses such as email notifications, showcasing flexibility in data orchestration, ETL, and other applications.

What is most valuable?

Apache NiFi offers great flexibility in terms of whether you want to be a low-code user or a high-code user, especially if you are a Python or Java developer, thanks to the recent addition of custom-built processors in the latest versions of Apache NiFi where you can use Python or Java to create your own processors versus using the great selection of out-of-the-box processors already available in Apache NiFi to do almost anything. If you are willing to put together a complex web of processors, you can do almost any data transformation you want, but the customizability with making your own processors, again with Python or Java, has been a huge benefit for performing both what Apache NiFi is specifically made to do and some more out-of-the-box solutions, such as creating some kind of email notification system as well. This kind of use with Apache NiFi has existed even before the implementation of custom processors. You could create scripts, even putting them in Python in Apache NiFi using the execute script methods, and this has existed before, but now it has even better functionality with the latest version of Python rather than just a Jython type of hybrid. Those are some of the best things that it offers.

The flexibility of Apache NiFi has helped me in my daily work, especially because instead of utilizing a bunch of Apache NiFi processors, which we do use for most of our processes, it can be much easier to combine transformation logic within Python processors since the majority of our team prefers Python programming as our choice of language. This integration allows us to put it all in one place. We can integrate Apache NiFi with our Python processors that we host on a Git repository, which integrates very well, and we can manage the same scripts and make changes efficiently. It is great coming from a Python developer mindset shared amongst the team.

Apache NiFi has positively impacted my organization as it continually improves functionality and throughput with each iteration over the past three years. One of the big tradeoffs with open source is that how well it functions is largely dependent on the user, but that means you can adapt it to whatever custom use case you have. We have been able to consolidate several different authentication methods through just Microsoft, and Apache NiFi has been helpful in facilitating that. Additionally, due to its many ways of extracting data from different sources, we can develop specific solutions ourselves, allowing us to integrate various data sources. Thanks to the open-source customizability, we can adapt Apache NiFi to our built cluster, which has numerous benefits, particularly since we are managing many of our processes. This approach saves us significant costs compared to moving to something more managed or on the cloud, as managing open-source technologies ourselves ultimately reduces expenses.

Regarding cost savings, I do not have a strict idea of how much we have saved since the company was already using Apache NiFi when I joined, but I am certain comparisons have been made against other ETL or data orchestration tools that are popular among different cloud providers such as AWS or Azure. The cost savings must be significant, particularly given that we are handling terabytes and petabytes of data daily, trying to find software that allows this in an affordable manner. It is clear that substantial savings exist, as long as we manage our own clusters and bugs effectively. The tradeoff with managed services is that they handle much of this, ensuring uninterrupted service, but these come at a cost. Conversely, with open-source software management, we incur no costs as we handle everything ourselves.

What needs improvement?

I would suggest continuous improvements regarding the custom developer-built processors, as many times the errors that arise are not useful. We often seem to struggle with a combination of implementing our own error handling or analyzing logs, as the information does not always align or proves unhelpful. Continuous enhancement in this area would be wonderful, so we do not need to decipher which error is more accurate or which report gets us nearer to the actual problem. For instance, I encountered a situation where flow files would not process; they were retried but returned to the queue before the Python processor due to ambiguous errors. It eventually turned out that the issue was the flow files' size being too large for the Python processor, which we only discovered by splitting the flow files, at which point the issue resolved. The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.

For how long have I used the solution?

I have been working in my current field for about three years.

What do I think about the stability of the solution?

Apache NiFi is now more stable than before.

What do I think about the scalability of the solution?

Apache NiFi's scalability is good. You can scale it up as long as you have the machines and servers available. If you have room for more instances, scaling up is fairly straightforward, provided you manage configurations effectively.

How are customer service and support?

Apache NiFi's customer support is good.

How would you rate customer service and support?

What was our ROI?

I have definitely seen a return on investment through time savings. Working with Apache NiFi allows us to manage it more efficiently, transitioning from spending hours or days resolving issues to requiring much less intervention now. Thanks to improvements on both our side in how we run processes and enhancements to Apache NiFi, we have reduced the time commitment to almost not needing to interact with Apache NiFi except for minor queue-clearance tasks, allowing it to run smoothly. At this point, we have certainly saved hundreds of hours.

What other advice do I have?

The customizability of Apache NiFi helps even with unique use cases, as I mentioned before, given that Apache NiFi can be used in this capacity. While there are better applications or software options available, when you are trying to keep it simple and finding ways to utilize a couple of processors for a unique solution, you can do that in Apache NiFi. For example, we have several notification-type pipelines we have built in Apache NiFi, such as reading from a SQL database to identify users who have not completed training and then sending them an email reminder to complete that training. We have that running regularly, week by week. Another instance involves a processing data flow that scans for specific data found in logs, which triggers an email notification to the relevant team letting them know that a unique identifier has appeared, allowing them to handle the situation.

I encountered some odd cases such as increasing concurrent threads on a processor, which should work similarly to copying several processors, yet functional throughput varies. It seems that using a distributed processor yields better throughput than just increasing the concurrent threads on one processor, which has been odd but is a workaround we had to adopt to boost throughput. Resolving such quirks could elevate the rating further.

I rate Apache NiFi an eight out of ten. I choose eight because, as open-source software, there is always room for improvement, but the tradeoff between learning how to use the software and the savings it provides, along with its customizability, ranks it pretty high. It is effective for what it does and continues to improve, so it could score higher if there are significant enhancements in custom-built processors and ongoing improvements in functionality.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

reviewer2785560

Cloud Data Architect at a healthcare company with 10,001+ employees

Dec 9, 2025

Data flows have transformed and now support reusable ETL pipelines across diverse sources

What is our primary use case?

Apache NiFi is used mainly for ETL to get data from multiple sources and then load it into a data lake. For example, data from Ab Initio is extracted and then loaded to an S3 bucket. From the S3 bucket, that data is read again and then loaded into other data layers. Different data layers, such as raw and raw silver, all use Apache NiFi.

How has it helped my organization?

Apache NiFi has positively impacted the organization by making development really easy, allowing efficient design and development, and enabling code reuse which has reduced the development effort. Integration with Git is also really good for sharing code across teams. A reduction in development effort of about 30% has been observed.

What is most valuable?

The best features Apache NiFi offers include the ability to connect to any type of sources, which is a big advantage since most connectors are already available and do not need to be created. In transformation, it has a wide variety of transformations that can be used across big data and any kind of data, including JSON formatted data.

The connectors used most often include connecting to the Oracle database as the main focus and then connecting to log data, which have greatly benefited the team.

What needs improvement?

Improvements in the user interface to make it easier to use would be beneficial, and adding more security features would make Apache NiFi more secure and robust. Documentation and support could also be enhanced, as most support is usually received from users rather than from the product owners.

For how long have I used the solution?

Apache NiFi has been used for more than five years.

What do I think about the stability of the solution?

Apache NiFi is stable.

What do I think about the scalability of the solution?

The performance of Apache NiFi is really good. Based on the workload, more nodes can be added to make a bigger cluster, which enhances the cluster whenever needed. Apache NiFi's scalability is good and it is auto-scalable, which is pretty impressive.

How are customer service and support?

Initially, customer support was contacted more often, but after understanding Apache NiFi better, not many issues have been faced. The customer support is really good, and they are helpful whenever concerns are posted, responding immediately.

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

Before Apache NiFi, StreamSets was used, and it was unsatisfactory as it consumed all the server resources and did not release them. After finding Apache NiFi, it became a preferred solution.

What was our ROI?

Apache NiFi provides a good return on investment, although specific cost details are not known since that is not an area of involvement. Feedback has been given to stakeholders that Apache NiFi is beneficial in development and does great work, indicating a positive return on investment.

Which other solutions did I evaluate?

Before choosing Apache NiFi, other options such as StreamSets, Teradata, Informatica Big Data version, and Glue were evaluated, but Glue was not chosen due to its high cost.

What other advice do I have?

In terms of flexibility and ease of use, Apache NiFi is more open compared to other ETL tools that have been used, such as Informatica and Teradata. It is open source with many contributors and can handle various data sources, including log data, structured, unstructured, and semi-structured data, unlike traditional ETL tools.

Tools such as Prometheus and Grafana are sometimes used to keep an eye on Apache NiFi server, and DataDog is also used along with it. Scaling Apache NiFi workloads is managed through auto-scaling.

To handle data security and compliance when using Apache NiFi, LDAP authentication is utilized, all clusters and nodes are kerberized, and single sign-on is used to authenticate. In transit, SSL encryption is used, and at rest, AES encryption is used, which is more than enough for the needs.

Apache NiFi is kept up to date by keeping an eye on new features that have been released, discussing them internally to assess if they need to be incorporated into development. If there are any gaps in the current version, an upgrade to the new version will be attempted.

Apache NiFi is a pretty good tool that meets most ETL needs, and in terms of performance and security, it is really good. After using it for quite some time without any issues, it is recommended as the number one tool for ETL. The overall review rating for Apache NiFi is 8 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Nishant Khandelwal

Team Lead Technical Specialist (Production Support) at a recreational facilities/services company with 11-50 employees

Dec 14, 2025

Standardized data pipelines have streamlined ETL workflows but still need clearer logs and UI

What is our primary use case?

My main use case for Apache NiFi is to use it for basic ETL pipelines before ingesting data into the data lake. For example, if we want to ingest anything into Snowflake, before actually moving it to Snowflake, we do basic data massaging and transformations through Apache NiFi. This may include consuming a file, converting that file from one format to another, breaking it into chunks, and then pushing it to Snowflake.

A specific example is that we consume all caller data through Apache NiFi by connecting to Kafka topics and consuming those messages. We then convert those messages into JSON format and push those messages to Snowflake by running Snowflake procedures, which eventually ingest the data directly into Snowflake by reading it through an S3 bucket. We also push the actual JSON messages to S3 buckets through Apache NiFi itself.

What is most valuable?

The best features Apache NiFi offers include the integration capability because we have use cases wherein we have been using Apache NiFi to integrate through and consume from multiple sources. It can connect to any APIs, NAS drives, and databases, and it can consume streaming data. We can connect to Kafka topics and queues as well through Apache NiFi. First of all, there is flexibility in consuming from multiple sources. We can also easily transfer information from one block to another. We have command and control on the controller services that we can maintain separately so that we do not have to repeat connections and create multiple connections in different processor groups.

Apache NiFi has positively impacted our organization by significantly helping us streamline the processes. Earlier, each team was creating their own ETL pipelines with no standard being followed. Apache NiFi gave us the opportunity to streamline that process. Each team can request access to Apache NiFi and be onboarded separately based on their needs. We have specifically standardized the pipeline design so that no one can deviate from that. For example, everyone is supposed to use specific variables when enabling the alerting system. We have a different tool called Moogsoft that everyone can onboard to using the InvokeHTTP processor of Apache NiFi to send their alerts to a centralized system which can generate incidents. No one is now working in silos, and there is a specific pattern being followed by everyone. Furthermore, everyone must have Apache NiFi configured so that they can consume from a specific source but eventually push the data to our strategic cloud partner, AWS, first into an AWS bucket which is common for all, and from that bucket, subsequent processing will be done in Snowflake. The standard being followed now once Apache NiFi has been introduced has allowed others to copy-paste successfully created pipelines, saving a lot of time in our overall software development life cycle by reducing redundancy and efforts.

What needs improvement?

A couple of improvements for Apache NiFi would be better logging. Sometimes when looking at logs and events, they do not always make sense. A strong recommendation is that there has to be improved logging in Apache NiFi. Secondly, when looking at the file states, the history of processed files should be more readable so that not only the centralized teams managing Apache NiFi but also application folks who are new to the platform can read how a specific document is traversing through Apache NiFi.

For how long have I used the solution?

I have been using Apache NiFi for almost four and a half years now.

What do I think about the scalability of the solution?

Apache NiFi's scalability is great; we can easily scale into it without encountering any challenges.

How are customer service and support?

Customer support for Apache NiFi has been excellent, with minimal response times whenever we raise cases that cannot be directly addressed by logs. The support team has consistently provided great assistance with processor failures and helped us create ad-hoc processors as needed.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not use any previous solution before Apache NiFi. Apache NiFi has significantly helped us improve our processes and streamline them, while we are also using SnapLogic for integration purposes in parallel.

I am not aware of any other solutions that were explored before onboarding Apache NiFi. Being from the analytics team, we were earlier using Tableau Prep for ETL and transformations and Informatica as well, but I am unsure if anything else was explored in the organization prior to using Apache NiFi.

What was our ROI?

Apache NiFi provides huge relief for all teams with similar use cases for ETL purposes, and it supports not just ETL but also ELT, allowing us to save significant time.

What's my experience with pricing, setup cost, and licensing?

I cannot comment on the pricing, setup cost, and licensing for Apache NiFi, but I can say there was significant time saved across the development life cycle due to reusable pipelines offered by Apache NiFi.

What other advice do I have?

The reason I rate Apache NiFi a seven is that most development folks are afraid to start using Apache NiFi because, to begin with, it does not always directly make sense. For example, there are other integration tools such as SnapLogic where you can simply search for a specific processor, but that is not the case with Apache NiFi. You need some basic understanding of which processors are there before you can fetch what you need. Better UI design should allow newcomers to search using relevant keywords, such as API, to retrieve appropriate processors, which currently is not happening, requiring some understanding of Apache NiFi. The other challenge I mentioned is having better logging, especially for processor-related logs to help newcomers navigate effectively.

I think Apache NiFi is a great tool that one should definitely explore. It is essential to perform basic checks regarding requirements, but if someone is looking for ETL and ELT functionalities needing to connect to CSVs, JSONs, Excels, and databases, they can onboard to Apache NiFi. It offers great connectivity for consuming or pushing data through queues and cloud workloads. Overall, it is an excellent product, especially for basic data massaging and processing before pushing data to Snowflake or creating reports. I rate Apache NiFi a seven out of ten.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Debashish Dhar

Manager at a tech company with 10,001+ employees

Dec 3, 2025

Low-code orchestration has bridged on-prem and cloud workflows and now streamlines data sharing

What is our primary use case?

My main use case for Apache NiFi is orchestration; I kickstart the job and then pass on the handler to Databricks or AWS to run the ETL pipeline.

A specific example of a workflow where Apache NiFi plays a key role is when there is an on-premises Hadoop system and a cloud component. Because of the company's policy regarding firewalls, the data cannot be directly moved through services such as Kinesis or DMS integrating directly with on-premises resources. Apache NiFi works as an orchestrator and a middle tool to get the parameters to trigger the job and then pass on the handler to the cloud services. Because of the firewall, Apache NiFi comes into the picture. Another use case for Apache NiFi is once the data is created in S3; I can extract a subset of the data and send it as an SFTP for outside recipients.

I have another scenario regarding my main use case with Apache NiFi; there is a use case for synthetic data, and we are using synthetic data generative AI software to synthesize data in the cloud environment. Now for users who are not on the cloud and want to access the synthetic data, Apache NiFi is used to pull the data back from the cloud.

What is most valuable?

The best features Apache NiFi offers include drag and drop capabilities; I would not say it is no-code, but it is pretty low-code.

The drag-and-drop interface of Apache NiFi has helped my team by cutting down on development time, allowing us to focus more on the mapping part and testing part of it.

In terms of features, if a job requires a predetermined mapping for the metadata, Apache NiFi comes in handy to map for the metadata, which is needed before an external file can be validated and pushed onto the cloud for processing. This is better than traditional systems where we would have to sit down and map individual attributes and their data types to create that metadata interface.

Apache NiFi has positively impacted my organization by definitely bridging the gap between the on-premises and cloud interaction until we find a solution to open the firewall for cloud components to directly interact with on-premises services.

What needs improvement?

Regarding improvements in Apache NiFi, there is scope; it should gear more towards an AI mechanism, especially for metadata generation. If I want to integrate Python code or embed Python code within the Apache NiFi parameters or workflows, it should come with AI integration that can assist in generating code based on user requirements. Gearing more towards a no-code mechanism will really enhance Apache NiFi's productivity, as customers would want that.

About needed improvements, I think integration with other tools would really help in the age of AI. Apache NiFi should have APIs or connectors that can connect seamlessly to other external entities, whether in the cloud or on-premises, creating a plug-and-play mechanism.

For how long have I used the solution?

I have been using Apache NiFi for roughly one and a half years.

What do I think about the stability of the solution?

The biggest challenges I have faced while using Apache NiFi revolve around reliability; I have seen Apache NiFi crashing at times, which is one of the issues we have faced in production. It does not happen often, but there have been instances where the NiFi nodes crashed, and sometimes we encountered issues retrieving data where it was unable to establish connectivity.

To troubleshoot or resolve the Apache NiFi crashes and data retrieval issues, my team primarily replicates the same scenario in development to see where the issue lies. On one occasion, the failure was linked to an authentication mechanism change at the enterprise level.

In my experience, Apache NiFi is stable.

What do I think about the scalability of the solution?

Apache NiFi's scalability has not been an issue yet; I would say it is pretty stable. Depending on the workload we process, it remains stable since at the end of the day, it is just used as an orchestration tool that triggers the job while the heavy lifting is done on Spark servers.

How are customer service and support?

I have not needed to reach out for help regarding customer support for Apache NiFi.

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

I previously used Airflow before switching to Apache NiFi.

I switched from Airflow to Apache NiFi because Airflow is still not widely used within the organization; my experience with Airflow has been with other companies, not with Mastercard.

What about the implementation team?

We have a dedicated team that handles version upgrades and maintenance for Apache NiFi.

What was our ROI?

I would say there has been time saved to an extent, but not in terms of resources when evaluating return on investment.

Which other solutions did I evaluate?

I was not involved in evaluating other options before choosing Apache NiFi.

What other advice do I have?

On a scale of one to ten, I would rate Apache NiFi an eight. I chose eight out of ten because there is scope for improvement; in this age, folks want more automation, which is why I kept it at eight. My advice for others considering using Apache NiFi is that it is a good tool, but there are also other options to consider before making a decision. I gave this review an overall rating of eight out of ten.

Which deployment model are you using for this solution?

On-premises

reviewer2769915

Data Consultant at a comms service provider with 201-500 employees

Dec 9, 2025

Managing data ingestion has created tech debt but still accelerates multi-step streaming work

What is our primary use case?

I use Apache NiFi for data ingestion from multiple sources into our data lake.

How has it helped my organization?

Can't say many positive impacts apart from ease of use for simpler use cases. The rest is mostly tech debt and negative impacts.

What is most valuable?

What stood out the most about Apache NiFi was its ease of use when it comes to smaller datasets or when you have multiple streaming data sets that need to be ingested in multiple steps. When you don't have a team with a lot of coding background, Apache NiFi comes in very handy.

The ease of use in Apache NiFi has helped my team because anyone can learn how to use it in a short amount of time, so we were able to get a lot of work done. You don't have to build anything from scratch as you do with Spark or any other tool sets because you already have blocks created within Apache NiFi, so you save time. When we talk about how much time Apache NiFi has saved in development, you save considerable time.

What needs improvement?

There are a lot of challenges with Apache NiFi. I believe it is very limited when it comes to scalability and version control. Various improvements are needed for Apache NiFi; the way the tool is structured needs to improve.

Having version control and metadata access is essential. If Apache NiFi compute could scale with workload through ephemeral scaling, that could help; but there is not much we can extract from the Apache NiFi API. It is a huge pain when it comes to reading Apache NiFi configuration metadata. Apache NiFi needs to be faster, and having version control, metadata access, and some improvements would be beneficial. However, it is a good solution with room for improvement.

For how long have I used the solution?

I have been using Apache NiFi for most of my career, but over the past two years, I have used it quite a lot.

What do I think about the stability of the solution?

Apache NiFi is stable in most cases. Downtime with Apache NiFi is not unheard of, so we do run into downtimes every now and then, but for the most part, it is stable.

What do I think about the scalability of the solution?

Scalability for Apache NiFi is a problem; it does not scale as well as other Spark solutions. Scalability is one of the reasons downtime is still a thing on Apache NiFi.

How are customer service and support?

I have had to reach out for help with Apache NiFi, but since I get Apache NiFi from Cloudera, they have their premium support, so it is not that big an issue.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used an older version of Apache NiFi before Cloudera; we chose to continue with Apache NiFi because we were coming from Apache NiFi.

How was the initial setup?

Complex

What about the implementation team?

We host it on AWS and get it from Cloudera

What was our ROI?

I haven't seen a return on investment. There may be return on investment based on the technology and easily moving our workloads onto Apache NiFi from our previous system. However, the tech debt that is created from Apache NiFi is not a good addition. It saves time on development, but then you have to cater to some custom solutions.

What's my experience with pricing, setup cost, and licensing?

It is expensive with little to show for it. But I personally don't deal with billing specifics

Which other solutions did I evaluate?

We evaluated using Spark for all of our ingestions before choosing Apache NiFi, and at that point, we had to build a framework entirely from scratch. That is why we didn't go there, and we went for Apache NiFi.

What other advice do I have?

The process and cost consumption, I believe, is actually not that good when working with Apache NiFi. Apache NiFi is one of the most expensive tools we use. I do believe it is an obsolete technology and the world is moving away from it. I have given this review a rating of five out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other

reviewer2784384

Partner at a tech vendor with 10,001+ employees

Dec 3, 2025

Data workflows have accelerated project delivery and reduce costs for analytics teams

What is our primary use case?

Apache NiFi is used for real-time and batch ingestion on data warehouse platforms. For example, Apache NiFi ingests all analytics from the e-commerce website into the data warehouse in the AWS Redshift database.

How has it helped my organization?

Speeding up projects with Apache NiFi has helped the organization by resulting in cost savings. A 30% reduction in cost was noticed as a specific metric regarding those savings.

What is most valuable?

The best feature of Apache NiFi is the simplicity of the tools because it is a drag-and-drop tool. The simplicity of Apache NiFi's tools helps by speeding up all the implementation process. Apache NiFi is also used to speed up projects in order to gain more projects in less time.

What needs improvement?

Apache NiFi is a good product as it is currently.

For how long have I used the solution?

Apache NiFi has been used for a long time, five years in different projects.

What do I think about the stability of the solution?

Apache NiFi is stable.

What do I think about the scalability of the solution?

The scalability of Apache NiFi is good because it is simple to scale up the resources.

How are customer service and support?

The customer support for Apache NiFi is fine. I would rate the customer support of Apache NiFi a 10 on a scale of 1 to 10.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

A custom solution implemented with Python was previously used before switching to Apache NiFi. The decision was made to switch from the custom Python solution to Apache NiFi to simplify all the deployment.

How was the initial setup?

Apache NiFi was purchased through the AWS Marketplace.

What was our ROI?

A return on investment has not been observed, and it is not possible to share these metrics.

What's my experience with pricing, setup cost, and licensing?

The experience with pricing, setup cost, and licensing was fine, as the integration with the AWS Marketplace was very good. The pricing in Italy is considered a little bit high, but the product is worth it.

Which other solutions did I evaluate?

Other options were not evaluated before choosing Apache NiFi.

What other advice do I have?

Apache NiFi receives a rating of 9 out of 10. This rating of 9 out of 10 for Apache NiFi was chosen because of the documentation and the support of the product. The advice for others looking into using Apache NiFi is to test the solution with a POC and then go to production in a quick way.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

reviewer2784747

Senior Data Engineer at a tech vendor with 10,001+ employees

Dec 4, 2025

Plug-and-play data flows have transformed how our team builds and runs large daily ingestions

What is our primary use case?

I have been using Apache NiFi for a couple of years now. My experience with Apache NiFi is that I have used it for a customer of ours who used that system to orchestrate ingestion processes from different sources into AWS. My main use case for Apache NiFi is data ingestion, so connecting to sources like APIs or different data lakes to get the data into S3. I manage more than 50 ingestion flows currently; some of them are quite small, dealing with a few megabytes a day, while others are multiple gigabytes a day and those are run daily, so we have a daily refresh of the sources.

What is most valuable?

Apache NiFi's best features include its plug-and-play solution, which means you don't need a lot of insight or knowledge to use it. The simplicity and plug-and-play approach of Apache NiFi has helped our team by allowing our customer to scale and build different ingestion flows, which previously needed additional development effort because custom solutions were required; now we just plug and play.

The features of Apache NiFi, including the integration capabilities to connect to different sources and monitoring with all the observability features, work really well for our team in obtaining the needed information. Apache NiFi has positively impacted our organization by making us a lot more productive; I would say development has significantly improved.

Development has improved with a reduction in time spent being the main benefit; before we needed a matter of days to create the ingestion flows, but now it only takes a couple of hours to configure.

What needs improvement?

I don't have any frustrations about how Apache NiFi can be improved; I'm pretty happy. I don't think there are needed improvements for Apache NiFi, particularly regarding the user interface, documentation, or performance.

For how long have I used the solution?

I have been working in my current field for five years now.

What do I think about the stability of the solution?

Apache NiFi is indeed stable.

What do I think about the scalability of the solution?

Apache NiFi's scalability works for us.

How are customer service and support?

I haven't interacted with their customer support.

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

I have evaluated other options before choosing Apache NiFi, and I have used other options as well.

How was the initial setup?

Apache NiFi is deployed in our organization using public cloud infrastructure. I purchased Apache NiFi through the AWS Marketplace. There are no improvements needed for Apache NiFi deployment on AWS, as the experience with that was pretty smooth.

What about the implementation team?

I don't manage pricing, setup cost, or licensing for that customer, but I think they are pretty satisfied with it as well.

What other advice do I have?

My advice for others looking into using Apache NiFi is that it's a good solution. I would rate this review 9 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Bharghava Raghavendra Beesa

Senior Developer at Infosys

Jan 21, 2025

The tool enables effective data transformation and integration

What is our primary use case?

I use NiFi as a tool for ETL, which stands for extract, transform, and load. It is particularly effective for integration methodologies.

The tool is useful for designing ETL pipelines and is an open-source product. Data is often stored in different forms and locations. If I want to integrate and transform it, NiFi can help load data from one place to another while making transformations.

I can handle stream or batch data and identify various data types on different platforms. NiFi can integrate with tools like Slack and perform required transformations before loading to the desired downstream.

It is primarily a pipeline-building tool with a graphical UI, however, I can also write custom JARs for specific functions. NiFi is an open-source tool effective for data migration and transformations, helping improve data quality from various sources.

What is most valuable?

NiFi works on data and file levels, streamlining real-time data processes. It is highly effective for handling real-time data by working with APIs for immediate and continuous data extraction. For real-time data tasks, this front-end UI-based tool is superior to back-end platforms.

What needs improvement?

There are some areas for improvement, particularly with record-level tasks that take a bit of time. The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process.

Enhancing features related to alerting would be helpful, including mobile alerts for pipeline issues. Integration with mobile devices for error alerts would simplify information delivery.

What do I think about the stability of the solution?

The product is stable for simple tasks, like using databases that are not distributed. However, for distributed environments like Hadoop or HBase, some vulnerabilities exist. While these are not major issues, they should not be ignored.

What do I think about the scalability of the solution?

Scaling works well, allowing cluster expansion. However, I have never encountered very large clusters, so it's uncertain how well it supports extensive scaling.

How was the initial setup?

The initial setup is fast, especially for communication stabilization. Although the product is open source, it functions as a cluster. For single-node environments, installation is simple. For company-wide or enterprise-level clusters, the initial stages may present issues with authentication and access. Stabilization, such as port communication, may not be immediately effective.

What other advice do I have?

I recommend the product for its data privacy features. It allows secure data handling because the data is stored on my nodes. However, a skilled technician is necessary due to the reliance on Java, especially for back-end operations and error debugging.

Enterprise versions may offer easier troubleshooting. As an open-source solution, good support is crucial.

I rate the overall product as eight out of ten.

Which deployment model are you using for this solution?

On-premises

Arjun Pandey

Engineering Lead- Cloud and Platform Architecture at a financial services firm with 1,001-5,000 employees

Oct 25, 2023

Good monitoring, metrics capabilities and provides ability to design processors with a single click

What is our primary use case?

As a DevOps engineer, my day-to-day task is to move files from one location to another, doing some transformation along the way. For example, I might pull messages from Kafka and put them into S3 buckets. Or I might move data from a GCS bucket to another location.

NiFi is really good for this because it has very good monitoring and metrics capabilities. When I design a pipeline in NiFi, I can see how much data is being processed, where it is at each stage, and what the total throughput is.

I can see all the metrics related to the complete pipeline. So, I personally like it very much.

What is most valuable?

The good thing about Apache NiFi is that it has a concept called a flow file, and there's something called a flow file processor. The processor is the building block of your entire job. They have close to 500 processors for each purpose.

For example, for reading from Kafka, Ni-Fi has a processor called "consumer Kafka". To write to S3, they have a processor called "put S3". Now, if I read from Kafka and write my own application, I'd need to ensure the library I'm using tracks my messages. I'd also need to handle any failures by rereading messages and ensuring acknowledgment. But all this complexity is already handled by Apache processor.

They have around 500 processors, with a community investing significant effort into developing them. I can design your processor with a single click, export the entire workflow, and import it. The format is actionable, so NiFi is immediately set up.

It's also distributed in nature so that I can scale it across nodes based on the workload. These nodes share their state. If one node goes down during processing, that data might be lost, but any subsequent data is safe. Such occurrences are rare.

In essence, if you want a quick solution, Apache NiFi is a strong contender. There are other solutions like AirFlow and some paid pipeline options.

AirFlow is open-source but can be complicated. For ETL or ERT solutions, there are pricier options. But if I need a pipeline that I can monitor step by step, Apache NiFi is a good choice. It integrates with Prometheus metrics, allowing me to embed them in my workflow.

There's also a processor for integration with Slack, and I can receive notifications when the workflow is completed or fails.

Another feature I appreciate is "back pressure," which NiFi handles automatically. It maintains its own queue and addresses back-pressure issues. If, for instance, an upstream entity isn't fast enough, items get stored in a queue, managed internally by NiFi's back pressure algorithm.

What needs improvement?

There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible.

So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.

For how long have I used the solution?

I have been using it for one and a half years.

What do I think about the stability of the solution?

I would rate the stability a seven out of ten because there are a lot of processes that need to be implemented.

What do I think about the scalability of the solution?

It's scalable. It can easily scale on multiple nodes. Depending on the workload, it also handles that internally; like the workers, they coordinate with each other, and they share the workload with each other. So, it's pretty good in terms of scalability.

How was the initial setup?

The initial setup is very easy, especially for users who are familiar with EDL or EMT.

NiFi is one of the easiest tools on the market to learn and use. It is also a quick-win solution, which is good for first-time users who are developing data pipelines for EMT. NiFi makes it easy to track and trace the status of your pipelines, so you can be sure that they are working properly.

What other advice do I have?

If I were to advise someone, I would ask the user what endpoints they want to touch. If I want to read something from Kafka and I want to put this thing on the S3 bucket, what is the alternative I have?

I have Kafka Connect, where I can connect Kafka with one Kafka, and I can put it into an S3 bucket. Is this scalable? No. Is this monitoring No.

We can't monitor it. We can't scale it. It's going to be a complete black box. The person who knows Kafka Connect, or Kafka, can understand what is happening there while using Kafka Connect. But if I compare it, I literally don't need to understand what Kafka is.

I know, "Okay, this is Kafka. These are the endpoints, and this is the URL I have to point to." That's it. My job is done. I will create a complete flow pipeline within, let's say, thirty minutes or something without having any current knowledge. I can read, I can Google it, and I can just implement it.

For people who are new to big data technologies like Kafka and BigQuery, I would give this solution an eight out of ten.

Let's say you need to build a solution to read from Kafka and write to an S3 bucket. You could use Kafka Connect, but if your requirements change and you need to start reading from a database instead, Kafka Connect will not work. With Apache NiFi, you can easily modify your flow pipeline to start reading from the database instead.

Which deployment model are you using for this solution?

Public Cloud

Title	Rating	Mindshare	Recommending
Apache Spark	4.2	8.5%	90%	69 interviews Add to research
AWS Lambda	4.3	15.3%	94%	92 interviews Add to research

Apache NiFi Reviews

What is Apache NiFi?

Featured Apache NiFi reviews

Apache NiFi mindshare

PeerResearch reports based on Apache NiFi reviews

Valuable Features

Room for Improvement

ROI

Pricing

Popular Use Cases

Service and Support

Deployment

Scalability

Stability

Review data by company size

Top industries

Compare Apache NiFi with alternative products

Learn more about Apache NiFi

Apache NiFi customers

Related questions

Product Categories

Popular Comparisons

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

What was our ROI?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

What was our ROI?

Which other solutions did I evaluate?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?