We performed a comparison between Palantir Foundry and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The security is also excellent. It's highly granular, so the admins have a high degree of control, and there are many levels of security. That worked well. You won't have an EDC unless you put everything onto the platform because it is its own isolated thing."
"Palantir Foundry is a robust platform that has really strong plugin connectors and provides features for real-time integration."
"Live video sessions enhance the available documentation and allow you to ask questions directly."
"The solution offers very good end-to-end capabilities."
"The virtualization tool is useful."
"It's scalable."
"It is easy to map out a workflow and run trigger-based scripts without having to deploy to another server."
"The ease of use is my favorite feature. We're able to build different models and projects or combine different projects to build one use case."
"It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
"The ability to have a good bifurcation rate and fewer mistakes is valuable."
"Important features include that it comprises lots of functionality to connect data from various sources through connector availability, scheduling pipelines at any time, and integration with third-party and security solutions for encryption."
"Also, the intuitive canvas for designing all the streams in the pipeline, along with the simplicity of the entire product are very big pluses for me. The software is very simple and straightforward. That is something that is needed right now."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"The entire user interface is very simple and the simplicity of creating pipelines is something that I like very much about it. The design experience is very smooth."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customize it to do what you need. Many other tools have started to use features similar to those introduced by StreamSets, like automated workflows that are easy to set up."
"If you want to create new models on specific data sets, computing that is quite costly."
"Compared to other hyperscalers, Palantir Foundry is complex and not so user-intuitive."
"The data lineage was challenging. It's hard to track data from the sources as it moves through stages. Informatica EDC can easily capture and report it because it talks to the metadata. This is generated across those various staging points."
"There is not a wide user base for the solution's online documentation so it is sometimes difficult to find answers."
"The frontend capabilities of Palantir Foundry could be improved."
"The solution could use more online documentation for new users."
"Some error messages can be very cryptic."
"Difficult to receive data from external sources."
"Sometimes, it is not clear at first how to set up nodes. A site with an explanation of how each node works would be very helpful."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."
"Using ETL pipelines is a bit complicated and requires some technical aid."
"We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
Palantir Foundry is ranked 11th in Data Integration with 13 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. Palantir Foundry is rated 7.6, while StreamSets is rated 8.4. The top reviewer of Palantir Foundry writes "The data visualization is fantastic and the security is excellent". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". Palantir Foundry is most compared with Azure Data Factory, Palantir Gotham, SAP Data Services, AWS Glue and Alteryx Designer, whereas StreamSets is most compared with Fivetran, Azure Data Factory, Informatica PowerCenter, SSIS and Oracle GoldenGate. See our Palantir Foundry vs. StreamSets report.
See our list of best Data Integration vendors and best Cloud Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.