IBM Cloud Pak for Data vs Palantir Foundry comparison

Cancel
You must select at least 2 products to compare!
StreamSets Logo
5,743 views|3,566 comparisons
IBM Logo
4,166 views|2,478 comparisons
Palantir Logo
16,769 views|10,858 comparisons
Comparison Buyer's Guide
Executive Summary

We performed a comparison between IBM Cloud Pak for Data and Palantir Foundry based on real PeerSpot user reviews.

Find out in this report how the two Data Integration Tools solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed IBM Cloud Pak for Data vs. Palantir Foundry Report (Updated: November 2022).
653,757 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved.""In StreamSets, everything is in one place.""StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes.""I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks.""It is a very powerful, modern data analytics solution, in which you can integrate a large volume of data from different sources. It integrates all of the data and you can design, create, and monitor pipelines according to your requirements. It is an all-in-one day data ops solution."

More StreamSets Pros →

"What I found most helpful in IBM Cloud Pak for Data is containerization, which means it's easy to shift and leave in terms of moving to other clouds. That's an advantage of IBM Cloud Pak for Data.""The most valuable features of IBM Cloud Pak for Data are the Watson Studio, where we can initiate more groups and write code. Additionally, Watson Machine Learning is available with many other services, such as APIs which you can plug the machine learning models.""One of Cloud Pak's best features is the Watson Knowledge Catalog, which helps you implement data governance."

More IBM Cloud Pak for Data Pros →

"The solution offers very good end-to-end capabilities.""The security is also excellent. It's highly granular, so the admins have a high degree of control, and there are many levels of security. That worked well. You won't have an EDC unless you put everything onto the platform because it is its own isolated thing."

More Palantir Foundry Pros →

Cons
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful.""The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time.""If you use JDBC Lookup, for example, it generally takes a long time to process data.""Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using.""We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."

More StreamSets Cons →

"One challenge I'm facing with IBM Cloud Pak for Data is native features have been decommissioned, such as XML input and output. Too many changes have been made, and my company has around one hundred thousand mappings, so my team has been putting more effort into alternative ways to do things. Another area for improvement in IBM Cloud Pak for Data is that it's more complicated to shift from on-premise to the cloud. Other vendors provide secure agents that easily connect with your existing setup. Still, with IBM Cloud Pak for Data, you have to perform connection migration steps, upgrade to the latest version, etc., which makes it more complicated, especially as my company has XML-based mappings. Still, the XML input and output capabilities of IBM Cloud Pak for Data have been discontinued, so I'd like IBM to bring that back.""One thing that bugs me is how much infrastructure Cloud Pak requires for the initial deployment. It doesn't allow you to start small. The smallest permitted deployment is too big. It's a huge problem that prevents us from implementing the solution in many scenarios.""There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement."

More IBM Cloud Pak for Data Cons →

"The workflow could be improved.""The data lineage was challenging. It's hard to track data from the sources as it moves through stages. Informatica EDC can easily capture and report it because it talks to the metadata. This is generated across those various staging points."

More Palantir Foundry Cons →

Pricing and Cost Advice
  • "StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
  • "It has a CPU core-based licensing, which works for us and is quite good."
  • "There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
  • "The pricing is good, but not the best. They have some customized plans you can opt for."
  • More StreamSets Pricing and Cost Advice →

  • "I don't have the exact licensing cost for IBM Cloud Pak for Data, as my company is still finalizing requirements, including monthly, yearly, and three-year licensing fees. Still, on a scale of one to five, I'd rate it a three because, compared to other vendors, it's more complicated."
  • More IBM Cloud Pak for Data Pricing and Cost Advice →

    Information Not Available
    report
    Use our free recommendation engine to learn which Data Integration Tools solutions are best for your needs.
    653,757 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:It is really easy to set up and the interface is easy to use.
    Top Answer:We've seen a couple of cases where it appears to have a memory leak or a similar problem. It grows for a bit and then… more »
    Top Answer:We typically use it to transport our Oracle raw datasets up to Microsoft Azure, and then into SQL databases there.
    Top Answer:The most valuable features are data virtualization and reporting.
    Top Answer:I think that this product is too expensive for smaller companies.
    Top Answer:The utilization of system resources is high. The technical support could be a little better. Having a "lite" version for… more »
    Top Answer:The solution offers very good end-to-end capabilities.
    Top Answer:The workflow could be improved. Although it works rather seamlessly, the workflow too complicated sometimes. Maybe they… more »
    Top Answer:This solution is used more for the analytics available on the platform. The main use was for a COVID-19 White House… more »
    Comparisons
    Also Known As
    Cloud Pak for Data
    Learn More
    StreamSets
    Video Not Available
    Overview

    StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.

    Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.

    With StreamSets, you can deliver the continuous data that drives the connected enterprise.

    IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement AI successfully.

    Building on the streamlined hybrid-cloud foundation of Red Hat® OpenShift®, IBM Cloud Pak for Data takes advantage of the underlying resource and infrastructure optimization and management. The solution fully supports multicloud environments such as Amazon Web Services (AWS), Azure, Google Cloud, IBM Cloud™ and private cloud deployments. Find out how IBM Cloud Pak for Data can lower your total cost of ownership and accelerate innovation.

    Palantir Foundry is an enterprise data management platform offering comprehensive tooling for working with big data. Because it is an operating system made for modern enterprises, it is highly available and a continuously updated platform.

    Palantir Foundry is a fully managed SaaS platform that spans from cloud hosting and data integration to flexible analytics, visualization, model-building, operational decision-making, and decision capture. It equips technical and non-technical users to make data-driven operational decisions.

    Palantir Foundry includes tools to integrate data of any scale, format, or structure, and also has granular, flexible access controls for individual datasets. In addition, it has an open, modular architecture with multiple RESTful APIs, it has native applications for developing machine learning and artificial intelligence, it provides sophisticated data science applications for users of all technical abilities, and much more.

    Palantir Foundry Features

    The most valuable Palantir Foundry features include:

    Security, flexibility, interoperability, easy deployment, built-in role classification, purpose-based access controls, interoperable architecture, model integration, AI modeling tools, ontology, custom workflows, team-specific applications, self-serve analytics, lineage system, operational application building, 200+ data connectors, data versioning, change management framework, sand decision orchestration, and custom dashboard and report building tools.

    With Palantir Foundry You Can:

    • Accelerate development & operations
    • Collaborate globally
    • Save time on security
    • Reduce costs

    Palantir Foundry Benefits

    Some of the many Palantir Foundry benefits include:

    • Modular: Because Palantir Foundry is modular, it can integrate with existing infrastructure.

    • Granular governance: Robust data governance controls help ensure seamless compliance, auditability, and security.

    • Secure data sharing: The security infrastructure allows you to share data securely, within organizations and with chosen partners.

    • Rapid integration: Palantir Foundry can rapidly integrate models and data from data platforms into an operational ontology, and can also fuel existing data platforms.

    • Stand-alone deployment: Palantir Foundry can be deployed standalone and can provide all necessary capabilities for data connection, data integration, and model building.

    • Prevent lock-in: The data and model integration suites are designed to prevent lock-in, and assume that tools will be interoperating with Palantir Foundry as an organization grows.

    • No SQL necessary: Users do not have to use SQL queries or employ engineers to write strings in order to search petabytes of data. With Palantir Foundry, natural language is used to query data and results are returned in real-time.

    • Excellent data integration: Palantir Foundry provides secure, scalable, and resilient integration of all data sources.

    • Availability: The Palantir platform uses DNS routing, gateways and elastic load balancing to ensure availability. Dedicated accounts and virtual resources are used on a per customer basis.

    • Multiple metrics types: The Palantir platform provides hundreds of metrics. Some include average session length, drop-off rate, number of logins per user, number of accounts, and search speed.

    Reviews from Real Users

    PeerSpot users like Palantir Foundry because it has many advantages:

    “It is user-friendly, good automation, and allows you to do a better job of data governance.” - Associate, Inhouse Consulting at a pharma/biotech company

    “Works seamlessly with good end-to-end capabilities and the capability to scale.- Wallace H., Sr. Director at a tech services company




    Offer
    Learn more about StreamSets
    Learn more about IBM Cloud Pak for Data
    Learn more about Palantir Foundry
    Sample Customers
    Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
    Qatar Development Bank, GuideWell, Skanderborg Music Festival
    Merck KGaA, Airbus, Ferrari,United States Intelligence Community, United States Department of Defense
    Top Industries
    VISITORS READING REVIEWS
    Financial Services Firm17%
    Computer Software Company14%
    Manufacturing Company7%
    Insurance Company7%
    VISITORS READING REVIEWS
    Computer Software Company19%
    Financial Services Firm14%
    Government10%
    Comms Service Provider10%
    VISITORS READING REVIEWS
    Computer Software Company15%
    Financial Services Firm10%
    Comms Service Provider8%
    Government8%
    Company Size
    REVIEWERS
    Small Business14%
    Midsize Enterprise29%
    Large Enterprise57%
    VISITORS READING REVIEWS
    Small Business15%
    Midsize Enterprise11%
    Large Enterprise74%
    VISITORS READING REVIEWS
    Small Business14%
    Midsize Enterprise11%
    Large Enterprise75%
    REVIEWERS
    Small Business13%
    Midsize Enterprise50%
    Large Enterprise38%
    VISITORS READING REVIEWS
    Small Business16%
    Midsize Enterprise11%
    Large Enterprise72%
    Buyer's Guide
    IBM Cloud Pak for Data vs. Palantir Foundry
    November 2022
    Find out what your peers are saying about IBM Cloud Pak for Data vs. Palantir Foundry and other solutions. Updated: November 2022.
    653,757 professionals have used our research since 2012.

    IBM Cloud Pak for Data is ranked 29th in Data Integration Tools with 3 reviews while Palantir Foundry is ranked 15th in Data Integration Tools with 2 reviews. IBM Cloud Pak for Data is rated 8.0, while Palantir Foundry is rated 8.6. The top reviewer of IBM Cloud Pak for Data writes "Plenty of features, multiple services available, but machine learning could improve". On the other hand, the top reviewer of Palantir Foundry writes "The data visualization is fantastic and the security is excellent". IBM Cloud Pak for Data is most compared with Azure Data Factory, IBM InfoSphere DataStage, IBM InfoSphere Information Server, Denodo and Talend Data Fabric, whereas Palantir Foundry is most compared with Azure Data Factory, Palantir Gotham, SAP Data Services, Alteryx Designer and Oracle Data Integrator (ODI). See our IBM Cloud Pak for Data vs. Palantir Foundry report.

    See our list of best Data Integration Tools vendors.

    We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.