IBM Cloud Pak for Data vs Pentaho Data Integration and Analytics comparison

Cancel
You must select at least 2 products to compare!
Comparison Buyer's Guide
Executive Summary

We performed a comparison between IBM Cloud Pak for Data and Pentaho Data Integration and Analytics based on real PeerSpot user reviews.

Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed IBM Cloud Pak for Data vs. Pentaho Data Integration and Analytics Report (Updated: May 2024).
771,212 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"Cloud Pak's most valuable features are IBM MQ, IBM App Connect, IBM API Connect, and ISPF.""The most valuable feature of IBM Cloud Pak for Data is the Modeler flows. The ability to develop models using a graphical approach and the capability to connect to various sources, as well as the data virtualization capabilities, allow me to easily access and utilize data that is dispersed across different sources.""Its data preparation capabilities are highly valuable.""One of Cloud Pak's best features is the Watson Knowledge Catalog, which helps you implement data governance.""You can model the data there, connect the data models with the business processes and create data lineage processes.""It is a scalable solution, and we have had no issues with its scalability in our company. I rate the solution's scalability a nine out of ten.""The most valuable features of IBM Cloud Pak for Data are the Watson Studio, where we can initiate more groups and write code. Additionally, Watson Machine Learning is available with many other services, such as APIs which you can plug the machine learning models.""The most valuable features are data virtualization and reporting."

More IBM Cloud Pak for Data Pros →

"The abstraction is quite good.""One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results.""It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there.""The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming.""Provides a good open source option.""The way it has improved our product is by giving our users the ability to do ad hoc reports, which is very important to our users. We can do predictive analysis on trends coming in for contracts, which is what our product does. The product helps users decide which way to go based on the predictive analysis done by Pentaho. Pentaho is not doing predictions, but reporting on the predictions that our product is doing. This is a big part of our product.""It's very simple compared to other products out there.""The amount of data that it loads and processes is good."

More Pentaho Data Integration and Analytics Pros →

Cons
"One thing that bugs me is how much infrastructure Cloud Pak requires for the initial deployment. It doesn't allow you to start small. The smallest permitted deployment is too big. It's a huge problem that prevents us from implementing the solution in many scenarios.""There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement.""The tool depends on the control plane, an OpenShift container platform utilized as an orchestration layer...So, we have communicated this issue to IBM and asked if it is feasible to adapt the solution to work on a Kubernetes platform that we support.""Cloud Pak would be improved with integration with cloud service providers like Cloudera.""The solution's user experience is an area that has room for improvement.""The product is trying to be more maturity in terms of connectors. That, I believe, is an area where Cloud Pak can improve.""The solution could have more connectors.""The product must improve its performance."

More IBM Cloud Pak for Data Cons →

"I would like to see more improvements with AS400 DB2.""I would like to see improvement when it comes to integrating structured data with text data or anything that is unstructured. Sometimes we get all kinds of different files that we need to integrate into the warehouse.""Parallel execution could be better in Pentaho. It's very simple but I don't think it works well.""The web interface is rusty, and the biggest problem with Pentaho is debugging and troubleshooting. It isn't easy to build the pipeline incrementally. At least in our case, it's hard to find a way to execute step by step in the debugging mode.""I have been facing some difficulties when working with large datasets. It seems that when there is a large amount of data, I experience memory errors.""I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You have to search all these different places using a mouse, clicking everywhere... each report is coded in a binary file... You cannot search with a text search tool...""I would like to see improvements made for real-time data processing.""I work with different databases. I would like to work with more connectors to new databases, e.g., DynamoDB and MariaDB, and new cloud solutions, e.g., AWS, Azure, and GCP. If they had these connectors, that would be great. They could improve by building new connectors. If you have native connections to different databases, then you can make instructions more efficient and in a more natural way. You don't have to write any scripts to use that connector."

More Pentaho Data Integration and Analytics Cons →

Pricing and Cost Advice
  • "I think that this product is too expensive for smaller companies."
  • "I don't have the exact licensing cost for IBM Cloud Pak for Data, as my company is still finalizing requirements, including monthly, yearly, and three-year licensing fees. Still, on a scale of one to five, I'd rate it a three because, compared to other vendors, it's more complicated."
  • "Cloud Pak's cost is a little high."
  • "IBM Cloud Pak for Data is expensive. If we include the training time and the machine learning, it's expensive. The cost of the execution is more reasonable."
  • "For the licensing of the solution, there is a yearly payment that needs to be made. Also, since it is expensive, cost-wise, I rate the solution an eight or nine out of ten."
  • "It's quite expensive."
  • "The solution is expensive."
  • More IBM Cloud Pak for Data Pricing and Cost Advice →

  • "There is a good open source option (Community Edition)​."
  • "The price of the regular version is not reasonable and it should be lower."
  • "Sometimes we provide the licenses or the customer can procure their own licenses. Previously, we had an enterprise license. Currently, we are on a community license as this is adequate for our needs."
  • "It does seem a bit expensive compared to the serverless product offering. Tools, such as Server Integration Services, are "almost" free with a database engine. It is comparable to products like Alteryx, which is also very expensive."
  • "I think Lumada's price is fair compared to some of the others, like BusinessObjects, which is was the other thing that I used at my previous job. BusinessObject's price was more reasonable before SAP acquired it. They jacked the price up significantly. Oracle's OBIEE tool was also prohibitively expensive."
  • "When we first started with it, it was much cheaper. It has gone up drastically, especially since Hitachi bought out Pentaho."
  • "The cost of these types of solutions are expensive. So, we really appreciate what we get for our money. Though, we don't think of the solution as a top-of-the-line solution or anything like that."
  • "The pricing has been pretty good. I'm used to using everything open-source or freeware-based. I understand that organizations need to make sure that the solutions are secure, and that's basically where I hit a roadblock in my current organization. They needed to ensure that we had a license and we had a secure way of accessing it so that no outside parties could get access to our data, but in terms of pricing, considering how much other teams are spending on cloud solutions or even their existing solutions, its price point is pretty good. At this time, there are no additional costs. We just have the licensing fees."
  • More Pentaho Data Integration and Analytics Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
    771,212 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:DataStage allows me to connect to different data sources.
    Top Answer:The product must improve its performance. We see typical cloud-related issues in the solution. IBM can still focus more on keeping the performance up and keeping it 100% available all the time.
    Top Answer:Hi Rajneesh yes here is the feature comparison between the community and enterprise edition :… more »
    Top Answer: In my opinion, the reporting side of this tool needs serious improvements. In my previous company, we worked with Hitachi Lumada Data Integration and while it does a good job for what it’s worth, it… more »
    Top Answer:My company has used this product to transform data from databases, CSV files, and flat files. It really does a good job. We were most satisfied with the results in terms of how many people could use… more »
    Ranking
    17th
    out of 101 in Data Integration
    Views
    4,032
    Comparisons
    2,639
    Reviews
    9
    Average Words per Review
    500
    Rating
    8.4
    15th
    out of 101 in Data Integration
    Views
    3,247
    Comparisons
    1,075
    Reviews
    10
    Average Words per Review
    1,105
    Rating
    7.5
    Comparisons
    Also Known As
    Cloud Pak for Data
    Hitachi Lumada Data Integration, Kettle, Pentaho Data Integration
    Learn More
    Overview

    IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement AI successfully.

    Building on the streamlined hybrid-cloud foundation of Red Hat® OpenShift®, IBM Cloud Pak for Data takes advantage of the underlying resource and infrastructure optimization and management. The solution fully supports multicloud environments such as Amazon Web Services (AWS), Azure, Google Cloud, IBM Cloud™ and private cloud deployments. Find out how IBM Cloud Pak for Data can lower your total cost of ownership and accelerate innovation.

    Pentaho Data Integration stands as a versatile platform designed to cater to the data integration and analytics needs of organizations, regardless of their size. This powerful solution is the go-to choice for businesses seeking to seamlessly integrate data from diverse sources, including databases, files, and applications. Pentaho Data Integration facilitates the essential tasks of cleaning and transforming data, ensuring it's primed for meaningful analysis. With a wide array of tools for data mining, machine learning, and statistical analysis, Pentaho Data Integration empowers organizations to glean valuable insights from their data. What sets Pentaho Data Integration apart is its maturity and a vibrant community of users and developers, making it a reliable and cost-effective option. Pentaho Data Integration offers a range of features, including a comprehensive ETL toolkit, data cleaning and transformation capabilities, robust data analysis tools, and seamless deployment options for data integration and analytics solutions, making it a go-to solution for organizations seeking to harness the power of their data.

    Sample Customers
    Qatar Development Bank, GuideWell, Skanderborg Music Festival
    66Controls, Providential Revenue Agency of Ro Negro, NOAA Information Systems, Swiss Real Estate Institute
    Top Industries
    VISITORS READING REVIEWS
    Financial Services Firm26%
    Computer Software Company11%
    Manufacturing Company8%
    Government8%
    REVIEWERS
    Healthcare Company19%
    Financial Services Firm19%
    Comms Service Provider11%
    Manufacturing Company11%
    VISITORS READING REVIEWS
    Financial Services Firm19%
    Computer Software Company14%
    Comms Service Provider12%
    Government7%
    Company Size
    REVIEWERS
    Small Business46%
    Large Enterprise54%
    VISITORS READING REVIEWS
    Small Business17%
    Midsize Enterprise8%
    Large Enterprise76%
    REVIEWERS
    Small Business27%
    Midsize Enterprise31%
    Large Enterprise42%
    VISITORS READING REVIEWS
    Small Business21%
    Midsize Enterprise11%
    Large Enterprise68%
    Buyer's Guide
    IBM Cloud Pak for Data vs. Pentaho Data Integration and Analytics
    May 2024
    Find out what your peers are saying about IBM Cloud Pak for Data vs. Pentaho Data Integration and Analytics and other solutions. Updated: May 2024.
    771,212 professionals have used our research since 2012.

    IBM Cloud Pak for Data is ranked 17th in Data Integration with 11 reviews while Pentaho Data Integration and Analytics is ranked 15th in Data Integration with 48 reviews. IBM Cloud Pak for Data is rated 8.0, while Pentaho Data Integration and Analytics is rated 8.0. The top reviewer of IBM Cloud Pak for Data writes "A scalable data analytics and digital transformation tool that provides useful features and integrations". On the other hand, the top reviewer of Pentaho Data Integration and Analytics writes "It's flexible and can do almost anything I want it to do". IBM Cloud Pak for Data is most compared with IBM InfoSphere DataStage, Azure Data Factory, Informatica Cloud Data Integration, Palantir Foundry and Denodo, whereas Pentaho Data Integration and Analytics is most compared with SSIS, Azure Data Factory, Talend Open Studio, Oracle Data Integrator (ODI) and AWS Glue. See our IBM Cloud Pak for Data vs. Pentaho Data Integration and Analytics report.

    See our list of best Data Integration vendors.

    We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.