

Pentaho Data Integration and Analytics and IBM Cloud Pak for Data compete in the data integration and analytics sector. Pentaho is particularly noted for its speed and adaptability, while IBM stands out for its comprehensive capabilities and AI-driven insights. Pentaho seems to have the upper hand due to its cost-effectiveness and adaptability.
Features: Pentaho Data Integration offers an intuitive system that easily integrates with various data sources, excellent data transformation features, and a customizable interface. It facilitates rapid development and effective visualization of diverse data formats. IBM Cloud Pak for Data provides robust data preparation, integration, and governance features along with the versatility of the Watson suite for AI insights, making it suitable for large enterprises seeking comprehensive solutions.
Room for Improvement: Pentaho could improve performance in real-time processing, debugging feedback, and version compatibility. It lacks native support for some modern databases and big data platforms. IBM could benefit from simplified deployment options, more connectors, improved interface responsiveness, and reduced complexity and cost to be more appealing to smaller organizations.
Ease of Deployment and Customer Service: Pentaho Data Integration excels in on-premises and hybrid cloud deployments, supported by an active community and online resources, though official support can be slow. IBM Cloud Pak for Data is strong in public and hybrid cloud settings, with robust technical support, but users often find deployment complex. User satisfaction on support responsiveness varies between the products.
Pricing and ROI: Pentaho offers a free Community Edition and an affordable Enterprise Edition, providing significant value and reducing entry barriers for ETL development. Its open-source nature allows for notable savings. IBM Cloud Pak for Data, while offering advanced features, is seen as expensive and geared towards larger enterprises with substantial budgets.
We have been able to drive responsible, transparent, and explainable AI workflow to operationalize AI and mitigate risk and regulatory compliance easily.
It is easy to collect, organize, and analyze data no matter where it is, hence being able to make data-driven decisions.
I have seen a return on investment; my team was able to stay extremely small even though we had a lot of data integrations with many companies.
I can testify to the return on investment with metrics regarding time saved; we have increased our efficiency by about 20 to 30 percent due to the swift migration processes facilitated by the tool.
I have noticed a return on investment with Pentaho Data Integration and Analytics in terms of time savings and staff reduction.
Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.
The customer support for IBM Cloud Pak for Data is great and responsive.
I would rate IBM's support at about a seven or eight out of ten because we have good support coverage owing to our long association with IBM.
24/7 assistance is available for the Enterprise Edition.
take the time to understand our business requirements, offering appropriate recommendations.
Communication with the vendor is challenging
I have not noticed any downtime or lagging, especially when dealing with large data, so it is relatively very scalable.
IBM Cloud Pak for Data's scalability is very good; it can be used by any size of organization.
It can be scaled well until you reach a point where you need to perform a lot of operations, and the issue arises when it runs out of memory to handle some data.
Its ability to scale horizontally in cloud-native architectures or for massive real-time processing is limited.
Pentaho Data Integration handles larger datasets better.
Performance issues arise due to reliance on a flowchart-based mechanism instead of scripts, which can lead to longer execution times.
I find that version 3.1 is the most stable version I have ever used.
It's pretty stable, however, it struggles when dealing with smaller amounts of data.
Setting up the hybrid and multi-cloud environments is a long job and it takes time.
IBM Cloud Pak for Data can be improved because processing speeds are sometimes slow.
To improve IBM Cloud Pak for Data, I suggest more out-of-the-box integration.
We should also explore more effective partitioning for parallel processing and fine-tuning database connections to reduce load times and improve ETL speed.
Pentaho Data Integration and Analytics can be improved by working with different environments, specifically the possibility to change the variables, meaning I write my variables only once and can change them for different environments such as production or development.
Pentaho Data Integration and Analytics could have real-time processing and automatic alerting, having alerts or automatic notifications when a job fails or when certain data doesn't meet certain rules.
The setup cost is very expensive.
Regarding my experience with pricing, setup cost, and licensing, for a small organization, the price might be relatively high, but for huge enterprises such as ours, the price is relatively affordable.
The list price is high, but the flexibility in pricing is adequate.
I use the community version of Pentaho Data Integration and Analytics, and I do not need additional costs.
The setup cost was minimal, and the pricing experience was pretty good.
The company covered it and they had no problem paying for it because they saw that it was cost-effective in terms of performance afterwards.
From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.
The benefits of choosing IBM Cognos, in addition to saving on cost, include having institutional knowledge about maintaining this infrastructure and enough people who have developed on Cognos in the past, which creates comfort in its use.
We have been able to save approximately 80 percent of our time. We are not doing data analysis manually, so this relieves our data department of dealing with data.
Pentaho Data Integration and Analytics has positively impacted my organization because it meant we didn't have to write a lot of custom API back-end processing logic; it did the majority of that heavy lifting for us.
It automates the data workflow, including extraction, cleansing, and loading into warehouses for BI reporting purposes, while also removing duplicates, validating data, and standardizing formats, enabling real-time decision-making.
Pentaho Data Integration and Analytics has positively impacted my organization because it is easier to use, and my knowledge about this work facilitates the translation from the source to my final system.
| Product | Mindshare (%) |
|---|---|
| Pentaho Data Integration and Analytics | 1.6% |
| IBM Cloud Pak for Data | 1.3% |
| Other | 97.1% |


| Company Size | Count |
|---|---|
| Small Business | 8 |
| Large Enterprise | 15 |
| Company Size | Count |
|---|---|
| Small Business | 18 |
| Midsize Enterprise | 17 |
| Large Enterprise | 31 |
IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement AI successfully.
Building on the streamlined hybrid-cloud foundation of Red Hat® OpenShift®, IBM Cloud Pak for Data takes advantage of the underlying resource and infrastructure optimization and management. The solution fully supports multicloud environments such as Amazon Web Services (AWS), Azure, Google Cloud, IBM Cloud™ and private cloud deployments. Find out how IBM Cloud Pak for Data can lower your total cost of ownership and accelerate innovation.
Pentaho Data Integration stands as a versatile platform designed to cater to the data integration and analytics needs of organizations, regardless of their size. This powerful solution is the go-to choice for businesses seeking to seamlessly integrate data from diverse sources, including databases, files, and applications. Pentaho Data Integration facilitates the essential tasks of cleaning and transforming data, ensuring it's primed for meaningful analysis. With a wide array of tools for data mining, machine learning, and statistical analysis, Pentaho Data Integration empowers organizations to glean valuable insights from their data. What sets Pentaho Data Integration apart is its maturity and a vibrant community of users and developers, making it a reliable and cost-effective option. Pentaho Data Integration offers a range of features, including a comprehensive ETL toolkit, data cleaning and transformation capabilities, robust data analysis tools, and seamless deployment options for data integration and analytics solutions, making it a go-to solution for organizations seeking to harness the power of their data.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.