We performed a comparison between IBM InfoSphere DataStage and Pentaho Data Integration and Analytics based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions."
"The most valuable feature is the ability to transfer information via notes."
"The most valuable feature is the product's versatility to inject data."
"We can view what we want to do. We can transform data and put them on tables."
"I am impressed with the tool's ETL tracing."
"The most valuable feature is the data integration for data warehousing."
"It works with multiple servers and offers high availability."
"When we have needed help from the IBM team, they were helpful. Our company is a premium partner so we get fast responses."
"The fact that it enables us to leverage metadata to automate data pipeline templates and reuse them is definitely one of the features that we like the best. The metadata injection is helpful because it reduces the need to create and maintain additional ETLs. If we didn't have that feature, we would have lots of duplicated ETLs that we would have to create and maintain. The data pipeline templates have definitely been helpful when looking at productivity and costs."
"The way it has improved our product is by giving our users the ability to do ad hoc reports, which is very important to our users. We can do predictive analysis on trends coming in for contracts, which is what our product does. The product helps users decide which way to go based on the predictive analysis done by Pentaho. Pentaho is not doing predictions, but reporting on the predictions that our product is doing. This is a big part of our product."
"The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming."
"It's my understanding that the product can scale."
"We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic."
"The abstraction is quite good."
"Lumada has allowed us to interact with our employees more effectively and compensate them properly. One of the cool things is that we use it to generate commissions for our salespeople and bonuses for our warehouse people. It allows us to get information out to them in a timely fashion. We can also see where they're at and how they're doing."
"The graphical nature of the development interface is most useful because we've got people with quite mixed skills in the team. We've got some very junior, apprentice-level people, and we've got support analysts who don't have an IT background. It allows us to have quite complicated data flows and embed logic in them. Rather than having to troll through lines and lines of code and try and work out what it's doing, you get a visual representation, which makes it quite easy for people with mixed skills to support and maintain the product. That's one side of it."
"The interface needs work to be more user-friendly."
"Working with some of the big data components is good, but I can see improvements are needed."
"The initial setup can be complex."
"Currently lacking virtualization ability."
"The troubleshooting guide is very bad."
"It would be great if they can include some basic version of data quality checking features."
"The template mapping could be easier."
"In the future, I would like to see more integration with cloud technologies."
"I'm still in the very recent stage concerning Pentaho Data Integration, but it can't really handle what I describe as "extreme data processing" i.e. when there is a huge amount of data to process. That is one area where Pentaho is still lacking."
"Since Hitachi took over, I don't feel that the documentation is as good within the solution. It used to have very good help built right in."
"There is not a data quality or MDM solution in the Pentaho DI suite."
"I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse."
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake."
"The web interface is rusty, and the biggest problem with Pentaho is debugging and troubleshooting. It isn't easy to build the pipeline incrementally. At least in our case, it's hard to find a way to execute step by step in the debugging mode."
"If you develop it on MacBook, it'll be quite a hassle."
"If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was."
More Pentaho Data Integration and Analytics Pricing and Cost Advice →
IBM InfoSphere DataStage is ranked 7th in Data Integration with 36 reviews while Pentaho Data Integration and Analytics is ranked 15th in Data Integration with 48 reviews. IBM InfoSphere DataStage is rated 7.8, while Pentaho Data Integration and Analytics is rated 8.0. The top reviewer of IBM InfoSphere DataStage writes "User-friendly with a lot of functions for transmission rules, but has slow performance and not suitable for a huge volume of data". On the other hand, the top reviewer of Pentaho Data Integration and Analytics writes "It's flexible and can do almost anything I want it to do". IBM InfoSphere DataStage is most compared with IBM Cloud Pak for Data, SSIS, Azure Data Factory, Talend Open Studio and Informatica PowerCenter, whereas Pentaho Data Integration and Analytics is most compared with Azure Data Factory, SSIS, Talend Open Studio, AWS Glue and Collibra Catalog. See our IBM InfoSphere DataStage vs. Pentaho Data Integration and Analytics report.
See our list of best Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.