IBM InfoSphere DataStage vs Talend Data Management Platform comparison

Cancel
You must select at least 2 products to compare!
Comparison Buyer's Guide
Executive Summary

We performed a comparison between IBM InfoSphere DataStage and Talend Data Management Platform based on real PeerSpot user reviews.

Find out in this report how the two Data Integration Tools solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed IBM InfoSphere DataStage vs. Talend Data Management Platform Report (Updated: November 2022).
653,757 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes.""StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved.""It is a very powerful, modern data analytics solution, in which you can integrate a large volume of data from different sources. It integrates all of the data and you can design, create, and monitor pipelines according to your requirements. It is an all-in-one day data ops solution.""In StreamSets, everything is in one place.""I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."

More StreamSets Pros →

"The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities.""We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions.""It is quite useful and powerful.""As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. Its error logging mechanism is far simpler and easier to understand than other data integration tools. The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables.""The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms""We like the flexibility of modeling.""When we have needed help from the IBM team, they were helpful. Our company is a premium partner so we get fast responses.""Offers great flexibility."

More IBM InfoSphere DataStage Pros →

"The most valuable feature is the data loading and scripting language""I like everything about this product, but the biggest thing is the ease of use."

More Talend Data Management Platform Pros →

Cons
"If you use JDBC Lookup, for example, it generally takes a long time to process data.""The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time.""Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful.""We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back.""Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."

More StreamSets Cons →

"The error messaging needs to be improved.""In the future, I would like to see more integration with cloud technologies.""The initial setup could be more straightforward.""What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well.""It would be useful to provide support for Python, AR, and Java.""Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate. In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere.""It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud. Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved. It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies.""Currently lacking virtualization ability."

More IBM InfoSphere DataStage Cons →

"I've had some issues with bugs causing crashes, especially when making changes to the system or with the monthly upgrades to Studio they've introduced.""I think they should drive toward AI and machine learning. They could include a machine-learning algorithm for the deduplication."

More Talend Data Management Platform Cons →

Pricing and Cost Advice
  • "StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
  • "It has a CPU core-based licensing, which works for us and is quite good."
  • "There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
  • "The pricing is good, but not the best. They have some customized plans you can opt for."
  • More StreamSets Pricing and Cost Advice →

  • "It's very expensive."
  • "Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side. For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction."
  • "The price is expensive but there are no licensing fees."
  • "It is quite expensive."
  • "It's quite expensive."
  • "I have no information on the exact pricing for IBM InfoSphere DataStage because the solution is usually procured by the clients my company works with, though the pricing is higher compared to other solutions, so many clients choose to go with a different solution rather than IBM InfoSphere DataStage."
  • More IBM InfoSphere DataStage Pricing and Cost Advice →

  • "The price is on a per-user basis. It's a little more expensive than other tools. There aren't any additional costs beyond the standard licensing fee."
  • More Talend Data Management Platform Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Data Integration Tools solutions are best for your needs.
    653,757 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:It is really easy to set up and the interface is easy to use.
    Top Answer:We've seen a couple of cases where it appears to have a memory leak or a similar problem. It grows for a bit and then… more »
    Top Answer:We typically use it to transport our Oracle raw datasets up to Microsoft Azure, and then into SQL databases there.
    Top Answer:When we have needed help from the IBM team, they were helpful. Our company is a premium partner so we get fast… more »
    Top Answer:I have no information on the exact pricing for IBM InfoSphere DataStage because the solution is usually procured by the… more »
    Top Answer:Their web interface is good but the on-prem sites are outdated. The solution could also be improved if they could… more »
    Top Answer:What is the OLAP that you are using? Hosted in Cloud or on-premise?  The target DB should have its tool to extract… more »
    Comparisons
    Also Known As
    Talend Integration Suite
    Learn More
    StreamSets
    Video Not Available
    Overview

    StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.

    Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.

    With StreamSets, you can deliver the continuous data that drives the connected enterprise.

    IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.

    The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.

    The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:

    • Designing data flows to extract information from multiple sources, transform the data, and deliver it to target databases or applications.

    • Delivery of relevant and accurate data through direct connections to enterprise applications.

    • Reduction of development time and improvement of consistency through prebuilt functions.

    • Utilization of InfoSphere Information Server tools for accelerating the project delivery cycle.

    IBM InfoSphere DataStage can be deployed in various ways, including:

    • As a service: The tool can be accessed from a subscription model, where its capabilities are a part of IBM DataStage on IBM Cloud Park for Data as a Service. This option offers full management on IBM Cloud.

    • On premises or in any cloud: The two editions - IBM DataStage Enterprise and IBM DataStage Enterprise Plus - can run workloads on premises or in any cloud when added to IBM DataStage on IBM Cloud Pak for Data as a Service.

    • On premises: The basic jobs of the tool can be run on premises using IBM DataStage.

    IBM InfoSphere DataStage Features

    The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:

    • AI services: The tool offers services such as data science, event messaging, data warehousing, and data virtualization. It accelerates processes through artificial intelligence (AI) and offers a connection with IBM Cloud Paks - the cloud-native insight platform of the solution.

    • Parallel engine: Through this feature, ETL performance can be optimized to process data at scale. This is achieved through parallel engine and load balancing, which maximizes throughput.

    • Metadata support: This feature of the product uses the IBM Watson Knowledge Catalog to protect companies' sensitive data and monitor who can access it and at what levels.

    • Automated delivery pipelines: IBM InfoSphere DataStage reduces costs by automating continuous integration and delivery of pipelines.

    • Prebuilt connectors: The feature for prebuilt connectivity and stages allows users to move data between multiple cloud sources and data warehouses, including IBM native products.

    • IBM DataStage Flow Designer: This feature offers assistance through machine learning design. The product offers its clients a user-friendly interface which facilitates the work process.

    • IBM InfoSphere QualityStage: The tool provides a feature that automatically resolves data quality issues and increases the reliability of the delivered data.

    • Automated failure detection: Through this feature, companies can reduce infrastructure management efforts, relying on the automated detection that the tool offers.

    • Distributed data processing: Cloud runtimes can be executed remotely through this feature while maintaining its sovereignty and decreasing costs.

    IBM InfoSphere DataStage Benefits

    This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:

    • Increased speed of workload execution due to better balancing and a parallel engine.

    • Reduction of data movement costs through integrations and seamless design of jobs.

    • Modernization of data integration by extending the capabilities of companies' data.

    • Delivery of reliable data through IBM Cloud Pak for Data.

    • Utilization of a drag-and-drop interface which assists in the delivery of data without the need for code.

    • Effective data manipulation allows data to be merged before being mapped and transformed.

    • Creating easier access of users to their data by providing visual maps of the process and the delivered data.

    Reviews from Real Users

    A data/solution architect at a computer software company says the product is robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data.

    Tirthankar Roy Chowdhury, team leader at Tata Consultancy Services, feels the tool is user-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features.

    Talend offers robust data integration in an open and scalable architecture to maximize its value to your business. Simple, graphical tools and wizards get you up and running quickly with native code generation that streamlines development.
    Offer
    Learn more about StreamSets
    Learn more about IBM InfoSphere DataStage
    Learn more about Talend Data Management Platform
    Sample Customers
    Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
    Dubai Statistics Center, Etisalat Egypt
    Accent Group, Acticall, Allianz Global Investors Distributors LLC, Car & Boat Media, Cdiscount.com
    Top Industries
    VISITORS READING REVIEWS
    Financial Services Firm17%
    Computer Software Company14%
    Manufacturing Company7%
    Insurance Company7%
    REVIEWERS
    Computer Software Company70%
    Aerospace/Defense Firm10%
    Healthcare Company10%
    Financial Services Firm10%
    VISITORS READING REVIEWS
    Financial Services Firm20%
    Computer Software Company17%
    Comms Service Provider9%
    Manufacturing Company7%
    VISITORS READING REVIEWS
    Computer Software Company21%
    Energy/Utilities Company10%
    Financial Services Firm9%
    Comms Service Provider9%
    Company Size
    REVIEWERS
    Small Business14%
    Midsize Enterprise29%
    Large Enterprise57%
    VISITORS READING REVIEWS
    Small Business15%
    Midsize Enterprise11%
    Large Enterprise74%
    REVIEWERS
    Small Business47%
    Midsize Enterprise3%
    Large Enterprise50%
    VISITORS READING REVIEWS
    Small Business14%
    Midsize Enterprise10%
    Large Enterprise75%
    REVIEWERS
    Small Business38%
    Midsize Enterprise23%
    Large Enterprise38%
    VISITORS READING REVIEWS
    Small Business20%
    Midsize Enterprise12%
    Large Enterprise68%
    Buyer's Guide
    IBM InfoSphere DataStage vs. Talend Data Management Platform
    November 2022
    Find out what your peers are saying about IBM InfoSphere DataStage vs. Talend Data Management Platform and other solutions. Updated: November 2022.
    653,757 professionals have used our research since 2012.

    IBM InfoSphere DataStage is ranked 9th in Data Integration Tools with 10 reviews while Talend Data Management Platform is ranked 18th in Data Integration Tools with 2 reviews. IBM InfoSphere DataStage is rated 7.8, while Talend Data Management Platform is rated 9.0. The top reviewer of IBM InfoSphere DataStage writes "Robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data". On the other hand, the top reviewer of Talend Data Management Platform writes "Allows us to capture large volumes of data from multiple sources and load it to the data lake". IBM InfoSphere DataStage is most compared with SSIS, Talend Open Studio, AWS Glue, Azure Data Factory and Matillion ETL, whereas Talend Data Management Platform is most compared with Talend Data Fabric, SAP Data Services, Talend Open Studio, Oracle Data Integrator (ODI) and Qlik Replicate. See our IBM InfoSphere DataStage vs. Talend Data Management Platform report.

    See our list of best Data Integration Tools vendors and best Cloud Data Integration vendors.

    We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.