IBM InfoSphere DataStage OverviewUNIXBusinessApplication

IBM InfoSphere DataStage is the #9 ranked solution in top Data Integration Tools. PeerSpot users give IBM InfoSphere DataStage an average rating of 7.8 out of 10. IBM InfoSphere DataStage is most commonly compared to SSIS: IBM InfoSphere DataStage vs SSIS. IBM InfoSphere DataStage is popular among the large enterprise segment, accounting for 75% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 20% of all views.
IBM InfoSphere DataStage Buyer's Guide

Download the IBM InfoSphere DataStage Buyer's Guide including reviews and more. Updated: November 2022

What is IBM InfoSphere DataStage?

IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.

The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.

The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:

  • Designing data flows to extract information from multiple sources, transform the data, and deliver it to target databases or applications.

  • Delivery of relevant and accurate data through direct connections to enterprise applications.

  • Reduction of development time and improvement of consistency through prebuilt functions.

  • Utilization of InfoSphere Information Server tools for accelerating the project delivery cycle.

IBM InfoSphere DataStage can be deployed in various ways, including:

  • As a service: The tool can be accessed from a subscription model, where its capabilities are a part of IBM DataStage on IBM Cloud Park for Data as a Service. This option offers full management on IBM Cloud.

  • On premises or in any cloud: The two editions - IBM DataStage Enterprise and IBM DataStage Enterprise Plus - can run workloads on premises or in any cloud when added to IBM DataStage on IBM Cloud Pak for Data as a Service.

  • On premises: The basic jobs of the tool can be run on premises using IBM DataStage.

IBM InfoSphere DataStage Features

The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:

  • AI services: The tool offers services such as data science, event messaging, data warehousing, and data virtualization. It accelerates processes through artificial intelligence (AI) and offers a connection with IBM Cloud Paks - the cloud-native insight platform of the solution.

  • Parallel engine: Through this feature, ETL performance can be optimized to process data at scale. This is achieved through parallel engine and load balancing, which maximizes throughput.

  • Metadata support: This feature of the product uses the IBM Watson Knowledge Catalog to protect companies' sensitive data and monitor who can access it and at what levels.

  • Automated delivery pipelines: IBM InfoSphere DataStage reduces costs by automating continuous integration and delivery of pipelines.

  • Prebuilt connectors: The feature for prebuilt connectivity and stages allows users to move data between multiple cloud sources and data warehouses, including IBM native products.

  • IBM DataStage Flow Designer: This feature offers assistance through machine learning design. The product offers its clients a user-friendly interface which facilitates the work process.

  • IBM InfoSphere QualityStage: The tool provides a feature that automatically resolves data quality issues and increases the reliability of the delivered data.

  • Automated failure detection: Through this feature, companies can reduce infrastructure management efforts, relying on the automated detection that the tool offers.

  • Distributed data processing: Cloud runtimes can be executed remotely through this feature while maintaining its sovereignty and decreasing costs.

IBM InfoSphere DataStage Benefits

This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:

  • Increased speed of workload execution due to better balancing and a parallel engine.

  • Reduction of data movement costs through integrations and seamless design of jobs.

  • Modernization of data integration by extending the capabilities of companies' data.

  • Delivery of reliable data through IBM Cloud Pak for Data.

  • Utilization of a drag-and-drop interface which assists in the delivery of data without the need for code.

  • Effective data manipulation allows data to be merged before being mapped and transformed.

  • Creating easier access of users to their data by providing visual maps of the process and the delivered data.

Reviews from Real Users

A data/solution architect at a computer software company says the product is robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data.

Tirthankar Roy Chowdhury, team leader at Tata Consultancy Services, feels the tool is user-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features.

IBM InfoSphere DataStage Customers

Dubai Statistics Center, Etisalat Egypt

IBM InfoSphere DataStage Video

IBM InfoSphere DataStage Pricing Advice

What users are saying about IBM InfoSphere DataStage pricing:
  • "It is quite expensive."
  • "I have no information on the exact pricing for IBM InfoSphere DataStage because the solution is usually procured by the clients my company works with, though the pricing is higher compared to other solutions, so many clients choose to go with a different solution rather than IBM InfoSphere DataStage."
  • "Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side. For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction."
  • "It's quite expensive."
  • "The price is expensive but there are no licensing fees."
  • IBM InfoSphere DataStage Reviews

    Filter by:
    Filter Reviews
    Industry
    Loading...
    Filter Unavailable
    Company Size
    Loading...
    Filter Unavailable
    Job Level
    Loading...
    Filter Unavailable
    Rating
    Loading...
    Filter Unavailable
    Considered
    Loading...
    Filter Unavailable
    Order by:
    Loading...
    • Date
    • Highest Rating
    • Lowest Rating
    • Review Length
    Search:
    Showingreviews based on the current filters. Reset all filters
    Data/Solution Architect at a computer software company with 51-200 employees
    Real User
    Robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data
    Pros and Cons
    • "As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. Its error logging mechanism is far simpler and easier to understand than other data integration tools. The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables."
    • "Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate. In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere."

    What is our primary use case?

    We use it for creating a pattern for data integration with our data vault. We have also used it for creating APIs.

    What is most valuable?

    As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. 

    Its error logging mechanism is far simpler and easier to understand than other data integration tools.

    The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables.

    What needs improvement?

    Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate.

    In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere.

    For how long have I used the solution?

    It was DataStage previously, and then it became InfoSphere. I have used DataStage for ten years and InfoSphere for one year.

    Buyer's Guide
    IBM InfoSphere DataStage
    November 2022
    Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: November 2022.
    656,474 professionals have used our research since 2012.

    What do I think about the stability of the solution?

    It is quite stable. In the newer components of InfoSphere, you have a mapping tool called FastTrack and a metadata generator, which can have issues from time to time, but they get resolved.

    What do I think about the scalability of the solution?

    It is not that easy to scale on-premises. I have worked on the ones deployed on Windows or Unix, and scalability is often dependent on whether you can add more CPUs or boxes. On the cloud, it would have been easier to scale. However, the current version can only be deployed on Windows or Unix.

    How are customer service and support?

    I have not been in touch with them recently. Earlier, I was in touch with their technical support and had raised tickets because some weird errors, such as fantom error, were being logged in the error log, which made no sense. We used to get in touch with their support team to understand these.

    Which solution did I use previously and why did I switch?

    I have used Informatica and SAS CA. IBM InfoSphere has the highest cost of licensing as compared to others. It is not very widely used, and it is very difficult to find people who have this sort of knowledge. 

    The newer version of Informatica is on the cloud and is much more user-friendly than InfoSphere because it provides profiling information in nice graphs and charts. It also provides a lot of templates. For example, if I want to build a whole dimensional kind of structure, Informatica has a template. I just need to use that template. So, the ease of use is far better in Informatica, and it has everything that InfoSphere has. The only thing is that Informatica comes in bundles. That's the reason sometimes organizations don't go for it. For example, the data integration is a separate section, and the data quality is a separate section. They have separate pricing.

    How was the initial setup?

    The initial setup is quite simple. It didn't take more than half an hour to set it up on my laptop.

    What about the implementation team?

    I implemented it myself. In terms of maintenance, a particular version might not require any maintenance. There could be bug fixes and minor versions going in for some versions.

    What's my experience with pricing, setup cost, and licensing?

    It is quite expensive.

    What other advice do I have?

    I would recommend this solution for large-scale implementation where you need a complex transformation and data integration to happen according to a structured format, either a data vault or a dimension model. It is suitable for big companies because of the cost. It is a very valuable platform for data in large volumes. For small volumes, you have other open-source tools that can do the same thing for you.

    I am part of a consultancy, and I have deployed this product for companies. We have five to eight developers. Because InfoSphere is a licensed product, and its licenses cost a lot, there are not many InfoSphere developers.

    I would rate IBM InfoSphere DataStage an eight out of ten.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
    PeerSpot user
    Tirthankar Roy Chowdhury - PeerSpot reviewer
    Teamlead at Tata consultancy services
    Real User
    Top 5Leaderboard
    User-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features
    Pros and Cons
    • "The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities."
    • "What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well."

    What is our primary use case?

    IBM InfoSphere DataStage was mostly used for ETL and data integration purposes, so extract, transfer, and load, including some data quality use cases. My team used the solution to extract data from various sources, do some business transformations, load the data into a target database, or generate files.

    What is most valuable?

    The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities.

    What needs improvement?

    What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag.

    Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources.

    The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well.

    For how long have I used the solution?

    I've been working with IBM InfoSphere DataStage for more than seven years.

    What do I think about the stability of the solution?

    IBM InfoSphere DataStage is a stable product and it's been in the market for quite some time, but in its latest version, there's been some instability caused by the new features introduced in the solution. The architecture was changed a lot and that was causing issues and frequent outages that my company had to go back to IBM for troubleshooting. My team didn't face issues in the earlier version of IBM InfoSphere DataStage. It was the latest version that had instability issues.

    What do I think about the scalability of the solution?

    IBM InfoSphere DataStage is a very scalable product.

    How are customer service and support?

    IBM InfoSphere DataStage has a pretty good technical support, but with the new version, particularly the new architecture and the microservice concept, support sometimes takes a bit of time, even for the IBM team to figure out what's wrong, but once that's been figured out, the team comes up with the solution or with a patch.

    How was the initial setup?

    Setting up IBM InfoSphere DataStage was easy.

    How long the deployment takes would depend on certain factors, but it usually takes just two to three hours.

    What's my experience with pricing, setup cost, and licensing?

    I have no information on the exact pricing for IBM InfoSphere DataStage because the solution is usually procured by the clients my company works with, though the pricing is higher compared to other solutions, so many clients choose to go with a different solution rather than IBM InfoSphere DataStage.

    What other advice do I have?

    The last version of IBM InfoSphere DataStage which I've worked with was version 11.7.

    I work for an IT service company that works with multiple clients on multiple projects, so close to two hundred people use IBM InfoSphere DataStage for various clients.

    Per project, on average, three people take care of IBM InfoSphere DataStage deployment, maintenance, and support-related activities.

    My advice to people looking into implementing IBM InfoSphere DataStage is that it's a very good product. A lot of similar products have come up nowadays, but this product has a pretty good reputation as it's been in the market for quite a while. I do think other products such as Talend, Informatica PowerCenter, and Informatica Data Quality are better than IBM InfoSphere DataStage.

    My rating for IBM InfoSphere DataStage is eight out of ten.

    My company has a partnership with IBM.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
    Flag as inappropriate
    PeerSpot user
    Buyer's Guide
    IBM InfoSphere DataStage
    November 2022
    Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: November 2022.
    656,474 professionals have used our research since 2012.
    DataStage at a healthcare company with 10,001+ employees
    Real User
    User-friendly with a lot of functions for transmission rules, but has slow performance and not suitable for a huge volume of data
    Pros and Cons
    • "We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions."
    • "It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud. Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved. It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies."

    What is our primary use case?

    We are supporting a healthcare domain vendor located in the US. We get data from various domains, such as health insurance. We have member data, provider data, and consumer data. We also have client-related stuff and broker-related commission data. 

    We get the data from these domains, and after receiving it, we apply the transformation rules, such as joints. We also do the standardization of data by formatting and doing field validations, such as formatting the date field and doing data and time validations. We also do other normal transformations with some business logic. After applying all this, we send the data to the business.

    What is most valuable?

    We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions.

    What needs improvement?

    It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud. 

    Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved.

    It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies.

    For how long have I used the solution?

    I have been using this solution for around 15 years.

    What do I think about the scalability of the solution?

    It is easy to scale. In my project, six or seven people are using this solution, but in my company, we have around 15 to 16 projects.

    How are customer service and technical support?

    We have an internal admin team for support. If they are not able to solve an issue, they raise a ticket with the IBM team. In the last ten years, we had to contact IBM only two to three times. Our internal team is able to handle most of the issues.

    How was the initial setup?

    Its initial setup has moderate complexity. It required some coordination with the vendor because their system also needs to be ready. We also get maintenance support from them.

    What's my experience with pricing, setup cost, and licensing?

    Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side.

    For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction.

    What other advice do I have?

    DataStage is a good tool for the ETL platform, but it is not suitable for a huge volume of data. It works well for low to medium volume of data. I would advise others to do a feasibility study and evaluate available options in the market in terms of features and cost.

    I would rate IBM InfoSphere DataStage a seven out of ten.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    KirillSlivchikov - PeerSpot reviewer
    Owner at 7Spring Consult
    Real User
    Top 10
    Reliable, simple to install, and useful
    Pros and Cons
    • "It is quite useful and powerful."
    • "It would be useful to provide support for Python, AR, and Java."

    What is our primary use case?

    I am a consultant. I provide product information for our clients.

    What is most valuable?

    IBM InfoSphere DataStage is a good product.

    It is quite useful and powerful.

    What needs improvement?

    From a practice point of view, solutions such as IBM InfoSphere DataStage and Oracle Data Integrator are losing ground, whereas open-source solutions are becoming increasingly powerful.

    For example, we are currently working hard on several examples, and in a few years, open-source solutions will take the lead in the market. It will be used by large enterprises. 

    Clients are looking for open-source solutions more and more.

    It would be useful to provide support for Python, R, and Java.

    For how long have I used the solution?

    I have more than 22 years of experience with many different products. 

    It has been three to four years that we have been using IBM InfoSphere DataStage.

    What do I think about the stability of the solution?

    I have no issues with the stability of IBM InfoSphere DataStage.

    How are customer service and support?

    Clients are quite dependant on support from the vendor. For example, if you want to activate a new feature on the product, you must create a ticket. You have no information on when it will be implemented, and the vendor does not know because they have a stream of tickets that are completed by the priority given to the ticket.

    Which solution did I use previously and why did I switch?

    I am a consultant. I have different projects with different platforms. We are constantly going back and forth to different solutions for different projects.

    I have had clients who have used Amazon Redshift.

    Over the years, my clients have used many different products. For example, they use IBM Landscape and we use IBM InfoSphere.

    How was the initial setup?

    The initial setup was straightforward. We did not have issues.

    What's my experience with pricing, setup cost, and licensing?

    Comparable solutions will have common disadvantages, which is the total cost of the project.

    It's quite expensive.

    Which other solutions did I evaluate?

    From time to time, I evaluate different products for my clients.

    What other advice do I have?

    We have had different projects with three of four clients. The average term per project has been nine months and one year.

    If you are working with an open-source solution or another solution, you can implement some features by yourself. For example, in the case of Amazon, which has Amazon Lambda, you can easily write your code in Python or Java, and it will orchestrate it. You can create your features yourself easily and gives you more abilities to make your solution run quicker, eliminating the dependence from the vendor.

    I would rate IBM InfoSphere DataStage an eight out of ten.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    ArturKowalczyk - PeerSpot reviewer
    Technology Innovation Leader at Netrix S.A.
    Real User
    Top 5Leaderboard
    Flexible with good connectivity and good modeling
    Pros and Cons
    • "We like the flexibility of modeling."
    • "The error messaging needs to be improved."

    What is our primary use case?

    The product is primarily used for  intense data transformation; it's part of the risk management, and dataflow, and is sourcing data from the data warehouse on the SAP Sybase platform.

    What is most valuable?

    The connectivity with the databases and the speed and flexibility of modeling is excellent. We like the flexibility of modeling.

    The solution is stable.

    It can scale.

    What needs improvement?

    We'd like better integration with source control and error and diagnostic information. The error messaging needs to be improved. 

    The solution is a bit complicated. 

    For how long have I used the solution?

    I've been using the solution for four years. 

    What do I think about the stability of the solution?

    It's stable. it's reliable. There are no bugs or glitches. It doesn't crash or freeze. 

    What do I think about the scalability of the solution?

    We can scale the solution as needed. 

    There are about 50 users on the solution right now. 

    How are customer service and support?

    While technical support may have been used, I have never personally dealt with them.

    Which solution did I use previously and why did I switch?

    I've used SSIS as well and find this product to be more difficult to set up.

    How was the initial setup?

    The initial setup can be challenging. It's harder to set up than, for example, SSIS.

    I'm not sure how long it took to set up, as it was already in place when I joined the team. However, I would say it took a week to deploy.

    We have five people on hand that can handle deployment and maintenance tasks. They are all engineers. 

    What about the implementation team?

    The initial setup can be handled in-house. 

    What's my experience with pricing, setup cost, and licensing?

    The licensing we have is permanent. 

    What other advice do I have?

    I'd recommend the product to others. 

    I'd rate it a nine out of ten. We've been pleased with its capabilities overall. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Manager at a consultancy with 1,001-5,000 employees
    Real User
    Top 20
    Robust and scalable but the initial setup is not straightforward and the price is high

    What is our primary use case?

    We are a solution provider and this is one of the products that we implement for our clients.

    This solution has an end-to-end process used for data integration.

    What is most valuable?

    It's a robust solution.

    What needs improvement?

    The initial setup could be more straightforward.

    For how long have I used the solution?

    We have been providing IBM InfoSphere DataStage for one year.

    What do I think about the stability of the solution?

    I believe this solution is stable. We have not received any feedback from our clients.

    What do I think about the scalability of the solution?

    To my understanding, this solution is scalable.

    We have several customers who are currently using it.

    How are customer service and technical support?

    I have not contacted technical support.

    Which solution did I use previously and why did I switch?

    We have a long list of different providers such as Informatica, IBM, Oracle, Microsoft SSIS, Pentaho, and Talend.

    How was the initial setup?

    The installation was not straightforward and I would rate it at medium complexity.

    What about the implementation team?

    The installation required assistance from an expert from IBM.

    What's my experience with pricing, setup cost, and licensing?

    The price is expensive but there are no licensing fees.

    What other advice do I have?

    Informatica provides a cloud-based deployment but we only work with the on-premises version. This is a product that I can recommend.

    I would rate this solution a six out of ten.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
    PeerSpot user
    Utkarsh Shrivastava - PeerSpot reviewer
    ETL/Solution Architect at Crux
    Real User
    Top 20
    Good performance optimization and useful for ETL purposes when we're building data warehouses or data marts
    Pros and Cons
    • "The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms"
    • "In the future, I would like to see more integration with cloud technologies."

    What is our primary use case?

    The primary use case is for ETL purposes for when we're building data warehouses or data marts. We use it to get the data from different disparate sources, do some ETL on them, and we use DataStage and then load them into the data warehouse, database, or data mart.

    This solution used to be on-premises, but they've recently come out with a hybrid offering.

    What is most valuable?

    The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms. I have not found those in Informatica or Talend.

    What needs improvement?

    As a product, it needs to be more stable. It's a legacy product, so even though it's high-performing, it's not very stable compared to other products like Informatica or Talend. The UI also looks dated.

    In the future, I would like to see more integration with cloud technologies. Technical support could be improved.

    For how long have I used the solution?

    I've worked with DataStage for about 9 years.

    What do I think about the stability of the solution?

    The stability could be better.

    What do I think about the scalability of the solution?

    It's scalable.

    How are customer service and support?

    I would rate technical support 6 out of 10.

    How was the initial setup?

    For the on-prem solution, it was moderately complex. I'm not sure about the hybrid version.

    What other advice do I have?

    I would rate this solution 8 out of 10.

    Which deployment model are you using for this solution?

    Hybrid Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Head of IT Integration & Finance Transformation at a financial services firm with 5,001-10,000 employees
    Real User
    Top 5
    Great flexibility with full delivery integration
    Pros and Cons
    • "Offers great flexibility."
    • "Currently lacking virtualization ability."

    What is our primary use case?

    The most powerful use of this solution is for full delivery integration. Aside from the ETL aspect, it makes data lineage. We are customers of IBM InfoSphere and I'm head of IT integration and finance transformation.

    What is most valuable?

    The key feature for me is the flexibility this solution offers. 

    What needs improvement?

    The solution is currently lacking virtualization ability. If they were to include it, it could be a good evolution on this framework. I'd like to see an improvement in support and a more customer friendly and knowledgeable support staff.  

    For how long have I used the solution?

    I've been using this solution for over 20 years. 

    What do I think about the scalability of the solution?

    Scalability is a question of licensing, but it can definitely be done.

    How are customer service and technical support?

    Technical support could be better - more knowledgeable and customer friendly. 

    How was the initial setup?

    We outsourced deployment to IBM and didn't have any issues. Implementation took about two weeks. We have between 20 and 30 users in the company and our maintenance is carried out by IBM.

    What's my experience with pricing, setup cost, and licensing?

    We pay an annual licensing fee. 

    Which other solutions did I evaluate?

    Yes, we evaluated other tools, and this tool is competing internally with our PySQL development. It's a battle between code and local.

    What other advice do I have?

    I rate this solution an eight out of 10. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Buyer's Guide
    Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros sharing their opinions.
    Updated: November 2022
    Product Categories
    Data Integration Tools
    Buyer's Guide
    Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros sharing their opinions.