ArturKowalczyk - PeerSpot reviewer
Technology Innovation Leader at Netrix S.A.
Real User
Top 5Leaderboard
Flexible with good connectivity and good modeling
Pros and Cons
  • "We like the flexibility of modeling."
  • "The error messaging needs to be improved."

What is our primary use case?

The product is primarily used for  intense data transformation; it's part of the risk management, and dataflow, and is sourcing data from the data warehouse on the SAP Sybase platform.

What is most valuable?

The connectivity with the databases and the speed and flexibility of modeling is excellent. We like the flexibility of modeling.

The solution is stable.

It can scale.

What needs improvement?

We'd like better integration with source control and error and diagnostic information. The error messaging needs to be improved. 

The solution is a bit complicated. 

For how long have I used the solution?

I've been using the solution for four years. 

Buyer's Guide
IBM InfoSphere DataStage
March 2024
Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,886 professionals have used our research since 2012.

What do I think about the stability of the solution?

It's stable. it's reliable. There are no bugs or glitches. It doesn't crash or freeze. 

What do I think about the scalability of the solution?

We can scale the solution as needed. 

There are about 50 users on the solution right now. 

How are customer service and support?

While technical support may have been used, I have never personally dealt with them.

Which solution did I use previously and why did I switch?

I've used SSIS as well and find this product to be more difficult to set up.

How was the initial setup?

The initial setup can be challenging. It's harder to set up than, for example, SSIS.

I'm not sure how long it took to set up, as it was already in place when I joined the team. However, I would say it took a week to deploy.

We have five people on hand that can handle deployment and maintenance tasks. They are all engineers. 

What about the implementation team?

The initial setup can be handled in-house. 

What's my experience with pricing, setup cost, and licensing?

The licensing we have is permanent. 

What other advice do I have?

I'd recommend the product to others. 

I'd rate it a nine out of ten. We've been pleased with its capabilities overall. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Partner at Avydium Data LLC
User
Its parallel processing capability allows you to go through extremely large data sets in no time at all
Pros and Cons
  • "Highly customizable: Allowing you to handle multiple data latencies (scheduled batch, on-demand, and real-time) in the same job."
  • "Working with some of the big data components is good, but I can see improvements are needed."

What is our primary use case?

Complex data integration projects which require integration from multiple data sources.

How has it helped my organization?

I have worked during many implementations using DataStage. All of the projects that I worked on have been successful. This is due mainly to the strict discipline around best practices, and by following a set of standards and templates designed to reduce complexity and improve automation, including strong reference architecture.

What is most valuable?

  • Its parallel processing capability allows you to go through extremely large data sets in no time at all, if you do your job right. 
  • Highly customizable: Allowing you to handle multiple data latencies (scheduled batch, on-demand, and real-time) in the same job. 
  • High scalability: Start small and go big with the same job. You just need to adjust the configuration file, no need to recompile.
  • Strong metadata management: Business, technical, and process metadata can all be managed from a single place.
  • Ease of integration with other tool sets: Easily supports APIs (or build your own) to support data streaming (or batched) from other systems.
  • Data Quality Management from within the tool: Supporting data sampling, including profiling of data, directly from the development canvas.

What needs improvement?

High-cost of ownership: They could take a page from open source software, such as Talend.

Working with some of the big data components is good, but I can see improvements are needed, such as native support for Spark and HBase.

For how long have I used the solution?

More than five years.

What do I think about the stability of the solution?

No issues.

What do I think about the scalability of the solution?

No issues.

How are customer service and technical support?

Support is always good.

Which solution did I use previously and why did I switch?

Have used quite a few ETL tools in my job.

  • Ab Initio: Even pricier, but has a highly competent ETL tool. It is complete, but hard to use. 
  • Informatica: Not as flexible and does not support the same level of complexity in its maps.
  • Talend: It is a good tool suite, extensive, but can be cumbersome to cite all its pieces.
  • ODI: For the Oracle centric world.
  • SSIS: Week when compared to any of the above tool sets.

How was the initial setup?

Depends on type of environment that is being installed. I have seen fairly simple to overly complex initial setups due to the environment, not due to the tool.

What about the implementation team?

Both vendor and in-house team implementations:

IBM has top-notch support and tool services along with other partners as well. Depending on the partner, this can go from installation and configuration to solution development, etc.)

Most in-house teams that I have seen tend to have have good developers, but not always good architects. Like most every data integration project, if you do not have a strong architecture, your solution will eventually fail.

What was our ROI?

Depends on the project.

Which other solutions did I evaluate?

Have done many ETL tool evaluations based on client requirements. DataStage has always been in the top-three. It may not have been selected due to different weights being used for different sections of the evaluation for different clients, but it has always been in the top-three consistently.

What other advice do I have?

If you have the budget and your solution requires industrial/enterprise strength data integration, this product is always a good choice.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
IBM InfoSphere DataStage
March 2024
Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,886 professionals have used our research since 2012.
Kirill Slivchikov - PeerSpot reviewer
Owner at 7Spring Consult
Real User
Top 5
A stable and scalable ETL tool that needs to integrate basic data quality check features
Pros and Cons
  • "I am impressed with the tool's ETL tracing."
  • "It would be great if they can include some basic version of data quality checking features."

What is our primary use case?

I work as a consultant and I have several projects with the Russian banks. My main expertise is building data warehouses and I use the product as a ETL. 

What is most valuable?

I am impressed with the tool's ETL tracing. 

What needs improvement?

It would be great if they can include some basic version of data quality checking features. 

What do I think about the stability of the solution?

The solution is quite stable. 

What do I think about the scalability of the solution?

I would rate the product's scalability an eight out of ten. 

What other advice do I have?

I would rate the product a nine out of ten. You need to get a balance between batch ETL processing and streaming. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer:
PeerSpot user
Manager at a consultancy with 1,001-5,000 employees
Real User
Robust and scalable but the initial setup is not straightforward and the price is high
Pros and Cons
  • "It's a robust solution."
  • "The initial setup could be more straightforward."

What is our primary use case?

We are a solution provider and this is one of the products that we implement for our clients.

This solution has an end-to-end process used for data integration.

What is most valuable?

It's a robust solution.

What needs improvement?

The initial setup could be more straightforward.

For how long have I used the solution?

We have been providing IBM InfoSphere DataStage for one year.

What do I think about the stability of the solution?

I believe this solution is stable. We have not received any feedback from our clients.

What do I think about the scalability of the solution?

To my understanding, this solution is scalable.

We have several customers who are currently using it.

How are customer service and technical support?

I have not contacted technical support.

Which solution did I use previously and why did I switch?

We have a long list of different providers such as Informatica, IBM, Oracle, Microsoft SSIS, Pentaho, and Talend.

How was the initial setup?

The installation was not straightforward and I would rate it at medium complexity.

What about the implementation team?

The installation required assistance from an expert from IBM.

What's my experience with pricing, setup cost, and licensing?

The price is expensive but there are no licensing fees.

What other advice do I have?

Informatica provides a cloud-based deployment but we only work with the on-premises version. This is a product that I can recommend.

I would rate this solution a six out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Utkarsh Shrivastava - PeerSpot reviewer
ETL/Solution Architect at Crux
Real User
Top 20
Good performance optimization and useful for ETL purposes when we're building data warehouses or data marts
Pros and Cons
  • "The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms"
  • "In the future, I would like to see more integration with cloud technologies."

What is our primary use case?

The primary use case is for ETL purposes for when we're building data warehouses or data marts. We use it to get the data from different disparate sources, do some ETL on them, and we use DataStage and then load them into the data warehouse, database, or data mart.

This solution used to be on-premises, but they've recently come out with a hybrid offering.

What is most valuable?

The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms. I have not found those in Informatica or Talend.

What needs improvement?

As a product, it needs to be more stable. It's a legacy product, so even though it's high-performing, it's not very stable compared to other products like Informatica or Talend. The UI also looks dated.

In the future, I would like to see more integration with cloud technologies. Technical support could be improved.

For how long have I used the solution?

I've worked with DataStage for about 9 years.

What do I think about the stability of the solution?

The stability could be better.

What do I think about the scalability of the solution?

It's scalable.

How are customer service and support?

I would rate technical support 6 out of 10.

How was the initial setup?

For the on-prem solution, it was moderately complex. I'm not sure about the hybrid version.

What other advice do I have?

I would rate this solution 8 out of 10.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Systems Integration Associate Director at a computer software company with 10,001+ employees
Real User
Helpful support, and the Hierarchical Data Stage is good
Pros and Cons
  • "The Hierarchical Data Stage is good."
  • "The interface needs improvement."

What is our primary use case?

We are a consulting company and we use this solution for our clients. We set up the data for them. We have various healthcare-related information from their vendor and business partners. They have integrated them and get data reports from it.

How has it helped my organization?

It improves how our client's organization functions.

What is most valuable?

We mainly use the designer and developer qualities. We use the basic features that we have.

They have many good features. The Hierarchical Data Stage is good.

What needs improvement?

The interface needs improvement. The interface in Informatica is easier than in DataStage.

The licensing can be improved. Many companies are moving away from DataStage because it is expensive.

The biggest issue that is unclear is how are they integrating into DevOps when they are binary files.

We would like to see DataStage integrated with DevOps so that a pipeline can be created for auto-deployment. Right now we are all doing it manually.

For how long have I used the solution?

I have been working with IBM InfoSphere DataStage for seven years.

We have the 11.3 version but have recently migrated to the 11.7 version.

What do I think about the stability of the solution?

It's a stable product, it's not new.

What do I think about the scalability of the solution?

It's very scalable. Our clients are medium-sized companies with a 1.5 billion turnover.

How are customer service and technical support?

We reached out to IBM because the file was not readable, and they resolved the issue.

Technical support is good. I have not found any issues with technical support. I would rate them an eight out of ten.

In some cases, they have a delay in giving suggestions for the configuration.

Which solution did I use previously and why did I switch?

Previously, in another company, I worked with Informatica. There are not a lot of differences but the interface is easier than it is in DataStage.

How was the initial setup?

I don't do the setup, but I think that they have many challenges.

Initially, we had challenges with the configuration. We were trying to use the comparison for Excel, and reading the Excel files from the source, but the files were not readable.

What's my experience with pricing, setup cost, and licensing?

It's very expensive.

Which other solutions did I evaluate?


What other advice do I have?

I am not a developer, I have a team within our company for that.

There is a cloud migration strategy going on, so they are thinking of moving to the cloud. They want a tool that is not heavy and suitable for their budget.

The recommendation for using this tool would depend on the requirements. 

I don't have anything bad to say about this product.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Data/Solution Architect at a computer software company with 51-200 employees
Real User
Robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data
Pros and Cons
  • "As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. Its error logging mechanism is far simpler and easier to understand than other data integration tools. The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables."
  • "Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate. In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere."

What is our primary use case?

We use it for creating a pattern for data integration with our data vault. We have also used it for creating APIs.

What is most valuable?

As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. 

Its error logging mechanism is far simpler and easier to understand than other data integration tools.

The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables.

What needs improvement?

Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate.

In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere.

For how long have I used the solution?

It was DataStage previously, and then it became InfoSphere. I have used DataStage for ten years and InfoSphere for one year.

What do I think about the stability of the solution?

It is quite stable. In the newer components of InfoSphere, you have a mapping tool called FastTrack and a metadata generator, which can have issues from time to time, but they get resolved.

What do I think about the scalability of the solution?

It is not that easy to scale on-premises. I have worked on the ones deployed on Windows or Unix, and scalability is often dependent on whether you can add more CPUs or boxes. On the cloud, it would have been easier to scale. However, the current version can only be deployed on Windows or Unix.

How are customer service and technical support?

I have not been in touch with them recently. Earlier, I was in touch with their technical support and had raised tickets because some weird errors, such as fantom error, were being logged in the error log, which made no sense. We used to get in touch with their support team to understand these.

Which solution did I use previously and why did I switch?

I have used Informatica and SAS CA. IBM InfoSphere has the highest cost of licensing as compared to others. It is not very widely used, and it is very difficult to find people who have this sort of knowledge. 

The newer version of Informatica is on the cloud and is much more user-friendly than InfoSphere because it provides profiling information in nice graphs and charts. It also provides a lot of templates. For example, if I want to build a whole dimensional kind of structure, Informatica has a template. I just need to use that template. So, the ease of use is far better in Informatica, and it has everything that InfoSphere has. The only thing is that Informatica comes in bundles. That's the reason sometimes organizations don't go for it. For example, the data integration is a separate section, and the data quality is a separate section. They have separate pricing.

How was the initial setup?

The initial setup is quite simple. It didn't take more than half an hour to set it up on my laptop.

What about the implementation team?

I implemented it myself. In terms of maintenance, a particular version might not require any maintenance. There could be bug fixes and minor versions going in for some versions.

What's my experience with pricing, setup cost, and licensing?

It is quite expensive.

What other advice do I have?

I would recommend this solution for large-scale implementation where you need a complex transformation and data integration to happen according to a structured format, either a data vault or a dimension model. It is suitable for big companies because of the cost. It is a very valuable platform for data in large volumes. For small volumes, you have other open-source tools that can do the same thing for you.

I am part of a consultancy, and I have deployed this product for companies. We have five to eight developers. Because InfoSphere is a licensed product, and its licenses cost a lot, there are not many InfoSphere developers.

I would rate IBM InfoSphere DataStage an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
CEO at DELOMID IT
Real User
Top 20
A solution that is easy to use for designing and transferring data
Pros and Cons
  • "The most valuable feature is the ability to transfer information via notes."
  • "The documentation and in-application help for this solution need to be improved, especially for new features."

What is our primary use case?

We implement this solution for our customers. The majority of them are Enterprise companies.

What is most valuable?

This solution is very easy to use because you can design to compile and to run.

The most valuable feature is the ability to transfer information via notes.

What needs improvement?

The documentation and in-application help for this solution need to be improved, especially for new features. By comparison, in Talend, there is help available for all of the features.

One of my clients has a problem using this solution with MongoDB.

In the next release of this solution, I would like to see the ability to copy and paste schemas. It would be very good because as it is now, you have to save the schema to a repository and then re-load it. It can be done in Talend, but in DataStage, it is not as good.

For how long have I used the solution?

Eight years.

What do I think about the stability of the solution?

This is a stable solution. You have to be careful when you install a service pack because sometimes it causes problems. There may be a second service pack to solve problems that were introduced by the first one.

What do I think about the scalability of the solution?

Scaling this solution is not difficult. When you first install you chose what components you need. My clients are enterprise companies, with at least five hundred or a thousand employees.

How are customer service and technical support?

The are several technical support teams, and the quality of support depends on where the customer is situated. Normally, technical support answers quickly, but it can be improved.

How was the initial setup?

The initial setup is not easy. You have to be an expert to install DataStage.

Sometimes I get calls from clients who ask me to install this solution because their Unix administrator is not able to do it. You have to configure your OS, Database, Web server, and more. There are a lot of things to install.  

If you are not experienced then it is not possible to install.

What's my experience with pricing, setup cost, and licensing?

Small and medium-sized companies cannot afford to pay for this solution.

Which other solutions did I evaluate?

This solution is for larger companies. Smaller businesses use Talend.

What other advice do I have?

This is a good product, but there is room for improvement.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2024
Product Categories
Data Integration
Buyer's Guide
Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros sharing their opinions.