IBM InfoSphere DataStage and StreamSets both compete in the data integration solutions category. StreamSets appears to have the upper hand due to its user-friendly approach, rapid deployment capabilities, and cost-effectiveness.
Features: DataStage offers advanced ETL functionalities, supports parallel processing, and integrates effectively with IBM systems. Its capabilities extend to comprehensive metadata management and robust system integration. StreamSets provides flexible pipeline design, diverse connectors, and excellent data drift management. It supports both batch and streaming data with ease, allowing seamless integration across various platforms.
Room for Improvement: DataStage is noted for its high cost and complex setup, with calls for better cloud integration and a modernized interface. The administrative experience could benefit from alignment with cloud services. StreamSets would benefit from improved user documentation and data quality assessments. It also requires technical expertise for complex transformations and could enhance logging and visualization mechanisms.
Ease of Deployment and Customer Service: DataStage is mostly deployed on-premise, offering limited hybrid environment flexibility. Users report varying experiences with its customer support. StreamSets supports hybrid and cloud environments, making deployment more flexible. Customer service receives positive feedback, although the cost of the support model and documentation improvements are noted as areas for enhancement.
Pricing and ROI: DataStage's enterprise-level deployment pricing is deemed high, limiting accessibility for smaller businesses despite ROI outcomes. StreamSets provides a more flexible pricing model that includes open-source options, offering favorable ROI due to ease of use and integration. However, smaller enterprises may find commercial licensing costly, despite the model's relative cost-effectiveness.
We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.
IBM tech support has allocated dedicated resources, making it satisfactory.
IBM technical support sometimes transfers tickets between different teams due to shift changes, which can be frustrating.
I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift.
The solution needs improvement in connectivity with big data technologies such as Spark.
It would be beneficial if StreamSets addressed any potential memory leak issues to prevent unnecessary upgrades.
Pricing for IBM InfoSphere DataStage is moderate and not much expensive.
It is straightforward from a design and development perspective, and also for deployment.
IBM InfoSphere DataStage is very scalable, allowing us to extend it according to our processing needs.
It allows a hybrid installation approach, rather than being completely cloud-based or on-premises.
IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.
The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.
The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:
IBM InfoSphere DataStage can be deployed in various ways, including:
IBM InfoSphere DataStage Features
The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:
IBM InfoSphere DataStage Benefits
This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:
Reviews from Real Users
A data/solution architect at a computer software company says the product is robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data.
Tirthankar Roy Chowdhury, team leader at Tata Consultancy Services, feels the tool is user-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features.
StreamSets is a data integration platform that enables organizations to efficiently move and process data across various systems. It offers a user-friendly interface for designing, deploying, and managing data pipelines, allowing users to easily connect to various data sources and destinations. StreamSets also provides real-time monitoring and alerting capabilities, ensuring that data is flowing smoothly and any issues are quickly addressed.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.