

IBM InfoSphere DataStage and StreamSets both compete in the data integration solutions category. StreamSets appears to have the upper hand due to its user-friendly approach, rapid deployment capabilities, and cost-effectiveness.
Features: DataStage offers advanced ETL functionalities, supports parallel processing, and integrates effectively with IBM systems. Its capabilities extend to comprehensive metadata management and robust system integration. StreamSets provides flexible pipeline design, diverse connectors, and excellent data drift management. It supports both batch and streaming data with ease, allowing seamless integration across various platforms.
Room for Improvement: DataStage is noted for its high cost and complex setup, with calls for better cloud integration and a modernized interface. The administrative experience could benefit from alignment with cloud services. StreamSets would benefit from improved user documentation and data quality assessments. It also requires technical expertise for complex transformations and could enhance logging and visualization mechanisms.
Ease of Deployment and Customer Service: DataStage is mostly deployed on-premise, offering limited hybrid environment flexibility. Users report varying experiences with its customer support. StreamSets supports hybrid and cloud environments, making deployment more flexible. Customer service receives positive feedback, although the cost of the support model and documentation improvements are noted as areas for enhancement.
Pricing and ROI: DataStage's enterprise-level deployment pricing is deemed high, limiting accessibility for smaller businesses despite ROI outcomes. StreamSets provides a more flexible pricing model that includes open-source options, offering favorable ROI due to ease of use and integration. However, smaller enterprises may find commercial licensing costly, despite the model's relative cost-effectiveness.
We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.
I rate their support as nine on a scale from one to ten.
IBM tech support has allocated dedicated resources, making it satisfactory.
IBM technical support sometimes transfers tickets between different teams due to shift changes, which can be frustrating.
If the job provided suggestions about running this kind of parallel processing and how many virtual nodes are required, it would help.
An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it.
I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift.
The solution needs improvement in connectivity with big data technologies such as Spark.
It would be beneficial if StreamSets addressed any potential memory leak issues to prevent unnecessary upgrades.
Pricing for IBM InfoSphere DataStage is moderate and not much expensive.
IBM InfoSphere DataStage is very scalable, allowing us to extend it according to our processing needs.
It is straightforward from a design and development perspective, and also for deployment.
I have leveraged IBM InfoSphere DataStage's integration with IBM's Information Server suite, and it is indeed beneficial.
It allows a hybrid installation approach, rather than being completely cloud-based or on-premises.
| Product | Mindshare (%) |
|---|---|
| IBM InfoSphere DataStage | 1.5% |
| StreamSets | 1.2% |
| Other | 97.3% |
| Company Size | Count |
|---|---|
| Small Business | 23 |
| Midsize Enterprise | 4 |
| Large Enterprise | 26 |
| Company Size | Count |
|---|---|
| Small Business | 9 |
| Midsize Enterprise | 2 |
| Large Enterprise | 11 |
IBM InfoSphere DataStage offers powerful ETL capabilities focusing on data transformation and integration, ensuring seamless data processing and management in complex environments. It is particularly valued for handling extensive data volumes with robust transformation features and scalability options.
IBM InfoSphere DataStage is renowned for its strength in data extraction, transformation, and loading, making it a preferred choice for businesses handling large datasets. It provides extensive database connectors, integrates efficiently with existing systems, and facilitates complex data transformations. Users appreciate its scalability, metadata management, and effectiveness in applying business rules. Despite this, areas for improvement include enhanced cloud integration, better error messaging, and expanded connectivity with modern databases. Its pricing scheme and deployment complexity also present considerations for potential users.
What are the key features of IBM InfoSphere DataStage?Businesses in sectors like telecommunications, banking, and insurance commonly implement IBM InfoSphere DataStage for ETL processes. It's used for integrating data from multiple sources into data warehouses, supporting business intelligence initiatives, and managing data quality. Known for efficiently handling integration of mainframes and Oracle databases, it supports complex data projects tailored to industry needs.
StreamSets streamlines data pipeline creation, connecting data from multiple sources to destinations like cloud platforms with minimal coding. Its centralized platform and intuitive design enhance ETL and data migration processes.
StreamSets integrates seamlessly with analytics platforms, offering tools such as Data Collector and Control Hub to facilitate data ingestion, transformation, and machine learning integrations. Its user-friendly interface and ready connectors aid in configuring complex data pipelines. With built-in data drift resilience and scheduling options, users experience efficient, scalable data management, despite challenges like latency in cloud storage and interface enhancement needs. Users often employ StreamSets for batch loading, real-time data processing, and smart data pipeline management, offering comprehensive data integration solutions.
What are the key features of StreamSets?In industries like finance and technology, StreamSets supports data migration, machine learning integrations, and analytics by simplifying data transformation and enhancing decision-making capabilities through its robust pipeline management.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.