Try our new research platform with insights from 80,000+ expert users
Data Scientest at a wellness & fitness company with 51-200 employees
Real User
Leaderboard
​Data ingestion​ ​has reduced manual effort to import data

What is our primary use case?

The primary use case is for data ingestion. We current have HDP 2.6 installed on Ubuntu 16.04.

How has it helped my organization?

Has reduced manual effort to import data.

What is most valuable?

Data ingestion

What needs improvement?

Not enough material is available for beginners.

Buyer's Guide
Talend Data Quality
June 2025
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
863,564 professionals have used our research since 2012.

For how long have I used the solution?

Less than one year.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

PeerSpot user
Rich text editor
    it_user827655 - PeerSpot reviewer
    Principal Developer
    Real User
    ​It lowers the amount of time in development from weeks to a day
    Pros and Cons
    • "​It lowers the amount of time in development from weeks to a day.​"
    • "If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier."

    What is our primary use case?

    We use it to load our big data system with S3 and Redshift. We also use it to process in HL7 from hospitals in real-time.

    How has it helped my organization?

    It lowers the amount of time in development from weeks to a day.

    What is most valuable?

    The ease of transforming data with inputs to TMaps and tJavaRow makes life so easy.

    What needs improvement?

    There is one place where I would appreciate an upgrade, if it is possible. If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier. When it does guess the schema it tends to bring back every column from every table or every column from the table specified in the table name in the component. Sometimes, the SQL comes from multiple tables and has some transformations of data. 

    I do not know if it would even be possible, but if this could be figured out automatically for the column names and types, that would be amazing.

    For how long have I used the solution?

    More than five years.

    What other advice do I have?

    I have not run into anything we could not use Talend to find a solution for.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.

    PeerSpot user
    Rich text editor
      Buyer's Guide
      Talend Data Quality
      June 2025
      Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
      863,564 professionals have used our research since 2012.
      it_user826299 - PeerSpot reviewer
      Junior ETL Developer at a marketing services firm with 51-200 employees
      Real User
      Heap space issues plague us consistently. However, the file fetch process is impeccable.
      Pros and Cons
      • "The file fetch process is impeccable."
      • "We are able to get emails from URLs very easily using this function when others fail."
      • "tLogRows are also great for finding bad data."
      • "NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down."
      • "Heap space issues plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does."
      • "Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using."

      What is our primary use case?

      We are a marketing and advertising company. We use this tool to fetch data from Google, Bing, and Adobe. We receive marketing data daily via email, FTP, and API, then process the data into MySQL tables.

      How has it helped my organization?

      Coming into the department with no knowledge of Talend, the interface has been user-friendly enough to allow me to come up to speed in four to five months on almost all its functions and use it like a pro.

      What is most valuable?

      • The file fetch process is impeccable. 
      • We are able to get emails from URLs very easily using this function when others fail. 
      • tLogRows are also great for finding bad data.

      What needs improvement?

      NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down. When we are dealing with millions of rows of data, this can be super hard to find. 

      Heap space issues also plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does. 

      Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using. 

      Inconsistency is a big issue.

      For how long have I used the solution?

      Three to five years.
      Disclosure: My company does not have a business relationship with this vendor other than being a customer.

      PeerSpot user
      Rich text editor
        it_user826677 - PeerSpot reviewer
        Technical Consultant
        Consultant
        Provides a flexible development environment to the coder
        Pros and Cons
        • "It has definitely streamlined certain processes.​"
        • "Provides a flexible development environment to the coder.​"
        • "The ability to change the code when debugging the JavaScript could be improved."

        What is our primary use case?

        Data migration (database to database using direct DB access and commands or using web services).

        How has it helped my organization?

        It has definitely streamlined certain processes.

        What is most valuable?

        The ability to build the interface using clear components and access the code (Java) to validate and trace any error. The wide range of components which suits a variety of purposes and provides a flexible development environment to the coder.

        What needs improvement?

        The ability to change the code when debugging the JavaScript could be improved.

        For how long have I used the solution?

        One to three years.
        Disclosure: My company does not have a business relationship with this vendor other than being a customer.

        PeerSpot user
        Rich text editor
          Practice Manager
          Real User
          It reduces the QA effort immensely by handling most of the test scenarios in a reusable way
          Pros and Cons
          • "It reduces the QA effort immensely by handling most of the test scenarios in a reusable way."
          • "​This product speeds up the unit testing and QA for specific test scenarios. As a result, the development output quality can be evaluated and adjusted.​"
          • "I like idea of storing the results of Data Quality jobs in a DB and having the ability to run reports in the DB to show a dashboard of quality metrics."
          • "There are too many functions which could be streamlined."
          • "There are more functions in a non-streamlined manner, which could be refined to arrive at a better off-the-shelf functions."

          What is our primary use case?

          Data Quality is used to automate the quality control check on the data loaded from batch jobs. This includes BCA for field level data quality and cross table checks for key column mismatches.

          The data is in Redshift and the load volume is around 10 million records per batch load over more than 100 tables in a Data Vault model.

          This is for a short three month project. I have used it from dev phase until QA. This reduces the QA effort immensely by handling most of the test scenarios in a reusable way.

          How has it helped my organization?

          This product speeds up the unit testing and QA for specific test scenarios. As a result, the development output quality can be evaluated and adjusted.

          What is most valuable?

          I like the components provided by Data Quality, such as:

          • Address standardization
          • Fuzzy match
          • Schema compliance check as they pack lot of code, which is required to perform these standard data operations. 
          • Doing the same by coding would be erroneous, take a lot of time, and provide output quality which is biased. 

          Apart from specific components, I like idea of storing the results of Data Quality jobs in a DB and having the ability to run reports in the DB to show a dashboard of quality metrics.

          What needs improvement?

          • The report generation and using the report in DI job steps could be improved. 
          • There are too many functions which could be streamlined. 
          • The report generated often has too many pages to go through, if not loaded into a DB.
          • There are more functions in a non-streamlined manner, which could be refined to arrive at a better off-the-shelf functions.

          For how long have I used the solution?

          Trial/evaluations only.
          Disclosure: My company does not have a business relationship with this vendor other than being a customer.

          PeerSpot user
          Rich text editor
            it_user497733 - PeerSpot reviewer
            Executive Director and Business Unit Manager at a tech company with 51-200 employees
            Vendor
            It helps more accurately identify data-quality issues, and it is simple to install.

            What is most valuable?

            • Analysing data trends: This works when you add a column to analyse. It shows you max, min, nulls, etc. per field. It allows a snapshot of your data.
            • Duplication

            How has it helped my organization?

            • More accurate data-quality issue identification
            • Reporting

            What needs improvement?

            I would like to see them add a configuration wizard.

            For how long have I used the solution?

            I have been using for two years.

            What do I think about the stability of the solution?

            I did not encounter any stability issues.

            What do I think about the scalability of the solution?

            I encountered scalability issues.

            How is customer service and technical support?

            I consulted a lot of product forums, but I did not ask for support from Talend.

            How was the initial setup?

            The Talend software is very simple to install. Because it runs on the Java platform, you need to make sure you have a JRE installed. Then, you download the ZIP file from the Talend website. You extract the file, and the software is ready to use by executing the EXE file.

            What's my experience with pricing, setup cost, and licensing?

            Try the free version first!

            What other advice do I have?

            It is a good tool; include it in your planning.

            Disclosure: My company has a business relationship with this vendor other than being a customer. We are a Talend distribution partner

            PeerSpot user
            Rich text editor
              it_user158814 - PeerSpot reviewer
              Developer with 51-200 employees
              Vendor
              Has allowed us to organise & deploy our staged ETL transformation processes; toolbox integration could be better.

              What is most valuable?

              Fuzzy matching lookups.

              How has it helped my organization?

              Talend has allowed us to systematically organise/structure and deploy our staged ETL transformation processes from Development into production, we have tracked our data quality efforts during our runs and supplied comprehensive feedback during our development.

              What needs improvement?

              Toolbox/component integration, performance (optimal memory performance) bench marks / manual across 64bit 32 bit architectures not existent.

              For how long have I used the solution?

              2-4 years.

              What was my experience with deployment of the solution?

              No.

              What do I think about the stability of the solution?

              Sometimes when working with larger datasets (possibly due to insufficient memory).

              What do I think about the scalability of the solution?

              No.

              How are customer service and technical support?

              Customer Service:

              Excellent

              Technical Support:

              Excellent

              Which solution did I use previously and why did I switch?

              Yes I have, found Talend less fussy with different data and debugging tools. It is a superior solution once you are acquainted with it.

              How was the initial setup?

              Straightforward.

              What about the implementation team?

              In-house,

              What was our ROI?

              100%.

              What's my experience with pricing, setup cost, and licensing?

              No setup costs or usage costs. Talend open studio.

              Which other solutions did I evaluate?

              Yes, SSIS and Pentaho.

              What other advice do I have?

              Platform/Technology specific decisions need to be made upfront before considering this solution.

              Disclosure: My company does not have a business relationship with this vendor other than being a customer.

              PeerSpot user
              Rich text editor
                PeerSpot user
                Information Architect at a healthcare company
                Vendor
                Good and easy debugging functions while better tools for geo-data are needed.

                Valuable Features

                Maybe the best thing is the product's easy start-up level when you are familiar with Java. Also job creation is fast compared to some other tools. One more good thing is that tables' metadata is easy to bring into the tool and utilize. Last thing to mention here is flexibility to use Java code inside the job.

                Improvements to My Organization

                These are: fast job creation from start to finish which improves ROI, good and easy debugging functions.

                Room for Improvement

                First, We faced problems with stability of the products. Also some components were clearly not tested well, which meant that there were bugs. Better tools for geo-data are needed. Documentation was poor in the beginning but it got better over time.

                Use of Solution

                Talend Enterprise Data Integration 5.1 (1) and Talend Platform for
                Data Services (2)

                2 years by one customer (without Data Quality (1)), 6 months in other customer (with Data Quality(2))

                Deployment Issues

                At the customer deployment to the production environment from the test one was a bit exhausting. This could be because they didn't use/know the best-practices.

                Stability Issues

                Yes we had issues. Quite often the server needed rebooting as if there were memory leaks. Sometimes the CVS version management got stuck.

                Scalability Issues

                No issues. Only issues were with the Java memory which is scalable and changeable from the job settings.

                Customer Service and Technical Support

                Customer Service:

                Customer service was good most of the time. Answers came in a timely fashion.

                Technical Support:

                It was good most of the time. Answers came in a timely fashion.

                Initial Setup

                It was pretty straightforward. Memory settings by the client needed some modification in the first place. From the server point of view I cannot say.

                Implementation Team

                In house team.

                Other Solutions Considered

                Yes. We evaluated IBM DataStage.

                Disclosure: My company does not have a business relationship with this vendor other than being a customer.

                PeerSpot user
                Rich text editor
                  Buyer's Guide
                  Download our free Talend Data Quality Report and get advice and tips from experienced pros sharing their opinions.
                  Updated: June 2025
                  Buyer's Guide
                  Download our free Talend Data Quality Report and get advice and tips from experienced pros sharing their opinions.
                  ...
                  ...