Data Scientist at a tech vendor with 51-200 employees
Real User
Top 20
Mar 24, 2026
My main use case for Dataiku involves ETL pipelines, mainly for data analysis, and I majorly use SQL queries for that. For ETL pipelines and data analysis, I had to create the output by combining a few datasets and then running SQL queries, applying filters, joining the tables, and so on; so I used Dataiku for that. Regarding my main use case with Dataiku, I primarily use it for analysis only, and the visual recipes of Dataiku and the SQL query are enough for that. No challenges have occurred so far, but the only challenge is that Dataiku gets slow sometimes and lags a lot.
My main use case for Dataiku is mostly based on the client's data where we look into life sciences data, mostly aligned to claims, campaign measurement, campaign targeting, IQVIA, LAD Epsilon data, and Komodo for instance. Apart from this, I'm basically working on creating an end-to-end pipeline as a bundled unit project, which has been the overall case. We primarily work on Next Best Engagement and Next Best Actions, more or less aligned to the healthcare side, while sometimes working on the consumer front and on the professionals front, meaning healthcare professionals (HCP). A specific project I built in Dataiku was on HCP campaign measurement. Our day-to-day cycle involves ingestion of data from our S3, which is the client's S3 storage. We fetch the data, perform some visual recipes to bring it onto Dataiku DSS, make preliminary changes, preprocess the data, do some data preparation, and perform feature engineering to have the final model ready dataset for modeling. We create multiple iterations of the model where Dataiku is of great help, allowing us to try multiple modeling iterations with different hyperparameters, saving a lot of time and providing a visual overview for everyone to understand how the data is performing. Once the modeling is done, we push the data downstream through an API or use MLOps for productionization, either via CI/CD pipeline or just simple scenario triggers such as sending an email once a job gets done. This primarily results in our day-to-day activity.
My main use of Dataiku is especially data preparation. I use Dataiku a lot for preparing my data, in particular processing and transforming my datasets by using recipes, creating recipes, and especially what I really value is being able to reuse the recipes already created for preparation on another dataset. I was preparing for my Core Dataiku certificate, and all of the modules were focused on data preparation. I load the data into Dataiku, then I use the recipes and tools to add columns, unpivot columns, delete, and transpose columns so I can format them. Then I create groups of recipes and I also reuse them by importing another dataset into Dataiku, which gives me the ability to save time. I don't have to redo all the previous processes since I already have a recipe for data preparation that I can reuse. After data preparation, I had the opportunity to carry out an end-to-end project with Dataiku. This involved first the data preparation and then I went on to set up a model to predict a stock index. I used the machine learning models for this project. Let's suppose I have datasets that I use every time, and each time I'm going to check the data formats to format a certain number of columns, for example dates, to see if they are in date format or not, delete certain columns, rename certain columns, transform the data, and clean them. If I've done all these steps once and I manage to put all these recipes into a group, next time it's an enormous time saver not to have to repeat these steps one by one, but to use directly the recipe or the group of recipes created. I want to emphasize again the recipes and how we can reuse them with Dataiku. In most data projects, data preparation takes a huge amount of time for professionals, and sometimes unfortunately we repeat the same tasks. Dataiku really brings a solution to this in that we can create groups of recipes or recipes that we can reuse. The reuse of recipes is already a significant advantage. Additionally, Dataiku is one of the rare platforms that offers end-to-end services where we can carry out a data project from start to finish. For example, to carry out my personal project on predicting a stock index, I did practically everything on Dataiku, from data preparation to putting my model into production. For example, we have an Excel file that will be incremented every month, a file that comes from outside where the format of the file is exactly the same. Each time we do the transformation, but we did it only on the first file. When the others come in, we apply the old recipes. This allowed us to save an enormous amount of time because we were able to automate everything with Dataiku. As soon as the file comes in, the recipes are automatically applied to these files. We no longer have to intervene at the end of each month and spend twenty to thirty minutes cleaning the files. Additionally, it ensures the validation of our data and consistency between the files.
My main use case for Dataiku is for data science and AI projects. I use Dataiku for a demand forecasting use case where the objective is to predict the demand for each product for the next four months. Demand forecasting is the primary focus where I use Dataiku.
Head of Delivery & Practice (Data & AI) at a recruiting/HR firm with 201-500 employees
Real User
Top 10
Dec 11, 2025
We are a consulting firm for BFSI customers for the FSI value chain use cases, which is what we use Dataiku for, based on the problem statement the customer comes up with.
My main use cases in Dataiku include ensuring a strong data pipeline ingestion. We have people from data management, so we need to take care of the pipeline, their data quality, data drifting, all these things. We are taking care of it with the Dataiku rule-based alert systems we have created.
My primary use case for Dataiku ( /products/dataiku-reviews ) is for data science, Gen ( /products/gen-reviews ) AI, and data science applications. Our AGN team also uses it for various purposes.
Big Data Consultant at a computer software company with 1,001-5,000 employees
Reseller
Top 10
Feb 27, 2025
My company sells licenses for both Dataiku and Alteryx, and we have clients who use them. I engage with several companies in telecommunications, retail, and energy to assess how our clients are utilizing these platforms.
My current client has Dataiku. We do sentiment analysis and some small large language models right now. We use Dataiku as a Jupyter Notebook. We use it a lot for marketing and analytics. The marketing and sales team uses Dataiku.
The use case is data science, and we've deployed Data Science Studio in multiple regions for four environments: dev, preset, pre-production, and production.
Business Intelligence Developer/ Data Scientist at a tech services company with 11-50 employees
Real User
Dec 4, 2019
I just use the product for data migration. Once a month, I push data from one Redshift data warehouse to another Redshift data warehouse. I do that by using a simple SQL query.
Dataiku Data Science Studio is acclaimed for its versatile capabilities in advanced analytics, data preparation, machine learning, and visualization. It streamlines complex data tasks with an intuitive visual interface, supports multiple languages like Python, R, SQL, and scales efficiently for large dataset handling, boosting organizational efficiency and collaboration.
My main use case for Dataiku involves ETL pipelines, mainly for data analysis, and I majorly use SQL queries for that. For ETL pipelines and data analysis, I had to create the output by combining a few datasets and then running SQL queries, applying filters, joining the tables, and so on; so I used Dataiku for that. Regarding my main use case with Dataiku, I primarily use it for analysis only, and the visual recipes of Dataiku and the SQL query are enough for that. No challenges have occurred so far, but the only challenge is that Dataiku gets slow sometimes and lags a lot.
My main use case for Dataiku is mostly based on the client's data where we look into life sciences data, mostly aligned to claims, campaign measurement, campaign targeting, IQVIA, LAD Epsilon data, and Komodo for instance. Apart from this, I'm basically working on creating an end-to-end pipeline as a bundled unit project, which has been the overall case. We primarily work on Next Best Engagement and Next Best Actions, more or less aligned to the healthcare side, while sometimes working on the consumer front and on the professionals front, meaning healthcare professionals (HCP). A specific project I built in Dataiku was on HCP campaign measurement. Our day-to-day cycle involves ingestion of data from our S3, which is the client's S3 storage. We fetch the data, perform some visual recipes to bring it onto Dataiku DSS, make preliminary changes, preprocess the data, do some data preparation, and perform feature engineering to have the final model ready dataset for modeling. We create multiple iterations of the model where Dataiku is of great help, allowing us to try multiple modeling iterations with different hyperparameters, saving a lot of time and providing a visual overview for everyone to understand how the data is performing. Once the modeling is done, we push the data downstream through an API or use MLOps for productionization, either via CI/CD pipeline or just simple scenario triggers such as sending an email once a job gets done. This primarily results in our day-to-day activity.
My main use of Dataiku is especially data preparation. I use Dataiku a lot for preparing my data, in particular processing and transforming my datasets by using recipes, creating recipes, and especially what I really value is being able to reuse the recipes already created for preparation on another dataset. I was preparing for my Core Dataiku certificate, and all of the modules were focused on data preparation. I load the data into Dataiku, then I use the recipes and tools to add columns, unpivot columns, delete, and transpose columns so I can format them. Then I create groups of recipes and I also reuse them by importing another dataset into Dataiku, which gives me the ability to save time. I don't have to redo all the previous processes since I already have a recipe for data preparation that I can reuse. After data preparation, I had the opportunity to carry out an end-to-end project with Dataiku. This involved first the data preparation and then I went on to set up a model to predict a stock index. I used the machine learning models for this project. Let's suppose I have datasets that I use every time, and each time I'm going to check the data formats to format a certain number of columns, for example dates, to see if they are in date format or not, delete certain columns, rename certain columns, transform the data, and clean them. If I've done all these steps once and I manage to put all these recipes into a group, next time it's an enormous time saver not to have to repeat these steps one by one, but to use directly the recipe or the group of recipes created. I want to emphasize again the recipes and how we can reuse them with Dataiku. In most data projects, data preparation takes a huge amount of time for professionals, and sometimes unfortunately we repeat the same tasks. Dataiku really brings a solution to this in that we can create groups of recipes or recipes that we can reuse. The reuse of recipes is already a significant advantage. Additionally, Dataiku is one of the rare platforms that offers end-to-end services where we can carry out a data project from start to finish. For example, to carry out my personal project on predicting a stock index, I did practically everything on Dataiku, from data preparation to putting my model into production. For example, we have an Excel file that will be incremented every month, a file that comes from outside where the format of the file is exactly the same. Each time we do the transformation, but we did it only on the first file. When the others come in, we apply the old recipes. This allowed us to save an enormous amount of time because we were able to automate everything with Dataiku. As soon as the file comes in, the recipes are automatically applied to these files. We no longer have to intervene at the end of each month and spend twenty to thirty minutes cleaning the files. Additionally, it ensures the validation of our data and consistency between the files.
My main use case for Dataiku is for data science and AI projects. I use Dataiku for a demand forecasting use case where the objective is to predict the demand for each product for the next four months. Demand forecasting is the primary focus where I use Dataiku.
My usual use cases for Dataiku are mostly data science use cases, or model creation and training.
We are a consulting firm for BFSI customers for the FSI value chain use cases, which is what we use Dataiku for, based on the problem statement the customer comes up with.
My main use cases in Dataiku include ensuring a strong data pipeline ingestion. We have people from data management, so we need to take care of the pipeline, their data quality, data drifting, all these things. We are taking care of it with the Dataiku rule-based alert systems we have created.
My primary use case for Dataiku ( /products/dataiku-reviews ) is for data science, Gen ( /products/gen-reviews ) AI, and data science applications. Our AGN team also uses it for various purposes.
My company sells licenses for both Dataiku and Alteryx, and we have clients who use them. I engage with several companies in telecommunications, retail, and energy to assess how our clients are utilizing these platforms.
I use that IQ since I am preparing cohorts for health investment research.
We use the solution for data science and machine learning.
My current client has Dataiku. We do sentiment analysis and some small large language models right now. We use Dataiku as a Jupyter Notebook. We use it a lot for marketing and analytics. The marketing and sales team uses Dataiku.
The use case is data science, and we've deployed Data Science Studio in multiple regions for four environments: dev, preset, pre-production, and production.
I just use the product for data migration. Once a month, I push data from one Redshift data warehouse to another Redshift data warehouse. I do that by using a simple SQL query.
Our primary uses for this solution are data preparation and data modeling. We have a testing environment, a production environment, and two API nodes.
I use this solution to make predictions from time-series data. I am a consultant and operate this solution for my clients.