What is your primary use case for Dataiku Data Science Studio?

Question

How do you or your organization use this solution? Please share with us so that your peers can learn from your experiences. Thank you!

reviewer2811273 · Accepted Answer

My main use case for Dataiku involves ETL pipelines, mainly for data analysis, and I majorly use SQL queries for that. For ETL pipelines and data analysis, I had to create the output by combining a few datasets and then running SQL queries, applying filters, joining the tables, and so on; so I used Dataiku for that. Regarding my main use case with Dataiku, I primarily use it for analysis only, and the visual recipes of Dataiku and the SQL query are enough for that. No challenges have occurred so far, but the only challenge is that Dataiku gets slow sometimes and lags a lot.

Shubhamkumar Gomar · Answer

My main use case for Dataiku is mostly based on the client's data where we look into life sciences data, mostly aligned to claims, campaign measurement, campaign targeting, IQVIA, LAD Epsilon data, and Komodo for instance. Apart from this, I'm basically working on creating an end-to-end pipeline as a bundled unit project, which has been the overall case. We primarily work on Next Best Engagement and Next Best Actions, more or less aligned to the healthcare side, while sometimes working on the consumer front and on the professionals front, meaning healthcare professionals (HCP). A specific project I built in Dataiku was on HCP campaign measurement. Our day-to-day cycle involves ingestion of data from our S3, which is the client's S3 storage. We fetch the data, perform some visual recipes to bring it onto Dataiku DSS, make preliminary changes, preprocess the data, do some data preparation, and perform feature engineering to have the final model ready dataset for modeling. We create multiple iterations of the model where Dataiku is of great help, allowing us to try multiple modeling iterations with different hyperparameters, saving a lot of time and providing a visual overview for everyone to understand how the data is performing. Once the modeling is done, we push the data downstream through an API or use MLOps for productionization, either via CI/CD pipeline or just simple scenario triggers such as sending an email once a job gets done. This primarily results in our day-to-day activity.

Ousseini Issoufou · Answer

My main use of Dataiku is especially data preparation. I use Dataiku a lot for preparing my data, in particular processing and transforming my datasets by using recipes, creating recipes, and especially what I really value is being able to reuse the recipes already created for preparation on another dataset. I was preparing for my Core Dataiku certificate, and all of the modules were focused on data preparation. I load the data into Dataiku, then I use the recipes and tools to add columns, unpivot columns, delete, and transpose columns so I can format them. Then I create groups of recipes and I also reuse them by importing another dataset into Dataiku, which gives me the ability to save time. I don't have to redo all the previous processes since I already have a recipe for data preparation that I can reuse. After data preparation, I had the opportunity to carry out an end-to-end project with Dataiku. This involved first the data preparation and then I went on to set up a model to predict a stock index. I used the machine learning models for this project. Let's suppose I have datasets that I use every time, and each time I'm going to check the data formats to format a certain number of columns, for example dates, to see if they are in date format or not, delete certain columns, rename certain columns, transform the data, and clean them. If I've done all these steps once and I manage to put all these recipes into a group, next time it's an enormous time saver not to have to repeat these steps one by one, but to use directly the recipe or the group of recipes created. I want to emphasize again the recipes and how we can reuse them with Dataiku. In most data projects, data preparation takes a huge amount of time for professionals, and sometimes unfortunately we repeat the same tasks. Dataiku really brings a solution to this in that we can create groups of recipes or recipes that we can reuse. The reuse of recipes is already a significant advantage. Additionally, Dataiku is one of the rare platforms that offers end-to-end services where we can carry out a data project from start to finish. For example, to carry out my personal project on predicting a stock index, I did practically everything on Dataiku, from data preparation to putting my model into production. For example, we have an Excel file that will be incremented every month, a file that comes from outside where the format of the file is exactly the same. Each time we do the transformation, but we did it only on the first file. When the others come in, we apply the old recipes. This allowed us to save an enormous amount of time because we were able to automate everything with Dataiku. As soon as the file comes in, the recipes are automatically applied to these files. We no longer have to intervene at the end of each month and spend twenty to thirty minutes cleaning the files. Additionally, it ensures the validation of our data and consistency between the files.

reviewer2784765 · Answer

My main use case for Dataiku is for data science and AI projects. I use Dataiku for a demand forecasting use case where the objective is to predict the demand for each product for the next four months. Demand forecasting is the primary focus where I use Dataiku.

PriyankaSharma3 · Answer

My usual use cases for Dataiku are mostly data science use cases, or model creation and training.

Durgesh-Singh · Answer

We are a consulting firm for BFSI customers for the FSI value chain use cases, which is what we use Dataiku for, based on the problem statement the customer comes up with.

Ravi-Srivastava · Answer

My main use cases in Dataiku include ensuring a strong data pipeline ingestion. We have people from data management, so we need to take care of the pipeline, their data quality, data drifting, all these things. We are taking care of it with the Dataiku rule-based alert systems we have created.

reviewer1525251 · Answer

My primary use case for Dataiku ( /products/dataiku-reviews ) is for data science, Gen ( /products/gen-reviews ) AI, and data science applications. Our AGN team also uses it for various purposes.

reviewer2667285 · Answer

My company sells licenses for both Dataiku and Alteryx, and we have clients who use them. I engage with several companies in telecommunications, retail, and energy to assess how our clients are utilizing these platforms.

Ramon Roman Viñas · Answer

I use that IQ since I am preparing cohorts for health investment research.

What is your primary use case for Dataiku Data Science Studio?

16 Answers

Related Q&As