

Databricks and Dataiku compete in the data analytics and machine learning platforms category. Databricks seems to have the upper hand due to its scalability and strong integration with big data tools.
Features: Databricks offers seamless integration with big data tools, built-in optimization, and advanced machine learning capabilities. It provides powerful analytics and data processing abilities that are favored by technical users. Dataiku features a visually intuitive workflow, allowing non-technical stakeholders to navigate the platform easily. It emphasizes streamlined data science workflows and integrates with existing data sources to enhance usability for business users.
Room for Improvement: Databricks could enhance its visualization capabilities and improve its user interface for non-technical users. There is a need for better integration with popular visualization tools like Power BI and Tableau. Dataiku needs to expand its no-code capabilities to compete with platforms like DataRobot and enhance its collaboration features. Better integration with platforms like GitHub would help to address its learning curve for non-IT users.
Ease of Deployment and Customer Service: Databricks offers flexible deployment, predominantly on public and hybrid clouds, catering to various business needs. Its customer service is generally satisfactory, though response times can vary. Dataiku is primarily deployed on-premises and private clouds, with comprehensive documentation reducing the need for direct support. Both platforms provide satisfactory customer assistance, although the deployment options differ.
Pricing and ROI: Databricks adopts a pay-as-you-go model, which can become costly without careful management, but delivers good ROI for large operations. Dataiku, while priced higher, provides substantial value for larger companies with robust data capabilities, though its pricing structure may deter smaller businesses lacking consumption-based options.
For a lot of different tasks, including machine learning, it is a nice solution.
When it comes to big data processing, I prefer Databricks over other solutions.
The market is competitive, and Dataiku must adopt a consumption-based model instead of the current monthly model.
I consider the return on investment with Dataiku valuable because for us, it is one single platform where all our data scientists come together and work on any model building, so it is collaboration, plus having everything in one place, organized, having proper project management, and then built-in capabilities which help to facilitate model building.
In terms of ROI, the use of Dataiku simplifies the architecture of customers, which helps them to decommission some of their existing tools;
Whenever we reach out, they respond promptly.
As of now, we are raising issues and they are providing solutions without any problems.
I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features.
Dataiku partners with local industry experts who understand the business better and provide support.
The support team does not provide adequate assistance.
They should not take the complaints so lightly.
The patches have sometimes caused issues leading to our jobs being paused for about six hours.
Databricks is an easily scalable platform.
I would rate the scalability of this solution as very high, about nine out of ten.
Dataiku is quite scalable, as long as I can pay for more licenses, there is no technical limitation.
They release patches that sometimes break our code.
Although it is too early to definitively state the platform's stability, we have not encountered any issues so far.
Databricks is definitely a very stable product and reliable.
In terms of stabilization, if my data has no outlier creation in the raw data, then it is quite stable.
As for stability and reliability, so far so good; after the installation, I really had no problems.
Adjusting features like worker nodes and node utilization during cluster creation could mitigate these failures.
We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly.
We use MLflow for managing MLOps, however, further improvement would be beneficial, especially for large language models and related tools.
Someone who needs to do coding can do it, and someone who does not know coding can also build solutions.
The license is very expensive.
I would love for Dataiku to allow more flexibility with code-based components and provide the possibility to extend it by developing and integrating custom components easily with existing ones.
It is not a cheap solution.
There are no extra expenses beyond the existing licensing cost.
I find the pricing of Dataiku quite affordable for our customers, as they are usually large companies.
The pricing for Dataiku is very high, which is its biggest downside.
Databricks' capability to process data in parallel enhances data processing speed.
The platform allows us to leverage cloud advantages effectively, enhancing our AI and ML projects.
The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse.
This feature is useful because it simplifies tasks and eliminates the need for a data scientist.
Dataiku primarily enhances the speed at which our customers can develop or train their machine learning models because it is a drag-and-drop platform.
It offers most of the capabilities required for data science, MLOps, and LLMOps.
| Product | Market Share (%) |
|---|---|
| Databricks | 9.6% |
| Dataiku | 8.0% |
| Other | 82.4% |


| Company Size | Count |
|---|---|
| Small Business | 25 |
| Midsize Enterprise | 12 |
| Large Enterprise | 56 |
| Company Size | Count |
|---|---|
| Small Business | 4 |
| Midsize Enterprise | 2 |
| Large Enterprise | 10 |
Databricks offers a scalable, versatile platform that integrates seamlessly with Spark and multiple languages, supporting data engineering, machine learning, and analytics in a unified environment.
Databricks stands out for its scalability, ease of use, and powerful integration with Spark, multiple languages, and leading cloud services like Azure and AWS. It provides tools such as the Notebook for collaboration, Delta Lake for efficient data management, and Unity Catalog for data governance. While enhancing data engineering and machine learning workflows, it faces challenges in visualization and third-party integration, with pricing and user interface navigation being common concerns. Despite needing improvements in connectivity and documentation, it remains popular for tasks like real-time processing and data pipeline management.
What features make Databricks unique?
What benefits can users expect from Databricks?
In the tech industry, Databricks empowers teams to perform comprehensive data analytics, enabling them to conduct extensive ETL operations, run predictive modeling, and prepare data for SparkML. In retail, it supports real-time data processing and batch streaming, aiding in better decision-making. Enterprises across sectors leverage its capabilities for creating secure APIs and managing data lakes effectively.
Dataiku Data Science Studio is acclaimed for its versatile capabilities in advanced analytics, data preparation, machine learning, and visualization. It streamlines complex data tasks with an intuitive visual interface, supports multiple languages like Python, R, SQL, and scales efficiently for large dataset handling, boosting organizational efficiency and collaboration.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.