Comparison Buyer's Guide

Executive SummaryUpdated on Mar 6, 2024
 

Categories and Ranking

Databricks
Ranking in Data Science Platforms
1st
Average Rating
8.2
Number of Reviews
78
Ranking in other categories
Streaming Analytics (2nd)
Dataiku
Ranking in Data Science Platforms
7th
Average Rating
8.2
Number of Reviews
7
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of June 2024, in the Data Science Platforms category, the mindshare of Databricks is 20.3%, up from 19.4% compared to the previous year. The mindshare of Dataiku is 9.9%, up from 6.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
Unique Categories:
Streaming Analytics
17.8%
No other categories found
 

Q&A Highlights

AO
Sep 17, 2023
 

Featured Reviews

DevSmita Asthana - PeerSpot reviewer
Dec 11, 2023
Helps to have a good data presence but needs to incorporate learning aspects
The product has helped in data fabrication.  Databricks has helped us have a good presence in data.  The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice.  I have been using the product for more than six months.  I rate…
RK
May 17, 2024
Gives different aspects of modeling approaches and good for multiple teams' collaboration
I used DataRobot. Dataiku has a different kind of structure to it. It's not financially heavy like DataRobot, which caters more to financial companies, like banks. Dataiku doesn't have that yet. I think they are also working on that area. But yeah, there are some key differences between the two products. DataRobot has an additional feature with financial firms that it creates all these financial metrics when you run a time series analysis. Those things I have not seen in Dataiku. If any financial company is choosing between DataRobot and Dataiku, they will definitely go for DataRobot because it creates all these financial metrics. It creates deltas, time series, time difference fields, and things like that. So, that is an added feature that DataRobot has.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"We like that this solution can handle a wide variety and velocity of data engineering, either in batch mode or real-time."
"The ease of use and its accessibility are valuable."
"One of the features provides nice interactive clusters, or compute instances that you don't really need to manage often."
"The most valuable feature is the ability to use SQL directly with Databricks."
"We are completely satisfied with the ease of connecting to different sources of data or pocket files in the search"
"Databricks is a unified solution that we can use for streaming. It is supporting open source languages, which are cloud-agnostic. When I do database coding if any other tool has a similar language pack to Excel or SQL, I can use the same knowledge, limiting the need to learn new things. It supports a lot of Python libraries where I can use some very easily."
"The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark."
"Databricks is a scalable solution. It is the largest advantage of the solution."
"Data Science Studio's data science model is very useful."
"The most valuable feature is the set of visual data preparation tools."
"The most valuable feature of this solution is that it is one tool that can do everything, and you have the ability to very easily push your design to prediction."
"If many teams are collaborating and sharing Jupyter notebooks, it's very useful."
"I like the interface, which is probably my favorite part of the solution. It is really user-friendly for an IT person."
"Extremely easy to use with its GUI-based functionality and large compatibility with various data sources. Also, maintenance processes are much more automated than ever, with fewer errors."
"Cloud-based process run helps in not keeping the systems on while processes are running."
"The solution is quite stable."
 

Cons

"It should have more compatible and more advanced visualization and machine learning libraries."
"There is room for improvement in the documentation of processes and how it works."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
"The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets."
"Databricks' technical support takes a while to respond and could be improved."
"If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks."
"I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast."
"Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."
"The interface for the web app can be a bit difficult. It needs to have better capabilities, at least for developers who like to code. This is due to the fact that everything is enabled in a single window with different tabs. For them to actually develop and do the concurrent testing that needs to be done, it takes a bit of time. That is one improvement that I would like to see - from a web app developer perspective."
"Server up-time needs to be improved. Also, query engines like Spark and Hive need to be more stable."
"In the next release of this solution, I would like to see deep learning better integrated into the tool and not simply an extension or plugin."
"Dataiku still needs some coding, and that could be a difference where business data scientists would go for DataRobot more than Dataiku."
"Although known for Big Data, the processing time to process 1.8 billion records was terribly slow (five days)."
"I think it would help if Data Science Studio added some more features and improved the data model."
"I find that it is a little slow during use. It takes more time than I would expect for operations to complete."
"There were stability issues: 1) SQL operations, such as partitioning, had bugs and showed wrong results. 2) Due to server downtime, scheduled processes used to fail. 3) Access to project folders was compromised (privacy issue) with wrong people getting access to confidential project folders."
 

Pricing and Cost Advice

"Databricks' cost could be improved."
"Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
"We implement this solution on behalf of our customers who have their own Azure subscription and they pay for Databricks themselves. The pricing is more expensive if you have large volumes of data."
"I would rate the tool’s pricing an eight out of ten."
"We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective."
"My smallest project is around a hundred euros, and my most expensive is just under a thousand euros a week. That is based on terabytes of data processed each month."
"The solution uses a pay-per-use model with an annual subscription fee or package. Typically this solution is used on a cloud platform, such as Azure or AWS, but more people are choosing Azure because the price is more reasonable."
"Databricks is a very expensive solution. Pricing is an area that could definitely be improved. They could provide a lower end compute and probably reduce the price."
"Pricing is pretty steep. Dataiku is also not that cheap."
"The annual licensing fees are approximately €20 ($22 USD) per key for the basic version and €40 ($44 USD) per key for the version with everything."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
787,779 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
15%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
18%
Educational Organization
14%
Manufacturing Company
8%
Computer Software Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their orga...
What is your experience regarding pricing and costs for Dataiku Data Science Studio?
Pricing is pretty steep. Dataiku is also not that cheap. It depends on the client and how much they want to spend towards a tool.
What needs improvement with Dataiku Data Science Studio?
The no-code/low-code aspect, where DataRobot doesn't need much coding at all. Dataiku still needs some coding, and that could be a difference where business data scientists would go for DataRobot m...
What is your primary use case for Dataiku Data Science Studio?
My current client has Dataiku. We do sentiment analysis and some small large language models right now. We use Dataiku as a Jupyter Notebook. We use it a lot for marketing and analytics. The market...
 

Comparisons

 

Also Known As

Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
Dataiku DSS
 

Overview

 

Sample Customers

Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
BGL BNP Paribas, Dentsu Aegis, Link Mobility Group, AramisAuto
Find out what your peers are saying about Databricks vs. Dataiku and other solutions. Updated: May 2024.
787,779 professionals have used our research since 2012.