Comparison Buyer's Guide

Executive SummaryUpdated on Mar 6, 2024
 

Categories and Ranking

Databricks
Ranking in Data Science Platforms
1st
Average Rating
8.2
Number of Reviews
78
Ranking in other categories
Streaming Analytics (2nd)
Dremio
Ranking in Data Science Platforms
10th
Average Rating
8.6
Number of Reviews
6
Ranking in other categories
Cloud Data Warehouse (11th)
 

Mindshare comparison

As of June 2024, in the Data Science Platforms category, the mindshare of Databricks is 20.3%, up from 19.4% compared to the previous year. The mindshare of Dremio is 5.8%, up from 2.3% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
Unique Categories:
Streaming Analytics
17.8%
Cloud Data Warehouse
2.8%
 

Featured Reviews

DevSmita Asthana - PeerSpot reviewer
Dec 11, 2023
Helps to have a good data presence but needs to incorporate learning aspects
The product has helped in data fabrication.  Databricks has helped us have a good presence in data.  The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice.  I have been using the product for more than six months.  I rate…
AM
Jan 16, 2024
A highly stable solution that works like a data warehouse on top of data lakes
Dremio's interface is good, but it has a few limitations. I cannot do a lot of things with ANSI SQL or basic SQL. I cannot use the recursive common table expression (CTE) in Dremio because the support page says it's currently unsupported. The use case I am working on requires building trees and hierarchical structures. Most of the time, it requires complex nested data structures to be made simpler for end users. It would be good if Dremio could provide a way to create trees just like Oracle does using commands like CONNECT BY and NO CYCLE. You can use a few languages to simplify complicated JSON and XML. It would be very helpful if Dremio could provide a solution to simplify building trees and building meaningful data from complex data.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The most valuable feature of Databricks is the notebook, data factory, and ease of use."
"A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem."
"The solution's features are fantastic and include interactive clusters that perform at top speed when compared to other solutions."
"The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark."
"The capacity of use of the different types of coding is valuable. Databricks also has good performance because it is running in spark extra storage, meaning the performance and the capacity use different kinds of codes."
"The solution is built from Spark and has integration with MLflow, which is important for our use case."
"The initial setup phase of Databricks was good."
"Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user."
"The most valuable feature of Dremio is it can sit on top of any other data storage, such as Amazon S3, Azure Data Factory, SGFS, or Hive. The memory competition is good. If you are running any kind of materialized view, you'd be running in memory."
"Dremio gives you the ability to create services which do not require additional resources and sterilization."
"Everyone uses Dremio in my company; some use it only for the analytics function."
"Dremio enables you to manage changes more effectively than any other data warehouse platform. There are two things that come into play. One is data lineage. If you are looking at data in Dremio, you may want to know the source and what happened to it along the way or how it may have been transformed in the data pipeline to get to the point where you're consuming it."
"We primarily use Dremio to create a data framework and a data queue."
"Dremio allows querying the files I have on my block storage or object storage."
 

Cons

"The pricing of Databricks could be cheaper."
"This solution only supports queries in SQL and Python, which is a bit limiting."
"Can be improved by including drag-and-drop features."
"I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one."
"There should be better integration with other platforms."
"There would also be benefits if more options were available for workers, or the clusters of the two points."
"The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets."
"It would be great if Databricks could integrate all the cloud platforms."
"I cannot use the recursive common table expression (CTE) in Dremio because the support page says it's currently unsupported."
"We've faced a challenge with integrating Dremio and Databricks, specifically regarding authentication. It is not shaking hands very easily."
"Dremio doesn't support the Delta connector. Dremio writes the IT support for Delta, but the support isn't great. There is definitely room for improvement."
"It shows errors sometimes."
"Dremio takes a long time to execute large queries or the executing of correlated queries or nested queries. Additionally, the solution could improve if we could read data from the streaming pipelines or if it allowed us to create the ETL pipeline directly on top of it, similar to Snowflake."
"They have an automated tool for building SQL queries, so you don't need to know SQL. That interface works, but it could be more efficient in terms of the SQL generated from those things. It's going through some growing pains. There is so much value in tools like these for people with no SQL experience. Over time, Dermio will make these capabilities more accessible to users who aren't database people."
 

Pricing and Cost Advice

"The solution requires a subscription."
"The solution is affordable."
"There are different versions."
"Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
"The licensing costs of Databricks is a tiered licensing regime, so it is flexible."
"Databricks are not costly when compared with other solutions' prices."
"I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five."
"I would rate the tool’s pricing an eight out of ten."
"Dremio is less costly competitively to Snowflake or any other tool."
"Right now the cluster costs approximately $200,000 per month and is based on the volume of data we have."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
789,442 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
15%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
30%
Computer Software Company
11%
Manufacturing Company
8%
Retailer
4%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their orga...
What do you like most about Dremio?
Dremio allows querying the files I have on my block storage or object storage.
What is your experience regarding pricing and costs for Dremio?
Every tool has a value based on its visualization, and the pricing is worth its value.
What needs improvement with Dremio?
Dremio's interface is good, but it has a few limitations. I cannot do a lot of things with ANSI SQL or basic SQL. I cannot use the recursive common table expression (CTE) in Dremio because the supp...
 

Also Known As

Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
No data available
 

Learn More

Video not available
 

Overview

 

Sample Customers

Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
UBS, TransUnion, Quantium, Daimler, OVH
Find out what your peers are saying about Databricks vs. Dremio and other solutions. Updated: May 2024.
789,442 professionals have used our research since 2012.