We use Databricks for data science work in projects that create data pipelines, pre-processing, data wrangling, big data cluster management and ML, machine learning and deep learning tasks.
Data Engineering Manager at a pharma/biotech company with 10,001+ employees
A great and easy-to-use platform for data engineers and data scientists who rely on a large dataset to do advanced analytics reporting
Pros and Cons
- "The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark."
- "It would be great if Databricks could integrate all the cloud platforms."
What is our primary use case?
How has it helped my organization?
Databricks collaborates very well with the Azure platform, Dataiku, and enterprise AI tool. Databricks is a new connection to pull the data or connect to the Spark cluster. It is helpful for us to instance it or distribute the load through the Spark cluster, and it is very user-friendly.
What is most valuable?
The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark.
What needs improvement?
Databricks as a solution is integrated with Azure, but Google Cloud has some restrictions. I'm not sure about AWS Cloud, but it would be great if Databricks could integrate all the cloud platforms. Regarding additional features, we would like to see them mostly on the data engineering side, where we have a Spark cluster and some inbuilt ML. In addition, pre-processing steps will be useful.
Buyer's Guide
Databricks
July 2025

Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
864,155 professionals have used our research since 2012.
For how long have I used the solution?
We have been using this solution for two years and are using the latest update.
What do I think about the stability of the solution?
It is a stable solution as long as the Microsoft Azure Platform is stable too.
What do I think about the scalability of the solution?
It is a scalable solution, both vertically and horizontally, which is good. My organization is big, and we have a lot of users. In my department, we have about 15 people using Databricks.
How are customer service and support?
We have not escalated any issues to technical support, but we initially struggled with configuration and the settings of Hive metastore, but we resolved it. I rate the technical support a nine out of ten.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We were using the looped EMR elastic MapReduce from AWS before using Databricks. We switched to Databricks because the whole platform changed from AWS to Azure platform, and Databricks comes as a package.
How was the initial setup?
The initial setup was easy to complete and not complex. It may initially be challenging for a new user, but it improves over time. The CICD pipeline works well with the Microsoft Azure platform because the continuous integration, development and deployment come with the Git integration. It makes it easier for Databricks and the CICD. The deployment should be improved from the perspective of auto ML functionality, so it doesn't have intensive automation learning capability.
We don't use Databricks directly because we work on a data science project. It requires an auto ML and inbuilt machine learning capability. We found capabilities like the large language model using NLP and other deep learning models that are not that intensive. It is meant for data engineering purposes rather than data science purposes. It'll be great if Databricks could be intensive for data science.
We used a third-party, Dataiku platform for the deployment, where we connected to Databricks and completed the ML ops. We required about three people for deployment, and it is easy to maintain the solution.
What was our ROI?
We have seen an ROI but cannot differentiate because it also comes with the Azure platform.
What's my experience with pricing, setup cost, and licensing?
I do not have details about the pricing.
What other advice do I have?
I rate this solution a nine out of ten. Regarding advice, Databricks is a very good platform, popular and easy to use daily for data engineers and data scientists who rely on a large dataset to do advanced analytics reporting. It's a very good tool.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Saves time and effort; thousands of applicable use cases
Pros and Cons
- "Databricks has improved my organization by allowing us to transform data from sources to a different format and feed that to the analytics, business intelligence, and reporting teams. This tool makes it easy to do those kinds of things."
- "In the next release, I would like to see more optimization features."
What is our primary use case?
Databricks is very useful and can handle thousands of different use cases. The use cases are all over the place.
How has it helped my organization?
Databricks has improved my organization by allowing us to transform data from sources to a different format and feed that to the analytics, business intelligence, and reporting teams. This tool makes it easy to do those kinds of things.
What is most valuable?
The most valuable Databricks feature for us is that it does not require us to configure clusters. It automatically configures the clusters to the right size, the right number of clusters, the right number of nodes per cluster, et cetera.
What needs improvement?
The area in which this product can be improved is optimization. In the next release, I would like to see more optimization features.
For how long have I used the solution?
I have been using Databricks for a couple of years.
What was our ROI?
I would say the ROI for this solution is expressed mainly in terms of effort and time.
What's my experience with pricing, setup cost, and licensing?
I would advise that they train themselves before using Databricks. They should figure out which advantages Databricks has over just plain Spark and use it to the best advantage that they can.
What other advice do I have?
I am currently implementing the latest version of Databricks.
The Databricks solution is deployed through Cloud.
I would rate the Databricks solution a nine.
Which deployment model are you using for this solution?
Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Databricks
July 2025

Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
864,155 professionals have used our research since 2012.
Data Engineer Analyst at Metyis
Highly scalable, easy to use, and performs well
Pros and Cons
- "The most valuable feature of Databricks is the notebook, data factory, and ease of use."
- "When I used the support, I had communication problems because of the language barrier with the agent. The accent was difficult to understand."
What is our primary use case?
I am using Databricks in my company.
What is most valuable?
The most valuable feature of Databricks is the notebook, data factory, and ease of use.
For how long have I used the solution?
I have been using Databricks for approximately nine months.
What do I think about the stability of the solution?
The performance and stability of Databricks are good. It is quick and I have not had problems.
What do I think about the scalability of the solution?
Databricks is highly scalable.
We have 200 people using the solution in my organization.
How are customer service and support?
When I used the support, I had communication problems because of the language barrier with the agent. The accent was difficult to understand.
Which solution did I use previously and why did I switch?
I have not worked with another solution prior to Databricks.
What's my experience with pricing, setup cost, and licensing?
The price of Databricks is reasonable compared to other solutions.
What other advice do I have?
I rate Databricks an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Vice President at a tech services company with 51-200 employees
Very easy to use and requires minimal coding and customizations
Pros and Cons
- "Easy to use and requires minimal coding and customizations."
- "Doesn't provide a lot of credits or trial options."
What is our primary use case?
Our primary use case of this product is for our customers who are running large systems and looking for an API -- a quick, easy integration with their own system. We use Databricks to create a secure API interface. I'm vice president of data science and we are customers of Databricks.
What is most valuable?
Databricks is quite easy to use and requires less coding and customizations than a solution like AWS SageMaker which I'd previously used on a lot of projects. Databricks enables more people to efficiently build and host their ML code. Another great aspect is that MLflow is already integrated with Databricks which makes a big difference. It enables us to track and monitor all our different experiments. We have mostly used the MLflow part and generic notebooks with the ML building machine learning model, as well as using Pytorch for some of our medical imaging. We were able to quickly deploy both these features without requiring anything extra.
What needs improvement?
I'm struggling a little because I wanted to do some POC solutions. I present a lot of projects in various forums and seminars and there aren't a lot of credits and trial options with Databricks. Even if we want to explore, we're not able to and that's a challenge. The solution is quite expensive.
For how long have I used the solution?
I've been using this solution for a year.
What do I think about the stability of the solution?
It's currently stable although we have not yet tested it with a huge volume of data. We'll focus on the performance and model serving capability in the near future. We're still carrying out performance testing, developing the models and figuring out the infrastructure.
What do I think about the scalability of the solution?
Scalability is quite good because we just used 128 GB of resources. It's quite easy to scale.
How was the initial setup?
It was relatively simple, we didn't face any challenges. Deployment takes around two days.
Which other solutions did I evaluate?
We did a PSU in Azure ML Studio which is quite a good solution, easy to deploy and use. It's almost a no-code platform. We've also found Azure ML Studio to be quite cost-effective.
What other advice do I have?
I would recommend trying Databricks because it's cloud agnostic. A lot of customers currently use Azure but want to build something on their own down the track. Databricks makes that easy with its integration with other cloud customers. If somebody wants to build something on their infrastructure or their own virtual cloud, this is a good platform.
I rate the solution eight out of 10 because of the issue I'm having with a lack of trial options.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Lead Architect at Birlasoft IndiaLtd.
Data analytics platform that supports large volumes of data and related activities
Pros and Cons
- "This solution offers a lake house data concept that we have found exciting. We are able to have a large amount of data in a data lake and can manage all relational activities."
- "The connectivity with various BI tools could be improved, specifically the performance and real time integration."
What is most valuable?
This solution offers a lake house data concept that we have found exciting. We are able to have a large amount of data in a data lake and can manage all relational activities. All asset complaints properties are available and this is very useful to ensure the quality of all data.
What needs improvement?
The connectivity with various BI tools could be improved, specifically the performance and real time integration. There is also some improvement required in the semantic layers to manage the data match as well as the data warehouse features.
In a future release, we would like to have features to better manage all ML development activities.
For how long have I used the solution?
I have been using this solution for three years.
What do I think about the stability of the solution?
This is a stable solution, especially compared to other technology on the market.
What do I think about the scalability of the solution?
It is a scalable solution but this depends on the platform that is being used. If you use a cloud platform such as Azure, it offers scalability. However, some platforms will not support scalability using Databricks.
We have around 20 users in our development team using Databricks.
How are customer service and support?
The customer service and support for this solution is good.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup is pretty simple and requires minimal configuration compared to other technology.
What's my experience with pricing, setup cost, and licensing?
I would rate the pricing for this solution a four out of five. This does depend on the environment or the infrastructure that one is using. There is a difference in pricing between using Azure or being on-premises.
Which other solutions did I evaluate?
Azure Synapse is a competitor that we evaluated but it is not mature enough to provide better performance than Databricks. We choose Databricks due to the ability to have a lot of data in Data Lakes and the Data Warehouse. We are also able to run data science activities using ML flow.
What other advice do I have?
If you are looking for custom model development and a lot of data management in a cloud agnostic manner, then Databricks is a good solution.
I would rate this solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Senior Data Engineer at TCS
Supports multiple languages, plenty of Python libraries, but user-interface could improve
Pros and Cons
- "Databricks is a unified solution that we can use for streaming. It is supporting open source languages, which are cloud-agnostic. When I do database coding if any other tool has a similar language pack to Excel or SQL, I can use the same knowledge, limiting the need to learn new things. It supports a lot of Python libraries where I can use some very easily."
- "The query plan is not easy with Databrick's job level. If I want to tune any of the code, it is not easily available in the blogs as well."
What is our primary use case?
We are using Databricks to receive the data from Data Lake where we are processing it and doing the transformation, and cleansing. Once it is processed, we are sending the data to the Azure SQL database.
What is most valuable?
Databricks is a unified solution that we can use for streaming. It is supporting open source languages, which are cloud-agnostic. When I do database coding if any other tool has a similar language pack to Excel or SQL, I can use the same knowledge, limiting the need to learn new things. It supports a lot of Python libraries where I can use some very easily.
What needs improvement?
The query plan is not easy with Databrick's job level. If I want to tune any of the code, it is not easily available in the blogs as well.
For how long have I used the solution?
I have been using Databricks for approximately three years.
What do I think about the stability of the solution?
Databricks is stable.
What do I think about the scalability of the solution?
The salability of Databricks is good. However, if I want to use the higher clusters and high concurrency clusters, you will need to wait more time to spin up the clusters.
We have different teams. Among them, I'm part of the data analytics where, our team, almost 10 people are using it. But I'm not sure about the rest of the teams.
We are using Databricks extensively. We have a team of 10 using the solution.
How was the initial setup?
The initial setup of Databricks is not straightforward. You need to create VLANs, VPNs, and networks. We are two ways of deployment, we are having the legacy PowerShell for the deployment and the template method to deploy the Databricks code to higher levels.
We have not integrated Databricks directly into the DevOps architecture. We are downloading the notebooks manually and we are uploading them.
What's my experience with pricing, setup cost, and licensing?
The billing of Databricks can be difficult and should improve.
Which other solutions did I evaluate?
We have evaluated Azure Synapse and SQL. Both Databricks and Azure Synapse are similar, the UI is the only difference. SQL and Databricks are the same, and one of the largest setbacks is the processing of a lot of data takes a long time.
What other advice do I have?
I rate Databricks a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
Head of Credit Risk and Data at Cegid Invoice and Financing
It's a reasonably priced all-in-one platform that enables us to build a lakehouse framework
Pros and Cons
- "Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform."
- "I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one."
What is our primary use case?
We primarily use Databricks for reporting and machine learning.
What is most valuable?
Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform.
What needs improvement?
I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one.
Also, this is an all-in-one platform, but it might be preferable if there were an a la carte model where we could select the best tool in each class for reporting, machine learning, etc. I'm not yet sure if this strategy is the best one.
For how long have I used the solution?
We've been using Databricks since the start of the year.
What do I think about the stability of the solution?
Databricks is quite stable. We haven't had any issues with stability. It's always working perfectly with no downtime.
What do I think about the scalability of the solution?
Databricks is based on Spark, which is based on Scala. These languages aren't easy to handle, and it's challenging to find people who know them well. At the same time, a couple of other vendors that work on top of Databricks are low-code platforms. We have to work around Databrick's lack of scalability by using low-code platforms that work on top of Databricks to give us scalability.
How are customer service and support?
I'll give Databricks support 10 out of 10. They are always prompt even though we didn't buy a support package. They have done an excellent job.
How would you rate customer service and support?
Positive
How was the initial setup?
Setting up Databricks is a bit complex, and the initial deployment took a few days—closer to a week. Of course, not everyone is working full-time on this. There are intervals when people are doing other stuff.
What was our ROI?
It's too soon to tell what kind of return we're getting because we just started using it, and we're still migrating.
What's my experience with pricing, setup cost, and licensing?
The cost of Databricks is in the lower range compared to other solutions. That was one of the main reasons we chose Databricks over other vendors and platforms.
We pay as we go, so there isn't a fixed price. It's charged by the unit. I don't have any details detail about how they measure this, but it should be a mix between processing and quantity of data handled. We run a simulation based on our use cases, which gives us an estimate. We've been monitoring this, and the costs have met our expectations.
What other advice do I have?
I give Databricks nine out of 10. The solution has met all our expectations. I'd recommend it to a friend. It's a reasonably priced all-in-one solution that gives us data lake and lakehouse capabilities. Those were the primary reasons we chose Databricks.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Coordenador Financeiro at Icatu
Good technical support, but is difficult to set up and integrate
Pros and Cons
- "The technical support is good."
- "The initial setup is difficult."
What is our primary use case?
I believe we are using the new version.
Our company makes comprehensive use of the solution to consolidate data and do a certain amount of reporting and analytics. All the data consumers use Databricks to develop the information.
What needs improvement?
Data governance should be addressed. We have some trouble connecting all the governance solutions with Databricks. This means the integrative capabilities are problematic.
The initial setup is difficult.
For how long have I used the solution?
We have been using Databricks for a year-and-a-half.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution is scalable.
How are customer service and support?
The technical support is good.
Which solution did I use previously and why did I switch?
As we are talking about a corporate solution, the deployment of Databricks lasted longer than the one day it took for Alteryx.
We used Alteryx prior to Databricks and continue to do so, it being the only other solution we have employed. We use the two with different software.
How was the initial setup?
The initial setup is difficult.
While I don't know exactly how long the deployment took, I do know that it lasted longer than the one day needed for Alteryx.
What about the implementation team?
I believe we used a partner for the deployment, although I cannot say for certain, as this is not within my purview.
I don't know how many people are needed for maintenance and deployment.
What's my experience with pricing, setup cost, and licensing?
As the licensing is not within my purview, I am not in a position to comment on this.
What other advice do I have?
My company makes use of the solution. It is employed by my data team and the technology one. I do not have personal experience using the solution.
The solution is deployed on base, on data.
I am not aware of how many people make use of it.
I rate Databricks as a seven out of ten.
Which deployment model are you using for this solution?
Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: July 2025
Popular Comparisons
Teradata
KNIME Business Hub
Alteryx
Amazon SageMaker
Microsoft Azure Machine Learning Studio
Dataiku
Confluent
IBM SPSS Statistics
Altair RapidMiner
Apache Kafka
OpenText Analytics Database (Vertica)
Azure Stream Analytics
Apache Flink
Dremio
Amazon Kinesis
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What are the main differences between Data Lake and Data Warehouse?
- What are the benefits of having separate layers or a dedicated schema for each layer in ETL?
- What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
- Are there any general guidelines to allocate table space quota to different layers in ETL?
- What cloud data warehouse solution do you recommend?