We were using Databricks to build an AI solution. We are only evaluating it, we have approximately three people that tried it out. Later we choose another solution, we did not fully deploy Databricks.
Machine Learning Engineer at a mining and metals company with 10,001+ employees
Highly scalable, stable and good technical support
Pros and Cons
- "Databricks is a scalable solution. It is the largest advantage of the solution."
- "Before I used Databricks it took me a long time to do some functions and now with Databricks I can do them much quicker."
- "The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good."
- "The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists."
What is our primary use case?
How has it helped my organization?
Before I used Databricks it took me a long time to do some functions and now with Databricks I can do them much quicker. It scales very well.
What needs improvement?
The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good.
For how long have I used the solution?
I have used Databricks within the last 12 months.
Buyer's Guide
Databricks
March 2026
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
885,286 professionals have used our research since 2012.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
Databricks is a scalable solution. It is the largest advantage of the solution.
How are customer service and support?
We have been in contact with the technical support of Databricks, they were good.
Which solution did I use previously and why did I switch?
We have used a lot of different solutions, such as Watson and DataIQ.
How was the initial setup?
The initial setup is easy. However, I do not know much about the implementation because the company does it.
What about the implementation team?
We did the implementation of the solution.
What other advice do I have?
If companies want scalability, they should choose Databricks.
I rate Databricks a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Practice Head, Data & Analytics at a tech vendor with 10,001+ employees
Key feature is ability to make changes in structure or data size and align for subsequent consumption
Pros and Cons
- "Can cut across the entire ecosystem of open source technology to give an extra level of getting the transformatory process of the data."
- "Stability of the product is good, whether it's handling large volumes, diverse elements of data or processing data at speed."
- "Implementation of Databricks is still very code heavy."
- "In my view, the fundamental approach of implementing Databricks is still very code heavy, more than you find in Azure Data Factory and other technologies like Informatica or SQL Server Integration Service."
What is our primary use case?
We have a team that works on Databricks for our clients. We are customers of Databricks.
What is most valuable?
Databricks can cut across the entire ecosystem of open source technology which gives an extra level in terms of getting the transformatory process of the data. The solution is primarily open source and they have bolstered its components to make it more fit for purpose for a complete Azure Data platform. The solution is responsible for the core transformatory activities. While Azure Data Factory is very good for pulling in the data, doing the basic standardization and profiling, Databricks is more about making fundamental changes in structure or in size of the data and aligning it for subsequent consumption, or for the final layer on Synapse. It also has the power to complement and work with Spark and elements related to Python.
What needs improvement?
In my view, the fundamental approach of implementing Databricks is still very code heavy, more than you find in Azure Data Factory and other technologies like Informatica or SQL Server Integration Service. From my perspective, that could be improved. I'd also like to have the ability to facilitate predictive analytics within the solution.
For how long have I used the solution?
I've been using the solution for a year and a half.
What do I think about the stability of the solution?
Stability of the product is good, whether it's handling large volumes, diverse elements of data or processing data at speed. It has stood the test of time. It's a solution that really lends itself to that higher level of stability, versatility and diversity in terms of its capability to process different forms of data.
What's my experience with pricing, setup cost, and licensing?
The cost of the solution is slightly on the high side so it's important to use it efficiently.
What other advice do I have?
Use the solution wisely and in tandem with Azure Data Factory. Apply the prism in your overall design of the pipelines of the flow, to utilize to its potential. Databricks offers significant capability to the transformatory and data tranching capabilities in terms of diverse variety to Azure Data Stack per se. In terms of the license, ensure that the customer is getting what they paid for so that the value for money is realized.
I rate the solution eight out of 10.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Databricks
March 2026
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
885,286 professionals have used our research since 2012.
Associate Manager at a consultancy with 501-1,000 employees
Efficient, high data volume processing, and easy to use
Pros and Cons
- "The main features of the solution are efficiency."
- "By going through this solution, we were able to complete the processing of the data in half an hour."
- "There should be better integration with other platforms."
- "There should be better integration with other platforms."
What is our primary use case?
We use this solution to process data, for example, data validation.
What is most valuable?
The main features of the solution are efficiency.
We were trying to process 300 million records over 10 years. If you are processing that high number of records through the ADF pipeline with, for example, Azure, it took approximately six hours. In order to reduce the burden on our ADF pipeline, we wrote a simple code in this solution where we can read and write to the file into the temporary Storage Explorer. By going through this solution, we were able to complete the processing of the data in half an hour.
The technology that allows us to write scripts within the solution is extremely beneficial. If I was, for example, able to script in SQL, R, Scala, Apache Spark, or Python, I would be able to use my knowledge to make a script in this solution. It is very user-friendly and you can also process the records and validation point of view.
The ability to migrate from one environment to another is useful.
What needs improvement?
There should be better integration with other platforms.
For how long have I used the solution?
I have been using this solution for two years.
What do I think about the scalability of the solution?
I have approximately 20 users using this solution in my organization. We have plans to increase our usage in the future.
How was the initial setup?
There is no installation required. It is easy to use, for example, in Azure it is available, you subscribe, and use it.
What's my experience with pricing, setup cost, and licensing?
The solution uses a pay-per-use model with an annual subscription fee or package. Typically this solution is used on a cloud platform, such as Azure or AWS, but more people are choosing Azure because the price is more reasonable.
What other advice do I have?
I rate Databricks a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Lead Data Architect at a government with 1,001-5,000 employees
Good integration with majority of data sources through Databricks Notebooks using Python, Scala, SQL, R.
Pros and Cons
- "The initial setup is pretty easy."
- "In the current capacity as an Architect and the end user of Databricks I would say I do have confidence that Databricks can provide a wealth of functionalities to start with."
- "Overall it's a good product, however, it doesn't do well against any individual best-of-breed products."
- "Overall it's a good product, however, it might get challenged over time with individual best-of-breed products."
What is our primary use case?
We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.
How has it helped my organization?
Databricks (delta lake) and the underlying files storage (data lake) is in the centre of the organisation's enterprise data hub. Most of our data is structured (csv files), have some semi-structured (json files) but we are beginning to ingest unstructured (pdf files) and use Natural Language Processing (Textract) to obtain insights driven by key words.
What is most valuable?
The Databricks notebooks with SQL and Python provide good intuitive development environment. The Delta Lake, the reading of underlying file storage, the delta tables mounted on top of data lake (AWS in our case) are providing full ACID compliance, good connectivity and interoperability.
The initial setup is fairly straightforward. The stability is good.
What needs improvement?
The product is quite ambitious. It's trying to become a centralized platform for all data ingestion, transformation, and analytics needs. It may encounter a stiff competition from best of breed solutions powered by open source software.
Overall it's a good product, however, it might get challenged over time with with individual best-of-breed products.
For example in the area of Data Science, RStudio seems to be the industry standard at the moment. RStudio IDE is richer, there are a more out of the box functionalities like a push-button publishing, etc. It's more difficult to run R within Databricks. Especially when it comes to synchronizing the R packages, it legs behind. It's not even supporting the latest version of R 1.3. I believe eventually all analytics will converge into data science. The analytics of the future will be data science, because predicting the future will be one of the most prevalent use cases. The stuff we used to do before, slicing and dicing, drilling through, trend analysis, etc. will become redundant operations after the analytics toolsets become powered by AI/ML and fully automated. Unless the organisations acquire these platforms that can cater for machine learning and artificial intelligence, including natural language processing they will have a hard time surviving.
With Databricks I would like to see more integration with and accommodation of open-source products. This could be controversial, as it could question the whole configuration and the purpose of the product. I'm pretty sure Microsoft is trying to position it in a monopoly market as they did with Windows and MS Office so that they could charge the premium. We are beginning to see the similar product strategy behind Databricks.
For how long have I used the solution?
I've been working with Databricks for about two years.
What do I think about the stability of the solution?
From what I know and from what I've heard, talking to our data operations team, it is stable and it's quite powerful.
What do I think about the scalability of the solution?
Obviously running on top of Spark, ensures fast processing and elasticity for coping with big data volumes, up to 2 petabytes. You can spin up the cluster very quickly, and shut it down. It's elastic.
How are customer service and technical support?
Excellent customer service from Databricks. Very proactive, constantly attuned to customer needs, even connected us with other customers for knowledge exchange.
Which solution did I use previously and why did I switch?
I am an IT Consultant and in the past have used different solutions for ETL on top of databases, particularly if we are talking about data warehousing. However, in the last 6 years I have seen large client using a mixture of open source and proprietory technologies, like Informatica stack with data lake in AWS, or Kafka Confluence with MQ Series on top of mainframes and data lake in AWS, Databricks and Azure data lake, etc.
How was the initial setup?
It was pretty easy to set up. At least, that is my understanding. I'm not the data engineer though. I don't actually do installs and configurations. I explore features and build them in my architecture designs.
What about the implementation team?
We implemented Databricks through vendor, and the vendor was pretty good.
What was our ROI?
Don't really know.
What's my experience with pricing, setup cost, and licensing?
I can't speak on pricing of the solution. It's not an aspect of the solution I deal with directly.
Which other solutions did I evaluate?
The options were Talend, EMC Isilon, native AWS services, and others.
What other advice do I have?
In the current capacity as and Architect and the end user of Databricks I would say I do have confidence that Databricks can provide a wealth of functionalities to start with.
My advice to future adopters of Databricks would be to be careful about the overall architectural roadmap for this application, adopt a flexible, modular, microservices like architecture whose components could be replaced in the future should they deem inadequate to cater for evolving business needs.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Chief Data-strategist and Director at Theworkshop.es
Flexible, stable, and reasonably priced
Pros and Cons
- "The solution is very easy to use."
- "The capability of the product is quite good and we are very satisfied with it overall."
- "The integration of data could be a bit better."
- "The integration of data could be a bit better."
What is our primary use case?
We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.
What is most valuable?
The solution is very easy to use.
The storage on offer is very good.
The solution is perfect for dealing with big data.
The artificial intelligence on offer is very good.
The product is quite flexible.
We have found the solution to be stable.
The cloud services on offer are very reasonably priced.
Technical support is very good. They also have very good documentation on offer to help you navigate the product and learn about its offerings.
What needs improvement?
The solution works very well for us. I can't recall any missing features or anything the solution really lacks. It's very complete.
It would help if there were different versions of the solution on offer.
The integration of data could be a bit better.
For how long have I used the solution?
I've worked for about 20 to 25 years in business intelligence analytics and have worked with Databricks for about four years at this point.
What do I think about the stability of the solution?
The stability of the solution is very good. It doesn't crash or freeze. There are no bugs or glitches. Its performance is very good.
What do I think about the scalability of the solution?
The scalability is quite good. A company that needs to expand it can do so with ease.
We only have four people on the solution at this time. The front-end users never use the product directly. The companies aren't that big here. If the economy improves, we'll likely have more of a need for the product.
How are customer service and technical support?
I've dealt with technical support in the past and have found them to be very good. They are helpful and responsive. We are satisfied with their level of service.
Which solution did I use previously and why did I switch?
I work with Databricks, Cloudera and Snowflake.
How was the initial setup?
The solution is on the cloud and therefore there isn't really an installation process that you need to go through. You only really need to configure the clusters.
Within the clusters, you configure according to how many platforms you need, or if you want to, you can build a cluster for artificial intelligence. You just configure it as required.
What's my experience with pricing, setup cost, and licensing?
The pricing of the product is very reasonable. The fact that it is on the cloud makes it a less expensive option. Other solutions that are on-premises are quite expensive.
What other advice do I have?
We are customers and end-users.
Databricks is on the could and therefore, we're always on the latest version of the solution. It's constantly updated for us so that we have access to the latest updates and upgrades.
I'd rate the solution at a nine out of ten. The capability of the product is quite good and we are very satisfied with it overall.
I'd recommend the solution to other companies and organizations.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Global Data Architecture and Data Science Director at FH
Flexible with support for several programming languages, good visualization and workload management functionality
Pros and Cons
- "Databricks gives you the flexibility of using several programming languages independently or in combination to build models."
- "Databricks gives you the flexibility of using several programming languages independently or in combination to build models."
- "Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks."
- "For a small workload, Databricks may not be worth the costs."
What is our primary use case?
The primary use is for data management and managing workloads of data pipelines.
Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.
What is most valuable?
Databricks gives you the flexibility of using several programming languages independently or in combination to build models.
The quick visualization of the data is very good.
The workload management functionality works well.
What needs improvement?
Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks.
For how long have I used the solution?
I have been using Databricks since 2017. I am no longer using it personally, although my team is, and will continue to do so in the future.
What do I think about the stability of the solution?
Databricks is quite popular these days and it appears to be stable. I have not found any issues with stability.
What do I think about the scalability of the solution?
Databricks is scalable, regardless of which cloud provider is being used. It is supported on Microsoft Azure, AWS, and they have their own cloud as well.
For a small workload, Databricks may not be worth the costs. However, for larger workloads, Databricks is a very good solution.
In my previous organization, there were between 10 and 15 users.
How are customer service and technical support?
The technical support is handled by Microsoft partners and because we had premium support, it was easy to get. That said, I did not require any support.
Which solution did I use previously and why did I switch?
I have not used tools that are similar to Databricks for workload management, but Azure ADFv2, Google BigQuery, SAS are some the most powerful tools in this space, that I have used in the past. I have also heard of Dataiku and other tools but I have not used them. The only things that I have used are tools written in Python or scripting languages.
How was the initial setup?
There is no installation required.
What's my experience with pricing, setup cost, and licensing?
Databricks uses pay-per-use model, where you can use as much compute as you need. I think that the cost can be reduced, given that there are more users on the platform, although it is not as expensive as some other solutions like SAS.
What other advice do I have?
As we transition to the Azure cloud, I expect that we will be using Databricks for workloads.
This is a product that I recommend for those who want to scale and have a good budget. It is good for automating a data pipeline and managing workloads. My advice for anybody who is starting to use it is to take the proper training.
Overall, based on my uses, I think that this product is pretty good.
I would rate this solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Very elastic, easy to scale, and a straightforward setup
Pros and Cons
- "It's easy to increase performance as required."
- "I don't think you can find any other tool or any other service that is faster than Databricks."
- "Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively."
- "The solution is expensive. It's not like a lot of competitors, which are open-source."
What is our primary use case?
We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data.
What is most valuable?
The elasticity of the solution is excellent.
The storage, etc., can be scaled up quite easily when we need it to.
It's easy to increase performance as required.
The solution runs on Spark very well.
What needs improvement?
Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively.
They're currently coming out with a new feature, which is Date Lake. It will come with a new layer of data compliance.
For how long have I used the solution?
We've been using the solution for two years.
What do I think about the stability of the solution?
I don't see any issues with stability going down to the cluster. It would certainly be fine if it's maintained. It's highly available even if things are dropped. It will still be up and running. I would describe it as very reliable. We don't have issues with crashing. There aren't bugs and glitches that affect the way it works.
What do I think about the scalability of the solution?
The system is extremely scalable. It's one of its greatest features and a big selling point. If a company needs to scale or expand, they can do so very easily.
We require daily usage from the solution even though we don't directly work with Databricks on a day to day basis. Due to the fact that we schedule everything we need and it will trigger work that needs to be done, it's used often. Do you need to log into the database console every day? No. You just need to configure it one time and that's it. Then it will deliver everything needed in the time required.
How are customer service and technical support?
We use Microsoft support, so we are enterprise customers for them. We raise a service request for Databricks, however, we use Microsoft. Overall, we've been satisfied with the support we've been given. They're responsive to our needs.
Which solution did I use previously and why did I switch?
We work with multiple clients and this solution is just one of the examples of products we work with. We use several others as well, depending on the client.
It's all wrappers between the same underlying systems. For example, Spark. It's all open-source. We've worked with them as well as the wrappers around it, whether the company was labeled Databrary, IBM insights, Cloudera, etc. These wrappers are all on the same open-source system.
If we with Azure data, we take over Databricks. Otherwise, we have to create a VM separately. Those things are not needed because Azure is already providing those things for us.
How was the initial setup?
The situation may have been a bit different for me than for many users or organizations. I've been in this industry for more than 15 or 17 years. I have a lot of experience. I also took the time to do some research and preparation for the setup. It was straightforward for me.
The deployment with Microsoft usually can be done in 20 minutes. However, it can take 40 to 45 minutes to complete. An organization only requires one person to upload the data and have complete access to the account.
What about the implementation team?
I deployed the solution myself. I didn't require any assistance, so I didn't enlist any resellers or consultants to help with the process.
What's my experience with pricing, setup cost, and licensing?
The solution is expensive. It's not like a lot of competitors, which are open-source.
What other advice do I have?
There isn't really a version, per se.
It's a popular service. I'd recommend the solution. The solution is cloud-agnostic right now, so it really can go into any cloud. It's the users who will be leveraging installed environments that can have these services, no matter if they are using Azure or Ubiquiti, or other systems.
I don't think you can find any other tool or any other service that is faster them Databricks. I don't see that right now. It's your best option.
Overall, I'd rate the solution eight out of ten. The reason I'm not giving it full marks is that it's expensive compared to open source alternatives. Also, the configuration is difficult, so sometimes you need to spend a couple of hours to get it right.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Associate Machine Learning Engineer at a tech services company with 501-1,000 employees
Provides resources to users quickly without much hassle
Pros and Cons
- "The most valuable features of the solution are the hardware and the resources it quickly provides without much hassle."
- "I think setting up the whole account for one person and giving access are areas that can be difficult to manage and should be made a little easier."
What is our primary use case?
I have recently gotten into Databricks and trained on one model. I started using Databricks because of its hardware support and all the other things that it provides, and it is easier to get into. Earlier, when I had to test some part of my code or test if it was working or not, it was not just a fair, not a full production run, but just a fair testing; I had to get a machine, raise a request, get into the whole process. With Databricks, I can just simply create one myself. I could get the resources, whatever they are required, test it out all there, and then go ahead with that, and that is why I have been using it primarily.
What is most valuable?
The most valuable features of the solution are the hardware and the resources it quickly provides without much hassle.
What needs improvement?
I think setting up the whole account for one person and giving access are areas that can be difficult to manage and should be made a little easier.
For how long have I used the solution?
I have experience with Databricks.
What do I think about the stability of the solution?
I think there's a duration after which our training without any activity would expire, which I think is a fair point, and that is the only place where I think this will stop. I haven't come across a lot of problems with Databricks.
What do I think about the scalability of the solution?
The tool is not used as frequently as PyTorch. I don't know why I am comparing Databricks to PyTorch, but I think around five people use it.
How are customer service and support?
I have not contacted the solution's technical support team.
Which solution did I use previously and why did I switch?
Before Databricks, I used to use a cloud support platform.
How was the initial setup?
The solution is deployed on the cloud.
Which other solutions did I evaluate?
I chose Databricks over other products, considering the hardware support it offers.
What other advice do I have?
A little bit of time will be needed to get comfortable with Databricks.
I rate the tool an eight out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: March 2026
Product Categories
Cloud Data Warehouse Data Science Platforms Data Management Platforms (DMP) Streaming AnalyticsPopular Comparisons
Teradata
Azure Data Factory
Palantir Foundry
Qlik Talend Cloud
Snowflake
KNIME Business Hub
IBM SPSS Statistics
Dataiku
Alteryx
Amazon SageMaker
Microsoft Azure Machine Learning Studio
Coralogix
Confluent
Altair RapidMiner
OpenText Analytics Database (Vertica)
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What are the main differences between Data Lake and Data Warehouse?
- What are the benefits of having separate layers or a dedicated schema for each layer in ETL?
- What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
- Are there any general guidelines to allocate table space quota to different layers in ETL?
- What cloud data warehouse solution do you recommend?

















