Try our new research platform with insights from 80,000+ expert users
Data Engineer at a tech vendor with 1,001-5,000 employees
Real User
Top 20
Experiencing smooth performance and cost advantages over previous tools
Pros and Cons
  • "Databricks is definitely a very stable product and reliable."
  • "My experience with the pricing and licensing model is that it remains relatively expensive. Though it's less expensive than AWS, we still need a more cost-effective solution."

What is our primary use case?

The use case for Databricks is that we use the clustering for high big data processing within the cluster.

What is most valuable?

I think it is difficult to determine which feature of Databricks I enjoy the most since there are many valuable features.

What's valuable about Databricks to my organization is that it is more cost-effective and provides better performance than the current AWS tools and services they offer.

What needs improvement?

I am uncertain about specific improvements for Databricks.

It would be beneficial to make Databricks even more cost-effective.

For how long have I used the solution?

I have been using Databricks for two years.

Buyer's Guide
Databricks
September 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
867,349 professionals have used our research since 2012.

What do I think about the stability of the solution?

My experience with Databricks has been smooth, and I haven't encountered any issues.

Databricks is definitely a very stable product and reliable.

How are customer service and support?

I have not used Databricks customer service or support.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Before Databricks, I used Batch processing, Fargate, and possibly Kubernetes.

I switched from my previous solutions because they were either too expensive or too difficult to configure.

Which other solutions did I evaluate?

I have considered other solutions besides Databricks, such as Snowflake, but we haven't explored it extensively yet.

We are still early in our Snowflake experience, so we don't know the pros and cons compared to Databricks.

What other advice do I have?

My deployment model for Databricks is limited as I'm not a heavy user.

I am not the person who purchased Databricks, but it was possibly acquired through the AWS Marketplace.

I may not have utilized Databricks machine learning capabilities.

My experience with the pricing and licensing model is that it remains relatively expensive. Though it's less expensive than AWS, we still need a more cost-effective solution.

I would rate Databricks overall a nine out of ten.

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Dunstan Matekenya - PeerSpot reviewer
Data Scientist at a financial services firm with 10,001+ employees
Real User
Top 5
Process large-scale data sets and integrates with Apache Spark with notebook environment

What is our primary use case?

I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB.

What is most valuable?

Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of its strengths. Another strength is that the platform makes it very easy to manage resources. For example, setting up a cluster of five or fifteen nodes is straightforward with Databricks. The notebook environment is also excellent, making it easy to perform various tasks.

What needs improvement?

While Databricks allows you to upload your packages, we encountered some limitations with its capabilities, particularly with Apache Spark, which also affected Databricks. We had issues working with spatial data. You had to go through many steps to find libraries that could process spatial data in a distributed fashion.

For how long have I used the solution?

I have been using Databricks since 2018.

What do I think about the scalability of the solution?

I might have a project that runs for one or two months, and perhaps I won't use it for six months. Self-service is one of its strengths. I can shut down everything and easily spin up resources when I need to use them again.  We have a dedicated group of fifty people who consistently use Databricks for analytics.

How was the initial setup?

The initial setup was very easy and took around 10-15 people. We have a data science infrastructure team helping with this.

What was our ROI?

Databricks stands out among most data platforms mainly because of its ease of use. The learning curve is not as steep, making it accessible for anyone to handle large-scale data processing on Databricks. This ease of use contributes positively to our return on investment. However, in our line of work, converting this efficiency into direct monetary gains can be challenging, given our nonprofit nature. 

What's my experience with pricing, setup cost, and licensing?

We purchased high-performance laptops to reduce our reliance on the cloud. The main issue was the cost. Internally, if I used Databricks, that cost would return to my team. There was a time when my monthly cost was around ten thousand dollars, which was quite high. Due to these costs, several teams, including ours, move away from using Databricks and other cloud providers. It became prohibitive, so we invested in our high-performance computers internally instead.

What other advice do I have?

Databricks provides ease of use for me, particularly due to its seamless integration with Apache Spark. This integration simplifies the process of conducting machine learning on large-scale datasets.

I recommend this solution 100%. Overall, I rate the solution an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Databricks
September 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
867,349 professionals have used our research since 2012.
Solution Architect at a insurance company with 10,001+ employees
Real User
A nice interface with good features for turning off clusters to save on computing
Pros and Cons
  • "There are good features for turning off clusters."
  • "It would be nice to have more guidance on integrations with ETLs and other data quality tools."

What is our primary use case?

Our company uses the solution for big data and as an interface for analytics. 

We also create custom APIs to get data and provide SQL endpoints so users can access it over traditional tools like JDVC or ODBC. 

We use the solution on AWS and Azure. The data lake is wide open for departmental use. We have ten departments and two or three people from each department access the solution. 

How has it helped my organization?

The platform as a service allows us to ramp up a new database pretty fast. We deploy some of the infrastructure as a code. End users can access data immediately and connect with Power BI for reporting. 

What is most valuable?

There are good features for turning off clusters. Basically, if we aren't using it, then it is turned off. When a user starts accessing, it starts up so we save on computing. 

Our data lake team likes the interface very much because it is straightforward. Of, course you need to understand the different clusters when they are started. 

There are nice features for matching the learning and analytics. 

The security features allow us to integrate with the active directory and assign different people to different databases. 

The solution has good a good interface with Python. 

There is good integration with Azure so we can access the solution over the standard Azure interface and use the storage pro measure. 

What needs improvement?

It would be nice to have more guidance on integrations with ETLs and other data quality tools. The solution is not really a product for ETL or data quality so we use other DBT tools. 

For how long have I used the solution?

I have been using the solution for four months but my company has been using it for one year. 

What do I think about the stability of the solution?

The solution is very stable with no issues so I rate stability a ten out of ten. 

What do I think about the scalability of the solution?

The solution is scalable to the cluster size and Azure storage. 

Scalability is rated an eight out of ten. 

How are customer service and support?

I have not used technical support. 

The company has regular calls with Databricks and they are pretty good but are more on the technical presale side. 

Which solution did I use previously and why did I switch?

We previously used Azure's data lake product and possibly some Hortonworks. 

How was the initial setup?

The setup is not easy but also is not too complicated. An infrastructure needs to be set up first. We use Azure storage or SQL S3 and create private end points. 

This is maybe a little more complex or a bit different than other databases in the cloud. For a traditional setup, you need to also think about file systems and disks. Here, you just transform it into the storage and private end point. 

The first setup might be a bit of a struggle until you learn and understand what is necessary. 

What about the implementation team?

We implemented the solution in-house with support from Databricks. Two team members were involved in the implementation. 

Three team members handle ongoing development and maintenance. 

What's my experience with pricing, setup cost, and licensing?

The solution is affordable. 

What other advice do I have?

The solution is pretty good because it uses Azure's data lake storage. It is basically the tool on top that provides the SQL interface and APIs for Python. I like the solution because it enables people to work with it.

I rate the solution a nine out of ten. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Business Architect at YASH Technologies
Real User
Very quick run time but there are some limitations for legacy integrations
Pros and Cons
  • "The solution is an impressive tool for data migration and integration."
  • "The solution has some scalability and integration limitations when consolidating legacy systems."

What is our primary use case?

Our company uses the solution for series-based and panel-based migrations. We collect and store user requirements, use apps to fetch data, and provide customers with better data for business reports. There are 30 to 40 users in our company.

What is most valuable?

The solution is an impressive tool for data migration and integration. 

The run time is very quick.

What needs improvement?

The solution has some scalability and integration limitations when consolidating legacy systems.

For how long have I used the solution?

I have been using the solution for two years. 

What do I think about the stability of the solution?

The solution is stable. 

What do I think about the scalability of the solution?

It is not really scalability but more about the combination of the structure, consolidation, and different formats we can split and merge. We do a lot of things while storing the target operational model. Snowflake is more flexible and scalable in that regard. 

How are customer service and support?

We have contacted technical support a lot about replicating values in PDF files. So far, they have not been able to provide a viable solution. 

How was the initial setup?

The setup is of average difficulty but tougher than Snowflake. 

Deployment is easy and run time is quick. 

What about the implementation team?

We implemented the solution in-house.

One resource manages services for end-to-end monitoring and maintenance activities. 

What's my experience with pricing, setup cost, and licensing?

The solution is based on a licensing model. Updates occur automatically by the task base. 

Which other solutions did I evaluate?

Snowflake is quite impressive in comparison to the solution because there is flexibility in the way you consolidate. In contrast, the solution has some scalability and integration limitations when consolidating legacy systems. Tool wise, Snowflake is easy from the technical perspective because connectors are included.  

We are evaluating options for one particular use case. The customer wants to replicate values from PDFs and enter them in the data model. We contacted the solution's technical support but do not yet have a viable answer. There are gaps in what we do and how we capture. The only option right now is for the customer to manually upload values that we integrate using Synapse to consolidate report data. We haven't yet found another tool that maps to meet our customer's requirement. 

What other advice do I have?

I rate the solution a seven out of ten. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
AbhishekGupta - PeerSpot reviewer
Engineering Leader at Walmart
Real User
Fantastic features such as interactive clusters that perform at top speed
Pros and Cons
  • "The solution's features are fantastic and include interactive clusters that perform at top speed when compared to other solutions."
  • "CI/CD needs additional leverage and support."

What is our primary use case?

Our company uses the solution's Spark module for big data analytics as a  processing engine.  

We do not use the module as a streaming engine. The historic perception is that Spark is for batches, machine learning, analytics, and big data processing but not for streaming and that is exactly how we use it. 

What is most valuable?

The solution's features are fantastic and include interactive clusters that perform at top speed when compared to other solutions.

The ATC monitoring experience and the maturity of the APIs are very good. 

What needs improvement?

CI/CD needs additional leverage and support. Community forums are helpful for gaining knowledge but the solution should provide specific documentation.

Streaming services such as Flink should be amplified and better supported. 

There are not many connectors to MapReduce.

For how long have I used the solution?

I have been using the solution for seven years. 

What do I think about the stability of the solution?

The solution is mature and stable compared to other products. 

What do I think about the scalability of the solution?

The solution is scalable with no issues from a computer perspective.

How are customer service and support?

I received support for initial challenges and it was very good. The support team was very professional and provided the answers I needed. 

I rate support an eight out of ten. 

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I previously used Cloud-Bricks. 

How was the initial setup?

The initial setup is easy for me because I access the solution on a web browser. 

What about the implementation team?

Unilever had a specific team for implementing and managing the solution.

Walmart had a team of ten engineers for implementation and a couple of engineers for management. 

What was our ROI?

We receive an ROI for our batch constructs. 

What's my experience with pricing, setup cost, and licensing?

The solution is a good value for batch processing and huge workloads. 

The price might be high for use cases that are for streaming or strictly data science. 

Which other solutions did I evaluate?

I have evaluated multiple options including Cloud-Brick and Dataproc for price versus performance, technical support, and CI/CD approach.

I started as a consumer and used the solution for on-premises deployment with Unilever from a data science perspective. At that time, the solution was in its beta stage but viewed as good, far ahead of its competition, and expensive. The key comparison used to be HDInsight or Adobe Cluster for cloud data and the solution was thought of as a cluster service rather than for unified analytics.

I moved along on my journey to Walmart where I was building their platform and compared it to the solution from a cloud perspective and a cluster service with notebooks. Consumers at the time were using Project Lightspeed and ATC for streaming. Spark was used as a micro-batching engine for machine learning, analytics, and big data processing. At some point, the solution became preferred and more than 100 staff members were leveraging its use.

I found that the solution had interesting features that I liked such as its notebook, interactive clusters with fast speed, and the ATC monitoring experience. I did not like the solution from a CI/CD perspective because it had a rigidity in terms of the approval process.

The solution grew from that original space and, by the time I had moved to Microsoft, was partnered with Microsoft Azure. An integration with ADF and other products solved the CI/CD issues for me.

I am now leading streaming platforms for Walmart so my interest is in the solution's streaming capabilities. I began building a streaming platform using Spark PM in Microsoft so the solution was its key competitor. Then the solution launched a vectorized machine on Photon for the Spark engine. Its performance was a key factor in moving from Microsoft because it performed much better than other products including opensource Spark, Microsoft Synapse Spark, and Dataproc.

What other advice do I have?

It is important to do POCs and run tests to control the meter that also controls the price. The meter can go really high from a computing perspective if POCs and settings are not streamlined. 

I rate the solution an eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Axel Richier - PeerSpot reviewer
Tech Lead Consultant | Manager Data Engineering at Ekimetrics
Real User
Top 20
Simple to set up, fast to deploy, and with regular product updates
Pros and Cons
  • "We can scale the product."
  • "I would love an integration in my desktop IDE. For now, I have to code on their webpage."

What is our primary use case?

We're using it to provide a unified development experience for all our data experts, including all data engineers, data scientists, and IT engineers. With the Databrick Platform we allows teams to collaborate easily towards building Data Science models for our clients. The development environment allows us to ingest data from various data sources, scale the data processing and expose them either trough API or through enriched datasets made available to web app or dashboard leveraging the serverless capacities of SQL warehouse endpoints.

How has it helped my organization?

Databricks allowed us to offer an homogeneous development environment accross different accounts and domains, and also across different clouds. The upskilling of our employees is far more linear and faster, while removing the complexity of infrastructure management. This lead to an increased collaboration between domain thanks to a better onboarding experience, more performant pipelines and a smoother industrialization process. Overall client satisfaction has increased and the time to first insight has been reduced.

What is most valuable?

The shared experience of collaborative notebooks is probably the most useful aspect since, as an expert, it allows me to help my juniors debug their books and their code live. I can do some live coding with them or help them find the errors very efficiently.

It has become very simple to set up thanks to its official Terraform provider and the open-source modules made available on GitHub.

I love Databricks due to the fact that we can now deploy it in 15 minutes and it's ready to use. That's very nice since we often help our clients in deploying their first Data Platform with Databricks.

The solution is stable, with LTS Runtimes that have proven to remain stable over the years. 

What needs improvement?

I would love to be able to declare my workflows as-code, in an Airflow-like way. This would help creating more robust ingestion python modules we can test, share and update within the company. 

We would also love to have access to cluster metrics in a programmatic way, so that we can analyse hardware logs and identify potential bottlenecks to optimize.

Lastly, the latest VS Code extension has proven to be useful and appreciated by the community, as it allows to develop locally and benefits from traditional software best-practices tools like pre-commits for example.

For how long have I used the solution?

I've been using the solution for more than four years now, in the context of PoC to full end-to-end Data Platform deployment.

What do I think about the stability of the solution?

The product is very stable. I've been using it for three years now, and I have projects that have been running for three years without any big issues.

What do I think about the scalability of the solution?

It's very scalable. I have a project that started as a proof of concept on connected cars. We had 100 cars to track at first - just for the proof of concept. Now we have millions of cars that are being tracked. It scales very well. We have terabytes of data every day and it doesn't even flinch.

How are customer service and support?

I've had very good experiences with technical support where they answer me in a couple of hours. Sometimes it takes a bit longer. It's usually a matter of days, so it's very good overall. 

Even if it took a bit of time, I got my answer. They never left me without an answer or a solution.

How would you rate customer service and support?

Positive

How was the initial setup?

The implementation is very simple to set up. That's why we choose it over many other tools. Its Terraform provider is our way-to-go for the initial setup has we are reusing templates to get a functional workspace in minutes.

Usually, we have two to five data engineers handling the maintenance and running of our solutions.

What about the implementation team?

We deploy it in-house.

What's my experience with pricing, setup cost, and licensing?

The solution is a bit expensive. That said, it's worth it. I see it as an Apple product. For example, the iPhone is very expensive, yet you get what you pay for.

The cost depends on the size of your data. If you have lots of data, it's going to be more expensive since your paper compute units will be more. My smallest project is around a hundred euros, and my most expensive is just under a thousand euros a week. That is based on terabytes of data processed each month.

Which other solutions did I evaluate?

We looked into Azure Synapse as an alternative, as well as Azure ML and Vertex on GCP. Vertex AI would be the main alternative.

Some people consider Snowflake a competitor; however, we can't deploy Snowflake ourselves just like we deploy Databricks ourselves. We use that as an advantage when we sell Databricks to our clients. We say, "If you go with us, we are going to deploy Databricks in your environment in 15 minutes," and they really like it.

Lately Fabric was released and can offer quite a similar product as Databricks. Yet, the user experience, the CI/CD capabilities and the frequent release cycle of Databricks remains a strong advantage.

What other advice do I have?

We're a partner.

We use the solution on various clouds. Mostly it is Aure. However, we also have Google and AWS as well. 

One of the big advantages is that it works across domains. I'm responsible for a data engineering team. However, I work on the same platform with data scientists, and I'm very close to my IT team, who is in charge of the data access and data access control, and they can manage all the accesses from one point to all the data assets. It's very useful for me as a data engineer. I'm sure that my IT director would say it's very useful for him too. They managed to build a solution that can very easily cross responsibilities. It unifies all the challenges in one place and solves them all mostly.

I'd rate the solution nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
PraveenS - PeerSpot reviewer
Design Engineer at Cyient Limited
Real User
Top 5
A scalable and cost-effective solution that has excellent translation features and can be used for data analytics
Pros and Cons
  • "It is a cost-effective solution."
  • "The product should provide more advanced features in future releases."

What is our primary use case?

We use the solution for data analytics of industrial data.

What is most valuable?

We extensively use the product’s notebooks, jobs, and triggers. We can create activities. Wherever translation is required, we use Databricks. The product fulfills our customer requirements. It is a cost-effective solution.

What needs improvement?

The product should provide more advanced features in future releases.

For how long have I used the solution?

I have been using the solution for six months.

What do I think about the stability of the solution?

Our data was not too huge. It worked well. It is easily adaptable.

What do I think about the scalability of the solution?

The tool is scalable. We can make it available for a larger audience.

How was the initial setup?

The initial setup is not that difficult. I rate the ease of setup a seven out of ten. The solution is cloud-based. We use native services like Data Factory for orchestration. Sometimes, the customers require us to use Amazon as the cloud provider instead of Azure.

What's my experience with pricing, setup cost, and licensing?

The pricing is average.

What other advice do I have?

There are many services which are coming up. They are still in the preview stage. Overall, I rate the product an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Karan  Sharma - PeerSpot reviewer
Data Analyst at Allianz
Real User
An easy to setup tool that provides its users with an insight into the metadata of the data they process
Pros and Cons
  • "The initial setup phase of Databricks was good."
  • "Scalability is an area with certain shortcomings. The solution's scalability needs improvement."

What is our primary use case?

My company uses Databricks to process real-time and batch data with its streaming analytics part. We use Databricks' Unified Data Analytics Platform, for which we have Azure as a solution to bring the unified architecture on top of that to handle the streaming load for our platform.

What is most valuable?

The most valuable feature of the solution stems from the fact that it is quite fast, especially regarding features like its computation and atomicity parts of reading data on any solution. We have a storage account, and we can read the data on the go and use that since we now have the unity catalog in Databricks, which is quite good for giving you an insight into the metadata of the data you're going to process. There are a lot of things that are quite nice with Databricks.

What needs improvement?

Scalability is an area with certain shortcomings. The solution's scalability needs improvement.

For how long have I used the solution?

I have been using Databricks for a few years. I use the solution's latest version. Though currently my company is a user of the solution, we are planning to enter into a partnership with Databricks.

What do I think about the stability of the solution?

It is a stable solution. Stability-wise, I rate the solution an eight to nine out of ten.

What do I think about the scalability of the solution?

It is a scalable solution. Scalability-wise, I rate the solution an eight to nine out of ten.

My company has a team of 50 to 60 people who use the solution.

How are customer service and support?

Sometimes, my company does need support from the technical team of Databricks. The technical team of Databricks has been good and helpful. I rate the technical support an eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup phase of Databricks was good. You can spin up clusters and integrate those with DevOps as well. Databricks it's quite nice owing to its user-friendly UI, DPP, and workspaces.

The solution is deployed on the cloud.

The time taken for the deployment depends on the workload.

What's my experience with pricing, setup cost, and licensing?

I cannot judge whether the product is expensive or cheap since I am unaware of the prices of the other products, which are competitors of Databricks. The licensing costs of Databricks depend on how many licenses we need, depending on which Databricks provides a lot of discounts.

What other advice do I have?

It is a state-of-the-art product revolutionizing data analytics and machine learning workspaces. Databricks are a complete solution when it comes to working with data.

I rate the overall product an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: September 2025
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.