Try our new research platform with insights from 80,000+ expert users
Anirban Bhattacharya - PeerSpot reviewer
Practice Head, Data & Analytics at a tech vendor with 10,001+ employees
Real User
Key feature is ability to make changes in structure or data size and align for subsequent consumption
Pros and Cons
  • "Can cut across the entire ecosystem of open source technology to give an extra level of getting the transformatory process of the data."
  • "Implementation of Databricks is still very code heavy."

What is our primary use case?

We have a team that works on Databricks for our clients. We are customers of Databricks. 

What is most valuable?

Databricks can cut across the entire ecosystem of open source technology which gives an extra level in terms of getting the transformatory process of the data. The solution is primarily open source and they have bolstered its components to make it more fit for purpose for a complete Azure Data platform. The solution is responsible for the core transformatory activities. While Azure Data Factory is very good for pulling in the data, doing the basic standardization and profiling, Databricks is more about making fundamental changes in structure or in size of the data and aligning it for subsequent consumption, or for the final layer on Synapse. It also has the power to complement and work with Spark and elements related to Python. 

What needs improvement?

In my view, the fundamental approach of implementing Databricks is still very code heavy, more than you find in Azure Data Factory and other technologies like Informatica or SQL Server Integration Service. From my perspective, that could be improved. I'd also like to have the ability to facilitate predictive analytics within the solution. 

For how long have I used the solution?

I've been using the solution for a year and a half. 

Buyer's Guide
Databricks
June 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

What do I think about the stability of the solution?

Stability of the product is good, whether it's handling large volumes, diverse elements of data or processing data at speed. It has stood the test of time. It's a solution that really lends itself to that higher level of stability, versatility and diversity in terms of its capability to process different forms of data.

What's my experience with pricing, setup cost, and licensing?

The cost of the solution is slightly on the high side so it's important to use it efficiently.

What other advice do I have?

Use the solution wisely and in tandem with Azure Data Factory. Apply the prism in your overall design of the pipelines of the flow, to utilize to its potential. Databricks offers significant capability to the transformatory and data tranching capabilities in terms of diverse variety to Azure Data Stack per se. In terms of the license, ensure that the customer is getting what they paid for so that the value for money is realized. 

I rate the solution eight out of 10. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1488372 - PeerSpot reviewer
Associate Manager at a consultancy with 501-1,000 employees
Real User
Efficient, high data volume processing, and easy to use
Pros and Cons
  • "The main features of the solution are efficiency."
  • "There should be better integration with other platforms."

What is our primary use case?

We use this solution to process data, for example, data validation.

What is most valuable?

The main features of the solution are efficiency.

We were trying to process 300 million records over 10 years. If you are processing that high number of records through the ADF pipeline with, for example, Azure, it took approximately six hours. In order to reduce the burden on our ADF pipeline, we wrote a simple code in this solution where we can read and write to the file into the temporary Storage Explorer. By going through this solution, we were able to complete the processing of the data in half an hour.

The technology that allows us to write scripts within the solution is extremely beneficial. If I was, for example, able to script in SQL, R, Scala, Apache Spark, or Python, I would be able to use my knowledge to make a script in this solution. It is very user-friendly and you can also process the records and validation point of view.

The ability to migrate from one environment to another is useful.

What needs improvement?

There should be better integration with other platforms.

For how long have I used the solution?

I have been using this solution for two years.

What do I think about the scalability of the solution?

I have approximately 20 users using this solution in my organization. We have plans to increase our usage in the future.

How was the initial setup?

There is no installation required. It is easy to use, for example, in Azure it is available, you subscribe, and use it.

What's my experience with pricing, setup cost, and licensing?

The solution uses a pay-per-use model with an annual subscription fee or package. Typically this solution is used on a cloud platform, such as Azure or AWS, but more people are choosing Azure because the price is more reasonable.

What other advice do I have?

I rate Databricks a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Databricks
June 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
reviewer1558740 - PeerSpot reviewer
Lead Data Architect at a government with 1,001-5,000 employees
Real User
Good integration with majority of data sources through Databricks Notebooks using Python, Scala, SQL, R.
Pros and Cons
  • "The initial setup is pretty easy."
  • "Overall it's a good product, however, it doesn't do well against any individual best-of-breed products."

What is our primary use case?

We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.

How has it helped my organization?

Databricks (delta lake) and the underlying files storage (data lake) is in the centre of the organisation's enterprise data hub. Most of our data is structured (csv files), have some semi-structured (json files) but we are beginning to ingest unstructured (pdf files) and use Natural Language Processing (Textract) to obtain insights driven by key words.  

What is most valuable?

The Databricks notebooks with SQL and Python provide good intuitive development environment. The Delta Lake, the reading of underlying file storage, the delta tables mounted on top of data lake (AWS in our case) are providing full ACID compliance, good connectivity and interoperability.  

The initial setup is fairly straightforward. The stability is good.

What needs improvement?

The product is quite ambitious. It's trying to become a centralized platform for all data ingestion, transformation, and analytics needs. It may encounter a stiff competition from best of breed solutions powered by open source software. 

Overall it's a good product, however, it might get challenged over time with with individual best-of-breed products. 

For example in the area of Data Science, RStudio seems to be the industry standard at the moment. RStudio IDE is richer, there are a more out of the box functionalities like a push-button publishing, etc. It's more difficult to run R within Databricks. Especially when it comes to synchronizing the R packages, it legs behind. It's not even supporting the latest version of R 1.3. I believe eventually all analytics will converge into data science. The analytics of the future will be data science, because predicting the future will be one of the most prevalent use cases. The stuff we used to do before, slicing and dicing, drilling through, trend analysis, etc. will become redundant operations after the analytics toolsets become powered by AI/ML and fully automated. Unless the organisations acquire these platforms that can cater for machine learning and artificial intelligence, including natural language processing they will have a hard time surviving.

With Databricks I would like to see more integration with and accommodation of  open-source products. This could be controversial, as it could question the whole configuration and the purpose of the product. I'm pretty sure Microsoft is trying to position it in a monopoly market as they did with Windows and MS Office so that they could charge the premium. We are beginning to see the similar product strategy behind Databricks. 

For how long have I used the solution?

I've been working with Databricks for about two years. 

What do I think about the stability of the solution?

From what I know and from what I've heard, talking to our data operations team, it is stable and it's quite powerful. 

What do I think about the scalability of the solution?

Obviously running on top of Spark, ensures fast processing and elasticity for coping with big data volumes, up to 2 petabytes. You can spin up the cluster very quickly, and shut it down. It's elastic.

How are customer service and technical support?

Excellent customer service from Databricks. Very proactive, constantly attuned to customer needs, even connected us with other customers for knowledge exchange. 

Which solution did I use previously and why did I switch?

I am an IT Consultant and in the past have used different solutions for ETL on top of databases, particularly if we are talking about data warehousing. However, in the last 6 years I have seen large client using a mixture of open source and proprietory technologies, like Informatica stack with data lake in AWS, or Kafka Confluence with MQ Series on top of mainframes and data lake in AWS, Databricks and Azure data lake, etc.

How was the initial setup?

It was pretty easy to set up. At least, that is my understanding. I'm not the data engineer though. I don't actually do installs and configurations. I explore features and build them in my architecture designs.

What about the implementation team?

We implemented Databricks through vendor, and the vendor was pretty good. 

What was our ROI?

Don't really know.

What's my experience with pricing, setup cost, and licensing?

I can't speak on pricing of the solution. It's not an aspect of the solution I deal with directly.

Which other solutions did I evaluate?

The options were Talend, EMC Isilon, native AWS services, and others.

What other advice do I have?

In the current capacity as and Architect and the end user of Databricks I would say I do have confidence that Databricks can provide a wealth of functionalities to start with. 

My advice to future adopters of Databricks would be to be careful about the overall architectural roadmap for this application, adopt a flexible, modular, microservices like architecture whose components could be replaced in the future should they deem inadequate to cater for evolving business needs. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Oscar Estorach - PeerSpot reviewer
Chief Data-strategist and Director at Theworkshop.es
Real User
Top 10
Flexible, stable, and reasonably priced
Pros and Cons
  • "The solution is very easy to use."
  • "The integration of data could be a bit better."

What is our primary use case?

We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.

What is most valuable?

The solution is very easy to use. 

The storage on offer is very good. 

The solution is perfect for dealing with big data.

The artificial intelligence on offer is very good.

The product is quite flexible.

We have found the solution to be stable. 

The cloud services on offer are very reasonably priced.

Technical support is very good. They also have very good documentation on offer to help you navigate the product and learn about its offerings. 

What needs improvement?

The solution works very well for us. I can't recall any missing features or anything the solution really lacks. It's very complete. 

It would help if there were different versions of the solution on offer.

The integration of data could be a bit better.

For how long have I used the solution?

I've worked for about 20 to 25 years in business intelligence analytics and have worked with Databricks for about four years at this point. 

What do I think about the stability of the solution?

The stability of the solution is very good. It doesn't crash or freeze. There are no bugs or glitches. Its performance is very good.

What do I think about the scalability of the solution?

The scalability is quite good. A company that needs to expand it can do so with ease.

We only have four people on the solution at this time. The front-end users never use the product directly. The companies aren't that big here. If the economy improves, we'll likely have more of a need for the product.

How are customer service and technical support?

I've dealt with technical support in the past and have found them to be very good. They are helpful and responsive. We are satisfied with their level of service.

Which solution did I use previously and why did I switch?

I work with  Databricks, Cloudera and Snowflake.

How was the initial setup?

The solution is on the cloud and therefore there isn't really an installation process that you need to go through. You only really need to configure the clusters. 

Within the clusters, you configure according to how many platforms you need, or if you want to, you can build a cluster for artificial intelligence. You just configure it as required. 

What's my experience with pricing, setup cost, and licensing?

The pricing of the product is very reasonable. The fact that it is on the cloud makes it a less expensive option. Other solutions that are on-premises are quite expensive.

What other advice do I have?

We are customers and end-users. 

Databricks is on the could and therefore, we're always on the latest version of the solution. It's constantly updated for us so that we have access to the latest updates and upgrades. 

I'd rate the solution at a nine out of ten. The capability of the product is quite good and we are very satisfied with it overall. 

I'd recommend the solution to other companies and organizations.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
RichardXu - PeerSpot reviewer
Data Science Lead at a mining and metals company with 10,001+ employees
Real User
Top 20
Scalable and reliable, with helpful support
Pros and Cons
  • "It can send out large data amounts."
  • "It's not easy to use, and they need a better UI."

What is our primary use case?

We use this solution to build skill and text classification models.

What is most valuable?

The scalability brings value to this solution.

It can send out large data amounts.

What needs improvement?

The user experience can be improved. 

It's not easy to use, and they need a better UI.

For how long have I used the solution?

I have been dealing with Databricks for more than five years.

We used this solution last five months ago and used the most current version during that time.

What do I think about the stability of the solution?

This solution is quite stable. We have not had any issues with stability.

What do I think about the scalability of the solution?

It's a scalable solution. Very few people are using this solution in our organization. Most don't have the skill.

How are customer service and technical support?

We were using the free version which did not have a lot of support.

We didn't really need support at the time. I had one conversation with them and they were very nice. They were helpful.

Which solution did I use previously and why did I switch?

We are using Dataiku for one project and also SageMaker. We have some issues with scalability using SageMaker, which is why we may be going back to Databricks.

SageMaker is a very specific AI tool.

How was the initial setup?

The initial setup was okay.

What's my experience with pricing, setup cost, and licensing?

There are many different versions. 

We used the trial version, which was free.

What other advice do I have?

If you have a lot of data, Databricks is a good choice. 

With the migration of Microsoft and Databricks, they make it easy. It's the direction to go in.

It's a very good tool. I would rate Databricks a nine out of ten. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
PeerSpot user
Global Data Architecture and Data Science Director at FH
Real User
ExpertModerator
Flexible with support for several programming languages, good visualization and workload management functionality
Pros and Cons
  • "Databricks gives you the flexibility of using several programming languages independently or in combination to build models."
  • "Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks."

What is our primary use case?

The primary use is for data management and managing workloads of data pipelines.

Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.

What is most valuable?

Databricks gives you the flexibility of using several programming languages independently or in combination to build models.

The quick visualization of the data is very good.

The workload management functionality works well.

What needs improvement?

Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks.

For how long have I used the solution?

I have been using Databricks since 2017. I am no longer using it personally, although my team is, and will continue to do so in the future.

What do I think about the stability of the solution?

Databricks is quite popular these days and it appears to be stable. I have not found any issues with stability.

What do I think about the scalability of the solution?

Databricks is scalable, regardless of which cloud provider is being used. It is supported on Microsoft Azure, AWS, and they have their own cloud as well.

For a small workload, Databricks may not be worth the costs. However, for larger workloads, Databricks is a very good solution.

In my previous organization, there were between 10 and 15 users.

How are customer service and technical support?

The technical support is handled by Microsoft partners and because we had premium support, it was easy to get. That said, I did not require any support.

Which solution did I use previously and why did I switch?

I have not used tools that are similar to Databricks for workload management, but Azure ADFv2, Google BigQuery, SAS are some the most powerful tools in this space, that I have used in the past. I have also heard of Dataiku and other tools but I have not used them. The only things that I have used are tools written in Python or scripting languages.

How was the initial setup?

There is no installation required.

What's my experience with pricing, setup cost, and licensing?

Databricks uses pay-per-use model, where you can use as much compute as you need. I think that the cost can be reduced, given that there are more users on the platform, although it is not as expensive as some other solutions like SAS.

What other advice do I have?

As we transition to the Azure cloud, I expect that we will be using Databricks for workloads.

This is a product that I recommend for those who want to scale and have a good budget. It is good for automating a data pipeline and managing workloads. My advice for anybody who is starting to use it is to take the proper training.

Overall, based on my uses, I think that this product is pretty good.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1438992 - PeerSpot reviewer
Cloud & Infra Security, Group Manager at a tech vendor with 10,001+ employees
MSP
A scalable solution to quickly process and analyze streams of information
Pros and Cons
  • "Databricks helps crunch petabytes of data in a very short period of time."
  • "Costs can quickly add up if you don't plan for it."

What is our primary use case?

We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions.

My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.

What is most valuable?

Databricks helps crunch petabytes of data in a very short period of time for data scientists or business analysts. It helps with fraud analysis, finance, projections, etc. I like it.

This is exactly the purpose of big data and analytics. It provides the mechanism to process and analyze a stream of information. It's best for share analysis and stream analysis.

What needs improvement?

Costs can quickly add up if you don't plan for it. 

For how long have I used the solution?

I've been using Databricks for just over a year.

What do I think about the stability of the solution?

Databricks is stable. It also helps that their support is included as part of the service.

What do I think about the scalability of the solution?

Databricks is scalable. The only issue is how much money you have for it. For example, if you need to run 100 servers, there's an eight-course with 256 gigabytes of RAM. You run out of money easily. It's charged to your credit card or your account, and you'll have to pay for it if you don't plan for that in advance.

How are customer service and technical support?

Databricks technical support is excellent. They provided their responses on time, and they're useful. However, I don't have extensive experience with them.

Which solution did I use previously and why did I switch?

I have used different Microsoft solutions before.

How was the initial setup?

The initial setup depends on the readiness of the team working with Databricks. There is no one template saying that it's easy, and it isn't easy. It can be complex to set up if you don't have a really good plan.

You can get in this environment at least for a test. You can do it in the lab, follow it step by step, and that'll take about an hour. The difficulty depends on the business requirements. 

If it's for training purposes, you can do it in about half an hour, and you're good to go. If you need it to support a business, it will be much more rigorous because multiple divisions would be interested in running their own environment, working with their data.

What's my experience with pricing, setup cost, and licensing?

The price is okay. It's competitive. 

What other advice do I have?

If you're thinking of implementing Databricks, I would recommend working with professionals. It'll help you save time. Also, plan the work and work the plan. Otherwise, it'll be a waste of time and money.

On a scale from one to ten, I would give Databricks a nine.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
reviewer1334334 - PeerSpot reviewer
Data Scientist at a retailer with 5,001-10,000 employees
Real User
Quick development, reliable, has interactive clusters, and is priced per usage
Pros and Cons
  • "One of the features provides nice interactive clusters, or compute instances that you don't really need to manage often."
  • "I would like to see more documentation in terms of how an end-user could use it, and users like me can easily try it and implement use cases."

What is our primary use case?

Currently, I am using this solution for a forecasting project.

What is most valuable?

One of the features provides nice interactive clusters, or compute instances that you don't really need to manage often. You can just spin it off and use that for a lot of your pre-processing, which is very convenient. 

The normal features are very good in terms of doing some quick development or doing some EDA.

Also, one of the newest features brought into this solution provides you with a way to solve, deploy, and train models using the platform itself. Or, it can connect to your Azure Machine Learning in order to train, deploy, and productionalize some of the machine learning models.

What needs improvement?

Since the Databricks community is not that old, there is not a lot of information about some of the issues that we face. We have to go back to the Databricks stream to get some of the issue resolutions from there. 

As time passes, and more people start putting more information out there about this technology, wit will be helpful.

I think even with the features that we currently have, they're still optimizing some of the clusters and trying to parallelize to better read from other types of data. So, that's going really well in terms of one of the features that they recently came up with to include the data format for data, which was really good, and that speeds up a lot of the processes.

I would like to see more documentation in terms of how an end-user could use it, and users like me can easily try it and implement use cases.

For how long have I used the solution?

I have been using Databricks on a daily basis for over a year.

It's deployed on the cloud, so it's always up to date.

What do I think about the stability of the solution?

It's definitely quite stable, in terms of an enterprise solution. 

I'd say that it's pretty stable. 

You have these clusters running on-demand, and you can also come up with these clusters that are scheduled, and that can be run for your production jobs.

What's my experience with pricing, setup cost, and licensing?

The pricing depends on the usage itself. They measure the cost of the companies in town. It also depends on the type of cluster that you are using. If you are using a very heavy cluster, it would be the price per CPU.

What other advice do I have?

I would rate Databricks an eight out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.