Try our new research platform with insights from 80,000+ expert users
Business Intelligence Coordinator Latam at a construction company with 5,001-10,000 employees
Real User
The capacity of use of the different types of coding is valuable
Pros and Cons
  • "The capacity of use of the different types of coding is valuable. Databricks also has good performance because it is running in spark extra storage, meaning the performance and the capacity use different kinds of codes."
  • "There would also be benefits if more options were available for workers, or the clusters of the two points."

What is our primary use case?

My company is a customer of Databricks. We use Data Science products for machine learning, engineering, and data preparation.

We have between five and eight people working on coding in Databricks. Indirectly, we have 1500 people consuming the data. We have plans to increase the usage of data bricks by 30% next year.

What is most valuable?

The capacity of use of the different types of coding is valuable. Databricks also has good performance because it is running in spark extra storage, meaning the performance and the capacity use different kinds of codes.

What needs improvement?

Databricks does not always have clear updates. Often we find an update in the tool but we are not really sure what has changed. We would appreciate better communication from Databricks. It could be in the form of a friendly warning that talks about the updates. 

There would also be benefits if more options were available for workers, or the clusters of the two points.

For how long have I used the solution?

I have been using Databricks for two years.

Buyer's Guide
Databricks
July 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
864,155 professionals have used our research since 2012.

What do I think about the stability of the solution?

Databricks is stable, however, we do find some errors and don't understand what has happened. Usually, they are resolved within a few minutes. I would say it is 95% stable.

What do I think about the scalability of the solution?

Scalability is really good.

How are customer service and support?

I have not had to contact Databrick's support other than through the deployment, which they helped a lot. 

How was the initial setup?

The initial setup of Databricks is straightforward and simple. It is not complex because they provide a lot of documentation. The deployment was fast, it took less than three days with five people assigned to the task.

What about the implementation team?

We implemented in-house. It is difficult to find a good consultant or reseller for Databricks in Brazil.

What's my experience with pricing, setup cost, and licensing?

We pay monthly on a pay as you go plan.

What other advice do I have?

With Databricks, you may have a lot of devices. It is important to use each cluster for each kind of process and then not use the small clusters. Using the bigger cluster you will receive better performance and the use is closer and will save you money. 

It is important to code it in parts because if you code it all in full you could find some problems with performance.

I would rate Databricks a 9 out of 10.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1708788 - PeerSpot reviewer
Technical Architect at a tech services company with 10,001+ employees
Real User
Facilitates robust solutions through collaboration but non-SQL users may struggle
Pros and Cons
  • "I like the ability to use workspaces with other colleagues because you can work together even without seeing the other team's job."
  • "Anyone who doesn't know SQL may find the product difficult to work with."

What is most valuable?

I like the ability to use workspaces with other colleagues because you can work together even without seeing the other team's job. So you can create a robust solution by working together with other professionals.

What needs improvement?

One area for improvement would be that anyone who doesn't know SQL may find the product difficult to work with. It would also be useful to have a remote support team inside Databricks, which would collect and analyze user feedback.

For how long have I used the solution?

I have been using Databricks since 2018.

How are customer service and support?

I had a little trouble with customer support but this was solved.

How was the initial setup?

The initial setup was a little complex because it was a new architecture for the customer, so there was nothing to compare it to in order to accelerate the project. This meant the deployment of the first project using Databricks took almost nine months and the second took almost a year.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Databricks
July 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
864,155 professionals have used our research since 2012.
it_user1526169 - PeerSpot reviewer
Advanced Analytics Lead at a pharma/biotech company with 1,001-5,000 employees
Real User
Better tailored code and automation capabilities needed, but easy to use
Pros and Cons
  • "The solution is easy to use and has a quick start-up time due to being on the cloud."
  • "The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration."

What is our primary use case?

Databricks can be used for large-scale data pre-processing and data transformations.

What is most valuable?

The solution is easy to use and has a quick start-up time due to being on the cloud.

What needs improvement?

The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration. There is a lot of code from places, such as GitHub, but it is not tailored for Databricks. It requires a lot of effort to bring the code to a level where it can be used with Databricks capabilities.

For how long have I used the solution?

I have been using Databricks for two months.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

Databricks is scalable.

How are customer service and technical support?

We did not have a need to use technical support.

How was the initial setup?

The installation is straightforward, and it took approximately one hour.

What about the implementation team?

We did the implementation and maintenance of the solution ourselves using approximately three engineers.

What's my experience with pricing, setup cost, and licensing?

The solution requires a subscription.

Which other solutions did I evaluate?

We are evaluating other solutions.

What other advice do I have?

I would recommend this solution for those wanting to process large data sets, but if it is to be used for smaller data sets, I would not recommend it.

I rate Databricks a five out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Head of Data & Analytics at a tech services company with 11-50 employees
Real User
Helpful integration with Python and notebooks, but it should be more user-friendly and less complicated to use
Pros and Cons
  • "The integration with Python and the notebooks really helps."
  • "Databricks is not geared towards the end-user, but rather it is for data engineers or data scientists."

What is our primary use case?

We are a consulting house and we employ solutions based on our customers' needs. We don't generally use products internally.

I am a certified data engineer from Microsoft and have worked on the Azure platform, which is why I have experience with Databricks. Now that Microsoft has launched Synapse, I think that there will be more use cases.

What is most valuable?

You can spin up an Azure Databricks clustered, and integrating with it is seamless.

The integration with Python and the notebooks really helps.

What needs improvement?

There is definitely room for improvement.

This is the type of solution where you need to have people with technical expertise to use it.  Other products are self-service and can be employed by end-users. Databricks is not geared towards the end-user, but rather it is for data engineers or data scientists. I'm not sure whether Databricks is working towards it, or not.

It would be nice if it were more user-friendly, where you don't have to rely on Power BI or a visualization tool. I know that there is integration in the notebook where you can do it, but still, the relationships and semantics make it more difficult. It would be better to do it right in Databricks. You could put them within the portal and I don't have to log out and bring that into Power BI and then visualize.

What do I think about the stability of the solution?

We have not done any major implementation yet, although I think it's stable to an extent. I can't comment on it in terms of benchmark and experiencing any issues. It works seamlessly in the places where I've used it.

What do I think about the scalability of the solution?

Our implementations have been small and we haven't needed to scale as of yet. 

Databricks can help you to build a data lake, and it's something that they need to help make more popular. People are slowly understanding it because if you look, there are lots of data lakes that people are trying to create. I'm not intimate with it, but the concept seems complicated. I think they need to write up something where videos can explain it better. What I have seen on YouTube is quite complicated for an end-user to understand.

How was the initial setup?

The initial setup is easy. It's not difficult when you are used to Azure.

What's my experience with pricing, setup cost, and licensing?

I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself. 

The cost is difficult to estimate. I've got customers who went to the cloud and then they realized that the costs were more, compared to what they used to be on-premises. Also, because our exchange rate is so weak, I would always advocate that prices being lower is better, although I don't know how feasible it is.

What other advice do I have?

From a purely technical perspective, I would rate Databricks and eight out of ten. However, there is a failure in terms of user adoption. After I look at other products, including Synapse, those are better. I still feel that Databricks is quite complicated for the average person.

I would rate this solution a five out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1269582 - PeerSpot reviewer
Data Architect at a tech services company with 201-500 employees
Real User
A reliable solution for processing and transforming data
Pros and Cons
  • "The fast data loading process and data storage capabilities are great."
  • "There are no direct connectors — they are very limited."

What is our primary use case?

We specialize in project consulting for our clients. Whenever we get the opportunity, we recommend Databricks to them.

What is most valuable?

The fast data loading process and data storage capabilities are great.

Based on the data loads and the performance, you can easily scale up the clusters.

What needs improvement?

Sometimes we experience issues connecting our database to Databricks. There are no direct connectors — they are very limited. This should be addressed and corrected in the next release.  

Reading past data can also be tricky as there is no data spectrum like you would find with Snowflake and other solutions. 

For how long have I used the solution?

We have been using Databricks for one and a half years.

What do I think about the scalability of the solution?

Both the scalability and the stability of Databricks is good.

How are customer service and technical support?

Technical support is good but I have not interacted with them directly. We have a point of contact. We used to interact with tech support on a regular basis and they would respond quickly. We would get a response on the same day based on the priority level. Keep in mind, my company is in a partnership with them which could be a factor in their quick response time.


How was the initial setup?

The initial setup was not very complex. We had it up and running in no time; it's a quick process.

What about the implementation team?

We have just one solution architect and one data architect who handle all maintenance-related issues. 

What other advice do I have?

I would recommend purchasing a package that includes technical support. Compared to other companies, they offer great support to their clients.

On a scale from one to ten, I would give Databricks a rating of eight.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Pre-sale Leader, Big Data Enterprise Solutions at Ness Technologies
Consultant
Easy to load and query data with SQL support, but it is difficult to deploy and the interface could be improved
Pros and Cons
  • "The most valuable feature is the ability to use SQL directly with Databricks."
  • "I have seen better user interfaces, so that is something that can be improved."

What is our primary use case?

My division works with Big Data and Data Science, and Databricks is one of the tools for Big Data that we work with. We are partners with Microsoft and we began working with this solution for one specific project in the financial industry.

What is most valuable?

The most valuable feature is the ability to use SQL directly with Databricks. That is the most relevant thing for my current project.

After deployment, it is easy to load files and query data.

What needs improvement?

I have seen better user interfaces, so that is something that can be improved.

It was quite hard to deploy.

For how long have I used the solution?

I have been using Databricks for about one year.

What do I think about the stability of the solution?

We have not found any bugs yet, although it is only the beginning of our work. I do not have enough information to say for sure.

What do I think about the scalability of the solution?

We have about 200 employees but it is only a small group using Databricks. We are at the beginning so scaling is not something we have had to do.

How are customer service and technical support?

We have not had to contact technical support because we are Microsoft partners and I am calling a colleague of mine who is helping me directly.

Which solution did I use previously and why did I switch?

I have used Snowflake and one of the differences is that Snowflake is much easier to deploy.

How was the initial setup?

The first deployment is difficult. It is not straightforward and you have to think about a lot of stuff. It is not really like a SaaS deployment and there are a lot of steps that you have to take.

What about the implementation team?

We have our own team, which includes colleagues from Microsoft. Because the current project is a large client, they would like to see this project succeed.

What's my experience with pricing, setup cost, and licensing?

We find Databricks to be very expensive, although this improved when we found out how to shut it down at night.

What other advice do I have?

Our client is a bank and some of the information can be shared outside of the organization, whereas some of the data is confidential and private. Using a purely on-premises solution would have made it more difficult to share information with the outside, which is one of the reasons that they wanted a cloud-based deployment. 

My advice for anybody who is considering this solution is that it is very good for unstructured or semi-structured data. If, however, you have structured data then I would recommend a columnar database like Snowflake or Vertica. These solutions are easier to deploy.

This is a good solution that is working well, but I don't think that it is really a SaaS.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer. Implementer
PeerSpot user
Tristan Bergh - PeerSpot reviewer
Data Scientist at a computer software company with 501-1,000 employees
Real User
Top 10
Good built-in optimization, easy to use with a great user interface
Pros and Cons
  • "The built-in optimization recommendations halved the speed of queries and allowed us to reach decision points and deliver insights very quickly."
  • "The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment."

What is our primary use case?

We are using this solution to run large analytics queries and prepare datasets for SparkML and ML using PySpark.

We ran on multiple clusters set up for a minimum of three and a maximum of nine nodes having 16GB RAM each.

For one ad hoc requirement, a 32-node cluster was required.

Databricks clusters were set for autoscaling and to time out after forty minutes of inactivity. Multiple users attached their notebooks to a cluster. When some workloads required different libraries, a dedicated cluster was spun up for that user.

How has it helped my organization?

Databricks took care of all the underlying cluster management seamlessly. We could configure our clusters to run and deliver results without any delays due to hardware configuration or installation issues.

Databricks allowed us to go from non-existent insights (because the datasets were just too large) to immediate and rich insights once the datasets were ingested into our PySpark notebooks.

What is most valuable?

Immense ease in running very large scale analytics, with a convenient and slick UI. This saved us from having to tweak, tune, dive into deeper abstractions, get involved in procurement, and also having to wait for other workloads to run.

The built-in optimization recommendations halved the speed of queries and allowed us to reach decision points and deliver insights very quickly. 

The Delta data format proved excellent. Databricks had already done the heavy lifting and optimized the format for large scale interactive querying. They saved us a lot of time.

What needs improvement?

The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment. Perhaps a few connectors that auto-deploy to a reporting server?

More parallelized Machine Learning libraries would be excellent for predictive analytics algorithms.

For how long have I used the solution?

I have been using this solution for three years.

What do I think about the stability of the solution?

This solution is stable and proved very robust. When very obvious programmatic recommendations were not followed, causing memory overruns on a driver, the clusters required restarting.

What do I think about the scalability of the solution?

Absolutely, seamlessly, and massively scalable, within only budgetary limits. Also, the product itself offers real-time efficiency and optimization recommendations. 

How are customer service and support?

So brilliant, it was never required. Their documentation is comprehensive, clear, simple, and thorough. 

Which solution did I use previously and why did I switch?

Previously I used Hive and Livy in Zeppelin on an in-house Hadoop installation. The queries constantly threw exceptions and timeouts and the necessary configuration changes proved time-consuming and problematic. Databricks, on the other hand, simply made all those problems vanish. 

How was the initial setup?

Setup and Support are single-click.

What about the implementation team?

We used an in-house team for implementation.

What was our ROI?

Our ROI was of the order of USD $75k per year for one deployment. We were able to switch our workloads from an onsite Hadoop cluster, billed to our department for more than USD $100k per year, to a Databricks workspace in the cloud for a quarter of that expenditure. 

Further, we were able to transparently and efficiently scale our queries to run under fifteen minutes per major analytics use case, while being subject to unstable queries and highly brittle data flow use cases from the in-house Hadoop cluster.

We are further reducing spending on our traditional RDBMS solution by offloading reporting workloads to the Databricks PySpark notebooks, which is reducing our expensive datacenter resources and freeing up RDBMS resources for OLTP loads. 

What's my experience with pricing, setup cost, and licensing?

Set up a cluster in your cloud of choice, but Databricks' service might also be very competitive as their pricing units will be built in. 

Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery.

Which other solutions did I evaluate?

I evaluated Hortonworks, Livy, and Zeppelin. These were unsuitable due to the unavailability of sufficiently skilled personnel.

What other advice do I have?

By investing in people skilled in data querying, Python coding, and even basic Data Science, a Databricks setup will reward the business. 

Once the Databricks data flows are established, it is a matter of a few incremental steps to opening up streaming and running up-to-the-minute queries, allowing the business to build its data-driven processes. 

Databricks continues to advance the state-of-the-art and will be my go-to choice for mission-critical PySpark and ML workflows. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1270416 - PeerSpot reviewer
Vice President, Business Intelligence and Analytics at a tech services company with 10,001+ employees
Real User
Stable cloud platform for data engineering and has a straightforward setup
Pros and Cons
  • "I haven't heard about any major stability issues. At this time I feel like it's stable."
  • "Pricing is one of the things that could be improved."

What is our primary use case?

We are still exploring the solution. We utilize it much, much better than their star schema models that they are trying to replace it with. We bring in Databricks and then see how they can leverage the additional analytical functionalities around the Databricks cloud. It's more in exploratory ways. We recommend Databricks, especially with the Azure cloud frameworks.

What needs improvement?

Pricing is one of the things that could be improved.

Also, there could be improvement in the visual analytics space there and on the machine learning functions. I haven't explored so I don't know about the functions and features that are there. If it is not there, then I think that's something which they should consider including.

For how long have I used the solution?

My team has been exploring Databricks for close to five or six months.

What do I think about the stability of the solution?

I haven't heard about any major stability issues. At this time I feel like it's stable.

What do I think about the scalability of the solution?

In terms of scalability, I think once we put it across for larger use-cases the scalability question will really arise. So we'll need detailed information. I assume that we will be able to scale up.

I think we do not have more than 10 people working on it now. Because we are in the earlier stages of implementation, it's more like a POC now. I really don't know whether it's been open for the larger audience yet.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

It is better to be installed with the help of integrators, or consultants, or with an experienced team.

What other advice do I have?

It's more data scientists using Databricks. I would call them power users trying to see how they can get a hand on it, though they are not data scientists. They try to understand it a little bit better for their future use.

On a scale of one to ten, I would rate it an eight, easy. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: July 2025
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.