Try our new research platform with insights from 80,000+ expert users
Data Science Consultant at Syniti
Consultant
Good performance, easy to set up, and easy to use if you have a Python background
Pros and Cons
  • "I work in the data science field and I found Databricks to be very useful."
  • "It would be very helpful if Databricks could integrate with platforms in addition to Azure."

What is our primary use case?

We are building internal tools and custom models for predictive analysis. We are currently building a platform where we can integrate multiple data sources, such as data that is coming from Azure, AWS, or any SQL database. We integrate the data and run our models on top of that.

We primarily use Databricks for data processing and for SQL databases.

What is most valuable?

I found that PySpark is the most useful tool. It uses in-memory calculation and when you want to run a model it does it very quickly. We used to use Python and when we migrated to PySpark the performance was much better.

What needs improvement?

It would be very helpful if Databricks could integrate with platforms in addition to Azure.

Having an open-source version or having the option to get a trial version of Databricks would be very helpful.

It would be very useful for beginners if there were tutorials and examples on how to write code for PySpark, R, or Scala. Having examples would give people something to refer to and play with.

For how long have I used the solution?

We have been using Databricks for the past two or three years.

Buyer's Guide
Databricks
June 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

What do I think about the stability of the solution?

A couple of times I faced an issue where a long-running process was consuming a lot of time and then stopped abruptly. It necessitated starting the process again.

What do I think about the scalability of the solution?

We are in the prototyping stage so we do not plan on increasing our usage yet.

How are customer service and support?

We have not been in contact with technical support.

Which solution did I use previously and why did I switch?

Before using Databricks, we were running our own cluster with a web server that executed our Python queries.

How was the initial setup?

The initial setup is straightforward. With respect to deployment, the development can be done within half an hour and we can use code and deploy from there.

What about the implementation team?

We implemented Databricks on our own. We haven't deployed as such, as we are just running our queries and it is not in production yet.

What other advice do I have?

I work in the data science field and I found Databricks to be very useful. If I want to run any models then I can code them in PySpark. If you are coming from a Python background then you can write code in PySpark and it runs quickly. This is a good solution in terms of performance. 

I would rate this solution a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
it_user1235523 - PeerSpot reviewer
Machine Learning Engineer at a tech vendor with 51-200 employees
Real User
A convenient notebook, good stability, and a straightforward setup
Pros and Cons
  • "The most valuable aspect of the solution is its notebook. It's quite convenient to use, both terms of the research and the development and also the final deployment, I can just declare the spark jobs by the load tables. It's quite convenient."
  • "The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets."

What is our primary use case?

We primarily use the solution to run current jobs; to run the spark jobs as the current job.

What is most valuable?

The most valuable aspect of the solution is its notebook. It's quite convenient to use, both terms of the research and the development and also the final deployment, I can just declare the spark jobs by the load tables. It's quite convenient.

What needs improvement?

The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets.

The support could be improved a bit around the database. When we stream it to Data Lake, some data cannot be loaded. It should be a priority to fix this.

For how long have I used the solution?

I've been using the solution for half a year.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

The solution is scalable. However, it still needs us to manually set out the number of nodes in a cluster. It's really dependent on the application. Sometimes, when the tasks are bigger, and it gets a little difficult for us to define the number of nodes in a cluster. If the solution could allow users to set up the clusters, I think that'll be good.

Currently, we have three people using the solution. We may increase usage in the future.

How are customer service and technical support?

The technical support is quite good. In the beginning, when we had a few POC projects, they were very supportive.

Which solution did I use previously and why did I switch?

We didn't previously use a different solution, however, we built our own from scratch. This is the first unified platform that we've used.

How was the initial setup?

The initial setup is very straightforward. We just use their job functions. To deploy as a spark job is quite straightforward. 

In our use case, we also had some external databases to handle the deployment. For example, we only generated some prediction results. We saved the results into an external database. The solution takes time to deploy to the external database, but the spark job is quite easy.

What other advice do I have?

I'm a software development engineer. I'm working with the latest version.

As long as the developers have an understanding of spark, and understanding technical tricks, it's very fast in terms of using the database.

I'd rate the solution eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Databricks
June 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
it_user1149000 - PeerSpot reviewer
Data Science Developer at a tech services company with 501-1,000 employees
Real User
Good performance and support for big data, built-in machine learning libraries are powerful
Pros and Cons
  • "Databricks is based on a Spark cluster and it is fast. Performance-wise, it is great."
  • "It should have more compatible and more advanced visualization and machine learning libraries."

What is our primary use case?

We use this solution for streaming analytics. We use machine learning functions that output to the API and work directly with the database.

How has it helped my organization?

Prior to using Azure Databricks in the cloud, we had Databricks installed in clusters. Since our implementation, the performance has increased and our cost has been reduced.

What is most valuable?

Databricks is based on a Spark cluster and it is fast. Performance-wise, it is great.

This solution has very good machine learning libraries built-in.

The support for big data is good.

What needs improvement?

Databricks should have more libraries for predictive analysis and machine learning.

It should have more compatible and more advanced visualization and machine learning libraries. As it is now, I have to try a customer algorithm in order for things to be compatible.

I would like to see more deep learning analytics.

For how long have I used the solution?

I have been using Databricks for about one year.

What do I think about the stability of the solution?

This is a cluster-based solution, so it is stable.

What do I think about the scalability of the solution?

We started using Databricks with a small PoC application, and then we developed it into a larger one. It's scalable, and it's a simple process to scale.

We have eight people in our team who are using this solution. We do not plan to increase usage at this time.

How are customer service and technical support?

I did not contact technical support myself, but when one of our team members contacted them they were given good answers. I would say that the support is good.

How was the initial setup?

It is not difficult to deploy this solution because it is well documented. We followed the normal steps that included all of the APIs.

What's my experience with pricing, setup cost, and licensing?

I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly.

What other advice do I have?

Databricks has been good and I like it. However, it would be improved with the enhancement of the machine learning libraries, and with the inclusion of visualization libraries.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user1146978 - PeerSpot reviewer
Business Intelligence and Analytics Consultant at a tech services company with 201-500 employees
Consultant
Easy to switch loads between clusters and automation is easy using the API
Pros and Cons
  • "Automation with Databricks is very easy when using the API."
  • "Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."

What is our primary use case?

I am a developer and I do a lot of consulting using Databricks.

We have been primarily using this solution for ETL purposes. We also do some migration of on-premises data to the cloud.

What is most valuable?

The most valuable feature is the ability to switch loads between multiple clusters.

Automation with Databricks is very easy when using the API.

The ability to write code and SQL in the same interface is useful.

It is easy to connect notebooks to a cluster.

There are a large number of inbuilt functions that help to make things easier.

What needs improvement?

Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems. As it is now, we have to go into the driver logs to identify the error messages properly. 

There is not much information about Databricks available online, such as cost. Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful.

I would like to see integration with Power BI or Tableau for the business users. They may use Databricks to check on things, but it will be a little bit complicated for them. The GUI interfaces for Tableau and Power BI are ones that they are used to, so the integration would help.

For how long have I used the solution?

I have been using Databricks for about five and a half years.

What do I think about the stability of the solution?

We have found that in the development environment, Databricks is pretty stable. We have had problems where something works in development but does not work in production, and this can happen when the version is updated and certain features have been deprecated. This means that more testing is required before moving to production, but this is the only drawback that we have seen.

Basically, when we move across version we have found issues, but otherwise, it's pretty stable.

What do I think about the scalability of the solution?

Scalability is one of the main features of Databricks. We have used datasets that are one hundred megabytes in size up to one terabyte, and we can manage, so it's easily scalable.

We have a large company with between 400 and 500 people using this solution.

How are customer service and technical support?

We have not reached out for technical support on Databricks.

How was the initial setup?

I found the initial setup easy because I had previously worked on Spark.

If somebody goes through the training, which is available on the website, then it should be straightforward. I don't think that it is very hard.

When it comes to developing things based on use cases, it can take between three days and two weeks, plus two to three days for testing and deploying it. I would say that for an entire use case, it will take a maximum of three weeks.

What other advice do I have?

My advice for developers who are interested in working with this solution is to first go through the Spark architecture.

I would rate this solution a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user1050483 - PeerSpot reviewer
CEO at Inosense
Real User
Great for dealing with huge amounts of data and it is easy to connect to different sources of data
Pros and Cons
  • "We are completely satisfied with the ease of connecting to different sources of data or pocket files in the search"
  • "The integration features could be more interesting, more involved."

What is our primary use case?

Our primary use case is really DevOps, for integration and continuous development. We've combined our database with some components from Azure to deploy elements in Sandbox for our data scientists and for our data engineers. 

What is most valuable?

Valuable features would have to include the Notebook for piping some models and the future of executing the notebooks in parallel, in batches, which is also something that we use. And we use the Notebook on Spark with Python. 

What needs improvement?

Improvements could include the pricing, the product is a little expensive, although I think comparable to other similar options. The integration features could be more interesting, more involved. For example, we use the Database Notebook, which is not as great as Jupyter Notebook, for providing a great user experience. The look and feel are not the same and we've had complaints from some of our users. They say that it's easier and more productive for them to use Jupyter Notebook.

And then there is the integration feature for connecting to data sources, for example, Jupyter Notebook through publishes connect. The problem is that when you do that, you don't get all the Jupyter features which is a shame for us. 

For additional features, having some PyTorch or TensorFlow type features inside would definitely be great. For now, my users are developing for themselves by importing their libraries into their Notebook and then creating models based on the potential flow of PyTorch. That requires a lot of imports, particularly library imports, something that is now available in the new version of  Machine Learning services. These things are very important because the self appliance community has shifted from the traditional way of preparing models, to a deeper learning system. It's now more common to have those features. 

For how long have I used the solution?

I've been using the product inside Azure for about six months now. 

What do I think about the stability of the solution?

Given my experience, the product is very stable. 

What do I think about the scalability of the solution?

The product is quite easy to scale and increasing the number of users is quite simple. 

Which solution did I use previously and why did I switch?

We previously used the earlier version of Azure Machine Learning services and we decided to move over because over time it became more difficult to deploy. That was two years ago, but now with the new version, it's much easier to deploy Machine Learning.

How was the initial setup?

The setup is straightforward, I did it myself. 

What other advice do I have?

The product has improved and I'm sure this will continue in the next versions. We are completely satisfied with it, the ease of connecting to different sources of data or pocket files in the search. 

I think it could be very interesting for users looking for a framework to use Databricks. I would, however, recommend a more complicated architecture for using Databricks and achieving a great result for end-users. 

I would rate this product an eight out of 10. 

Which deployment model are you using for this solution?

Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2144922 - PeerSpot reviewer
Data engineer
Real User
A stable solution that can be scaled depending on the project, but the price could be cheaper
Pros and Cons
  • "The setup was straightforward."
  • "The pricing of Databricks could be cheaper."

What is our primary use case?

I primarily use the solution in two conditions: machine learning and big data computing.

What needs improvement?

The pricing of Databricks could be cheaper. The solution can also improve by providing more intelligence to the coder.

For how long have I used the solution?

I have been using Databricks for the past two years.

What do I think about the stability of the solution?

The solution is stable. I would rate the stability a seven out of ten.

What do I think about the scalability of the solution?

The scalability depends on the project. At present, around 20 people use the solution in my company.

How are customer service and support?


How was the initial setup?

The setup was straightforward. It also depends on the projects.

What about the implementation team?

The deployment process was automated.

Which other solutions did I evaluate?

Evaluating solutions is not my work. I depend on Databricks.

What other advice do I have?

I rate Databricks a seven out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Enterprise Data Architect at a financial services firm with 51-200 employees
Real User
Assists with quickly computing a considerable amount of historical data and helps us with data ingestion
Pros and Cons
  • "Its lightweight and fast processing are valuable."
  • "The Databricks cluster can be improved."

What is our primary use case?

Our primary use case for this solution is for data ingestion and the DQ rules we are implementing. We deploy the solution on Azure cloud.

How has it helped my organization?

Whenever we send data to downstream applications for creating a file, multiple business rules are involved, and this solution assists with quickly computing a considerable amount of historical data.

What is most valuable?

Its lightweight and fast processing are valuable.

What needs improvement?

The product could include some UI features to improve the ease of use, like drag and drop for a few aggregated functions. Additionally, the Databricks cluster can be improved.

For how long have I used the solution?

We have been using Databricks for approximately two years and are currently using the latest version.

What do I think about the stability of the solution?

The solution is very stable. However, sometimes it intermittently restarts. I rate the stability an eight out of ten.

What do I think about the scalability of the solution?

The solution is scalable, and we are trying to implement more use cases with Databricks in our organization as we advance. I rate the scalability an eight out of ten.

How are customer service and support?

I rate customer service and support a nine out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup was not very complex. We deploy the solution manually and the time required depends on the complexity of the business logic. I rate it an eight out of ten.

What about the implementation team?

We implemented the solution through an in-house team.

What other advice do I have?

I rate the solution an eight out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer: Gold Partners
PeerSpot user
reviewer1888527 - PeerSpot reviewer
Big Data and Cloud Architect at a computer software company with 201-500 employees
Real User
Excellent workspace and notebooks
Pros and Cons
  • "Databricks' most valuable features are the workspace and notebooks. Its integration, interface, and documentation are also good."
  • "Databricks' technical support takes a while to respond and could be improved."

What is our primary use case?

I primarily use Databricks for data pipelines.

What is most valuable?

Databricks' most valuable features are the workspace and notebooks. Its integration, interface, and documentation are also good.

For how long have I used the solution?

I've been working with Databricks for around five years.

What do I think about the stability of the solution?

Databricks is stable.

What do I think about the scalability of the solution?

Databricks is scalable.

How are customer service and support?

Databricks' technical support takes a while to respond and could be improved.

How was the initial setup?

The initial setup was easy.

What's my experience with pricing, setup cost, and licensing?

Databricks' cost could be improved.

What other advice do I have?

I would give Databricks a rating of eight out of ten.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.