Alex Tsui - PeerSpot reviewer
Sr. Director at Omnicell
Real User
A stable, scalable solution that simplifies the development process but needs more debuggers and components
Pros and Cons
  • "The simplicity of development is the most valuable feature."
  • "Databricks has a lack of debuggers, and it would be good to see more components."

What is our primary use case?

We use the solution for data engineering. 

How has it helped my organization?

The tool helps us manage large amounts of data. 

What is most valuable?

The simplicity of development is the most valuable feature. 

What needs improvement?

Databricks has a lack of debuggers, and it would be good to see more components. 

Another issue is that the D4 data format keeps changing on our cluster. This doesn't affect me much because I use functions to define it, but it is very frustrating for some more casual users. One day the output will be in a particular format, and then it becomes an object without us changing the cluster configuration. As a small team, we don't have the capacity to dig deeply into the issue, which has been frustrating.

Buyer's Guide
Databricks
April 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.

For how long have I used the solution?

We have been using the solution for three years. 

What do I think about the stability of the solution?

The solution's stability is good. 

What do I think about the scalability of the solution?

The product is scalable. We're a small organization with 12 users, and we don't currently have any plans to increase our usage.

What was our ROI?

We see an ROI from Databricks. 

What other advice do I have?

I would rate the solution seven out of ten. 

It's a good solution and more for handling large amounts of data. Databricks is better as a batch processing system than as an interactive system. The performance is a little disappointing because the memory processing is supposed to be excellent, but it's not as competitive as some other solutions out there in this regard. Even classical databases can respond and process faster.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
RichardXu - PeerSpot reviewer
Data Science Lead at a mining and metals company with 10,001+ employees
Real User
Top 10
Scalable and reliable, with helpful support
Pros and Cons
  • "It can send out large data amounts."
  • "It's not easy to use, and they need a better UI."

What is our primary use case?

We use this solution to build skill and text classification models.

What is most valuable?

The scalability brings value to this solution.

It can send out large data amounts.

What needs improvement?

The user experience can be improved. 

It's not easy to use, and they need a better UI.

For how long have I used the solution?

I have been dealing with Databricks for more than five years.

We used this solution last five months ago and used the most current version during that time.

What do I think about the stability of the solution?

This solution is quite stable. We have not had any issues with stability.

What do I think about the scalability of the solution?

It's a scalable solution. Very few people are using this solution in our organization. Most don't have the skill.

How are customer service and technical support?

We were using the free version which did not have a lot of support.

We didn't really need support at the time. I had one conversation with them and they were very nice. They were helpful.

Which solution did I use previously and why did I switch?

We are using Dataiku for one project and also SageMaker. We have some issues with scalability using SageMaker, which is why we may be going back to Databricks.

SageMaker is a very specific AI tool.

How was the initial setup?

The initial setup was okay.

What's my experience with pricing, setup cost, and licensing?

There are many different versions. 

We used the trial version, which was free.

What other advice do I have?

If you have a lot of data, Databricks is a good choice. 

With the migration of Microsoft and Databricks, they make it easy. It's the direction to go in.

It's a very good tool. I would rate Databricks a nine out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Databricks
April 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.
Sahil Taneja - PeerSpot reviewer
Principal Consultant/Manager at Tenzing
Real User
Top 5
Processes tremendous data easily
Pros and Cons
  • "The processing capacity is tremendous in the database."
  • "There is room for improvement in the documentation of processes and how it works."

What is our primary use case?

Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture. 

What is most valuable?

The processing capacity is tremendous in the database. We are dealing with Azure as storage, so we have not faced any challenges. And also the connectors to different data sources. Moreover, it is not a language-dependent tool. Therefore, development also takes place faster. It is one of the best features of Databricks.

What needs improvement?

There is room for improvement in the documentation of processes and how it works. I was trying to get one of the certifications, so I saw an area of improvement there. 

For how long have I used the solution?

I have been using Databricks for eight to nine months.

What do I think about the stability of the solution?

It is a stable product for us. We didn't see any challenges. 

What do I think about the scalability of the solution?

There are around 30 to 35 users in our organization. 

How was the initial setup?

The initial setup was easy because the third-party team made the clusters for us. 

What about the implementation team?

A third-party team enabled the cluster to make the setup easy for us. 

What other advice do I have?

I would advise using it based on the use case because it easily handles big data. It is your go-to tool if you are dealing with massive data. 

Overall, I would rate the solution a nine out of ten. The tool performs well in various use cases, availability of documentation online, and compatibility with big data systems like GCP, Azure, or AWS.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Lead Data Scientist at a manufacturing company with 10,001+ employees
Real User
Top 5
A great solution that has allowed for collaboration within our organization
Pros and Cons
  • "We have the ability to scale, collaborate and do machine learning."
  • "The product cannot be integrated with a popular coding IDE."

What is our primary use case?

Our primary use case for this solution is research for data scientists. The solution is deployed on cloud.

How has it helped my organization?

It has allowed our data engineers, data scientists, and analysts to collaborate and work on the same platform. 

What is most valuable?

We have the ability to scale, collaborate and do machine learning.

What needs improvement?

The product cannot be integrated with a popular coding IDE.

For how long have I used the solution?

We have been using this solution for approximately three years.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

The solution is scalable. There are five people using it in our organization.

How are customer service and support?

I rate my experience with customer service and support an eight out of ten.

Which solution did I use previously and why did I switch?

We previously used H2O.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

Implementation was done in-house.

What was our ROI?

We have seen a return on investments.

What's my experience with pricing, setup cost, and licensing?

Licensing costs are charged on a yearly basis and costs between 25,000 and 30,000.

Which other solutions did I evaluate?

We evaluated other options but this solution was the best fit for what we required.

What other advice do I have?

I rate this solution nine out of ten. The solution is good but can be improved by integrating with a popular coding IDE.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Anand Sharma - PeerSpot reviewer
Sr Data Engineer at PIMCO
Real User
Supports several coding languages, good performance, and facilitates team collaboration
Pros and Cons
  • "The load distribution capabilities are good, and you can perform data processing tasks very quickly."
  • "In the future, I would like to see Data Lake support. That is something that I'm looking forward to."

What is our primary use case?

Our primary use case is ETL.

How has it helped my organization?

Using Databricks enables us to use the Data Mesh methodology, where every team performs their own ETL.

What is most valuable?

The most valuable feature is the versatility of the ecosystem. You can write code in SQL, Python, or Java.

The load distribution capabilities are good, and you can perform data processing tasks very quickly.

You can save and share notebooks between different teams.

The interface is easy to use.

What needs improvement?

The cost of this solution is high, on the expensive side.

In the future, I would like to see Data Lake support. That is something that I'm looking forward to.

For how long have I used the solution?

I worked with Databricks for approximately two years in my previous company.

What do I think about the scalability of the solution?

This is a very scalable solution. We have twenty-five data engineers that use it, and we may grow our usage.

How are customer service and support?

The technical support is okay. I would rate them a seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We did not use another similar solution prior to Databricks.

How was the initial setup?

The cloud-based deployment is simple.

If you use an on-premises deployment then there is more to do.

What about the implementation team?

We deployed it with our in-house team.

There is no maintenance required.

What was our ROI?

We have seen a return on our investment with Databricks.

What's my experience with pricing, setup cost, and licensing?

Price-wise, I would rate Databricks a three out of five.

Which other solutions did I evaluate?

When we looked into Databricks, we evaluated Azure Data Factory and some of the others on the market. We found that Databricks was one of the easiest ones to use.

What other advice do I have?

My advice for anybody that is looking into Databricks is not to use the on-premises deployment. Instead, use the cloud-based setup.

In summary, this is a good product.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Rupal Sharma - PeerSpot reviewer
Data Architect at Three Ireland (Hutchison) - Infrastructure
Real User
Top 5
Processes large data for data science and data analytics purposes
Pros and Cons
  • "Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours."
  • "There is room for improvement in visualization."

What is our primary use case?

It's mainly used for data science, data analytics, visualization, and industrial analytics.

What is most valuable?

Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours.

So that's why it's quite convenient to use for data science, for training machine learning models. By using more computing power, you can make it even faster.

What needs improvement?

There is room for improvement in visualization.

For how long have I used the solution?

I used it for two years. I worked with the latest update. 

What do I think about the stability of the solution?

I would rate the stability a nine out of ten. I didn't face performance drops.

What do I think about the scalability of the solution?

I would rate the scalability an eight out of ten.

How are customer service and support?

Databrick's support is great. If we need any support, they are very quick with it. And they genuinely want you to use Databricks. So, whatever we ask them, they come up with multiple solutions to problem statements. That's really good.

Overall, the customer service and support are very good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I personally prefer using Databricks. However, we also considered using Snowflake, but the pricing was different. It's  price per query.

So, as per your storage, a data scientist or a data analytics team needs to query again and again, which does not suit a data-heavy organization.

What was our ROI?

It's a good return on investment for Databricks from a delivery perspective. Delivered multiple dashboards. So, it's quite a good return on investment. And being a small organization, everyone can use Databricks, and cost-wise, it's also good for small organizations.

Which other solutions did I evaluate?

If the company is a startup, Databricks might be suitable. If a big company needs a lot of storage, Teradata might be best for them. It depends on the situation.

What other advice do I have?

Overall, I would rate the solution a eight out of ten. I would definitely recommend this solution for small organizations. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Kevin McAllister - PeerSpot reviewer
Executive Manager at Hexagon AB
Real User
Top 5Leaderboard
Excellent data transformation but data-serving performance could be better
Pros and Cons
  • "Databricks' most valuable feature is the data transformation through PySpark."
  • "Databricks' performance when serving the data to an analytics tool isn't as good as Snowflake's."

What is our primary use case?

We mainly use Databricks to process ingest and do the ELT processes of data to get it ready for analytics and to serve the data to ThoughtSpot, which calls queries and Databricks to get the data.

How has it helped my organization?

We didn't have any good tooling for ELT processing prior to Databricks. We were using Microsoft HD Insight, but it was taking too long to process the data. When we changed our data-processing ELT processes over to Databricks, the amount of time to process the data was reduced to a fraction of what HD Insight used, so we were able to run jobs much faster.

What is most valuable?

Databricks' most valuable feature is the data transformation through PySpark.

What needs improvement?

Databricks' performance when serving the data to an analytics tool isn't as good as Snowflake's. In the next release, Databricks should include a better data-sharing platform to facilitate data sharing between companies.

For how long have I used the solution?

I've been using Databricks for three years.

What do I think about the stability of the solution?

Databricks' stability has been great, and I would rate it eight out of ten.

What do I think about the scalability of the solution?

Databricks is very scalable because it's very easy to spin up multiple clusters, but the cost of doing that is tremendous. I'd rate its scalability nine out of ten, but you'll pay for it.

How are customer service and support?

The technical support has been really bad, but that's because we don't have a direct agreement with Databricks.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I previously used HD Insight from Microsoft, but it took many, many hours to process data, so we switched to Databricks.

How was the initial setup?

The initial setup was pretty complex and required three people.

What about the implementation team?

We used an in-house team with some consulting help.

What was our ROI?

We've had a low ROI from Databricks.

What's my experience with pricing, setup cost, and licensing?

I would rate Databricks' pricing seven out of ten.

What other advice do I have?

I would advise anyone thinking of implementing Databricks to know their use case. For example, if you're looking for a big data repository to query data and do ELT processing, I recommend looking at other platforms, like Snowflake. However, if you're going to do AI and machine learning, then Databricks is probably stronger in that area. Overall, I would rate Databricks seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PankajKumar13 - PeerSpot reviewer
Computer Scientist at Adobe
Real User
Top 10
Pumps up performance and the processing power; comes with helpful Lakehouse and SQL environments
Pros and Cons
  • "When we have a huge volume of data that we want to process with speed, velocity, and volume, we go through Databricks."
  • "I believe that this product could be improved by becoming more user-friendly."

What is our primary use case?

Our primary use case is for data analytics. Essentially, we use it for the financial reporting for Adobe.

How has it helped my organization?

The way Databricks has improved my organization is definitely through giving us improved performance and the processing power. We are usually never able to achieve it using traditional data warehouses. When we have a huge volume of data that we want to process with speed, velocity, and volume, we go through Databricks.

What is most valuable?

The features I found most helpful with Databricks are the Lakehouse and SQL environments.

What needs improvement?

I believe that this product could be improved by becoming more user-friendly. 

In the next release, I would like to see more flexibility in the dashboard. It has plenty of features but it can be enhanced so that it matches with other visualization tools, like Power BI and Tableau. Also, the integrations with other tools could be better.

For how long have I used the solution?

I have been using Databricks for the last three years.

What do I think about the stability of the solution?

I would rate the stability of Databricks an eight, on a scale from one to 10, with one being the worst and 10 being the best.

What do I think about the scalability of the solution?

I would rate the scalability of this solution a nine, on a scale from one to 10, with one being the worst and 10 being the best. I would say there are around 2,000 to 3,000 users of this solution in our organization.

How are customer service and support?

I've been in contact with the Databricks support team and received timely support from them. I would rate their support an eight, on a scale from one to 10, with one being the worst and 10 being the best.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Prior to Databricks, we initially used Hadoop. Afterwards, we used HANA, SAP HANA, and the Microsoft SQL Server.

How was the initial setup?

The initial setup was relatively straightforward. I would rate it nine, on a scale from one to 10, with one being the easiest and 10 being the hardest.

There is no need to worry about the deployment as it can be done quickly. It is relatively automated. We used Terraform for auto-deployment, which happens in Azure. With Terraform, there are two options. As option one, you can deploy manually by creating services. For option two, you use Terraform and automate. Terraform is like infrastructure as a code where you can code the deployment part using it.

There were two or three persons involved in the deployment of this solution.

What other advice do I have?

The new version of the Databricks solution requires code maintenance. This is done by the platform team.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.