Shiva Prasad ELLUR - PeerSpot reviewer
Vice President - Data Engineering and Analytics at a financial services firm with 10,001+ employees
Real User
Top 5
A good, but expensive, web-based platform for automated cluster management with some coding limitations
Pros and Cons
  • "We like that this solution can handle a wide variety and velocity of data engineering, either in batch mode or real-time."
  • "This solution only supports queries in SQL and Python, which is a bit limiting."

What is our primary use case?

We use this solution for advanced civilization power.

What is most valuable?

We like that this solution can handle a wide variety and velocity of data engineering, either in batch mode or real-time.

This product allows us to write the email models in a way that allows us to take the advantage of the parallel scaling computer window backend on any of the satellite services.

What needs improvement?

This solution only supports queries in SQL and Python, which is a bit limiting. 

This is a fairly expensive solution for any service outside of the basic package, and costs can add up quite quickly if there are large scaling requirements.

What do I think about the stability of the solution?

This is a stable solution in our experience.

Buyer's Guide
Databricks
April 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,740 professionals have used our research since 2012.

What do I think about the scalability of the solution?

We have found that part of the beauty of this platform is that it is easy to scale and expand.

How are customer service and support?

The support for this product uses Microsoft as a middle man, and due to this there have been times when we experienced communication delays, as well as misunderstandings of what our issues are.

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial setup for this solution is very simple.

What's my experience with pricing, setup cost, and licensing?

The basic version of this solution is now open-source, so there are no license costs involved. However, there is a charge for any advanced functionality and this can be quite expensive.

Which other solutions did I evaluate?

We looked at both Snowflake and BigQuery as a comparison with this solution. We choose this product as it offered more scalability and a higher level of security, which is extremely important in our banking environment.

What other advice do I have?

We would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Sr. BigData Architect at ITC Infotech
MSP
Very elastic, easy to scale, and a straightforward setup
Pros and Cons
  • "It's easy to increase performance as required."
  • "Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively."

What is our primary use case?

We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data. 

What is most valuable?

The elasticity of the solution is excellent.

The storage, etc., can be scaled up quite easily when we need it to.

It's easy to increase performance as required.

The solution runs on Spark very well.

What needs improvement?

Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively.

They're currently coming out with a new feature, which is Date Lake. It will come with a new layer of data compliance.

For how long have I used the solution?

We've been using the solution for two years.

What do I think about the stability of the solution?

I don't see any issues with stability going down to the cluster. It would certainly be fine if it's maintained. It's highly available even if things are dropped. It will still be up and running. I would describe it as very reliable. We don't have issues with crashing. There aren't bugs and glitches that affect the way it works.

What do I think about the scalability of the solution?

The system is extremely scalable. It's one of its greatest features and a big selling point. If a company needs to scale or expand, they can do so very easily.

We require daily usage from the solution even though we don't directly work with Databricks on a day to day basis. Due to the fact that we schedule everything we need and it will trigger work that needs to be done, it's used often. Do you need to log into the database console every day? No. You just need to configure it one time and that's it. Then it will deliver everything needed in the time required.

How are customer service and technical support?

We use Microsoft support, so we are enterprise customers for them. We raise a service request for Databricks, however, we use Microsoft. Overall, we've been satisfied with the support we've been given. They're responsive to our needs.

Which solution did I use previously and why did I switch?

We work with multiple clients and this solution is just one of the examples of products we work with. We use several others as well, depending on the client.

It's all wrappers between the same underlying systems. For example, Spark. It's all open-source. We've worked with them as well as the wrappers around it, whether the company was labeled Databrary, IBM insights, Cloudera, etc. These wrappers are all on the same open-source system.

If we with Azure data, we take over Databricks. Otherwise, we have to create a VM separately. Those things are not needed because Azure is already providing those things for us.

How was the initial setup?

The situation may have been a bit different for me than for many users or organizations. I've been in this industry for more than 15 or 17 years. I have a lot of experience. I also took the time to do some research and preparation for the setup. It was straightforward for me.

The deployment with Microsoft usually can be done in 20 minutes. However, it can take 40 to 45 minutes to complete. An organization only requires one person to upload the data and have complete access to the account.

What about the implementation team?

I deployed the solution myself. I didn't require any assistance, so I didn't enlist any resellers or consultants to help with the process.

What's my experience with pricing, setup cost, and licensing?

The solution is expensive. It's not like a lot of competitors, which are open-source.

What other advice do I have?

There isn't really a version, per se. 

It's a popular service. I'd recommend the solution. The solution is cloud-agnostic right now, so it really can go into any cloud. It's the users who will be leveraging installed environments that can have these services, no matter if they are using Azure or Ubiquiti, or other systems.

I don't think you can find any other tool or any other service that is faster them Databricks. I don't see that right now. It's your best option.

Overall, I'd rate the solution eight out of ten. The reason I'm not giving it full marks is that it's expensive compared to open source alternatives. Also, the configuration is difficult, so sometimes you need to spend a couple of hours to get it right.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Databricks
April 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,740 professionals have used our research since 2012.
Principal at a computer software company with 5,001-10,000 employees
Real User
Top 20
Has advanced modeling and machine-learning features; highly scalable, with no stability issues
Pros and Cons
  • "What I like about Databricks is that it's one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that."
  • "I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement."

What is our primary use case?

I've worked with Databricks primarily in the pharmaceuticals and life sciences space, which means a lot of work on patient-level data and the predictive analytics around that.

Another use case for Databricks is in the manufacturing industry. I'm a consultant, so the use cases for the product vary, but my primary use case for it is in the pharma space.

What is most valuable?

From a data science and applied analytics perspective, what I like about Databricks is that it's probably one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that, and then go ahead and make that available for dissemination of insights. For example, you can save all data and build out endpoints, so business analysts and users can access that data through a dashboard.

During the process, I also like that Databricks allows you to do portion control to keep track of your operations on the data and maintain that lineage to create reproducible results. 

The most significant Databricks advantage is that you can do everything within the platform. You don't need to exit the platform because it's a one-stop shop that can help you do all processes.

The solution is top-notch from a data science, applied ML, or advanced analytics perspective.

What needs improvement?

I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement. Still, I am generally unaware of any super-critical issues.

For how long have I used the solution?

My experience with Databricks is two and a half years.

What do I think about the stability of the solution?

Databricks stability is an eight out of ten because I never had issues with its stability.

What do I think about the scalability of the solution?

Databricks has high scalability. Most of my work on the solution has been in the pharma space, which has massive data sets, so it's a nine out of ten, scalability-wise.

How are customer service and support?

I've never dealt with the Databricks technical support team.

How was the initial setup?

I don't have experience setting up Databricks because that's generally taken care of by the IT, data, or software engineering team before the data science team comes in and starts leveraging the platform. I have yet to experience setting up the Databricks environment personally. However, I have had experience setting up clusters, which was pretty straightforward. Still, in the overall environment of an enterprise-wide system, I have yet to gain experience setting Databricks up.

What's my experience with pricing, setup cost, and licensing?

The cost for Databricks depends on the use case. I work on it as a consultant, so I'm using the client's Databricks, so it depends on how big the client is. If it's a global organization, that cost varies versus a smaller organization that has just adopted the platform and is trying to onboard a small team of five people. It depends.

What other advice do I have?

I'm a data scientist, so I frequently use Databricks and Domino Data Science Platform.

I'm a consultant, so every client has a different version or a different runtime in Databricks, so the versions used would vary per client.

The deployment for the solution is on the cloud, predominantly on AWS or Azure.

My clients adopted Databricks as the platform of choice, and with different use cases and more teams coming on board, the usage of Databricks will increase. I don't see that going down. It can only go up.

My advice to anyone looking into implementing Databricks is that it should be one of your top choices, especially if you're looking to focus on data processing, standard ETL operations, advanced analytics, or the ML type of work.

I'd rate the solution as nine out of ten. It checks almost all the boxes that modern applications need to have.

My organization is an active partner and implementer of Databricks, but it doesn't resell the solution.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
PraveenS - PeerSpot reviewer
Design Engineer at Cyient Limited
Real User
Top 5
A scalable and cost-effective solution that has excellent translation features and can be used for data analytics
Pros and Cons
  • "It is a cost-effective solution."
  • "The product should provide more advanced features in future releases."

What is our primary use case?

We use the solution for data analytics of industrial data.

What is most valuable?

We extensively use the product’s notebooks, jobs, and triggers. We can create activities. Wherever translation is required, we use Databricks. The product fulfills our customer requirements. It is a cost-effective solution.

What needs improvement?

The product should provide more advanced features in future releases.

For how long have I used the solution?

I have been using the solution for six months.

What do I think about the stability of the solution?

Our data was not too huge. It worked well. It is easily adaptable.

What do I think about the scalability of the solution?

The tool is scalable. We can make it available for a larger audience.

How was the initial setup?

The initial setup is not that difficult. I rate the ease of setup a seven out of ten. The solution is cloud-based. We use native services like Data Factory for orchestration. Sometimes, the customers require us to use Amazon as the cloud provider instead of Azure.

What's my experience with pricing, setup cost, and licensing?

The pricing is average.

What other advice do I have?

There are many services which are coming up. They are still in the preview stage. Overall, I rate the product an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user
Jeremy Salt - PeerSpot reviewer
Sr. Data Quality Analyst at Seek
Real User
Top 10
Can use different technologies to do data analysis and can quickly get data
Pros and Cons
  • "Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes."
  • "Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present."

What is our primary use case?

We use it for data analysis and testing of high volume web user behavioral data.

What is most valuable?

Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes.

I'm starting to build a solution using Delta Live Tables and Delta Live pipelines, and it is proving to be exceptionally easy to use. I have also been able to quickly implement a pipeline.

What needs improvement?

Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present.

For how long have I used the solution?

I've been using Databricks for a year.

What do I think about the stability of the solution?

It is a stable and reliable solution. I'd rate stability at eight out of ten.

What do I think about the scalability of the solution?

Databricks is absolutely scalable, and I'd rate scalability at eight out of ten. We probably have between 60 and 100 users in our organization, and we hope to increase usage in the future.

How are customer service and support?

The technical support staff we have worked with have been amazing. They helped us initially with our Delta Live pipelines. I would give them a rating of ten out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I have previously worked with Apache Hadoop, and Databricks is definitely a better product. It's much easier to get data quickly in Databricks. As a result, a lot of the drudgery is taken away. Whereas with Hadoop, it's a bit more tricky to get data together.

What's my experience with pricing, setup cost, and licensing?

We're charged on what the data throughput is and also what the compute time is.

What other advice do I have?

I'd strongly recommend giving Databricks a try. We have found it to be a fantastic tool that has accelerated some of our solutions. We're an AI-heavy shop, and there are a lot of data scientists using the MLflow capabilities. I hear a lot of good things from that side as well. From a data analysis point of view, Databricks has been fantastic, and I would rate it at eight on a scale from one to ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
STI Data Leader at grupo gtd
Real User
Top 5Leaderboard
Easy to use with a free community version and helpful documentation
Pros and Cons
  • "The solution offers a free community version."
  • "We'd like a more visual dashboard for analysis It needs better UI."

What is most valuable?

I like the simplicity and ease of use. 

You can deploy the solution to many clouds easily. 

The initial setup is straightforward.

The solution offers a free community version.

What needs improvement?

The auto models can be improved. 

We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks. 

We need more connectors between on-premises and the cloud. 

We'd like a more visual dashboard for analysis It needs better UI. 

For how long have I used the solution?

I've used the solution for one and a half months. 

What do I think about the stability of the solution?

The solution is very stable. There are no bugs or glitches. It doesn't crash or freeze. 

What do I think about the scalability of the solution?

Scalability is no problem. At the beginning, we created a cluster, for example, and if we need more performance in the future, for example, or to accelerate the training, we can change the cluster. It's quite straightforward. 

We have five people using the solution. 

In one or two years, we'd like to promote the solution to clients and increase usage. Right now, the way it is used is limited. I know that some banks and aeronautics companies use it.

How are customer service and support?

In terms of technical support, for now, we use the community. 

Which solution did I use previously and why did I switch?

We are also aware of KNIME, Azure Machine Learning, and Anaconda. In Anaconda, we use many frameworks, for example.

We started with other platforms, like Azure Machine Learning due to the fact that, with AutoML, it's easy to use. However, now that we have more skills, we need other tools or platforms like Databricks. It's a good platform to deploy and develop machine learning in employees.

How was the initial setup?

The implementation is quite easy. It's not complex or difficult. The first time, I did it using a tutorial which was quite helpful. Later, I took a course. I know it quite well. 

The deployment only takes a few days. 

You only need to deploy or maintain the solution. 

What about the implementation team?

We did not need any outside assistance in terms of setting up the solution. 

What's my experience with pricing, setup cost, and licensing?

For us, this product is free. We use the community version.

I am interested in using the enterprise version, however. Whether we use it or not depends on the projects and customers we get.

What other advice do I have?

I work with a solution provider. We are a Databrick customer.

We are not partners of Databricks. Only we are partnered with Microsoft Azure and Amazon AWS.

We are using the latest version of the solution. However, I do not know the exact version number. 

I still need time with the solution before providing advice to others. I need to prepare the capacity internally. So far, it's been great.

I'd rate the solution eight out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Elizabeth Ho - PeerSpot reviewer
Manager, Customer Journey at a retailer with 10,001+ employees
Real User
Top 20
You can connect multiple data sources and share work easily
Pros and Cons
  • "I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature."
  • "I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data."

What is our primary use case?

I use Databricks for customer marketing analytics.

What is most valuable?

Databricks lets you schedule jobs pretty easily, and you can use SQL, Spark SQL, Python, or R. It also allows you to save a table or view. 

I like that you can connect to multiple data sources. Most of our data is stored in the Azure data lake, but my previous company connected to SQL databases or even blob storage. 

They've improved on many features. I don't do data engineering, but I had an issue a couple of years ago at my two companies ago. It took a long time to read and save tables, but I think the new Delta feature helped. 

I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature.

What needs improvement?

I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data. 

Because I work in analytics and not data engineering, I think that's probably the biggest one. There are better graphical tools, so I don't think Databricks can compete. You can do a simple graph, and it's not that great. However, I don't think they can ever stack up to Tableau, so it's probably not worth it to improve upon that. 

For how long have I used the solution?

I've been using Databricks for two years.

What do I think about the stability of the solution?

Databricks is stable.

What do I think about the scalability of the solution?

Databricks is scalable.

How are customer service and support?

Databricks tech support has been great every time I've dealt with them. Their team is highly knowledgeable. 

How was the initial setup?

Setting up Databricks is easy. I set it up at my previous company. That was on Azure as well, but they utilized a third-party team with expertise in Databricks to ensure everything was optimized. 

What other advice do I have?

I rate Databricks 10 out of 10. I recommend taking advantage of Databricks support or a third-party provider to ensure it's set up optimally. I don't know if it's an additional service you must pay for, but we always had access to Databricks support in my last company. 

I think that's worth the money because there are so many different scenarios with distributed computing. Even people who study analytics may not understand the ins and out of Spark. It's worth it to have a service contract for support.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
DevSmita Asthana - PeerSpot reviewer
Strategic Alliances& Ecosystems Manager at a outsourcing company with 501-1,000 employees
MSP
Top 10
Helps to have a good data presence but needs to incorporate learning aspects
Pros and Cons
  • "Databricks has helped us have a good presence in data."
  • "The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice."

What is our primary use case?

The product has helped in data fabrication. 

How has it helped my organization?

Databricks has helped us have a good presence in data. 

What needs improvement?

The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice. 

For how long have I used the solution?

I have been using the product for more than six months. 

What do I think about the stability of the solution?

I rate Databricks' an eight out of ten. 

What do I think about the scalability of the solution?

I rate the tool's scalability an eight out of ten. 

How was the initial setup?

The transition to Databricks was smooth. 

What's my experience with pricing, setup cost, and licensing?

Databricks' price is high. 

What other advice do I have?

I rate the solution a nine out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer:
Flag as inappropriate
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.