I like the simplicity and ease of use.
You can deploy the solution to many clouds easily.
The initial setup is straightforward.
The solution offers a free community version.
I like the simplicity and ease of use.
You can deploy the solution to many clouds easily.
The initial setup is straightforward.
The solution offers a free community version.
The auto models can be improved.
We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks.
We need more connectors between on-premises and the cloud.
We'd like a more visual dashboard for analysis It needs better UI.
I've used the solution for one and a half months.
The solution is very stable. There are no bugs or glitches. It doesn't crash or freeze.
Scalability is no problem. At the beginning, we created a cluster, for example, and if we need more performance in the future, for example, or to accelerate the training, we can change the cluster. It's quite straightforward.
We have five people using the solution.
In one or two years, we'd like to promote the solution to clients and increase usage. Right now, the way it is used is limited. I know that some banks and aeronautics companies use it.
In terms of technical support, for now, we use the community.
We are also aware of KNIME, Azure Machine Learning, and Anaconda. In Anaconda, we use many frameworks, for example.
We started with other platforms, like Azure Machine Learning due to the fact that, with AutoML, it's easy to use. However, now that we have more skills, we need other tools or platforms like Databricks. It's a good platform to deploy and develop machine learning in employees.
The implementation is quite easy. It's not complex or difficult. The first time, I did it using a tutorial which was quite helpful. Later, I took a course. I know it quite well.
The deployment only takes a few days.
You only need to deploy or maintain the solution.
We did not need any outside assistance in terms of setting up the solution.
For us, this product is free. We use the community version.
I am interested in using the enterprise version, however. Whether we use it or not depends on the projects and customers we get.
I work with a solution provider. We are a Databrick customer.
We are not partners of Databricks. Only we are partnered with Microsoft Azure and Amazon AWS.
We are using the latest version of the solution. However, I do not know the exact version number.
I still need time with the solution before providing advice to others. I need to prepare the capacity internally. So far, it's been great.
I'd rate the solution eight out of ten.
The product has helped in data fabrication.
Databricks has helped us have a good presence in data.
The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice.
I have been using the product for more than six months.
I rate Databricks' an eight out of ten.
I rate the tool's scalability an eight out of ten.
The transition to Databricks was smooth.
Databricks' price is high.
I rate the solution a nine out of ten.
Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.
The most valuable features are data engineering and data science because we can create Notebooks on them. We can use any Python library to build data science models, or we can use libraries like Seaborn or Matplotlib to create charts based on data for data analysis. It is a really valuable capability.
Microsoft Azure has its learning environment on the Microsoft website. We can complete certifications, but the Databricks certification is more expensive than Microsoft. It costs between $2,000 and $2,500, and the knowledge is linked. They're also charged based on whether a person doesn't want to analyze large amounts of data. Hence, we want to have the capacity for free student users so that people can learn and build their professional skills.
We have been using the solution for approximately one year.
The solution is stable. Microsoft offers a public service, and we can get it from the Databricks website. Additionally, many companies use it to analyze their data or create a Spark cluster to run Python or SQL scripts based on their data. I rate the stability a nine out of ten.
The setup is quite easy, and Databricks has also partnered with Microsoft, so we get this service on Microsoft Azure.
We have seen a return on investment.
We have a pay-as-you-go subscription and pay for it based on our usage.
We chose this solution because my company uses Microsoft Azure for a project, and my role as a data engineer primarily focuses on data-related services. For storing data, we use Data Lake; similarly, for the data processing engine, we use Spark, which Databricks provides.
I rate the solution an eight out of ten. The solution is good but can be improved by including drag-and-drop features because it can be helpful for users who are unfamiliar with coding. I advise new users to have prior experience with Python or SQL before utilizing this solution if they use it for data science or model building.
I use Databricks to manage the setting up of data lakes for SaaS.
The biggest problem associated with the product is that it is quite pricey. We cannot find a better solution than Databricks in the market currently.
I have been using Databricks for a year.
It is an expensive tool. The licensing model is a pay-as-you-go one.
The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale.
For my general use cases, I would say that I am not a technical person, so I cannot explain to you how the tool helps with the area of data engineering tasks.
There is another team in my company that is involved in the use of machine learning and AI features in Databricks. My team is mostly into operations. The tool is used in a multi-country project.
For example, in my company, they make some shopping decisions related to solutions based on what is the product chosen by the whole company.
I rate the tool an eight out of ten.
We build data solutions for the banking industry. Previously, we worked with AWS, but now we are on Azure. My role is to assess the current legacy applications and provide cloud alternatives based on the customers' requirements and expectations.
Databricks is a unified platform that provides features like streaming and batch processing. All the data scientists, analysts, and engineers can collaborate on a single platform. It has all the features, you need, so you don't need to go for any other tool.
I like that Databricks is a unified platform that lets you do streaming and batch processing in the same place. You can do analytics, too. They have added something called Databricks SQL Analytics, allowing users to connect to the data lake to perform analytics. Databricks also will enable you to share your data securely. It integrates with your reporting system as well.
The Unity Catalog provides you with the data links and material capabilities. These are some of the unique features that fulfill all the requirements of the banking domain.
Every tool has room for improvement. Normally what happens, a solution will claim it can do ETL and everything else, but you encounter some limitations when you actually start. Then you keep on interacting with the vendor, and they continue to upgrade it. For example, we haven't fully implemented Databricks Unity Catalog, a newly introduced feature. We need to check how it works and then accordingly, there can be improvements in that also.
Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity.
I have been using Databricks for a year.
Databricks relies on scalability and performance. Every cloud vendor prioritizes scalability, high availability, performance, and security. These are the most important reasons to move to the cloud.
Deploying Databricks on the cloud is straightforward. It's not like an on-premise solution, where you must create a cluster and all those other prerequisites for big data.
I don't think it's challenging to maintain, but you need an expert programmer because Databricks isn't GUI-based. With GUI-based tools, building ETLs is drag-and-drop. Databricks entirely relies on coding, so you need skilled programmers to building your code, ETLs, etc.
The price of Databricks is based on the computing volume. You also need to pay storage costs for the cloud where you're hosting Databricks, whether it is AWS, Azure, or Google.
I rate Databricks nine out of 10. Databricks is one of the best tools on the market.
We use Databricks for batch data processing and stream data processing.
Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user.
The flexibility of Databricks is the most valuable feature. It gives us the ability to write analytics code in multiple languages.
There is a single workspace for different data roles like data engineers, machine learning engineers, and the end user, who can connect to the same system.
Databricks computes separate from storage, so you are not coupled with the underlying data sets, allowing for multiple processes and multiple programs to be written on the same code.
I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times. It should be more user-friendly.
In future releases, Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics.
I have been using Databricks for eight months.
Databricks is very stable.
The scalability of this solution is good. In our organization, users include analysts, data engineers, and data scientists.
I would give Databrick service and support a four and a half out of five overall.
Positive
Prior to using Databricks, we used Azure Stream Analytics. We made the switch because of the scalability and integrated platform.
The initial setup of Databricks is more complex. I would rate it a four out of five on the complexity of the setup. It took two days to deploy the solution.
We used a third party for some of the implementations of Databricks. The number of staff required to deploy and maintain this solution depends on the number of processes you have. Due to the cloud nature of the technology, it is easy to deploy and maintain.
The licensing of Databricks is a tiered licensing regime, so it is flexible. I feel their pricing is a five out of five.
Databricks is a one-stop shop for everything data related, and it can scale with you.
I would rate this solution a 9.5 out of 10 overall.
Our use case is confidential, but I can say we use it for a deep learning model for machine learning.
The solution is built from Spark and has integration with MLflow, which is important for our use case.
Databricks is also user-friendly, providing customizable codes and models that allow people to experiment quickly.
Integration of Delta Lake is another useful feature.
Writing pandas-profiling reports could be easier.
The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps.
I have been using this product for one and a half years.
For now the solution seems stable.
The solution is easy to scale horizontally and it has a useful auto-scaling feature. For vertical scaling, you need to bring the system down and make some adjustments.
On my current project I have a team of 30 members under me, including data engineers and data science people. Our data science, engineering, and MLOps projects are expanding, so we are planning to do some vertical scaling to increase the team size to over 100 members. In our company, we are trying to certify more and more people in Databricks because it's cloud-agnostic.
We have never needed to contact customer support, online resources have been sufficient to solve our problems.
The initial setup of the solution is straightforward, once you understand the UI it is easy to implement. I would rate Databricks a four out of five for ease of setup.
One migration project took two to three months, including writing all the code and implementing end-to-end pipelines.
We are planning to deploy the solution in stages over the next 15 months to completely implement MLOps for our organization.
I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five.
I find that deployed systems work out cheaper than having to operate manually, which appeals to our customers.
I would rate this solution an eight out of ten.
There is an issue where clusters are automatically deleted after termination or after 100 days of non-usage. This could be more user-friendly, and they could include an enabler to pin the clusters you want to keep, instead of having to go and research why clusters got deleted after implementing the product. That documentation needs to be right in front of the user to avoid issues.
I definitely recommend this product to other users.
We are using Databricks for machine learning workloads specifically.
Databricks aligns well with our skillset and overall approach. We sought out their solution specifically for a big data application we are currently working on, as we needed a platform capable of handling large amounts of data and building models. Additionally, the fact that they use open-source software and can integrate data warehouse and data lake systems was particularly appealing, as we have encountered such issues in the past. We determined that Databricks would be an effective solution for our needs.
The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production.
The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team.
The most important feature other than the Jupyter interface would be to have the RStudio interface inside Databricks. This would be perfect.
We have been using Databricks for approximately one year.
The stability of Databricks is good.
I rate the stability of Databricks a nine out of ten.
Databricks is scalable.
I rate the scalability of Databricks a nine out of ten.
I have been receiving responsive answers from Databricks's support. I have been pleased with the support.
I rate the support from Databricks a ten out of ten.
Positive
The initial setup of Databricks is simple. I did not experience any challenges. The time it takes for the deployment is approximately four hours.
I rate the initial setup of Databricks.
We did the deployment of the solution in-house. There were three people involved in the deployment. A data engineer, data analyst, and machine learning engineer.
We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective.
We only pay standard fees for the solution.
We use a data engineer, data analyst, and machine learning engineer for the maintenance of the solution.
I rate Databricks a nine out of ten.