

Databricks and Amazon SageMaker are key players in the big data analytics and machine learning market, offering extensive tools and services. Databricks stands out for its integration capabilities and interactive notebooks, while Amazon SageMaker is often recognized for its strong deployment within the AWS ecosystem and model tuning capabilities.
Features: Databricks offers interactive notebooks supporting both Python and SQL, automation capabilities, and seamless integration with Spark. Its performance and scalability are highly rated. Amazon SageMaker is integrated well within the AWS ecosystem, featuring robust deployment services, user-friendly SageMaker Studio, and flexible machine learning pipelines.
Room for Improvement: Databricks could improve visualization capabilities, offer better integration with tools like Power BI and Tableau, and provide better pricing transparency. Enhancing support for machine learning libraries would be beneficial. Amazon SageMaker could work on simplifying its interface for beginners, offering clearer documentation, and improving cost management due to high pricing, especially over extended periods.
Ease of Deployment and Customer Service: Databricks is deployable on both public and hybrid clouds, allowing flexible infrastructure use, and receives positive feedback for customer service and responsive technical support. Amazon SageMaker excels at deployment within AWS's ecosystem but faces criticism for complex interfaces and documentation gaps, impacting customer experience despite comprehensive service support.
Pricing and ROI: Databricks’ pay-as-you-go model can result in high costs with large data volumes. Users find it cost-intensive but acknowledge its value through faster analytics and reduced data processing costs. Amazon SageMaker also has a premium pricing model, with costs quickly accumulating without proper management. Users report that while it is more expensive than some competitors, its integration with AWS services results in a positive ROI.
The return on investment varies by use case and offers significant value in revenue increases and cost saving capabilities, especially in real time fraud detection and targeted advertisements.
For a lot of different tasks, including machine learning, it is a nice solution.
When it comes to big data processing, I prefer Databricks over other solutions.
The technical support from AWS is excellent.
The support is very good with well-trained engineers.
The response time is generally swift, usually within seven to eight hours.
Whenever we reach out, they respond promptly.
As of now, we are raising issues and they are providing solutions without any problems.
I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features.
The availability of GPU instances can be a challenge, requiring proper planning.
It works very well with large data sets from one terabyte to fifty terabytes.
Amazon SageMaker is scalable and works well from an infrastructure perspective.
The patches have sometimes caused issues leading to our jobs being paused for about six hours.
Databricks is an easily scalable platform.
I would rate the scalability of this solution as very high, about nine out of ten.
There are issues, but they are easily detectable and fixable, with smooth error handling.
The product has been stable and scalable.
I rate the stability of Amazon SageMaker between seven and eight.
They release patches that sometimes break our code.
Although it is too early to definitively state the platform's stability, we have not encountered any issues so far.
Databricks is definitely a very stable product and reliable.
Having all documentation easily accessible on the front page of SageMaker would be a great improvement.
This would empower citizen data scientists to utilize the tool more effectively since many data scientists do not have a core development background.
Integration of the latest machine learning models like the new Amazon LLM models could enhance its capabilities.
Adjusting features like worker nodes and node utilization during cluster creation could mitigate these failures.
We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly.
We use MLflow for managing MLOps, however, further improvement would be beneficial, especially for large language models and related tools.
The cost for small to medium instances is not very high.
For a single user, prices might be high yet could be cheaper for user-managed services compared to AWS-managed services.
The pricing can be up to eight or nine out of ten, making it more expensive than some cloud alternatives yet more economical than on-premises setups.
It is not a cheap solution.
SageMaker supports building, training, and deploying AI models from scratch, which is crucial for my ML project.
They offer insights into everyone making calls in my organization.
The most valuable features include the ML operations that allow for designing, deploying, testing, and evaluating models.
Databricks' capability to process data in parallel enhances data processing speed.
The platform allows us to leverage cloud advantages effectively, enhancing our AI and ML projects.
The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse.
| Product | Market Share (%) |
|---|---|
| Databricks | 12.3% |
| Amazon SageMaker | 5.4% |
| Other | 82.3% |


| Company Size | Count |
|---|---|
| Small Business | 12 |
| Midsize Enterprise | 11 |
| Large Enterprise | 16 |
| Company Size | Count |
|---|---|
| Small Business | 25 |
| Midsize Enterprise | 12 |
| Large Enterprise | 56 |
Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Amazon SageMaker removes all the barriers that typically slow down developers who want to use machine learning.
Databricks offers a scalable, versatile platform that integrates seamlessly with Spark and multiple languages, supporting data engineering, machine learning, and analytics in a unified environment.
Databricks stands out for its scalability, ease of use, and powerful integration with Spark, multiple languages, and leading cloud services like Azure and AWS. It provides tools such as the Notebook for collaboration, Delta Lake for efficient data management, and Unity Catalog for data governance. While enhancing data engineering and machine learning workflows, it faces challenges in visualization and third-party integration, with pricing and user interface navigation being common concerns. Despite needing improvements in connectivity and documentation, it remains popular for tasks like real-time processing and data pipeline management.
What features make Databricks unique?In the tech industry, Databricks empowers teams to perform comprehensive data analytics, enabling them to conduct extensive ETL operations, run predictive modeling, and prepare data for SparkML. In retail, it supports real-time data processing and batch streaming, aiding in better decision-making. Enterprises across sectors leverage its capabilities for creating secure APIs and managing data lakes effectively.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.