Databricks and Amazon Kinesis both compete in the data analytics and streaming domain. Databricks appears to have the upper hand in scalability and integration owing to its versatile Lakehouse architecture, while Amazon Kinesis leads in real-time streaming capabilities within AWS environments.
Features: Databricks offers large-scale analytics with built-in optimization, flexible programming options, and seamless integration with various environments. Its Lakehouse architecture and collaborative features resonate with data professionals. Amazon Kinesis excels in real-time data streaming, capturing, processing, and analyzing large data volumes. Its integration and management services simplify streaming processes significantly.
Room for Improvement: Databricks could improve visualization capabilities, data source integration, and machine learning libraries. Pricing and clearer documentation are also points that users suggest need attention. For Amazon Kinesis, better documentation, scalability enhancements, and more automation of functions are areas for improvement. Users also seek more data retention options and reduced latency.
Ease of Deployment and Customer Service: Both Databricks and Amazon Kinesis are mostly deployed on public cloud environments. Databricks supports hybrid and multi-cloud deployments with good customer service, although it has a learning curve for technical expertise. Amazon Kinesis integrates smoothly within AWS and allows straightforward deployment for real-time applications but requires familiarity with AWS systems.
Pricing and ROI: Databricks is considered expensive, suitable for enterprises with a focus on batch processing and data management, enabling ROI through infrastructure cost savings. It offers flexible pricing models but requires efficient use to justify costs. Amazon Kinesis provides competitive pricing for streaming, with cost-effectiveness noted in comparison to building separate structures, enhancing ROI as part of AWS without extra hardware.
With Lambda, there is no need for data transfer charges, which is beneficial for less frequent workloads.
When it comes to big data processing, I prefer Databricks over other solutions.
For a lot of different tasks, including machine learning, it is a nice solution.
We receive prompt support from AWS solution architects or TAMs.
As of now, we are raising issues and they are providing solutions without any problems.
Whenever we reach out, they respond promptly.
I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features.
Amazon Kinesis provides auto-scaling with streams that handle large volumes well.
Databricks is an easily scalable platform.
I would rate the scalability of this solution as very high, about nine out of ten.
The patches have sometimes caused issues leading to our jobs being paused for about six hours.
Although it is too early to definitively state the platform's stability, we have not encountered any issues so far.
They release patches that sometimes break our code.
Databricks is definitely a very stable product and reliable.
Amazon Kinesis could improve its pricing to be more competitive, especially for large volumes.
We could use their job clusters, however, that increases costs, which is challenging for us as a startup.
This feature, if made publicly available, may act as a game-changer, considering many global organizations use SAP data for their ERP requirements.
If I could right-click to copy absolute paths or to read files directly into a data frame, it would standardize and simplify the process.
Amazon Kinesis and Lambda pricing is competitive, but we noticed that scaling and large volumes could potentially increase costs significantly.
It is not a cheap solution.
Lambda's scalability, seamless integration with other AWS services, and support for multiple programming languages are very beneficial.
The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse.
The platform allows us to leverage cloud advantages effectively, enhancing our AI and ML projects.
Databricks' capability to process data in parallel enhances data processing speed.
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.
Databricks is utilized for advanced analytics, big data processing, machine learning models, ETL operations, data engineering, streaming analytics, and integrating multiple data sources.
Organizations leverage Databricks for predictive analysis, data pipelines, data science, and unifying data architectures. It is also used for consulting projects, financial reporting, and creating APIs. Industries like insurance, retail, manufacturing, and pharmaceuticals use Databricks for data management and analytics due to its user-friendly interface, built-in machine learning libraries, support for multiple programming languages, scalability, and fast processing.
What are the key features of Databricks?
What are the benefits or ROI to look for in Databricks reviews?
Databricks is implemented in insurance for risk analysis and claims processing; in retail for customer analytics and inventory management; in manufacturing for predictive maintenance and supply chain optimization; and in pharmaceuticals for drug discovery and patient data analysis. Users value its scalability, machine learning support, collaboration tools, and Delta Lake performance but seek improvements in visualization, pricing, and integration with BI tools.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.