Badges

125 Points
3 Years
Top 20

User Activity

Almost 2 years ago
We are in a world where data drives decisions. Every business should be capable of taking decisions in real-time or near real-time. This is where the Stream analytics comes to the rescue.  Stream analytics helps businesses take quick decisions based on the data so that they…
Over 2 years ago
The primary use case is the orchestration and automation of ELT/ETL data pipelines.  Apache Airflow is great in this respect and there are scheduling options to make it fully automated based on the used case.
Over 2 years ago
Yeah, the pricing is something which I too felt can be more open and explicit for the customers.
Over 2 years ago
The open source Apache Airflow is a free to use. It itself does not incur any cost. But the managed solution by AWS or GCP have cost and other packaged product like Astronomer too. 
Over 2 years ago
Near Real time analytics using Near real time data ingestion.
Over 2 years ago
I don't think using Apache Spark without Hadoop has any major drawbacks or issues. I have used Apache Spark quite successfully with AWS S3 on many projects which are batch based. Yes for very high performance system HDFS is a better option.  The main problem with Apache…
Over 2 years ago
If you are dealing with semi-structured data like json Snowflake has great support in handling and querying json data. it is also good to use as data lake and can act as one stop solution for a data lake and cloud data warehouse query performance and low maintainability is…
Over 2 years ago
It is little more costly but it has great features to keep the control on pricing if utilised properly like different warehouse sizes, caching, auto-suspend of warehouses and some more.
Over 2 years ago
handling of semi-structured data like json. It has great support for json and we can write sql on json which is amazing. the performance on semi-structured data is little poor as compared to structured data but it is still great.
Over 2 years ago
Very good review on Snowflake, very helpful.
Over 2 years ago
Apache Airflow is a great orchestration and automation tool. Its connectivity with other systems is a great plus point. The interactive UI, the options for scheduling and the very fact that its compatibility with Python.
Almost 3 years ago
The CDP I used was almost 2.5 years ago on-premise. I would rate it 8/10. I did not have much to compare against in those days and due to Cloud not accessible in my organisation. But, definitely CDP was a good choice then wrt to open source distribution. The installation was…
Almost 3 years ago
Have you used Azure Data Governance tool Purview ? If yes, what's your view and is it mature enough?
Almost 3 years ago
Snowflake is an amazing Product. It is one of the best Warehouses currently in for Cloud. Separation of store and compute and the Warehouse concept makes this unique and it has lots of features, low maintenance and the cost can be optimised to a great extent if we understand…
Almost 3 years ago
Almost 3 years ago
Many features: 1) Separate warehouse and the control user gets on it. 2) Auto caching features 3) Json and XML handling 4) Minimal DBA activity
Almost 3 years ago
We are using it as a Datalake and a DWH.
Almost 3 years ago
Have you used Azure Purview for Data Governance, Data Lineage and as Data Catalog ?
Almost 3 years ago
1. TCO : options of Long term commitment vs pay as you go 2. Ease of setup , security & performance 3. High availability & Support
About 3 years ago

Reviews

Questions

Answers

Almost 2 years ago
Streaming Analytics
Over 2 years ago
Business Process Management (BPM)
Over 2 years ago
Business Process Management (BPM)
Over 2 years ago
Software Configuration Management
Over 2 years ago
Business Process Management (BPM)
Almost 3 years ago
Data Warehouse
Almost 3 years ago
Data Warehouse

Comments

Almost 3 years ago
Infrastructure as a Service Clouds (IaaS)
Almost 3 years ago
Infrastructure as a Service Clouds (IaaS)

About me

I have 19+ years experience in building software. I primarily worked on Java, Spring and Database systems initially and then moved to Distributed systems in the last 8 years. I have worked on Apache Spark, Snowflake, Hadoop, Hbase, Hive, Kafka, Hbase etc. Also, have got exposure to work on multiple cloud technologies on AWS, Azure and GCP.