Badges

60 Points
4 Years

User Activity

About 4 years ago
There is still enough space of improvement on Apache Spark in term of integration and improving speed. Apache spark community can use Rust, C++ implementation to improve performance.
About 4 years ago
I would say for some use case we don't have to go to Apache spark and it can be implemented using ordinary python,go or Java application. For some use cases if leveraging the usage of Apache Spark gives better performance and reduction of time we can go for Apache Spark. I…
About 4 years ago
I love every core functionality of Apache Spark Initially they have only provided RDD basic interface to process the data across distributed cluster. Then it evolved to dataframe and dataset interface with optimised execution engine and more flexibility for developers to…
About 4 years ago
Apache spark is available in cloud services like AWS cloud, Azure. We have to use the specific service for our use case. For example we can use AWS Glue which runs spark for ETL process, AWS EMR /Azurre data brick for on demand data processing in the cloud. Basically it…
About 4 years ago
Apache Spark can be used in multiple use case in big data and in data engineering task. We are using Apache spark for ETL, integration with streaming data and performing real time prediction like anomaly, price prediction and data exploration on large volume of data.
About 4 years ago
Answered a question: What is data lake storage?
Data lake can hold vast pools of raw data at optimal price. One way to imagine a data lake is to compare it with the natural lake which stores all the water from different sources in its raw form For any enterprise product there will be more than one application, Which…

Projects

About 4 years ago
Building data platform and leveraging machine learning model
Building data platform and leveraging machine learning model to deliver business insights

Interesting Projects and Accomplishments