What is our primary use case?
Snowflake is primarily used to handle the data warehousing part, for creating data modeling, and also keeping the raw data and creating reporting data so that it is further used for data analytics.
Whenever we receive files from API integration from the front end, such as Excel files, HR data, sales data, or marketing data, we load those files into Snowflake. These files may be in non-SQL formats, such as Parquet files or text files. From Snowflake, we transform those files into tables. Once the tables are transformed, we create dimension tables according to the business needs, either in a star schema or Snowflake schema. After creating that schema, we create curated data on top of it. Once the curated data is created, we create a view query which is further used as reporting data analytics. We load the raw data, transform it, and then use it further as reporting data analytics.
Snowflake is utilized as a hybrid cloud. We primarily use it as a private cloud, but some components are hybrid.
What is most valuable?
Snowflake has a very wide variety of data integration capabilities. For workflows and data sources, we can capture semi-structured or non-structured data such as Avro, Parquet files, JSON files, and text files, or we fetch data from API integrations, such as Salesforce API or SAP data. We try to inculcate and grab all the data and then load it into Snowflake. Snowflake works as a very good tool because it handles and creates micro-partitions automatically. Whenever we create SQL queries, it automatically divides and runs the whole query itself on a multi-cloud, multi-cluster, and multi-cloud enterprise workflow.
The top features about Snowflake are first that it is hybrid cloud native. Everything is on cloud, so we have very zero maintenance for computational or in order to maintain Snowflake. Second, it is a very simple tool to understand with an architecture dividing the data layer, storage layer, computational layer, and the UI layer. This unique feature of Snowflake gives us very low costs. It is very helpful for SQL analysis. In Snowflake, we keep the data in a very low-cost manner, and if we don't compute it, we can also move it to cloud storage. Third, it is very elastic in scaling. Whenever a large amount of SQL or computational work is required, the warehouse automatically scales in and out. This is one of the best features in Snowflake that it manages automatically and runs on ACID properties: Atomicity, consistency, isolation, and durability. Each query works on isolation, hits isolation, and Snowflake very much optimizes performance. It is very simple to use.
The elastic scaling helps significantly; for example, if we get a large amount of data after a leave period, it requires scaling out the warehouse to a larger size. If we have a bulky query, the warehouse size increases automatically, dividing the query into micro-partitions for quick execution. If we encounter unexpected data, such as special characters, Snowflake creates zero-copy cloning, aiding in development and testing. Such cloning does not require physical storage, hence reducing storage costs.
Snowflake has positively impacted us by making everything cloud-native, significantly reducing the systems and application running process. The maintenance costs are low because Snowflake maintains a high amount of data efficiently at a low cost. There is also a massive scalable data warehousing performance. Whenever we need to increase SQL data, the data warehouse scales up or down based on our needs and query requirements. If the query is not heavy, the costs remain low, and the differentiation between storage and computational layers makes Snowflake very cost-effective compared to other data warehousing tools.
Snowflake has contributed to significant cost savings. Previously, we paid for both storage and computational layers, where unused data accumulated costs unnecessarily. With Snowflake, the unique approach to managing storage and computational costs allows us to transfer less frequently accessed data to Glacier, further optimizing expenses. Snowflake's performance enhances business efficiency, allowing timely achievement of targets. The analytics and transformations we execute using Snowflake are notably faster, providing significant advantages. The scalability is remarkable; marketing teams can analyze data on time-sensitive trends swiftly.
What needs improvement?
Snowflake is already quite improved, but they have recently introduced AI features. AI integration would be beneficial for direct data capturing from systems such as SAP and Salesforce to Snowflake as raw data and allow for efficient data warehousing.
Snowflake is very good overall, but it could improve documentation for supporting different structures. Even though Snowflake has a strong ANSI SQL presence, there are some features that do not perform optimally, particularly for Change Data Capture (CDC).
I do not rate it a 10 because there are still areas for improvement, such as AI integration and raw data capturing. Once those enhancements are achieved, the dependency on ETL would decrease, as would the scheduling aspect to streamline major data workloads within Snowflake.
For how long have I used the solution?
I have been working for more than six months in my current field, and it is about a year going to happen.
What do I think about the stability of the solution?
Snowflake is highly stable and performs well even with large data sets exceeding terabytes, maintaining stability throughout.
What do I think about the scalability of the solution?
Snowflake's scalability is excellent, depending on dataset sizes and query handling. It can scale from one engine to up to 32 engines seamlessly, maintaining performance.
How are customer service and support?
We interacted with customer support, and they were quite helpful. They listened to our issues and provided solutions, but we faced challenges acquiring documentation for various structures. We sought this documentation multiple times but faced difficulty in obtaining it.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Earlier, we were using on-premise data warehousing with Hadoop, but it was complex and high-maintenance. The setup was also prolonged and costly. Therefore, we transitioned to Snowflake, which allows for rapid setup.
What's my experience with pricing, setup cost, and licensing?
For pricing, setup cost, and licensing, everything is managed smoothly. Regarding licensing, it is inexpensive. The setup cost is low, mainly due to AWS Marketplace; we only need to pay for serverless configurations. When it comes to cloud support, the setup cost is very cheap compared to other platforms, such as Oracle or PostgreSQL, which typically require higher costs. Snowflake's handling of computational resources and hardware is excellent, making it one of the most affordable data warehousing solutions on the market.
Which other solutions did I evaluate?
Before choosing Snowflake, we evaluated other options such as Redshift and Databricks. What we observed was Redshift's architecture seemed coupled, leading to slower scaling and limited concurrency issues requiring extensive maintenance. Redshift's clustering methods increased complexity in data management. In contrast, we found Snowflake offers fully managed services with micro-partitioning and independent scaling, making it a more desirable option.
What other advice do I have?
One more feature is Time Travel, which maintains historical data. For transient tables, it is retained for seven days, but for permanent tables, it is stored for 90 days. If we want to keep historical data beyond that, we can extend it via Time Travel. If we accidentally delete some tables, we can retrieve that data within 90 days, or if that does not happen, we can reach out to Snowflake support team for retrieval known as Fail-safe travel. Additionally, Snowflake offers very secure data sharing, ensuring no data movement among different applications while collecting data into one place for transformation and loading into data warehousing. The storage, security, and compliance are all managed well within Snowflake, and we just need to maintain the security policies as per administration.
One thing is very low maintenance. The infrastructure has transitioned to cloud, which is very manageable. Snowflake gives us very high concurrency, and we can predict our costs effectively. The enterprise data analytics we perform—whether BI reports or analytics—everything is very helpful with Snowflake.
My advice is to leverage Snowflake for its scalability, efficient query handling, user role assignments, and direct integrations with platforms such as SAP and Salesforce. The decoupling of computational and storage architecture is a key feature, allowing us to pay based on actual usage. Features such as zero-copy cloning and Time Travel enable easy data recovery and improve overall data management. Additionally, automatic tuning enhances query performance and sharing securely aligns with stringent security policies.
I believe we have covered almost everything. To summarize, Snowflake offers scalable storage that adjusts automatically, high concurrency, and virtually no downtime. It continually enhances performance by improving cluster configurations. I rate this product an 8.5 out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?