Try our new research platform with insights from 80,000+ expert users
NarendraBelkar - PeerSpot reviewer
Manager, Project at a consultancy with 1,001-5,000 employees
Real User
Top 5
Enables operational reporting and machine learning implementations and helps users to create reusable frameworks
Pros and Cons
  • "We can implement one framework and utilize it for multiple use cases."
  • "Microsoft Fabric costs more since we are charged for all the components."

What is our primary use case?

We are using Azure Data Lake Storage as a secondary storage solution. We use it for operational reporting and machine learning implementations. We also use the security and encryption features.

What is most valuable?

We can use the solution to implement a data warehouse. I am working on machine learning use cases. I have implemented a reusable framework. We can implement one framework and utilize it for multiple use cases.

What needs improvement?

Microsoft Fabric costs more since we are charged for all the components.

What do I think about the scalability of the solution?

I rate the tool’s scalability an eight out of ten. There are a lot of customizable codes. I have created multiple frameworks. We have more than 300 users. After we start with our ML use cases, there will be more users.

Buyer's Guide
Azure Data Lake Storage
June 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

How are customer service and support?

The support team provides quick support. It provides quick solutions when we create tickets.

How would you rate customer service and support?

Positive

How was the initial setup?

I rate the ease of setup a nine out of ten. It is pretty straightforward. We need three people to deploy the product. There are two developers on our team. Everyone on our team has more than 12 years of experience.

What's my experience with pricing, setup cost, and licensing?

The pricing is dependent on what we use and how we use it. In my previous project, we paid a lot of money for the tool. Now, we only use the product for storage purposes. The price will increase if the machine learning use cases are productionised.

The product is bundled with Microsoft. The vendor charges us based on services. Recently, they introduced Microsoft Fabric, which includes all components. It costs more because we are charged for all the services available. We will not choose it for the initial stages of implementation.

Which other solutions did I evaluate?

Whenever we provide solutions to our customers, we often compare the product to Amazon Web Services. I prefer Microsoft for mid-sized companies.

What other advice do I have?

I have been using Azure Synapse for the last seven years. In my current organization, we are exploring more options because SAP does not have machine-learning capabilities. We are planning to use Microsoft Purview to share data.

We don't have any maintenance needs. We do a little bit of monitoring. We have automated everything. Maintenance is mostly done by Microsoft. I started my career in Microsoft. I gradually moved from Microsoft Stack on-premises to the cloud. Overall, I rate the product a nine out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Data Architect /Data Engineer at Regional Council
Real User
Efficient data integration enabling modern data platforms
Pros and Cons
  • "Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks."
  • "Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format."

What is our primary use case?

The main use cases are for people who don't want their data to be siloed. They want to integrate it into one place, which is a data lake, where they can have data coming from different sources and formats. They want to be able to report on the data from one place, have different personnel work in the same place, and utilize the modern data platform.

What is most valuable?

Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks. Most recently, Microsoft Fabric has been the main tool I have recommended.

What needs improvement?

Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format. Additionally, improvements in supporting various formats would be beneficial.

For how long have I used the solution?

I've been working with Azure Data Lake Storage for about three years now.

What do I think about the stability of the solution?

I would rate stability between nine and ten. It is very stable, however, it depends on settings like geographical redundancy. It's also dependent on expertise and the company's willingness to pay for specific features.

What do I think about the scalability of the solution?

It is very scalable, especially the Gen 2 version. Azure Data Lake Gen 2 is very flexible and uses hierarchical file structures. The newest version, Gen 3, also supports a lake house approach and integrates with other cloud storage like Amazon S3, Google Storage, and Snowflake.

How are customer service and support?

In terms of community support, they respond between one to two weeks. Organization-paid technical support almost gets an immediate response. Overall, their service is rated eight.

How would you rate customer service and support?

Positive

How was the initial setup?

Microsoft Fabric is easier to use than Databricks. Fabric is a software as a service (SaaS), while Databricks is a platform as a service (PaaS, making Fabric easier for starters. Setting up Azure Data Lake Storage can be quick if done manually, but using code ensures scalability.

What's my experience with pricing, setup cost, and licensing?

It's very cheap to store large terabytes of data. It costs just a few dollars per terabyte per month. Computational costs could vary based on usage and are generally more expensive than storage.

What other advice do I have?

For startups, I recommend using Microsoft Fabric as it integrates well with other tools and is cost-efficient. The pricing model is easy to understand. I'd rate the solution nine out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Azure Data Lake Storage
June 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
Venkatesh Kollana - PeerSpot reviewer
Associate Software Engineer at Systech Solutions
Real User
Top 10
Can store different types of data - structured, unstructured, and semi-structured—in one lake
Pros and Cons
  • "The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design."
  • "I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data."

What is our primary use case?

We use Azure Data Lake Storage for sources like HubSpot (CRM software) and Xero (invoice software). We call their APIs, get the data, and store it in the product. From there, we use it to get the responses and load them into Azure SQL DB.

What is most valuable?

The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design.

What needs improvement?

I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.

For how long have I used the solution?

I have been using the product for a year. 

What do I think about the stability of the solution?

The tool is a stable product; they've been improving it recently.

What do I think about the scalability of the solution?

About 400 to 500 people (40 to 50 percent of employees) use Azure Data Lake Storage for ETL or development. It's scalable - we can add or remove users and change permissions within minutes.

How are customer service and support?

I haven't talked directly to Microsoft support, but I use blogs and forums to find answers. 

How was the initial setup?

Setting up and deploying Azure Data Lake Storage is easy. We use Azure DevOps to connect and deploy all our services.

What's my experience with pricing, setup cost, and licensing?

The tool is cheap, depending on the services and requirements you need.

What other advice do I have?

I use Azure Data Lake Storage in a cloud-only setup, not on-premises. We receive API calls and store the responses in the product. Then, we process these files using the tool.

For first-time users, I recommend learning from Microsoft materials or YouTube videos before using the tool. It is better to gain some knowledge before using it. It's easy for beginners to learn and use, especially compared to AWS and other services.

I'd rate Azure Data Lake Storage eight out of ten. I find it user-friendly as a fresher with about two point eight years in my tech career. I started my career with this tool, gained much knowledge, and now I can lead a team.

Disclosure: My company has a business relationship with this vendor other than being a customer: customer/partner
PeerSpot user
MACIEJPOLAKOWSKI - PeerSpot reviewer
Senior Manager at IT Squad
Real User
Top 20
A cost-effective solution to store data and allows flexible capacity management
Pros and Cons
    • "The version was a bit outdated compared to the newer Microsoft Data Fabric offerings."

    What is our primary use case?

    We use the solution for storing data but don’t use Synapse to store data directly in it. Instead, Azure Synapse Analytics is utilized to analyze and process data in Data Lake Storage. Data Lake Storage is a large, scalable solution that handles extensive volumes of structured and unstructured data rather than a direct disk storage system.

    What needs improvement?

    In Azure Data Lake Storage, the tool we're using, Spark, handles the management, storage, retrieval, and organization of data. Spark employs its algorithms to abstract the underlying complexities. We don’t work with a large amount of data. If we were to handle larger datasets, we would need to focus more on optimizing storage and retrieval processes, as the efficiency of these operations would become more critical.

    The version was a bit outdated compared to the newer Microsoft Data Fabric offerings. For instance, the directory services are already available in Data Fabric, so I don't think adding them to Azure Data Lake Storage would be necessary. For example, Snowflake, a cloud data analytics platform, adds its capabilities and optimizations to Azure Data Lake Storage, such as improved performance or easier integration with SQL. Compared to other similar services, Azure Data Lake Storage remains very competitive.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for over a year.

    What do I think about the stability of the solution?

    Azure is a stable platform. These interruptions are relatively rare and usually last only a few minutes. It is good for data-oriented applications that don’t require continuous online processing.

    These brief outages do not significantly impact the quality of service. We haven’t experienced major stability issues with Azure Storage. 

    What do I think about the scalability of the solution?

    It is scalable.

    How are customer service and support?

    Any issues are handled by the team responsible for managing the platform.

    Which solution did I use previously and why did I switch?

    We primarily use Azure Synapse, which integrates with Azure Data Lake Storage. Synapse leverages the storage provided by Data Lake Storage, so both are part of the Azure ecosystem but remain distinct services.

    Another integration involves SQL Server, which serves data to various consumers as an SQL database. The main consumer is Power BI, which provides extensive reporting capabilities. Additionally, Azure Functions integrates with internal systems at the client’s end.

    What's my experience with pricing, setup cost, and licensing?

    It is a cost-effective solution.

    What other advice do I have?

    Using a cloud platform generally allows for flexible capacity management, meaning you can use and pay for resources only when needed. This is particularly useful for our customers, who can run Spark clusters in serverless mode. They only pay for the time they use the service, which is cost-effective since they don’t need constant access to high power and typically run jobs for shorter periods, like half an hour.

    It is available continuously and supports data archiving. However, since the current volume of data is not large, the client doesn’t need to focus on archiving or optimization. As their data grows and becomes more historical, they may need to optimize storage and archiving practices.

    The other team manages the integration tasks. The process is straightforward as long as the systems, functions, or other components interact with external systems. The ease of integration can depend on the intensity of the integration requirements.

    Overall, I rate the solution an eight out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Richard Mottershead - PeerSpot reviewer
    Enterprise Architect at a non-profit with 501-1,000 employees
    Real User
    Top 5Leaderboard
    Able to partition data into various datasets using a directory hierarchy
    Pros and Cons
    • "The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily."
    • "One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity."

    What is most valuable?

    The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily.

    What needs improvement?

    One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity.

    For how long have I used the solution?

    I have been using the product for a year. 

    What do I think about the stability of the solution?

    Stability is good if you build your Azure Data Lake Storage well in the first place.

    What do I think about the scalability of the solution?

    Scalability depends on process complexity—it is high for simple processes and low for complex ones. This is due to the architecture of a data lake, but once converted to a data lakehouse, scalability is high across the board. I think Azure Data Lake Storage would suit medium—to large enterprises. 

    How are customer service and support?

    Microsoft's documentation is superb, and support is good, especially if you have a relevant intermediate supplier.

    Which solution did I use previously and why did I switch?

    We haven't compared Azure Data Lake Storage with products from other vendors because we're an Azure shop. We did check that the Azure product was good enough for our needs, and it was, so we didn't explore alternatives like AWS, Google, or Snowflake.

    How was the initial setup?

    The initial setup is fairly complex, but if you get your data architecture right from the start, it's not a problem. We're using a totally cloud-based deployment with Azure.

    What other advice do I have?

    Integration capabilities are fairly smooth and comparable to AWS in terms of cloud integration. Some might say it's slightly better, others slightly worse, but I think it's good. I'd rate Azure Data Lake Storage an eight out of ten. However, it's important to note that it's only eventually consistent, so don't expect immediate consistency when changes are made. It works well as a data storage bucket for future use, but it's unsuitable for transactional work. You need to use a data lakehouse like Databricks for transactional processes, which can handle transactional work once the data is in the correct format (like Parquet). The tool is great for storing data you want to put into a data lakehouse, but not for frequent transactions. It's suitable for daily archiving, but anything more frequent than that might cause issues.

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Data Architecture and Engineering Specialist at coprocenva
    User
    Top 5Leaderboard
    Manages large data volumes and has user-friendly automation
    Pros and Cons
    • "Azure Data Lake Storage is user-friendly and easy to use."
    • "Azure Data Lake Storage is user-friendly and easy to use."
    • "Maybe the solution could be a bit more user-friendly."
    • "The scalability is limited. However, it's easy to set up."

    What is our primary use case?

    We use Azure Data Lake Storage for managing large data volumes in our big data projects.

    How has it helped my organization?

    I have configured the tool to automate the deletion of data and transfer data from one repository to another automatically.

    What is most valuable?

    Azure Data Lake Storage is user-friendly and easy to use. It effectively manages large data volumes and allows for automated configuration of data operations such as deletion and transfer between repositories.

    What needs improvement?

    Maybe the solution could be a bit more user-friendly.

    What do I think about the stability of the solution?

    It is very stable and reliable. It is a good solution that doesn't crash.

    What do I think about the scalability of the solution?

    The scalability is limited. However, it's easy to set up.

    How are customer service and support?

    The support from Microsoft for Azure products is good. It's timely.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    The initial setup is very easy with this tool.

    What's my experience with pricing, setup cost, and licensing?

    I am not familiar with the pricing.

    What other advice do I have?

    Overall, I would rate the Azure Data Lake Storage as nine out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Flag as inappropriate
    PeerSpot user
    Mukesh  Kumar - PeerSpot reviewer
    Senior Software Engineer at a tech consulting company with 10,001+ employees
    Real User
    Top 5
    Easily integrates with a company's current workflow
    Pros and Cons
    • "The response time and quality offered by the support team are good."
    • "The high price of the product is an area of concern where improvements are required."

    What is our primary use case?

    I use the solution in my company as per our project requirements. In my company, we are only putting data from on-premises RDBMS into Azure Data Lake Storage Gen2, and then the file is stored in parquet format. After the aforementioned process is followed, my company has another data engineering team, which reads those data further.

    What is most valuable?

    For writing data to data lake, my company uses Oracle GoldenGate for Big Data. With Oracle GoldenGate for Big Data, my company had to use Handler (Java Platform SE 8), but now we use HDFS Handler, and then using it, we have to configure some files and open some ports between a bank's private network to Azure Data Lake Storage Gen2. After opening all the aforementioned areas, my company is able to push the data to Azure Data Lake Storage Gen2.

    What needs improvement?

    In my company, we are not facing any slowness or other kinds of issues with the product. Each day in my company, we create new directories and put the current files into them, so there is the segregation part that is taken care of, and because of this, there are no issues with the tool.

    In our company, one of the teams use Azure Databricks to read data from Azure Data Lake Storage's account and as per the business use case, they move data or take the data further. The project I am currently doing has only limited work. I haven't explored all the points associated with the tool.

    The high price of the product is an area of concern where improvements are required.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for a year.

    What do I think about the stability of the solution?

    The product's stability is good.

    What do I think about the scalability of the solution?

    The scalability part of the product is very good, and my company has not faced any issues with it.

    At present, 15 to 16 percent of the company uses the tool, but it will increase by a percent in the future.

    How are customer service and support?

    The response time and quality offered by the support team are good. I rate the technical support as nine out of ten.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    The product's initial setup phase is not too simple, and it can be described as a moderate process.

    The solution is deployed on the cloud.

    What's my experience with pricing, setup cost, and licensing?

    I rate the product price as two or three, where one is high, and ten is low. The product's price is really high.

    What other advice do I have?

    Integrating Azure Data Lake Storage into my company's current workflow was easy.

    I recommend the product to those who plan to use it.

    I rate the tool an eight to nine out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    CurtisFisher - PeerSpot reviewer
    Data Manager at National Committee for Quality Assurance
    Real User
    Top 20
    User-friendly and helps aggregate all information into one particular environment
    Pros and Cons
    • "Azure Data Lake Storage is a user-friendly and easy-to-learn solution."
    • "Some terminology is not easy to understand when using some interfaces."

    What is our primary use case?

    We use the tool to aggregate all our information into one particular environment. We have separate environments, but we aggregate the important pieces into one environment so they can be used for reporting and analytics.

    What is most valuable?

    Azure Data Lake Storage is a user-friendly and easy-to-learn solution.

    What needs improvement?

    The solution's instructions could be improved. Some terminology is not easy to understand when using some interfaces. Microsoft must provide more examples because everybody uses the tool differently.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for 5 years.

    What do I think about the stability of the solution?

    I rate the solution’s stability ten out of ten.

    What do I think about the scalability of the solution?

    Around 300 users use the solution in our organization, including developers and people who use the data through report interfaces.

    I rate the solution ten out of ten for scalability.

    How was the initial setup?

    The solution’s initial setup was straightforward.

    What about the implementation team?

    We implemented the solution through an in-house team. Certain features are turned on to deploy the tool. You go in there and set up what you need, like the database or data factory. Once they turn it on and give you permission, you go in there and set it up yourself.

    What was our ROI?

    The tool is worth the money because I've worked in other environments, and this one seems to be the most flexible and easy to use.

    What other advice do I have?

    I use Azure Data Lake Storage version 2. We integrate the solution with other internal and external sources through REST and SQL. We have external sites that we integrate or pull the information into. We can see all the data in one place and report off of it.

    The tool can be expanded at a moment's notice. Once I'm done, I can lower the threshold back down. The solution's flexibility allows me to increase my CPU or database size at a moment's notice.

    Overall, I rate the solution 10 out of 10.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user