Apache Spark vs Zadara comparison

Cancel
You must select at least 2 products to compare!
Apache Logo
3,093 views|2,345 comparisons
89% willing to recommend
Zadara Logo
95 views|43 comparisons
100% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Apache Spark and Zadara based on real PeerSpot user reviews.

Find out in this report how the two Compute Service solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed Apache Spark vs. Zadara Report (Updated: March 2024).
768,578 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"The most valuable feature of Apache Spark is its flexibility.""The product is useful for analytics.""The deployment of the product is easy.""The features we find most valuable are the machine learning, data learning, and Spark Analytics.""It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained.""Now, when we're tackling sentiment analysis using NLP technologies, we deal with unstructured data—customer chats, feedback on promotions or demos, and even media like images, audio, and video files. For processing such data, we rely on PySpark. Beneath the surface, Spark functions as a compute engine with in-memory processing capabilities, enhancing performance through features like broadcasting and caching. It's become a crucial tool, widely adopted by 90% of companies for a decade or more.""This solution provides a clear and convenient syntax for our analytical tasks.""Its scalability and speed are very valuable. You can scale it a lot. It is a great technology for big data. It is definitely better than a lot of earlier warehouse or pipeline solutions, such as Informatica. Spark SQL is very compliant with normal SQL that we have been using over the years. This makes it easy to code in Spark. It is just like using normal SQL. You can use the APIs of Spark or you can directly write SQL code and run it. This is something that I feel is useful in Spark."

More Apache Spark Pros →

"Zadara Storage Cloud having 24/7 management saves me support and engineering costs because the storage and computing are managed by a third-party. We are able to focus more attention on the customer, which is truly our core business. Even at 1:00 AM or 2:00 AM at night, someone will answer, which is important.""A nice feature is the immutable object storage, which can be used in conjunction with Veeam.""The most valuable feature is the flexibility in terms of deployment options.""One of the most valuable features is its integration with other cloud solutions. We have a presence within Amazon EC2 and we leverage compute instances in there. Being able to integrate with compute, both locally within Zadara, as well as with other cloud vendors such as Amazon, is very helpful, while also being able to maintain extremely low latency between those connections.""The processing is much faster with this product.""Being able to scale on demand, and being able to get out of our security operation center, and not having to purchase hardware upfront, has drastically reduced the overhead that was required to maintain our information. We have also gained additional capabilities in terms of speed of replicating that information.""The most valuable features of Zadara are its visibility and simplicity to use.""The most valuable feature of Zadara is its ease of use and safety. Overall the solution is a complete package, it has all the features needed."

More Zadara Pros →

Cons
"When you want to extract data from your HDFS and other sources then it is kind of tricky because you have to connect with those sources.""Apache Spark can improve the use case scenarios from the website. There is not any information on how you can use the solution across the relational databases toward multiple databases.""It would be beneficial to enhance Spark's capabilities by incorporating models that utilize features not traditionally present in its framework.""Apart from the restrictions that come with its in-memory implementation. It has been improved significantly up to version 3.0, which is currently in use.""We've had problems using a Python process to try to access something in a large volume of data. It crashes if somebody gives me the wrong code because it cannot handle a large volume of data.""It requires overcoming a significant learning curve due to its robust and feature-rich nature.""The solution’s integration with other platforms should be improved.""In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."

More Apache Spark Cons →

"The range of support of VMware could be better. It can support Windows, however, it cannot support other operating systems like IBM AIX. This needs to improve.""The initial setup of the solution is complex.""There are still some storage features that they lack. For example, other vendors implemented the auto-tiering feature a long time ago, while Zadara Storage Cloud is just coming out with this feature today. So, they are a little bit late compared to the market.""Having iSCSI over the internet using a VPN, the IPSec tunnel is really the only thing that I find missing from this product.""Cost-wise, because it's a pay-per-use model, it may ultimately end up costing us more in the long run than something we developed ourselves.""The management interface is more geared towards end-users rather than a service partner like ourselves, and there are improvements that can be made around that.""In the next release, there can be some improvements to the web console by adding more features because the console is simple. Additionally, the calculator could improve.""Some of the features are a little bit slow to come to market."

More Zadara Cons →

Pricing and Cost Advice
  • "Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
  • "Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
  • "We are using the free version of the solution."
  • "Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
  • "Apache Spark is an expensive solution."
  • "Spark is an open-source solution, so there are no licensing costs."
  • "On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
  • "It is an open-source solution, it is free of charge."
  • More Apache Spark Pricing and Cost Advice →

  • "For our use, it's appropriately priced and overall, it's proved to be very cost-effective against other tier-one vendors."
  • "The pricing and licensing are very simple and the cost is predictable, although, like everything that you pay for as you use, you have to be mindful of what you're using."
  • "It is a nice licensing model and it makes it quite simple because we just pay for what we use, and the bill that comes shows us exactly what customers are using what resources."
  • "The pricing is very competitive and the fact that they have very compelling discounts for multi-year commitments is great."
  • "The price of Zadara is very good and it covers everything. There is no subscription needed."
  • "One of the factors that ruled out several providers was cost. They were way too expensive for the volume of data that we needed and the speed at which we needed to be able to manage it. There aren't a lot of providers that can do that."
  • "If you just take the street price of Zadara Storage Cloud and look up the price or cost per hour, then you could think that Zadara Storage Cloud is extremely expensive or a solution only for enterprise use. That is not true. You need to compare the entire system. This means that you don't stop looking at just the street price, but you need to consider all the features, requirements, and costs of support as well as the extra cost that other vendors have. Other players just play with hidden, additional costs. Everything is included in Zadara Storage Cloud's licensing cost; what you get is what you pay for."
  • More Zadara Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
    768,578 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:We use Spark to process data from different data sources.
    Top Answer:In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, and do the transformation in a subsecond
    Top Answer:The most valuable features of Zadara are its visibility and simplicity to use.
    Top Answer:In the next release, there can be some improvements to the web console by adding more features because the console is simple. Additionally, the calculator could improve.
    Top Answer:We use services on Zadara for our customers in Brazil.
    Ranking
    5th
    out of 16 in Compute Service
    Views
    3,093
    Comparisons
    2,345
    Reviews
    25
    Average Words per Review
    432
    Rating
    8.7
    9th
    out of 16 in Compute Service
    Views
    95
    Comparisons
    43
    Reviews
    3
    Average Words per Review
    1,918
    Rating
    9.3
    Comparisons
    MinIO logo
    Compared 23% of the time.
    Amazon S3 logo
    Compared 21% of the time.
    Wasabi logo
    Compared 10% of the time.
    Red Hat Ceph Storage logo
    Compared 8% of the time.
    Learn More
    Overview

    Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

    Zadara is a powerful enterprise-level storage solution whose design enables it to handle every aspect of a user’s data storage needs. It can be deployed in any location, using any protocol, and storing any data type that an organization requires. With Zadara, organizations can do everything that they were able to do with more traditional systems in a cheaper and more efficient way. 

    Zadara Benefits

    Some of the ways that organizations can benefit by choosing to deploy Zadara include:

    • Security. Zadara is built with data security in mind. All of a user’s data is heavily encrypted by Zadara’s software so that it is safe both while it is in storage and while it is being transferred somewhere else. It comes with role-based access controls that prevent a user’s data from being seen by those who lack the proper level of authorization. It also comes with dual-factor authentication and other similar methods of identity management. 
    • Reduced costs. While Zadara offers enterprise-level storage capabilities, the costs associated with utilizing it are far less than those of similar solutions. Some of the resources that it offers are totally free of cost. The model that it utilizes is a pay-as-you-go model. Organizations only pay for the resources that they use and nothing more. Users are not required to sign up for any particular period of time. They are in complete and total control of what they spend on the solution. 
    • Storage efficiency. Zadara ensures that users are able to store their data in the most efficient way possible. It takes a number of approaches that prevent insufficient data storage. These approaches are:

      1. 
      Thin-provision volumes. This method only allows actually written data to be counted toward the storage capacity that is consumed. 

      2. 
      Pattern removal. Zadara searches a user’s stored data for repetitive binary sequences. Once it identifies the repetitive binary sequences, it deletes them.

      3. 
      Inline deduplication (all-flash only). This method ensures that Zadara only stores unique data blocks in its system.

      4. Inline compression.
      Zadara takes data that is stored in its system and compresses it. This keeps larger blocks of data from taking up too much room. 

    Zadara Features

    • Centralized management.  Zadara comes with a centralized dashboard that enables organizations to monitor and control their storage operations from a single location. 
    • File analytics. Organizations can leverage a powerful analytics package that can provide them with critical insights. These tools can help users sort through their data and make more informed data management decisions.

    • Remote mirroring. This feature enables users to replicate data in two remote locations at the same time. This makes it easy for data to be recovered in the event of an unexpected loss. It also aids remote teams in their collaboration efforts by giving them access to the same data at the same time. 

    Reviews from Real Users

    Zadara is a highly effective solution that stands out when compared to many of its competitors. Two major advantages it offers are its extensive suite of cloud solution integrations and its object storage capability. 

    Steve H., the chief technology officer at Pratum, writes, “One of the most valuable features is its integration with other cloud solutions. We have a presence within Amazon EC2 and we leverage computer instances there. Being able to integrate with computing, both locally within Zadara, as well as with other cloud vendors such as Amazon, is very helpful, while also being able to maintain extremely low latency between those connections.”

    Mauro R., the CEO of Momit SRL, says, “The object storage feature is wonderful. With traditional storage, you have a cost per gigabyte that is extremely high or related to the number of disks. With Zadara Storage Cloud, you have a cost per gigabyte that you can cut and tailor to your needs independent of the number or size of the disks.”

    Sample Customers
    NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
    Time, Inc. A&E Network, The Washington Post, News UK, McGraw Hill, Gilt, Toshiba, Deloitte, VMware
    Top Industries
    REVIEWERS
    Computer Software Company30%
    Financial Services Firm15%
    University9%
    Marketing Services Firm6%
    VISITORS READING REVIEWS
    Financial Services Firm24%
    Computer Software Company13%
    Manufacturing Company7%
    Comms Service Provider6%
    VISITORS READING REVIEWS
    Computer Software Company24%
    Manufacturing Company13%
    Financial Services Firm8%
    Government7%
    Company Size
    REVIEWERS
    Small Business40%
    Midsize Enterprise19%
    Large Enterprise40%
    VISITORS READING REVIEWS
    Small Business17%
    Midsize Enterprise12%
    Large Enterprise71%
    REVIEWERS
    Small Business73%
    Midsize Enterprise9%
    Large Enterprise18%
    VISITORS READING REVIEWS
    Small Business35%
    Midsize Enterprise15%
    Large Enterprise50%
    Buyer's Guide
    Apache Spark vs. Zadara
    March 2024
    Find out what your peers are saying about Apache Spark vs. Zadara and other solutions. Updated: March 2024.
    768,578 professionals have used our research since 2012.

    Apache Spark is ranked 5th in Compute Service with 60 reviews while Zadara is ranked 9th in Compute Service with 9 reviews. Apache Spark is rated 8.4, while Zadara is rated 8.8. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of Zadara writes "We're able to scale up or down almost instantly, and changes are handled efficiently by their managed services team ". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop, whereas Zadara is most compared with MinIO, Amazon S3, Nutanix Cloud Infrastructure (NCI), Wasabi and Red Hat Ceph Storage. See our Apache Spark vs. Zadara report.

    See our list of best Compute Service vendors.

    We monitor all Compute Service reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.