We performed a comparison between Apache Spark and QueryIO based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."It is useful for handling large amounts of data. It is very useful for scientific purposes."
"The processing time is very much improved over the data warehouse solution that we were using."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"Provides a lot of good documentation compared to other solutions."
"The solution is very stable."
"I like that it can handle multiple tasks parallelly. I also like the automation feature. JavaScript also helps with the parallel streaming of the library."
"The distribution of tasks, like the seamless map-reduce functionality, is quite impressive."
"One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them."
"Anyone who has even a little bit of knowledge of the solution can begin to create things. You don't have to be technical to use the solution."
"When you want to extract data from your HDFS and other sources then it is kind of tricky because you have to connect with those sources."
"There could be enhancements in optimization techniques, as there are some limitations in this area that could be addressed to further refine Spark's performance."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"The setup I worked on was really complex."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"At the initial stage, the product provides no container logs to check the activity."
"One limitation is that not all machine learning libraries and models support it."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"There needs to be some simplification of the user interface."
Earn 20 points
Apache Spark is ranked 1st in Hadoop with 60 reviews while QueryIO is ranked 16th in Hadoop. Apache Spark is rated 8.4, while QueryIO is rated 8.0. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of QueryIO writes "Stable with good connectivity and good integration capabilities". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop, whereas QueryIO is most compared with Splice Machine.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.