Aug 1, 2024
Read in 4 minutes
Recently Apache Hive 4 was released after a hiatus of several years. We have released Hive 4 on MR3 which replaces Tez with MR3 as the default execution engine in Hive 4. In this article, we evaluate the performance of Hive 4 on MR3 using the TPC-DS Benchmark. To compare Hive 4 on MR3 with competing technologies, we also evaluate the performance of the latest release of Trino.
Similarly to Tez and the core module of Spark, MR3 uses an execution framework generalizing the two-stage model of MapReduce. It represents computations as DAGs (Directed Acyclic Graphs) which describe how to produce map/reduce tasks to be fetched and executed in workers. In order to support fault tolerance, tasks may spill intermediate data on local disks. In contrast, Trino is an MPP system which does not exploit local disks and relies solely on in-memory storage to store intermediate data. As such, there has been a common misconception that Trino (or formerly Presto) runs much faster than Hive.
Currently Trino promotes itself as a query engine that runs at ludicrous speed on its website. This article demonstrates that Hive 4 on MR3 runs nearly at the same ludicrous speed, but unlike Trino, without sacrificing fault tolerance and without compromising the correctness.
For the experiment, we use a cluster consisting of 1 master and 12 worker nodes with the following properties:
In total, the amount of memory of worker nodes is 12 * 256GB = 3072GB. The cluster runs HDP (Hortonworks Data Platform) 3.1.4 and uses HDFS replication factor of 3.
We use a variant of the TPC-DS benchmark introduced in the previous article which replaces an existing LIMIT clause with a new SELECT clause so that different results from the same query translate to different numbers of rows. The reader can find the set of modified TPC-DS queries in the GitHub repository.
The scale factor for the TPC-DS benchmark is 10TB.
We generate datasets in ORC with Snappy compression.
Both Hive 4 on MR3 and Trino use Java 22 (which is required by Trino 453).
For Trino, we use a JVM option -Xmx196G
and choose the following configuration
after performance tuning:
memory.heap-headroom-per-node=58GB
query.max-total-memory=1680GB
query.max-memory-per-node=120GB
query.max-memory=1440GB
For Hive 4 on MR3, we use the default configuration in the MR3 distribution
except that
we use a JVM option -Xmx86G
for every worker and create two workers on each node.
We sequentially submit 99 queries from the TPC-DS benchmark. We report the total running time, the geometric mean of running times, and the running time of each individual query.
In order to check the correctness, we report the number of rows from each query. If the result contains a single row, we report the sum of all numerical values in it.
For the reader's perusal, we attach the table containing the raw data of the experiment results. Here is a link to [Google Docs].
Both the systems successfully complete all the queries and agree on the results, except that Trino returns wrong answers on query 23 and query 72.
In terms of the total running time, Hive 4 on MR3 runs slightly faster than Trino.
In terms of the geometric mean of running times, Trino responds about 15 percent faster than Hive 4 on MR3.
From the comparison with Trino 453, we conclude that Hive 4 on MR3 is indeed a fast query engine suitable for modern computing environments. It is a practical data warehouse solution that significantly reduces the complexity of operating Apache Hive. For example, one can run Hive 4 on MR3 not only on Hadoop, but also on Kubernetes and even in standalone mode without requiring any resource manager.
If you are interested in Hive 4 on MR3, join MR3 Slack or see Quick Start Guide.