Hive on MR3 & Spark on MR3

The easiest way to run Hive and Spark on Kubernetes

Run Hive and Spark on Kubernetes - Easy and Fast

Hive on MR3 allows Apache Hive to run directly on Kubernetes without requiring an additional Hadoop layer. Spark on MR3 allows Apache Spark to share Hive Metastore in the same Kubernetes cluster. The enabling technology is an execution engine MR3 which provides native support for Kubernetes. Our product can package Grafana, Superset, and Apache Ranger as well.

We provide a quick and ready solution to the following problems.

#1. You want to migrate from Hadoop to Kubernetes and continue to use Hive.

    

As the enterprise environment gravitates towards Kubernetes at an accelerating pace, the industry is looking for a solution that enables Hive to run on Kubernetes. Unfortunately only an expedient solution exists today which first operates Hadoop on Kubernetes and then runs Hive on Hadoop, thus introducing two layers of complexity. For this problem, Hive on MR3 is a perfect solution ready for you.

#2. You want the speed of LLAP without setting it up.

    

LLAP (Low-Latency Analytical Processing) is a major component of Hive which allows it to far outperform competing technologies such as Presto and Spark SQL. Unfortunately enabling and configuring LLAP is excruciatingly difficult because of its complex architecture. Hive on MR3 automatically achieves the speed of LLAP, whether on Hadoop or on Kubernetes.

#3. You want to run Hive and Spark sharing Metastore.

    

A common use case of Hive and Spark is to use Hive as a BI solution for its performance and Spark as an ETL solution for its flexibility. Our solution automatically configures Hive and Spark to share Metastore and thus makes it trivial to run Hive and Spark together on Kubernetes.

#4. You want to maximize resource utilization of Spark applications.

    

While building a Spark cluster on Kubernetes is relatively easy, optimizing compute resources for a multi-tenant cluster is far from trivial. This is because different Spark applications maintain their own sets of executors and do not share compute resources. As a result, resource utilization can be especially low if short-running Spark applications are executed frequently. Our solution enables multiple Spark applications to share compute resources, thus significantly increasing resource utilization.

-

Pricing

    

Our solution is free for 1 terabyte of aggregate memory of worker Pods on Kubernetes. By purchasing a commercial use license, the customer can increase the limit on the aggregate memory of worker Pods.
We offer commercial use licenses from 2 terabytes in increments of 1 terabyte. A commercial use license costs $867 per terabyte per month (in US Dollars). This is equivalent to paying $0.075 per hour for a node with 64 gigabytes of memory.

Ready to get started?

If you are interested in our solution, you can try it yourself or request a demo. For any question about commercial use licenses, our discount program, and technical support, please contact us.

Latest news