Configuring HiveServer2 for CDH
 Important: Because of concurrency
and security issues, HiveServer1 was deprecated in CDH 5.3 and has been removed from CDH 6.
  Important: Because of concurrency
and security issues, HiveServer1 was deprecated in CDH 5.3 and has been removed from CDH 6.HiveServer2 Memory and Hardware Requirements
| Component | Java Heap | CPU | Disk | |
|---|---|---|---|---|
| HiveServer 2 | Single Connection | 4 GB | Minimum 4 dedicated cores | Minimum 1 disk This disk is required for the following: 
 | 
| 2-10 connections | 4-6 GB | |||
| 11-20 connections | 6-12 GB | |||
| 21-40 connections | 12-16 GB | |||
| 41 to 80 connections | 16-24 GB | |||
| Cloudera recommends splitting HiveServer2 into multiple instances and load balancing them once you start allocating more than 12 GB to HiveServer2. The objective is to adjust the size to reduce the impact of Java garbage collection on active processing by the service. | ||||
| Set this value using the Java Heap Size of HiveServer2 in Bytes Hive configuration property. For more information, see Tuning Hive in CDH. | ||||
| Hive Metastore | Single Connection | 4 GB | Minimum 4 dedicated cores | Minimum 1 disk This disk is required so that the Hive metastore can store the following artifacts: 
 | 
| 2-10 connections | 4-10 GB | |||
| 11-20 connections | 10-12 GB | |||
| 21-40 connections | 12-16 GB | |||
| 41 to 80 connections | 16-24 GB | |||
| Set this value using the Java Heap Size of Hive Metastore Server in Bytes Hive configuration property. For more information, see Tuning Hive in CDH. | ||||
| Beeline CLI | Minimum: 2 GB | N/A | N/A | |
 Important:
These numbers are general guidance only, and can be affected by factors such as number of columns, partitions, complex joins, and client activity. Based on your anticipated deployment, refine through
testing to arrive at the best values for your environment.
  Important:
These numbers are general guidance only, and can be affected by factors such as number of columns, partitions, complex joins, and client activity. Based on your anticipated deployment, refine through
testing to arrive at the best values for your environment.For information on configuring heap for HiveServer2, as well as Hive metastore and Hive clients, see Tuning Apache Hive in CDH and the following video:
After you start the video, click YouTube in the lower right corner of the player window to watch it on YouTube where you can resize it for clearer viewing.
hive.zookeeper.client.port
If ZooKeeper is not using the default value for ClientPort, you need to set hive.zookeeper.client.port in /etc/hive/conf/hive-site.xml to the same value that ZooKeeper is using. Check /etc/zookeeper/conf/zoo.cfg to find the value for ClientPort. If ClientPort is set to any value other than 2181 (the default), sethive.zookeeper.client.port to the same value. For example, if ClientPort is set to 2222, set hive.zookeeper.client.port to 2222 as well:
<property> <name>hive.zookeeper.client.port</name> <value>2222</value> <description> The port at which the clients will connect. </description> </property>
JDBC driver
The connection URL format and the driver class for HiveServer2:
| Connection URL | Driver Class | 
|---|---|
| jdbc:hive2://<host>:<port> | org.apache.hive.jdbc.HiveDriver | 
Authentication
HiveServer2 can be configured to authenticate all connections; by default, it allows any client to connect. HiveServer2 supports either Kerberos or LDAP authentication; configure this in the hive.server2.authentication property in the hive-site.xml file. You can also configure Pluggable Authentication, which allows you to use a custom authentication provider for HiveServer2; and HiveServer2 Impersonation, which allows users to execute queries and access HDFS files as the connected user rather than the super user who started the HiveServer2 daemon. For more information, see Hive Security Configuration.
Running HiveServer2
 Important: Because of concurrency and security issues, HiveServer1 was deprecated in CDH 5.3 and has been removed from CDH 6. The Hive CLI is deprecated and will be removed in a future
release. Cloudera recommends you migrate to Beeline and HiveServer2 as soon as possible. The Hive CLI is not needed if you are using Beeline with HiveServer2.
 
Important: Because of concurrency and security issues, HiveServer1 was deprecated in CDH 5.3 and has been removed from CDH 6. The Hive CLI is deprecated and will be removed in a future
release. Cloudera recommends you migrate to Beeline and HiveServer2 as soon as possible. The Hive CLI is not needed if you are using Beeline with HiveServer2.<property> <name>hive.server2.thrift.port</name> <value>10001</value> <description>TCP port number to listen on, default 10000</description> </property>
You can also specify the port and the host IP address for HiveServer2 by setting these environment variables:
| Port | Host Address | 
|---|---|
| HIVE_SERVER2_THRIFT_PORT | HIVE_SERVER2_THRIFT_BIND_HOST | 
| << Configuring the Hive Metastore for CDH | ©2016 Cloudera, Inc. All rights reserved | Starting the Hive Metastore in CDH >> | 
| Terms and Conditions Privacy Policy |