Cloudera Enterprise 6.0.x | Other versions

Exposing Hadoop Metrics to Ganglia

Core Hadoop services and HBase support the writing of their metrics to Ganglia, a data representation and visualization tool.

HDFS, YARN, and HBase support the Metrics2 framework; MapReduce1 and HBase support the Metrics framework. See the Cloudera blog post, What is Hadoop Metrics2?

Configure Hadoop Metrics for Ganglia Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the Home page by clicking the Cloudera Manager logo.
  2. Click Configuration > Advanced Configuration Snippets.
  3. Search on the term Metrics.
  4. To configure HDFS, YARN, or HBase, use Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve). For MapReduce1 (or HBase), use Hadoop Metrics Advanced Configuration Snippet (Safety Valve).
  5. Click Edit Individual Values to see the supported daemons and their default groups.
  6. Configure each default group with a metrics class, sampling period, and Ganglia server. See the tables below.
  7. To add optional parameters for socket connection retry, modify this example as necessary:
    *.sink.ganglia.retry_socket_interval=60000 #in milliseconds
    *.sink.ganglia.socket_connection_retries=10 #Set it to 0 if you do not want it to be retried 
  8. To define a filter, which is recommended for preventing YARN metrics from overwhelming the Ganglia server, do so on the sink side. For example:
    *.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter
    *.record.filter.class=${*.source.filter.class}
    *.metric.filter.class=${*.source.filter.class}
    nodemanager.sink.ganglia.record.filter.exclude=ContainerResource*
  9. Click Save Changes.
  10. Restart the Cluster or Service depending on the scope of your changes.

Ganglia Configuration Settings Per Daemon

Table 1. Hadoop Metrics2 Ganglia Configuration
Service Daemon Default Group Ganglia Configuration Settings
HBase Master and RegionServer
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.period=10
hbase.sink.ganglia.servers=<hostname>:<port>
HDFS DataNode
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
datanode.sink.ganglia.servers=<hostname>:<port>
NameNode
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
namenode.sink.ganglia.servers=<hostname>:<port>
SecondaryNameNode
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
secondarynamenode.sink.ganglia.servers=<hostname>:<port>
YARN NodeManager
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
nodemanager.sink.ganglia.servers=<hostname>:<port>
ResourceManager
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
resourcemanager.sink.ganglia.servers=<hostname>:<port>
JobHistory Server
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 
*.period=10
jobhistoryserver.sink.ganglia.servers=<hostname>:<port>
  Note: To use metrics, set values for each context. For example, for MapReduce1, add values for both the JobTracker Default Group and the TaskTracker Default Group.
Table 2. Hadoop Metrics Ganglia Configuration
Service Daemon Default Group Ganglia Configuration Settings
HBase Master
hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 
hbase.period=10 
hbase.servers=<hostname>:<port> 

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 
jvm.period=10 
jvm.servers=<hostname>:<port>  

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 
rpc.period=10 
rpc.servers=<hostname>:<port> 
RegionServer
MapReduce1 JobTracker
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext  
dfs.period=10 
dfs.servers=<hostname>:<port> 

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext  
mapred.period=10 
mapred.servers=<hostname>:<port> 

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext  
jvm.period=10 
jvm.servers=<hostname>:<port> 

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext 
rpc.period=10 
rpc.servers=<hostname>:<port>
TaskTracker
Page generated July 25, 2018.