婉兮清扬

案上诗书杯中酒之快意人生

Using JConsole to Monitor Hadoop Processes

发表时间:2015-04-20 17:32:18
Many friends ask how to use JConsole to look at what Hadoop is doing in real time. This is actually quite easy. Assuming that the IP address of your master node is 192.168.10.1, then all you need to do add the following configuration into etc/hadoop/hadoop-env.sh:
export JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false  -Djava.rmi.server.hostname=192.168.10.1 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port"


Assuming that you would like to monitor the NameNode and the DataNode, then modify HADOOP_NAME_NODE_OPTS and HADOOP_DATANODE_OPTS with JMX_OPTS and the desired port number. In the example below, we use port 8006 for the name node and port 8007 for the data node.
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS $JMX_OPTS=8006"

export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS $JMX_OPTS=8007"

Then you start HDFS using start-dfs.sh as normal, you will be able to use JConsole to connect to the JVM running NameNode and DataNode from remote on port 8006 and 8007, respectively. It should be noted that if you are running Hadoop on AWS EC2 then there is a catch - if you are running JConsole from outside of AWS EC2, then -Djava.rmi.server.hostname becomes the Elastic IP (EIP) of the EC2 instance. If you use the private IP, then you will only be able to connect from within your VPC.

If you want to use JConsole to monitor a particular Hadoop application such as WordCount, you can then temporarily modify etc/hadoop/hadoop-env.sh with the following configuration:
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true $JMX_OPTS=8010"


Then you run your Hadoop application as normal, you will be able to use JConsole to connect to the JVM running your Hadoop application from remote on port 8010. It should be noted that you should not make this configuration persistent. The reason is this HADOOP_OPTS will be reused each time you running a Hadoop command, and the new Hadoop process will try to listen on the same port. (Also, HADOOP_NAMENODE_OPTS and HADOOP_DATANODE_OPTS mentioned above are extensions of HADOOP_OPTS. If we set a persistent JMX_OPTS in HADOOP_OPTS, then HADOOP_NAMENODE_OPTS and HADOOP_DATANODE_OPTS will try to use the same JMX_OPTS too.)
上一篇 下一篇

 
姓名:
评论:

请输入下面这首诗词的作者姓名。

单车欲问边,属国过居延。
征蓬出汉塞,归雁入胡天。
大漠孤烟直,长河落日圆。
萧关逢候骑,都护在燕然。

答案:

云与清风常拥有,
冰雪知音世难求。
击节纵歌相对笑,
案上诗书杯中酒。

蒋清野
2000.12.31 于 洛杉矶