MR debuging by taking JVM heap dumps
Taking Heap Dump manually:
jmap -histo:live pid (Histogram)
jmap -dump:live,format=b,file=file-name.bin <pid> (dump jvm heap as a file on disk)
To analyse the dump , use jhat .
jhat -port “protno” heap_file_path .
What to look for in the Jhat analysis
set the following option in Job configuration .
set mapred.child.java.opts ‘-Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/@taskid@S2sSdebug.hprof ‘.
This option launches the map/reduce task jvm with the value specified thus giving us handle to control various jvm memory related parameters.
Few things to note
Taking HeapDump on OutOfMemoryException And Collecting the dump files across datanodes in a hdfs location for further analysis .
The above mentioned option required one to log on in the datanode on which the map/reduce task has been spawned , and run jmap , jhat on those machines . A MR task which has 100 of Map/reduce tasks can make this process very difficult . This option mentioned below provides a mechanism to collect all heap dump in a specified hdfs location .
Make a shell script named dump.sh :
#!/bin/sh
text=`echo $PWD | sed ‘s=/=\\_=g’` (this helps in figuring out which heap dump belongs to which task)
hadoop fs -put heapdump.hprof /user/kunalg/hprof/$text.hprof
One can verify sane execution of the script in the stdoutLog .
on Stdlogout :
jmap -histo:live pid (Histogram)
jmap -dump:live,format=b,file=file-name.bin <pid> (dump jvm heap as a file on disk)
- Logonto the datanode where the map/reduce jvm is running , run ps -eaf | grep attempt_id to get the pid .
- Use Sudo -u “appropriate user to get the heap dump by using jmap command”.
- Never use -f option . while taking the dump using jmap .
To analyse the dump , use jhat .
jhat -port “protno” heap_file_path .
What to look for in the Jhat analysis
- Object address having highest memory footprints
- objects having highest instance count .
set the following option in Job configuration .
set mapred.child.java.opts ‘-Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/@taskid@S2sSdebug.hprof ‘.
This option launches the map/reduce task jvm with the value specified thus giving us handle to control various jvm memory related parameters.
Few things to note
- -Xmx512m heap memory in MB
- -XX:+HeapDumpOnOutOfMemoryError dump heap on disk when jvm goes out of memory
- -XX:HeapDumpPath=/tmp/@taskid@S2sSdebug.hprof @taskid@ is replaced by hadoop framework with original taskid which is unique .
Taking HeapDump on OutOfMemoryException And Collecting the dump files across datanodes in a hdfs location for further analysis .
The above mentioned option required one to log on in the datanode on which the map/reduce task has been spawned , and run jmap , jhat on those machines . A MR task which has 100 of Map/reduce tasks can make this process very difficult . This option mentioned below provides a mechanism to collect all heap dump in a specified hdfs location .
Make a shell script named dump.sh :
#!/bin/sh
text=`echo $PWD | sed ‘s=/=\\_=g’` (this helps in figuring out which heap dump belongs to which task)
hadoop fs -put heapdump.hprof /user/kunalg/hprof/$text.hprof
- Place the dump.sh script in a hdfs location by using hadoop dfs -put dump.sh “hdfs location (example /user/kunalg/dump.sh) “
- Create a dir on hdfs where u want to gather all the heap dumps and give 777 permission to that dir . (example hadoop dfs -chmod -R 777 /user/kunalg/hprof)
- Set the following proprties in the MR job
- set mapred.child.java.opts ‘-Xmx256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./heapdump.hprof -XX:OnOutOfMemoryError=./dump.sh’
- set mapred.create.symlink ‘yes’
- set mapred.cache.files ‘hdfs:///user/kunalg/dump.sh#dump.sh‘
One can verify sane execution of the script in the stdoutLog .
on Stdlogout :
java.lang.OutOfMemoryError: Java heap space Dumping heap to ./heapdump.hprof ... Heap dump file created [12039655 bytes in 0.081 secs] # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="./dump.sh" # Executing /bin/sh -c "./dump.sh"...Use Hadoop Default profiler for profiling and finding issues
set mapred.task.profile ' true'; set mapred.task.profile.params '-agentlib:hprof=cpu=samples,heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s' set mapred.task.profile.maps '0-1' set mapred.task.profile.reduces '0-1'
profiler will provide the details of the jvm tasks in the
specified range . Location of the dump will be availabe at TaskLogs
under profile.out logs section .
No comments:
Post a Comment