Flame Graph
Introduction
Flame graphs are a visualization of profiled software, allowing the most frequent code-paths to be identified quickly and accurately. They can be generated using open source programs on github.com/brendangregg/FlameGraph, which create interactive SVGs.
Example Usage
Following is step-by-step instruction using seti as example:
- Install systemtap and clone nginx-systemtap-toolkit:
apt-get install -y linux-tools-`uname -r` git systemtap git clone --depth 1 https://github.com/wxdublin/nginx-systemtap-toolkit.git
- Get FlameGraph:
git clone --depth 1 https://github.com/brendangregg/FlameGraph.git
- Install kernel symbol
- Install bonic and seti:
apt-get install boinc-app-seti-dbg
- create a SETI@home account and copy account key.
boinccmd --project_attach http://setiathome.berkeley.edu <app key>
- Generate On-CPU user space flare graph:
cd ~/nginx-systemtap-toolkit ./sample-bt -p `pgrep seti | head -n 1` -t 5 -u > on.user c++filt < on.user | ~/FlameGraph/stackcollapse-stap.pl | ~/FlameGraph/flamegraph.pl > on.user.svg
- Generate On-CPU kernel space flare graph:
./sample-bt -p `pgrep seti | head -n 1` -t 5 -k > on.kern c++filt < on.kern | ~/FlameGraph/stackcollapse-stap.pl | ~/FlameGraph/flamegraph.pl > on.kern.svg
- Off-CPU User Flare Graph:
cd ~/nginx-systemtap-toolkit ./sample-bt-off-cpu -p `pgrep seti | head -n 1` -t 5 -u > off.user c++filt < off.user | ~/FlameGraph/stackcollapse-stap.pl | ~/FlameGraph/flamegraph.pl > off.user.svg
- On-CPU Kernel flare graph:
./sample-bt-off-cpu -p `pgrep seti | head -n 1` -t 5 -k > off.kern c++filt < off.kern | ~/FlameGraph/stackcollapse-stap.pl | ~/FlameGraph/flamegraph.pl > off.kern.svg
Analyze
In user level On-CPU graph, the top edges of boxes tells us who's running on CPU. Function seti_analyze is main user, and to be more specific its children functions 1) math algorithms: lcgf, float_to_uchar, f_getChiSq; 2) Memory related: GAUSS_INFO::GAUSS_INFO and GAUSS::!GAUSS_INGO.
Kernel level On-CPU graph clearly say kernel only does four things: timer management, memory allocation, user/kernel switch and memory free.
Kernel level Off-CPU graph just tells us the kernel is doing lots of scheduling.
User level Off-CPU graph is the most difficult to understand. What caused the app off the CPU is not cleat at all. The root cause is I only did the user level tracing. To enable both kernel and user (like perf), I did:
On-CPU Kernel/User:
./sample-bt-off-cpu -p pgrep seti | head -n 1
-t 5 -k -u> off.all
c++filt < off.all | ~/FlameGraph/stackcollapse-stap.pl | ~/FlameGraph/flamegraph.pl > off.all.svg
From above we can understand well - the app invoke some syscall, and on the return, resume_userspace is called and because of timeslice already used up, another thread(process) is scheduled and so the app is in off-cpu state.