platform/kernel/linux-3.10 - Domain: System / Kernel;

diff options

author	Roberto Agostino Vitillo <ravitillo@lbl.gov>	2012-02-09 23:21:03 +0100
committer	Ingo Molnar <mingo@elte.hu>	2012-03-09 08:26:05 +0100
commit	b50311dc2ac1c04ad19163c2359910b25e16caf6 (patch)
tree	80a4489b2b268b7512dd4eb566858a6bae8bfffe /include
parent	bdfebd848f2a14e639031a0b0e61d7c7ee5e5fd2 (diff)
download	linux-3.10-b50311dc2ac1c04ad19163c2359910b25e16caf6.tar.gz linux-3.10-b50311dc2ac1c04ad19163c2359910b25e16caf6.tar.bz2 linux-3.10-b50311dc2ac1c04ad19163c2359910b25e16caf6.zip

perf report: Add support for taken branch sampling

This patch adds support for taken branch sampling, i.e, the PERF_SAMPLE_BRANCH_STACK feature to perf report. In other words, to display histograms based on taken branches rather than executed instructions addresses. The new option is called -b and it takes no argument. To generate meaningful output, the perf.data must have been obtained using perf record -b xxx ... where xxx is a branch filter option. The output shows symbols, modules, sorted by 'who branches where' the most often. The percentages reported in the first column refer to the total number of branches captured and not the usual number of samples. Here is a quick example. Here branchy is simple test program which looks as follows: void f2(void) {} void f3(void) {} void f1(unsigned long n) { if (n & 1UL) f2(); else f3(); } int main(void) { unsigned long i; for (i=0; i < N; i++) f1(i); return 0; } Here is the output captured on Nehalem, if we are only interested in user level function calls. $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf About half (52%) of the call branches captured are from main() -> f1(). The second half (24%+23%) is split in two equal shares between f1() -> f2(), f1() ->f3(). The output is as expected given the code. It should be noted, that using -b in perf record does not eliminate information in the perf.data file. Consequently, a typical profile can also be obtained by perf report by simply not using its -b option. It is possible to sort on branch related columns: - dso_from, symbol_from - dso_to, symbol_to - mispredict Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>

Diffstat (limited to 'include')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: