summaryrefslogtreecommitdiff
path: root/tools/perf/builtin-diff.c
AgeCommit message (Collapse)AuthorFilesLines
2014-11-18perf diff: Use internal rb tree for hists__precomputeJiri Olsa1-3/+10
There's missing change for hists__precompute to iterate either entries_collapsed or entries_in tree. The change was initiated for hists_compute_resort function in commit: 66f97ed perf diff: Use internal rb tree for compute resort but was missing for hists__precompute function changes. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1355404152-16523-2-git-send-email-jolsa@redhat.com [ committer note: Reduce patch size, no functional change ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-11-18perf report: Add --percent-limit optionNamhyung Kim1-1/+1
The --percent-limit option is for not showing small overhead entries in the output. Maybe we want to set a certain default value like 0.1. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1368497347-9628-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-11-18perf sort: Consolidate sort_entry__setup_elide()Namhyung Kim1-3/+1
The same code was duplicate to places, factor them out to common sort__setup_elide(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364991979-3008-11-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01perf tools: Add support for weight v7 (modified)Andi Kleen1-3/+4
perf record has a new option -W that enables weightened sampling. Add sorting support in top/report for the average weight per sample and the total weight sum. This allows to both compare relative cost per event and the total cost over the measurement period. Add the necessary glue to perf report, record and the library. v2: Merge with new hist refactoring. v3: Fix manpage. Remove value check. Rename global_weight to weight and weight to local_weight. v4: Readd sort keys to manpage v5: Move weight to end v6: Move weight to template v7: Rename weight key. Original patch from Andi modified by Stephane Eranian <eranian@google.com> to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-02-06perf sort: Make setup_sorting returns an error codeNamhyung Kim1-1/+3
Currently the setup_sorting() is called for parsing sort keys and exits if it failed to add the sort key. As it's included in libperf it'd be better returning an error code rather than exiting application inside of the library. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24perf diff: Use internal rb tree for compute resortNamhyung Kim1-10/+21
There's no reason to run hists_compute_resort() using output tree. Convert it to use internal tree so that it can remove unnecessary _output_resort. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1355128197-18193-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24perf hists: Link hist entries before inserting to an output treeNamhyung Kim1-48/+17
For matching and/or linking hist entries, they need to be sorted by given sort keys. However current hists__match/link did this on the output trees, so that the entries in the output tree need to be resort before doing it. This looks not so good since we have trees for collecting or collapsing entries before passing them to an output tree and they're already sorted by the given sort keys. Since we don't need to print anything at the time of matching/linking, we can use these internal trees directly instead of bothering with double resort on the output tree. Its only user - at the time of this writing - perf diff can be easily converted to use the internal tree and can save some lines too by getting rid of unnecessary resorting codes. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1355128197-18193-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24perf hists: Exchange order of comparing items when collapsing histsNamhyung Kim1-1/+1
When comparing entries for collapsing put the given entry first, and then the iterated entry. This is not the case of hist_entry__cmp() when called if given sort keys don't require collapsing. So change the order for the sake of consistency. It will be required for matching and/or linking multiple hist entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1355128197-18193-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09perf diff: Remove displacement output optionJiri Olsa1-22/+7
It seems not very useful, because it's possible and event more convenient to lookup related symbol by name. Also the output value for both 'baseline' and 'new' data is quite apparent from diff output. And above all it complicates hist code factoring ;) Ditching out PERF_HPP__DISPL column with related output functions. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20121206132228.GB1080@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09perf diff: Change formula methods to work with pair directlyJiri Olsa1-22/+13
Changing formula methods to operate over hist entry and its pair directly. This makes the code more obvious and readable, instead of all time checking for pair being != NULL. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1354110769-2998-7-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09perf diff: Change compute methods to work with pair directlyJiri Olsa1-21/+17
Changing compute methods to operate over hist entry and its pair directly. This makes the code more obvious and readable, instead of all time checking for pair being != NULL. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1354110769-2998-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09perf hists: Introduce perf_hpp__list for period related columnsJiri Olsa1-13/+8
Adding perf_hpp__list list to register and contain all period related columns the command is interested in. This way we get rid of static array holding all possible columns and enable commands to register their own columns. It'll be handy for diff command in future to process and display data for multiple files. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-kiykge4igrcl7etmpmveto1h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08perf diff: Use hists__link when not pairing just with baselineArnaldo Carvalho de Melo1-0/+2
Previously there were blind spots because we were not looking at symbols that didn't ocurred in the latest run: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (~801 samples) ] # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (~801 samples) ] Before: # perf diff # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ ....... ................. ............................. # +10.38% [kernel.kallsyms] [k] get_empty_filp +9.51% [kernel.kallsyms] [k] update_sd_lb_stats +9.41% libpopt.so.0.0.0 [.] _init +9.29% [kernel.kallsyms] [k] vma_interval_tree_insert 9.05% +0.12% [kernel.kallsyms] [k] do_sys_open +9.14% [kernel.kallsyms] [k] kfree +8.98% [kernel.kallsyms] [k] free_pages_and_swap_cache +8.78% [kernel.kallsyms] [k] unmap_page_range 9.36% -0.90% [kernel.kallsyms] [k] zap_pte_range 7.60% +0.09% [kernel.kallsyms] [k] find_next_bit +4.37% [kernel.kallsyms] [k] place_entity +3.38% [kernel.kallsyms] [k] __do_page_fault +0.80% [kernel.kallsyms] [k] native_apic_mem_write 0.21% +0.43% [kernel.kallsyms] [k] native_write_msr_safe # So 9.05 + 9.36 + 7.60 + 0.21 != 100% Now using the recently introduced hists__link we can see the whole picture: # perf diff # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ ....... ................. ............................. # 8.44% -8.44% [kernel.kallsyms] [k] _raw_spin_lock 9.05% -9.05% [kernel.kallsyms] [k] sha_transform 10.55% -10.55% [kernel.kallsyms] [k] __d_lookup_rcu +10.38% [kernel.kallsyms] [k] get_empty_filp 17.70% -17.70% [kernel.kallsyms] [k] kmem_cache_free +9.51% [kernel.kallsyms] [k] update_sd_lb_stats +9.41% libpopt.so.0.0.0 [.] _init +9.29% [kernel.kallsyms] [k] vma_interval_tree_insert 9.05% +0.12% [kernel.kallsyms] [k] do_sys_open +9.14% [kernel.kallsyms] [k] kfree +8.98% [kernel.kallsyms] [k] free_pages_and_swap_cache +8.78% [kernel.kallsyms] [k] unmap_page_range 9.36% -0.90% [kernel.kallsyms] [k] zap_pte_range 7.60% +0.09% [kernel.kallsyms] [k] find_next_bit +4.37% [kernel.kallsyms] [k] place_entity +3.38% [kernel.kallsyms] [k] __do_page_fault 4.01% -4.01% [kernel.kallsyms] [k] handle_pte_fault 9.27% -9.27% [kernel.kallsyms] [k] find_get_page 0.78% -0.78% [kernel.kallsyms] [k] rcu_irq_enter 0.57% -0.57% [kernel.kallsyms] [k] finish_task_switch 4.25% -4.25% [kernel.kallsyms] [k] run_timer_softirq +0.80% [kernel.kallsyms] [k] native_apic_mem_write 0.21% +0.43% [kernel.kallsyms] [k] native_write_msr_safe 9.16% -9.16% ld-2.12.so [.] close # Now: 8.44 + 9.05 + 10.55 + 17.70 + 9.05 + 9.36 + 7.60 + 4.01 + 9.27 + 0.78 + 0.57 + 4.25 + 0.21 + 9.16 == 100% Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jeq55qdgby1745bs8r9sscdh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08perf diff: Move hists__match to the hists libArnaldo Carvalho de Melo1-34/+1
Its not 'diff' specific and will be useful for other use cases, like bucketizing multiple events in a single session. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-o35urjgxfxxm70aw1wa81s4w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08perf diff: Start moving to support matching more than two histsArnaldo Carvalho de Melo1-9/+12
We want to match more than two hists, so that we can match more than two perf.data files and moreover, match hist_entries (buckets) in multiple events in a group. So the "baseline"/"leader" will instead of a ->pair pointer, use a list_head, that will link to the pairs and hists__match use it. Following that perf_evlist__link will link the hists in its evsel groups. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2kbmzepoi544ygj9godseqpv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-06perf event: No need to create a thread when handling PERF_RECORD_EXITArnaldo Carvalho de Melo1-2/+2
When we were processing a PERF_RECORD_EXIT event we first used machine__findnew_thread for both the thread exiting and for its parent, only to use just the thread struct associated with the one exiting, and to just delete it. If it existed, i.e. not created at this very moment in machine__findnew_thread, it will be moved to the machine->dead_threads linked list, because we may have hist_entries pointing to it, but if it was created just do be deleted, it will just sit there with no references at all. Use the new machine__find_thread() method so that if it is not there, we don't create it. As a bonus the parent thread will also not be created at this point. Create process_fork() and process_exit() helpers to use this and make the builtins use it instead of the generic process_task(), ditched by this patch. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Include samples without symbol in overall statsJiri Olsa1-1/+1
Currently we omit samples without symbols. This way we get different and confusing numbers for samples than from report command. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-8-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add -F option to display formula for computationJiri Olsa1-1/+66
Adding -F option to display the formula for specified computation. This is mainly to facilitate debugging, but can be useful anyway. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-7-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add -p option to display period values for hist entriesJiri Olsa1-1/+9
Adding -p option to show period values for both compared hist entries. Showing hist column PERF_HPP__PERIOD and newly added hist column PERF_HPP__PERIOD_BASELINE. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add weighted diff computation way to compare hist entriesJiri Olsa1-5/+111
Adding 'wdiff' as new computation way to compare hist entries. If specified the 'Weighted diff' column is displayed with value 'd' computed as: d = B->period * WEIGHT-A - A->period * WEIGHT-B - A/B being matching hist entry from first/second file specified (or perf.data/perf.data.old) respectively. - period being the hist entry period value - WEIGHT-A/WEIGHT-B being user suplied weights in the the '-c' option behind ':' separator like '-c wdiff:1,2'. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-5-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add option to sort entries based on diff computationJiri Olsa1-0/+137
Adding support to sort hist entries based on the outcome of selected computation. It's now possible to specify '+' as a first character of '-c' option value to make such sort. Example: $ perf diff -c ratio -b # Event 'cache-misses' # # Baseline Ratio Shared Object Symbol # ........ .............. ................. ................................ # 19.64% 0.69 [kernel.kallsyms] [k] clear_page 0.30% 0.17 [kernel.kallsyms] [k] mm_alloc 0.04% 0.20 [kernel.kallsyms] [k] kmem_cache_alloc $ perf diff -c +ratio -b # Event 'cache-misses' # # Baseline Ratio Shared Object Symbol # ........ .............. ................. ................................ # 19.64% 0.69 [kernel.kallsyms] [k] clear_page 0.04% 0.20 [kernel.kallsyms] [k] kmem_cache_alloc 0.30% 0.17 [kernel.kallsyms] [k] mm_alloc Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add ratio computation way to compare hist entriesJiri Olsa1-2/+50
Adding -c option to select computation method with the current 'Delta' computation as default. Current possible values are of this option are: 'delta' and 'ratio'. Adding 'ratio' as new computation way to compare hist entries. If specified the 'Ratio' column is displayed with value 'r' computed as: r = A->period / B->period with: - A/B being matching hist entry from first/second file specified (or perf.data/perf.data.old) respectively. - period being the hist entry period value Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-05perf diff: Add -b option for perf diff to display paired entries onlyJiri Olsa1-2/+29
Adding -b option to perf diff command to display only entries with match in the baseline. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1349448287-18919-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04perf tool: Add hpp interface to enable/disable hpp columnJiri Olsa1-1/+17
Adding perf_hpp__column_enable function to enable/disable hists column and removing diff command specific stuff 'need_pair and show_displacement' from hpp code. The diff command now enables/disables columns separately according to the user arguments. This will be helpful in future patches where more columns are added into diff output. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-6-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04perf tools: Removing hists pair argument from output pathJiri Olsa1-2/+1
The hists pointer is now part of the 'struct hist_entry'. And since the overhead and baseline columns are split now, there's no reason to pass it through the output path. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-5-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04perf diff: Refactor diff displacement possition infoJiri Olsa1-17/+32
Moving the position calculation into the diff command, so the position as prepared inside struct hist_entry data and there's no need to compute in the output display path. Removing 'displacement' from struct perf_hpp as it is no longer needed. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1349354994-17853-3-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11perf tools: Use __maybe_used for unused variablesIrina Tirdea1-2/+2
perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-08perf hists: Introduce perf_hpp for hist period printingNamhyung Kim1-0/+1
Current hist print functions are messy because it has to consider many of command line options and the code doing that is scattered around to places. So when someone wants to add an option to manipulate the hist output it'd very easy to miss to update all of them in sync. And things getting worse as more options/features are added continuously. So I'd like to refactor them using hpp formats and move common code to ui/hist.c in order to make it easy to maintain and to add new features. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1346640790-17197-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-07perf diff: Make diff command work with evsel histsJiri Olsa1-32/+61
Putting 'perf diff' command back on track with the 'latest' evsel hists changes. Each evsel has its own 'hists' object gathering stats for the particular event. While currently counts are accumulated for the whole session regardless of the events diversification within compared sessions. The 'perf diff' command now outputs all matching events within compared sessions (with event name specified). The per event diff output stays the same. $ ./perf diff # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ .......... ................. .............................. # 0.00% +15.14% [kernel.kallsyms] [k] __wake_up 0.00% +13.38% [kernel.kallsyms] [k] ext4fs_dirhash ... SNIP 0.00% +0.42% [kernel.kallsyms] [k] local_clock 0.17% -0.05% [kernel.kallsyms] [k] native_write_msr_safe # Event 'faults' # # Baseline Delta Shared Object Symbol # ........ .......... ................. .............................. # 0.00% +79.12% ld-2.15.so [.] _dl_relocate_object 0.00% +11.62% ld-2.15.so [.] openaux Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1346946426-13496-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-22perf diff: Fix to work with new hists designJiri Olsa1-25/+35
The perf diff command is broken since: perf hists: Threaded addition and sorting of entries commit 1980c2ebd7020d82c024b8c4046849b38e78e7da Several places were broken: - hists data need to be collected into opened sessions instead of into events - session's hists data need to be initialized properly when the session is created - hist_entry__pcnt_snprintf: the percentage and displacement buffer preparation must not use 'ret' because it's used as a pointer to the final buffer Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120322133726.GB1601@m.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Rename perf_event_ops to perf_toolArnaldo Carvalho de Melo1-5/+6
To better reflect that it became the base class for all tools, that must be in each tool struct and where common stuff will be put. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Resolve machine earlier and pass it to perf_event_opsArnaldo Carvalho de Melo1-4/+5
Reducing the exposure of perf_session further, so that we can use the classes in cases where no perf.data file is created. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Pass tool context in the the perf_event_ops functionsArnaldo Carvalho de Melo1-1/+2
So that we don't need to have that many globals. Next steps will remove the 'session' pointer, that in most cases is not needed. Then we can rename perf_event_ops to 'perf_tool' that better describes this class hierarchy. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf hists: Allow limiting the number of rows and columns in fprintfArnaldo Carvalho de Melo1-1/+1
So that we can reuse hists__fprintf for in the new perf top tool. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-huazw48x05h8r9niz5cf63za@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23perf session: Pass evsel in event_ops->sample()Arnaldo Carvalho de Melo1-0/+1
Resolving the sample->id to an evsel since the most advanced tools, report and annotate, and the others will too when they evolve to properly support multi-event perf.data files. Good also because it does an extra validation, checking that the ID is valid when present. When that is not the case, the overhead is just a branch + function call (perf_evlist__id2evsel). Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29perf tools: Kill event_t typedef, use 'union perf_event' insteadArnaldo Carvalho de Melo1-7/+7
And move the event_t methods to the perf_event__ too. No code changes, just namespace consistency. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29perf tools: Rename 'struct sample_data' to 'struct perf_sample'Arnaldo Carvalho de Melo1-1/+1
Making the namespace more uniform. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf symbols: Add symfs option for off-box analysis using specified treeDavid Ahern1-0/+2
The symfs argument allows analysis of perf.data file using a locally accessible filesystem tree with debug symbols - e.g., tree created during image builds, sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything with an OS tree can be analyzed from anywhere without the need to populate a local data store with build-ids. Commiter notes: o Fixed up symfs="/" variants handling. o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files outside the symfs directory. LKML-Reference: <1291926427-28846-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf record,report,annotate,diff: Process events in orderIan Munsie1-0/+2
This patch changes perf report to ask for the ID info on all events be default if recording from multiple CPUs. Perf report, annotate and diff will now process the events in order if the kernel is able to provide timestamps on all events. This ensures that events such as COMM and MMAP which are necessary to correctly interpret samples are processed prior to those samples so that they are attributed correctly. Before: # perf record ./cachetest # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 74.11% :3259 [unknown] [k] 0x4a6c 1.50% cachetest ld-2.11.2.so [.] 0x1777c 1.46% :3259 [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% :3259 [kernel.kallsyms] [k] restore 0.74% :3259 [kernel.kallsyms] [k] ._raw_spin_lock 0.71% :3259 [kernel.kallsyms] [k] .filemap_fault 0.66% :3259 [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.54% :3259 [kernel.kallsyms] [k] .copy_4K_page 0.54% :3259 [kernel.kallsyms] [k] .find_get_page 0.52% :3259 [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% :3259 [kernel.kallsyms] [k] .__do_fault <SNIP> After: # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 44.28% cachetest cachetest [.] sumArrayNaive 22.53% cachetest cachetest [.] sumArrayOptimal 6.59% cachetest ld-2.11.2.so [.] 0x1777c 2.13% cachetest [unknown] [k] 0x340 1.46% cachetest [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% cachetest [kernel.kallsyms] [k] restore 0.74% cachetest [kernel.kallsyms] [k] ._raw_spin_lock 0.71% cachetest [kernel.kallsyms] [k] .filemap_fault 0.66% cachetest [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .copy_4K_page 0.54% cachetest [kernel.kallsyms] [k] .find_get_page 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.52% cachetest [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% cachetest [kernel.kallsyms] [k] .__do_fault <SNIP> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291872833-839-1-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf session: Fallback to unordered processing if no sample_id_allIan Munsie1-2/+2
If we are running the new perf on an old kernel without support for sample_id_all, we should fall back to the old unordered processing of events. If we didn't than we would *always* process events without timestamps out of order, whether or not we hit a reordering race. In other words, instead of there being a chance of not attributing samples correctly, we would guarantee that samples would not be attributed. While processing all events without timestamps before events with timestamps may seem like an intuitive solution, it falls down as PERF_RECORD_EXIT events would also be processed before any samples. Even with a workaround for that case, samples before/after an exec would not be attributed correctly. This patch allows commands to indicate whether they need to fall back to unordered processing, so that commands that do not care about timestamps on every event will not be affected. If we do fallback, this will print out a warning if report -D was invoked. This patch adds the test in perf_session__new so that we only need to test once per session. Commands that do not use an event_ops (such as record and top) can simply pass NULL in it's place. Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291951882-sup-6069@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-04perf session: Parse sample earlierArnaldo Carvalho de Melo1-5/+6
At perf_session__process_event, so that we reduce the number of lines in eache tool sample processing routine that now receives a sample_data pointer already parsed. This will also be useful in the next patch, where we'll allow sample the identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu, timestamp) just after before every event. Also validate callchains in perf_session__process_event, i.e. as early as possible, and keep a counter of the number of events discarded due to invalid callchains, warning the user about it if it happens. There is an assumption that was kept that all events have the same sample_type, that will be dealt with in the future, when this preexisting limitation will be removed. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-01perf diff: Fix displacement and modules options short flagShawn Bohrer1-1/+1
The --displacement and --modules options to perf diff both use -m as a short flag. Change --displacement to use -M since other perf commands use -m, --modules. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1291168642-11402-4-git-send-email-shawn.bohrer@gmail.com> Signed-off-by: Shawn Bohrer <shawn.bohrer@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-27perf tools: Remove unneeded code for tracking the cwd in perf sessionsDave Martin1-2/+0
Tidy-up patch to remove some code and struct perf_session data members which are no longer needed due to the previous patch: "perf tools: Don't abbreviate file paths relative to the cwd". LKML-Reference: <new-submission> Signed-off-by: Dave Martin <dave.martin@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05perf tools: Make event__preprocess_sample parse the sampleArnaldo Carvalho de Melo1-6/+1
Simplifying the tools that were using both in sequence and allowing upcoming simplifications, such as Arun's patch to sort by cpus. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14perf report: Report number of events, not samplesArnaldo Carvalho de Melo1-3/+3
Number of samples is meaningless after we switched to auto-freq, so report the number of events, i.e. not the sum of the different periods, but the number PERF_RECORD_SAMPLE emitted by the kernel. While doing this I noticed that naming "count" to the sum of all the event periods can be confusing, so rename it to .period, just like in struct sample.data, so that we become more consistent. This helps with the next step, that was to record in struct hist_entry the number of sample events for each instance, we need that because we use it to generate the number of events when applying filters to the tree of hist entries like it is being done in the TUI report browser. Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14perf hist: Clarify events_stats fields usageArnaldo Carvalho de Melo1-1/+1
The events_stats.total field is too generic, rename it to .total_period, and also add a comment explaining that it is the sum of all the .period fields in samples, that is needed because we use auto-freq to avoid sampling artifacts. Ditto for events_stats.lost, that is the sum of all lost_event.lost fields, i.e. the number of events the kernel dropped. Looking at the users, builtin-sched.c can make use of these fields and stop doing it again. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10perf hist: Introduce hists class and move lots of methods to itArnaldo Carvalho de Melo1-27/+23
In cbbc79a we introduced support for multiple events by introducing a new "event_stat_id" struct and then made several perf_session methods receive a point to it instead of a pointer to perf_session, and kept the event_stats and hists rb_tree in perf_session. While working on the new newt based browser, I realised that it would be better to introduce a new class, "hists" (short for "histograms"), renaming the "event_stat_id" struct and the perf_session methods that were really "hists" methods, as they manipulate only struct hists members, not touching anything in the other perf_session members. Other optimizations, such as calculating the maximum lenght of a symbol name present in an hists instance will be possible as we add them, avoiding a re-traversal just for finding that information. The rationale for the name "hists" to replace "event_stat_id" is that we may have multiple sets of hists for the same event_stat id, as, for instance, the 'perf diff' tool has, so event stat id is not what characterizes what this struct and the functions that manipulate it do. Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09perf hist: Simplify the insertion of new hist_entry instancesArnaldo Carvalho de Melo1-11/+3
And with that fix at least one bug: The first hit for an entry, the one that calls malloc to create a new instance in __perf_session__add_hist_entry, wasn't adding the count to the per cpumode (PERF_RECORD_MISC_USER, etc) total variable. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02perf: add perf-inject builtinTom Zanussi1-2/+2
Currently, perf 'live mode' writes build-ids at the end of the session, which isn't actually useful for processing live mode events. What would be better would be to have the build-ids sent before any of the samples that reference them, which can be done by processing the event stream and retrieving the build-ids on the first hit. Doing that in perf-record itself, however, is off-limits. This patch introduces perf-inject, which does the same job while leaving perf-record untouched. Normal mode perf still records the build-ids at the end of the session as it should, but for live mode, perf-inject can be injected in between the record and report steps e.g.: perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i - perf-inject reads a perf-record event stream and repipes it to stdout. At any point the processing code can inject other events into the event stream - in this case build-ids (-b option) are read and injected as needed into the event stream. Build-ids are just the first user of perf-inject - potentially anything that needs userspace processing to augment the trace stream with additional information could make use of this facility. Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-19perf: 'perf kvm' tool for monitoring guest performance from hostZhang, Yanmin1-1/+5
Here is the patch of userspace perf tool. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>