From 97e1c18e8d17bd87e1e383b2e9d9fc740332c8e2 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 18 Jul 2008 12:16:16 -0400 Subject: tracing: Kernel Tracepoints Implementation of kernel tracepoints. Inspired from the Linux Kernel Markers. Allows complete typing verification by declaring both tracing statement inline functions and probe registration/unregistration static inline functions within the same macro "DEFINE_TRACE". No format string is required. See the tracepoint Documentation and Samples patches for usage examples. Taken from the documentation patch : "A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or "off" (no probe is attached). When a tracepoint is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the tracepoint site). You can put tracepoints at important locations in the code. They are lightweight hooks that can pass an arbitrary number of parameters, which prototypes are described in a tracepoint declaration placed in a header file." Addition and removal of tracepoints is synchronized by RCU using the scheduler (and preempt_disable) as guarantees to find a quiescent state (this is really RCU "classic"). The update side uses rcu_barrier_sched() with call_rcu_sched() and the read/execute side uses "preempt_disable()/preempt_enable()". We make sure the previous array containing probes, which has been scheduled for deletion by the rcu callback, is indeed freed before we proceed to the next update. It therefore limits the rate of modification of a single tracepoint to one update per RCU period. The objective here is to permit fast batch add/removal of probes on _different_ tracepoints. Changelog : - Use #name ":" #proto as string to identify the tracepoint in the tracepoint table. This will make sure not type mismatch happens due to connexion of a probe with the wrong type to a tracepoint declared with the same name in a different header. - Add tracepoint_entry_free_old. - Change __TO_TRACE to get rid of the 'i' iterator. Masami Hiramatsu : Tested on x86-64. Performance impact of a tracepoint : same as markers, except that it adds about 70 bytes of instructions in an unlikely branch of each instrumented function (the for loop, the stack setup and the function call). It currently adds a memory read, a test and a conditional branch at the instrumentation site (in the hot path). Immediate values will eventually change this into a load immediate, test and branch, which removes the memory read which will make the i-cache impact smaller (changing the memory read for a load immediate removes 3-4 bytes per site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it also saves the d-cache hit). About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. Quoting Hideo Aoki about Markers : I evaluated overhead of kernel marker using linux-2.6-sched-fixes git tree, which includes several markers for LTTng, using an ia64 server. While the immediate trace mark feature isn't implemented on ia64, there is no major performance regression. So, I think that we don't have any issues to propose merging marker point patches into Linus's tree from the viewpoint of performance impact. I prepared two kernels to evaluate. The first one was compiled without CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS. I downloaded the original hackbench from the following URL: http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c I ran hackbench 5 times in each condition and calculated the average and difference between the kernels. The parameter of hackbench: every 50 from 50 to 800 The number of CPUs of the server: 2, 4, and 8 Below is the results. As you can see, major performance regression wasn't found in any case. Even if number of processes increases, differences between marker-enabled kernel and marker- disabled kernel doesn't increase. Moreover, if number of CPUs increases, the differences doesn't increase either. Curiously, marker-enabled kernel is better than marker-disabled kernel in more than half cases, although I guess it comes from the difference of memory access pattern. * 2 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 4.811 | 4.872 | +0.061 | +1.27 | 100 | 9.854 | 10.309 | +0.454 | +4.61 | 150 | 15.602 | 15.040 | -0.562 | -3.6 | 200 | 20.489 | 20.380 | -0.109 | -0.53 | 250 | 25.798 | 25.652 | -0.146 | -0.56 | 300 | 31.260 | 30.797 | -0.463 | -1.48 | 350 | 36.121 | 35.770 | -0.351 | -0.97 | 400 | 42.288 | 42.102 | -0.186 | -0.44 | 450 | 47.778 | 47.253 | -0.526 | -1.1 | 500 | 51.953 | 52.278 | +0.325 | +0.63 | 550 | 58.401 | 57.700 | -0.701 | -1.2 | 600 | 63.334 | 63.222 | -0.112 | -0.18 | 650 | 68.816 | 68.511 | -0.306 | -0.44 | 700 | 74.667 | 74.088 | -0.579 | -0.78 | 750 | 78.612 | 79.582 | +0.970 | +1.23 | 800 | 85.431 | 85.263 | -0.168 | -0.2 | -------------------------------------------------------------- * 4 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.586 | 2.584 | -0.003 | -0.1 | 100 | 5.254 | 5.283 | +0.030 | +0.56 | 150 | 8.012 | 8.074 | +0.061 | +0.76 | 200 | 11.172 | 11.000 | -0.172 | -1.54 | 250 | 13.917 | 14.036 | +0.119 | +0.86 | 300 | 16.905 | 16.543 | -0.362 | -2.14 | 350 | 19.901 | 20.036 | +0.135 | +0.68 | 400 | 22.908 | 23.094 | +0.186 | +0.81 | 450 | 26.273 | 26.101 | -0.172 | -0.66 | 500 | 29.554 | 29.092 | -0.461 | -1.56 | 550 | 32.377 | 32.274 | -0.103 | -0.32 | 600 | 35.855 | 35.322 | -0.533 | -1.49 | 650 | 39.192 | 38.388 | -0.804 | -2.05 | 700 | 41.744 | 41.719 | -0.025 | -0.06 | 750 | 45.016 | 44.496 | -0.520 | -1.16 | 800 | 48.212 | 47.603 | -0.609 | -1.26 | -------------------------------------------------------------- * 8 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.094 | 2.072 | -0.022 | -1.07 | 100 | 4.162 | 4.273 | +0.111 | +2.66 | 150 | 6.485 | 6.540 | +0.055 | +0.84 | 200 | 8.556 | 8.478 | -0.078 | -0.91 | 250 | 10.458 | 10.258 | -0.200 | -1.91 | 300 | 12.425 | 12.750 | +0.325 | +2.62 | 350 | 14.807 | 14.839 | +0.032 | +0.22 | 400 | 16.801 | 16.959 | +0.158 | +0.94 | 450 | 19.478 | 19.009 | -0.470 | -2.41 | 500 | 21.296 | 21.504 | +0.208 | +0.98 | 550 | 23.842 | 23.979 | +0.137 | +0.57 | 600 | 26.309 | 26.111 | -0.198 | -0.75 | 650 | 28.705 | 28.446 | -0.259 | -0.9 | 700 | 31.233 | 31.394 | +0.161 | +0.52 | 750 | 34.064 | 33.720 | -0.344 | -1.01 | 800 | 36.320 | 36.114 | -0.206 | -0.57 | -------------------------------------------------------------- Signed-off-by: Mathieu Desnoyers Acked-by: Masami Hiramatsu Acked-by: 'Peter Zijlstra' Signed-off-by: Ingo Molnar --- include/asm-generic/vmlinux.lds.h | 6 +- include/linux/module.h | 17 +++++ include/linux/tracepoint.h | 127 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 include/linux/tracepoint.h (limited to 'include') diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 7440a0dcedd..3d8e472a09c 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -52,7 +52,10 @@ . = ALIGN(8); \ VMLINUX_SYMBOL(__start___markers) = .; \ *(__markers) \ - VMLINUX_SYMBOL(__stop___markers) = .; + VMLINUX_SYMBOL(__stop___markers) = .; \ + VMLINUX_SYMBOL(__start___tracepoints) = .; \ + *(__tracepoints) \ + VMLINUX_SYMBOL(__stop___tracepoints) = .; #define RO_DATA(align) \ . = ALIGN((align)); \ @@ -61,6 +64,7 @@ *(.rodata) *(.rodata.*) \ *(__vermagic) /* Kernel version magic */ \ *(__markers_strings) /* Markers: strings */ \ + *(__tracepoints_strings)/* Tracepoints: strings */ \ } \ \ .rodata1 : AT(ADDR(.rodata1) - LOAD_OFFSET) { \ diff --git a/include/linux/module.h b/include/linux/module.h index 68e09557c95..8b611350386 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -331,6 +332,10 @@ struct module struct marker *markers; unsigned int num_markers; #endif +#ifdef CONFIG_TRACEPOINTS + struct tracepoint *tracepoints; + unsigned int num_tracepoints; +#endif #ifdef CONFIG_MODULE_UNLOAD /* What modules depend on me? */ @@ -454,6 +459,9 @@ extern void print_modules(void); extern void module_update_markers(void); +extern void module_update_tracepoints(void); +extern int module_get_iter_tracepoints(struct tracepoint_iter *iter); + #else /* !CONFIG_MODULES... */ #define EXPORT_SYMBOL(sym) #define EXPORT_SYMBOL_GPL(sym) @@ -558,6 +566,15 @@ static inline void module_update_markers(void) { } +static inline void module_update_tracepoints(void) +{ +} + +static inline int module_get_iter_tracepoints(struct tracepoint_iter *iter) +{ + return 0; +} + #endif /* CONFIG_MODULES */ struct device_driver; diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h new file mode 100644 index 00000000000..e623a6fca5c --- /dev/null +++ b/include/linux/tracepoint.h @@ -0,0 +1,127 @@ +#ifndef _LINUX_TRACEPOINT_H +#define _LINUX_TRACEPOINT_H + +/* + * Kernel Tracepoint API. + * + * See Documentation/tracepoint.txt. + * + * (C) Copyright 2008 Mathieu Desnoyers + * + * Heavily inspired from the Linux Kernel Markers. + * + * This file is released under the GPLv2. + * See the file COPYING for more details. + */ + +#include +#include + +struct module; +struct tracepoint; + +struct tracepoint { + const char *name; /* Tracepoint name */ + int state; /* State. */ + void **funcs; +} __attribute__((aligned(8))); + + +#define TPPROTO(args...) args +#define TPARGS(args...) args + +#ifdef CONFIG_TRACEPOINTS + +/* + * it_func[0] is never NULL because there is at least one element in the array + * when the array itself is non NULL. + */ +#define __DO_TRACE(tp, proto, args) \ + do { \ + void **it_func; \ + \ + rcu_read_lock_sched(); \ + it_func = rcu_dereference((tp)->funcs); \ + if (it_func) { \ + do { \ + ((void(*)(proto))(*it_func))(args); \ + } while (*(++it_func)); \ + } \ + rcu_read_unlock_sched(); \ + } while (0) + +/* + * Make sure the alignment of the structure in the __tracepoints section will + * not add unwanted padding between the beginning of the section and the + * structure. Force alignment to the same alignment as the section start. + */ +#define DEFINE_TRACE(name, proto, args) \ + static inline void trace_##name(proto) \ + { \ + static const char __tpstrtab_##name[] \ + __attribute__((section("__tracepoints_strings"))) \ + = #name ":" #proto; \ + static struct tracepoint __tracepoint_##name \ + __attribute__((section("__tracepoints"), aligned(8))) = \ + { __tpstrtab_##name, 0, NULL }; \ + if (unlikely(__tracepoint_##name.state)) \ + __DO_TRACE(&__tracepoint_##name, \ + TPPROTO(proto), TPARGS(args)); \ + } \ + static inline int register_trace_##name(void (*probe)(proto)) \ + { \ + return tracepoint_probe_register(#name ":" #proto, \ + (void *)probe); \ + } \ + static inline void unregister_trace_##name(void (*probe)(proto))\ + { \ + tracepoint_probe_unregister(#name ":" #proto, \ + (void *)probe); \ + } + +extern void tracepoint_update_probe_range(struct tracepoint *begin, + struct tracepoint *end); + +#else /* !CONFIG_TRACEPOINTS */ +#define DEFINE_TRACE(name, proto, args) \ + static inline void _do_trace_##name(struct tracepoint *tp, proto) \ + { } \ + static inline void trace_##name(proto) \ + { } \ + static inline int register_trace_##name(void (*probe)(proto)) \ + { \ + return -ENOSYS; \ + } \ + static inline void unregister_trace_##name(void (*probe)(proto))\ + { } + +static inline void tracepoint_update_probe_range(struct tracepoint *begin, + struct tracepoint *end) +{ } +#endif /* CONFIG_TRACEPOINTS */ + +/* + * Connect a probe to a tracepoint. + * Internal API, should not be used directly. + */ +extern int tracepoint_probe_register(const char *name, void *probe); + +/* + * Disconnect a probe from a tracepoint. + * Internal API, should not be used directly. + */ +extern int tracepoint_probe_unregister(const char *name, void *probe); + +struct tracepoint_iter { + struct module *module; + struct tracepoint *tracepoint; +}; + +extern void tracepoint_iter_start(struct tracepoint_iter *iter); +extern void tracepoint_iter_next(struct tracepoint_iter *iter); +extern void tracepoint_iter_stop(struct tracepoint_iter *iter); +extern void tracepoint_iter_reset(struct tracepoint_iter *iter); +extern int tracepoint_get_iter_range(struct tracepoint **tracepoint, + struct tracepoint *begin, struct tracepoint *end); + +#endif -- cgit v1.2.3 From 0a16b6075843325dc402edf80c1662838b929aff Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 18 Jul 2008 12:16:17 -0400 Subject: tracing, sched: LTTng instrumentation - scheduler Instrument the scheduler activity (sched_switch, migration, wakeups, wait for a task, signal delivery) and process/thread creation/destruction (fork, exit, kthread stop). Actually, kthread creation is not instrumented in this patch because it is architecture dependent. It allows to connect tracers such as ftrace which detects scheduling latencies, good/bad scheduler decisions. Tools like LTTng can export this scheduler information along with instrumentation of the rest of the kernel activity to perform post-mortem analysis on the scheduler activity. About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. See the "Tracepoints" patch header for performance result detail. Changelog : - Change instrumentation location and parameter to match ftrace instrumentation, previously done with kernel markers. [ mingo@elte.hu: conflict resolutions ] Signed-off-by: Mathieu Desnoyers Acked-by: 'Peter Zijlstra' Signed-off-by: Ingo Molnar --- include/trace/sched.h | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 include/trace/sched.h (limited to 'include') diff --git a/include/trace/sched.h b/include/trace/sched.h new file mode 100644 index 00000000000..506ae132365 --- /dev/null +++ b/include/trace/sched.h @@ -0,0 +1,45 @@ +#ifndef _TRACE_SCHED_H +#define _TRACE_SCHED_H + +#include +#include + +DEFINE_TRACE(sched_kthread_stop, + TPPROTO(struct task_struct *t), + TPARGS(t)); +DEFINE_TRACE(sched_kthread_stop_ret, + TPPROTO(int ret), + TPARGS(ret)); +DEFINE_TRACE(sched_wait_task, + TPPROTO(struct rq *rq, struct task_struct *p), + TPARGS(rq, p)); +DEFINE_TRACE(sched_wakeup, + TPPROTO(struct rq *rq, struct task_struct *p), + TPARGS(rq, p)); +DEFINE_TRACE(sched_wakeup_new, + TPPROTO(struct rq *rq, struct task_struct *p), + TPARGS(rq, p)); +DEFINE_TRACE(sched_switch, + TPPROTO(struct rq *rq, struct task_struct *prev, + struct task_struct *next), + TPARGS(rq, prev, next)); +DEFINE_TRACE(sched_migrate_task, + TPPROTO(struct rq *rq, struct task_struct *p, int dest_cpu), + TPARGS(rq, p, dest_cpu)); +DEFINE_TRACE(sched_process_free, + TPPROTO(struct task_struct *p), + TPARGS(p)); +DEFINE_TRACE(sched_process_exit, + TPPROTO(struct task_struct *p), + TPARGS(p)); +DEFINE_TRACE(sched_process_wait, + TPPROTO(struct pid *pid), + TPARGS(pid)); +DEFINE_TRACE(sched_process_fork, + TPPROTO(struct task_struct *parent, struct task_struct *child), + TPARGS(parent, child)); +DEFINE_TRACE(sched_signal_send, + TPPROTO(int sig, struct task_struct *p), + TPARGS(sig, p)); + +#endif -- cgit v1.2.3 From cf569a932217b97e2fc2c48aa597fe29519a0cff Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 23 Jul 2008 13:48:22 +0200 Subject: sched: clean up tracepoints make it a bit more structured hence more readable. Signed-off-by: Ingo Molnar --- include/trace/sched.h | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) (limited to 'include') diff --git a/include/trace/sched.h b/include/trace/sched.h index 506ae132365..ad47369d01b 100644 --- a/include/trace/sched.h +++ b/include/trace/sched.h @@ -6,40 +6,51 @@ DEFINE_TRACE(sched_kthread_stop, TPPROTO(struct task_struct *t), - TPARGS(t)); + TPARGS(t)); + DEFINE_TRACE(sched_kthread_stop_ret, TPPROTO(int ret), - TPARGS(ret)); + TPARGS(ret)); + DEFINE_TRACE(sched_wait_task, TPPROTO(struct rq *rq, struct task_struct *p), - TPARGS(rq, p)); + TPARGS(rq, p)); + DEFINE_TRACE(sched_wakeup, TPPROTO(struct rq *rq, struct task_struct *p), - TPARGS(rq, p)); + TPARGS(rq, p)); + DEFINE_TRACE(sched_wakeup_new, TPPROTO(struct rq *rq, struct task_struct *p), - TPARGS(rq, p)); + TPARGS(rq, p)); + DEFINE_TRACE(sched_switch, TPPROTO(struct rq *rq, struct task_struct *prev, struct task_struct *next), - TPARGS(rq, prev, next)); + TPARGS(rq, prev, next)); + DEFINE_TRACE(sched_migrate_task, TPPROTO(struct rq *rq, struct task_struct *p, int dest_cpu), - TPARGS(rq, p, dest_cpu)); + TPARGS(rq, p, dest_cpu)); + DEFINE_TRACE(sched_process_free, TPPROTO(struct task_struct *p), - TPARGS(p)); + TPARGS(p)); + DEFINE_TRACE(sched_process_exit, TPPROTO(struct task_struct *p), - TPARGS(p)); + TPARGS(p)); + DEFINE_TRACE(sched_process_wait, TPPROTO(struct pid *pid), - TPARGS(pid)); + TPARGS(pid)); + DEFINE_TRACE(sched_process_fork, TPPROTO(struct task_struct *parent, struct task_struct *child), - TPARGS(parent, child)); + TPARGS(parent, child)); + DEFINE_TRACE(sched_signal_send, TPPROTO(int sig, struct task_struct *p), - TPARGS(sig, p)); + TPARGS(sig, p)); #endif -- cgit v1.2.3 From 36dcd67ae994fece615b7c700958d215e884b9ae Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Tue, 29 Jul 2008 12:00:59 +0200 Subject: ftrace: ignore functions that cannot be kprobe-ed kprobes already has an extensive list of annotations for functions that should not be instrumented. Add notrace annotations to these functions as well. This is particularly useful for functions called by the NMI path. Signed-off-by: Ingo Molnar --- include/linux/kprobes.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 0be7795655f..497b1d1f7a0 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -29,6 +29,7 @@ * and Prasanna S Panchamukhi * added function-return probes. */ +#include #include #include #include @@ -47,7 +48,7 @@ #define KPROBE_HIT_SSDONE 0x00000008 /* Attach to insert probes on any functions which should be ignored*/ -#define __kprobes __attribute__((__section__(".kprobes.text"))) +#define __kprobes __attribute__((__section__(".kprobes.text"))) notrace struct kprobe; struct pt_regs; @@ -256,7 +257,7 @@ void recycle_rp_inst(struct kretprobe_instance *ri, struct hlist_head *head); #else /* CONFIG_KPROBES */ -#define __kprobes /**/ +#define __kprobes notrace struct jprobe; struct kretprobe; -- cgit v1.2.3 From 8da3821ba5634497da63d58a69e24a97697c4a2b Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 15:45:07 -0400 Subject: ftrace: create __mcount_loc section This patch creates a section in the kernel called "__mcount_loc". This will hold a list of pointers to the mcount relocation for each call site of mcount. For example: objdump -dr init/main.o [...] Disassembly of section .text: 0000000000000000 : 0: 55 push %rbp [...] 000000000000017b : 17b: 55 push %rbp 17c: 48 89 e5 mov %rsp,%rbp 17f: 53 push %rbx 180: 48 83 ec 08 sub $0x8,%rsp 184: e8 00 00 00 00 callq 189 185: R_X86_64_PC32 mcount+0xfffffffffffffffc [...] We will add a section to point to each function call. .section __mcount_loc,"a",@progbits [...] .quad .text + 0x185 [...] The offset to of the mcount call site in init_post is an offset from the start of the section, and not the start of the function init_post. The mcount relocation is at the call site 0x185 from the start of the .text section. .text + 0x185 == init_post + 0xa We need a way to add this __mcount_loc section in a way that we do not lose the relocations after final link. The .text section here will be attached to all other .text sections after final link and the offsets will be meaningless. We need to keep track of where these .text sections are. To do this, we use the start of the first function in the section. do_one_initcall. We can make a tmp.s file with this function as a reference to the start of the .text section. .section __mcount_loc,"a",@progbits [...] .quad do_one_initcall + 0x185 [...] Then we can compile the tmp.s into a tmp.o gcc -c tmp.s -o tmp.o And link it into back into main.o. ld -r main.o tmp.o -o tmp_main.o mv tmp_main.o main.o But we have a problem. What happens if the first function in a section is not exported, and is a static function. The linker will not let the tmp.o use it. This case exists in main.o as well. Disassembly of section .init.text: 0000000000000000 : 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: e8 00 00 00 00 callq 9 5: R_X86_64_PC32 mcount+0xfffffffffffffffc The first function in .init.text is a static function. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices The lowercase 't' means that set_reset_devices is local and is not exported. If we simply try to link the tmp.o with the set_reset_devices we end up with two symbols: one local and one global. .section __mcount_loc,"a",@progbits .quad set_reset_devices + 0x10 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices U set_reset_devices We still have an undefined reference to set_reset_devices, and if we try to compile the kernel, we will end up with an undefined reference to set_reset_devices, or even worst, it could be exported someplace else, and then we will have a reference to the wrong location. To handle this case, we make an intermediate step using objcopy. We convert set_reset_devices into a global exported symbol before linking it with tmp.o and set it back afterwards. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices Now we have a section in main.o called __mcount_loc that we can place somewhere in the kernel using vmlinux.ld.S and access it to convert all these locations that call mcount into nops before starting SMP and thus, eliminating the need to do this with kstop_machine. Note, A well documented perl script (scripts/recordmcount.pl) is used to do all this in one location. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/asm-generic/vmlinux.lds.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include') diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 3d8e472a09c..838d9b2a0da 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -37,6 +37,13 @@ #define MEM_DISCARD(sec) *(.mem##sec) #endif +#ifdef CONFIG_FTRACE_MCOUNT_RECORD +#define MCOUNT_REC() VMLINUX_SYMBOL(__start_mcount_loc) = .; \ + *(__mcount_loc) \ + VMLINUX_SYMBOL(__stop_mcount_loc) = .; +#else +#define MCOUNT_REC() +#endif /* .data section */ #define DATA_DATA \ @@ -192,6 +199,7 @@ /* __*init sections */ \ __init_rodata : AT(ADDR(__init_rodata) - LOAD_OFFSET) { \ *(.ref.rodata) \ + MCOUNT_REC() \ DEV_KEEP(init.rodata) \ DEV_KEEP(exit.rodata) \ CPU_KEEP(init.rodata) \ -- cgit v1.2.3 From 68bf21aa15c85d2e9b623dcda2b1ed8893275fa1 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 15:45:08 -0400 Subject: ftrace: mcount call site on boot nops core This is the infrastructure to the converting the mcount call sites recorded by the __mcount_loc section into nops on boot. It also allows for using these sites to enable tracing as normal. When the __mcount_loc section is used, the "ftraced" kernel thread is disabled. This uses the current infrastructure to record the mcount call sites as well as convert them to nops. The mcount function is kept as a stub on boot up and not converted to the ftrace_record_ip function. We use the ftrace_record_ip to only record from the table. This patch does not handle modules. That comes with a later patch. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/asm-x86/ftrace.h | 10 ++++++++++ include/linux/ftrace.h | 6 ++++++ 2 files changed, 16 insertions(+) (limited to 'include') diff --git a/include/asm-x86/ftrace.h b/include/asm-x86/ftrace.h index be0e004ad14..1bb6f9bbe1a 100644 --- a/include/asm-x86/ftrace.h +++ b/include/asm-x86/ftrace.h @@ -7,6 +7,16 @@ #ifndef __ASSEMBLY__ extern void mcount(void); + +static inline unsigned long ftrace_call_adjust(unsigned long addr) +{ + /* + * call mcount is "e8 <4 byte offset>" + * The addr points to the 4 byte offset and the caller of this + * function wants the pointer to e8. Simply subtract one. + */ + return addr - 1; +} #endif #endif /* CONFIG_FTRACE */ diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index bb384068272..d4d6ab453b7 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -162,4 +162,10 @@ static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } #endif +#ifdef CONFIG_FTRACE_MCOUNT_RECORD +extern void ftrace_init(void); +#else +static inline void ftrace_init(void) { } +#endif + #endif /* _LINUX_FTRACE_H */ -- cgit v1.2.3 From 90d595fe5ca4b685465c068907e6e554760abea8 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 15:45:09 -0400 Subject: ftrace: enable mcount recording for modules This patch enables the loading of the __mcount_section of modules and changing all the callers of mcount into nops. The modification is done before the init_module function is called, so again, we do not need to use kstop_machine to make these changes. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index d4d6ab453b7..4936489f9ed 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -164,8 +164,11 @@ ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } #ifdef CONFIG_FTRACE_MCOUNT_RECORD extern void ftrace_init(void); +extern void ftrace_init_module(unsigned long *start, unsigned long *end); #else static inline void ftrace_init(void) { } +static inline void +ftrace_init_module(unsigned long *start, unsigned long *end) { } #endif #endif /* _LINUX_FTRACE_H */ -- cgit v1.2.3 From 29e71abf56cebc5c5a4e184a6eb4360cc58554ad Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 15:45:10 -0400 Subject: ftrace: rebuild everything on change to FTRACE_MCOUNT_RECORD When enabling or disabling CONFIG_FTRACE_MCOUNT_RECORD, we want a full kernel compile to handle the adding of the __mcount_loc sections. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/kernel.h | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'include') diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 75d81f157d2..ecce4a4ccd5 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -486,4 +486,9 @@ struct sysinfo { #define NUMA_BUILD 0 #endif +/* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */ +#ifdef CONFIG_FTRACE_MCOUNT_RECORD +# define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD +#endif + #endif -- cgit v1.2.3 From 28614889bcb2558a47d02d52394b7fd9795a9547 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 22:47:18 -0400 Subject: ftrace: move notrace to compiler.h The notrace define belongs in compiler.h so that it can be used in init.h Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/compiler.h | 2 ++ include/linux/linkage.h | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 8322141ee48..98115d9d04d 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -44,6 +44,8 @@ extern void __chk_io_ptr(const volatile void __iomem *); # error Sorry, your compiler is too old/not recognized. #endif +#define notrace __attribute__((no_instrument_function)) + /* Intel compiler defines __GNUC__. So we will overwrite implementations * coming from above header files here */ diff --git a/include/linux/linkage.h b/include/linux/linkage.h index 56ba3739465..9fd1f859021 100644 --- a/include/linux/linkage.h +++ b/include/linux/linkage.h @@ -4,8 +4,6 @@ #include #include -#define notrace __attribute__((no_instrument_function)) - #ifdef __cplusplus #define CPP_ASMLINKAGE extern "C" #else -- cgit v1.2.3 From fed1939c64d2288938fdc1c367d49082da65e195 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 22:47:19 -0400 Subject: ftrace: remove old pointers to mcount When a mcount pointer is recorded into a table, it is used to add or remove calls to mcount (replacing them with nops). If the code is removed via removing a module, the pointers still exist. At modifying the code a check is always made to make sure the code being replaced is the code expected. In-other-words, the code being replaced is compared to what it is expected to be before being replaced. There is a very small chance that the code being replaced just happens to look like code that calls mcount (very small since the call to mcount is relative). To remove this chance, this patch adds ftrace_release to allow module unloading to remove the pointers to mcount within the module. Another change for init calls is made to not trace calls marked with __init. The tracing can not be started until after init is done anyway. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ include/linux/init.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 4936489f9ed..6b232a2460c 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -165,10 +165,12 @@ ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } #ifdef CONFIG_FTRACE_MCOUNT_RECORD extern void ftrace_init(void); extern void ftrace_init_module(unsigned long *start, unsigned long *end); +extern void ftrace_release(void *start, unsigned long size); #else static inline void ftrace_init(void) { } static inline void ftrace_init_module(unsigned long *start, unsigned long *end) { } +static inline void ftrace_release(void *start, unsigned long size) { } #endif #endif /* _LINUX_FTRACE_H */ diff --git a/include/linux/init.h b/include/linux/init.h index 93538b696e3..27f61f6b3cb 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -40,7 +40,7 @@ /* These are for everybody (although not all archs will actually discard it in modules) */ -#define __init __section(.init.text) __cold +#define __init __section(.init.text) __cold notrace #define __initdata __section(.init.data) #define __initconst __section(.init.rodata) #define __exitdata __section(.exit.data) -- cgit v1.2.3 From dd0e545f061f90099a3dcc13aa77e29c6295cf23 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Fri, 1 Aug 2008 12:26:41 -0400 Subject: ftrace: printk formatting infrastructure This patch adds a feature that can help kernel developers debug their code using ftrace. int ftrace_printk(const char *fmt, ...); This records into the ftrace buffer using printf formatting. The entry size in the buffers are still a fixed length. A new type has been added that allows for more entries to be used for a single recording. The start of the print is still the same as the other entries. It returns the number of characters written to the ftrace buffer. For example: Having a module with the following code: static int __init ftrace_print_test(void) { ftrace_printk("jiffies are %ld\n", jiffies); return 0; } Gives me: insmod-5441 3...1 7569us : ftrace_print_test: jiffies are 4296626666 for the latency_trace file and: insmod-5441 [03] 1959.370498: ftrace_print_test jiffies are 4296626666 for the trace file. Note: Only the infrastructure should go into the kernel. It is to help facilitate debugging for other kernel developers. Calls to ftrace_printk is not intended to be left in the kernel, and should be frowned upon just like scattering printks around in the code. But having this easily at your fingertips helps the debugging go faster and bugs be solved quicker. Maybe later on, we can hook this with markers and have their printf format be sucked into ftrace output. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 6b232a2460c..f53b975e32f 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -157,9 +157,18 @@ static inline void __ftrace_enabled_restore(int enabled) #ifdef CONFIG_TRACING extern void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3); +# define ftrace_printk(x...) __ftrace_printk(_THIS_IP_, x) +extern int +__ftrace_printk(unsigned long ip, const char *fmt, ...) + __attribute__ ((format (printf, 2, 3))); #else static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } +static inline int +ftrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 0))) +{ + return 0; +} #endif #ifdef CONFIG_FTRACE_MCOUNT_RECORD @@ -173,4 +182,5 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } static inline void ftrace_release(void *start, unsigned long size) { } #endif + #endif /* _LINUX_FTRACE_H */ -- cgit v1.2.3 From 2f2c99dba2398ef7d9c21f7c793180a50e68b1f0 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Fri, 1 Aug 2008 16:45:49 -0400 Subject: ftrace: ftrace_printk doc moved Based on Randy Dunlap's suggestion, the ftrace_printk kernel-doc belongs with the ftrace_printk macro that should be used. Not with the __ftrace_printk internal function. Signed-off-by: Steven Rostedt Acked-by: Randy Dunlap Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index f53b975e32f..018af16bce5 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -157,7 +157,24 @@ static inline void __ftrace_enabled_restore(int enabled) #ifdef CONFIG_TRACING extern void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3); -# define ftrace_printk(x...) __ftrace_printk(_THIS_IP_, x) + +/** + * ftrace_printk - printf formatting in the ftrace buffer + * @fmt: the printf format for printing + * + * Note: __ftrace_printk is an internal function for ftrace_printk and + * the @ip is passed in via the ftrace_printk macro. + * + * This function allows a kernel developer to debug fast path sections + * that printk is not appropriate for. By scattering in various + * printk like tracing in the code, a developer can quickly see + * where problems are occurring. + * + * This is intended as a debugging tool for the developer only. + * Please refrain from leaving ftrace_printks scattered around in + * your code. + */ +# define ftrace_printk(fmt...) __ftrace_printk(_THIS_IP_, fmt) extern int __ftrace_printk(unsigned long ip, const char *fmt, ...) __attribute__ ((format (printf, 2, 3))); -- cgit v1.2.3 From 3f5a54e371ca20b119b73704f6c01b71295c1714 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Wed, 30 Jul 2008 22:36:46 -0400 Subject: ftrace: dump out ftrace buffers to console on panic At OLS I had a lot of interest to be able to have the ftrace buffers dumped on panic. Usually one would expect to uses kexec and examine the buffers after a new kernel is loaded. But sometimes the resources do not permit kdump and kexec, so having an option to still see the sequence of events up to the crash is very advantageous. This patch adds the option to have the ftrace buffers dumped to the console in the latency_trace format on a panic. When the option is set, the default entries per CPU buffer are lowered to 16384, since the writing to the serial (if that is the console) may take an awful long time otherwise. [ Changes since -v1: Got alpine to send correctly (as well as spell check working). Removed config option. Moved the static variables into ftrace_dump itself. Gave printk a log level. ] Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 018af16bce5..f7fb92045bf 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -178,6 +178,7 @@ ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3); extern int __ftrace_printk(unsigned long ip, const char *fmt, ...) __attribute__ ((format (printf, 2, 3))); +extern void ftrace_dump(void); #else static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } @@ -186,6 +187,7 @@ ftrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 0))) { return 0; } +static inline void ftrace_dump(void) { } #endif #ifdef CONFIG_FTRACE_MCOUNT_RECORD -- cgit v1.2.3 From 7b928c23fa3e9fa37d1d4ba52ba963f41ee5aae0 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Fri, 15 Aug 2008 17:48:02 +0200 Subject: ftrace: build fix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit fix: In file included from init/main.c:65: include/linux/ftrace.h:166: error: expected ‘,' or ‘;' before ‘{' token make[1]: *** [init/main.o] Error 1 make: *** [init/main.o] Error 2 Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index f7fb92045bf..ce929cb5543 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -183,7 +183,10 @@ extern void ftrace_dump(void); static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } static inline int -ftrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 0))) +ftrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 0))); + +static inline int +ftrace_printk(const char *fmt, ...) { return 0; } -- cgit v1.2.3 From c5131ad6c3cbe8f6674993e29a76cecf8deb4384 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Fri, 15 Aug 2008 18:22:09 +0200 Subject: ftrace: ftrace_kill_atomic() build fix fix: kernel/built-in.o: In function `ftrace_dump': (.text+0x2e2ea): undefined reference to `ftrace_kill_atomic' Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index ce929cb5543..36c439927ff 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -36,6 +36,7 @@ extern void ftrace_stub(unsigned long a0, unsigned long a1); # define register_ftrace_function(ops) do { } while (0) # define unregister_ftrace_function(ops) do { } while (0) # define clear_ftrace_function(ops) do { } while (0) +static inline void ftrace_kill_atomic(void) { } #endif /* CONFIG_FTRACE */ #ifdef CONFIG_DYNAMIC_FTRACE -- cgit v1.2.3 From 3700273586ee6a58b95dd07d9f8a02db4a9b476f Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Mon, 18 Aug 2008 16:24:56 +0800 Subject: ftrace: fix incorrect comment style of __ftrace_enabled_save() This patch fixes incorrect comment style of __ftrace_enabled_save(). Signed-off-by: Huang Ying Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 36c439927ff..8b4cf38c80d 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -99,9 +99,11 @@ static inline void tracer_disable(void) #endif } -/* Ftrace disable/restore without lock. Some synchronization mechanism +/* + * Ftrace disable/restore without lock. Some synchronization mechanism * must be used to prevent ftrace_enabled to be changed between - * disable/restore. */ + * disable/restore. + */ static inline int __ftrace_enabled_save(void) { #ifdef CONFIG_FTRACE -- cgit v1.2.3 From c0719e5a4b1ccc04180b7a7b71095c9fb7131919 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Sat, 6 Sep 2008 01:06:03 -0400 Subject: ftrace: use ftrace_release for all dynamic ftrace functions ftrace_release is necessary for all uses of dynamic ftrace and not just the archs that have CONFIG_FTRACE_MCOUNT_RECORD defined. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 8b4cf38c80d..5de9903645d 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -77,8 +77,10 @@ extern void mcount_call(void); extern int skip_trace(unsigned long ip); -void ftrace_disable_daemon(void); -void ftrace_enable_daemon(void); +extern void ftrace_release(void *start, unsigned long size); + +extern void ftrace_disable_daemon(void); +extern void ftrace_enable_daemon(void); #else # define skip_trace(ip) ({ 0; }) @@ -86,6 +88,7 @@ void ftrace_enable_daemon(void); # define ftrace_set_filter(buf, len, reset) do { } while (0) # define ftrace_disable_daemon() do { } while (0) # define ftrace_enable_daemon() do { } while (0) +static inline void ftrace_release(void *start, unsigned long size) { } #endif /* CONFIG_DYNAMIC_FTRACE */ /* totally disable ftrace - can not re-enable after this */ @@ -199,12 +202,10 @@ static inline void ftrace_dump(void) { } #ifdef CONFIG_FTRACE_MCOUNT_RECORD extern void ftrace_init(void); extern void ftrace_init_module(unsigned long *start, unsigned long *end); -extern void ftrace_release(void *start, unsigned long size); #else static inline void ftrace_init(void) { } static inline void ftrace_init_module(unsigned long *start, unsigned long *end) { } -static inline void ftrace_release(void *start, unsigned long size) { } #endif -- cgit v1.2.3 From 9e57fb35d711331a9b1410c5c56ebeb3733428a0 Mon Sep 17 00:00:00 2001 From: Pekka Paalanen Date: Tue, 16 Sep 2008 22:00:34 +0300 Subject: x86 mmiotrace: implement mmiotrace_printk() Offer mmiotrace users a function to inject markers from inside the kernel. This depends on the trace_vprintk() patch. Signed-off-by: Pekka Paalanen Acked-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/mmiotrace.h | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/mmiotrace.h b/include/linux/mmiotrace.h index 61d19e1b7a0..60cc3bf5c53 100644 --- a/include/linux/mmiotrace.h +++ b/include/linux/mmiotrace.h @@ -34,11 +34,15 @@ extern void unregister_kmmio_probe(struct kmmio_probe *p); /* Called from page fault handler. */ extern int kmmio_handler(struct pt_regs *regs, unsigned long addr); -/* Called from ioremap.c */ #ifdef CONFIG_MMIOTRACE +/* Called from ioremap.c */ extern void mmiotrace_ioremap(resource_size_t offset, unsigned long size, void __iomem *addr); extern void mmiotrace_iounmap(volatile void __iomem *addr); + +/* For anyone to insert markers. Remember trailing newline. */ +extern int mmiotrace_printk(const char *fmt, ...) + __attribute__ ((format (printf, 1, 2))); #else static inline void mmiotrace_ioremap(resource_size_t offset, unsigned long size, void __iomem *addr) @@ -48,7 +52,15 @@ static inline void mmiotrace_ioremap(resource_size_t offset, static inline void mmiotrace_iounmap(volatile void __iomem *addr) { } -#endif /* CONFIG_MMIOTRACE_HOOKS */ + +static inline int mmiotrace_printk(const char *fmt, ...) + __attribute__ ((format (printf, 1, 0))); + +static inline int mmiotrace_printk(const char *fmt, ...) +{ + return 0; +} +#endif /* CONFIG_MMIOTRACE */ enum mm_io_opcode { MMIO_READ = 0x1, /* struct mmiotrace_rw */ @@ -81,5 +93,6 @@ extern void enable_mmiotrace(void); extern void disable_mmiotrace(void); extern void mmio_trace_rw(struct mmiotrace_rw *rw); extern void mmio_trace_mapping(struct mmiotrace_map *map); +extern int mmio_trace_printk(const char *fmt, va_list args); #endif /* MMIOTRACE_H */ -- cgit v1.2.3 From 4427414170a63331a9cc36b9598502c5cdfe453b Mon Sep 17 00:00:00 2001 From: Pekka Paalanen Date: Tue, 16 Sep 2008 22:03:56 +0300 Subject: mmiotrace: remove left-over marker cruft Signed-off-by: Pekka Paalanen Acked-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/mmiotrace.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/mmiotrace.h b/include/linux/mmiotrace.h index 60cc3bf5c53..139d7c88d9c 100644 --- a/include/linux/mmiotrace.h +++ b/include/linux/mmiotrace.h @@ -67,8 +67,7 @@ enum mm_io_opcode { MMIO_WRITE = 0x2, /* struct mmiotrace_rw */ MMIO_PROBE = 0x3, /* struct mmiotrace_map */ MMIO_UNPROBE = 0x4, /* struct mmiotrace_map */ - MMIO_MARKER = 0x5, /* raw char data */ - MMIO_UNKNOWN_OP = 0x6, /* struct mmiotrace_rw */ + MMIO_UNKNOWN_OP = 0x5, /* struct mmiotrace_rw */ }; struct mmiotrace_rw { -- cgit v1.2.3 From e98d0eabef2748d88fa58760d104e8e68517406b Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Mon, 29 Sep 2008 11:05:13 -0400 Subject: markers: marker_synchronize_unregister() Create marker_synchronize_unregister() which must be called before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers Signed-off-by: Ingo Molnar --- include/linux/marker.h | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'include') diff --git a/include/linux/marker.h b/include/linux/marker.h index 1290653f924..889196c7fbb 100644 --- a/include/linux/marker.h +++ b/include/linux/marker.h @@ -160,4 +160,11 @@ extern int marker_probe_unregister_private_data(marker_probe_func *probe, extern void *marker_get_private_data(const char *name, marker_probe_func *probe, int num); +/* + * marker_synchronize_unregister must be called between the last marker probe + * unregistration and the end of module exit to make sure there is no caller + * executing a probe when it is freed. + */ +#define marker_synchronize_unregister() synchronize_sched() + #endif -- cgit v1.2.3 From 53c8c8fdfd2d2d515bdcb3d0f2a11d1f3f42ece1 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 3 Oct 2008 11:52:54 -0400 Subject: markers: turn marker_synchronize_unregister() into an inline Turn marker synchronize unregister into a static inline. There is no reason to keep it as a macro over a static inline. Signed-off-by: Mathieu Desnoyers Signed-off-by: Ingo Molnar --- include/linux/marker.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'include') diff --git a/include/linux/marker.h b/include/linux/marker.h index 889196c7fbb..38e32e781ed 100644 --- a/include/linux/marker.h +++ b/include/linux/marker.h @@ -13,6 +13,7 @@ */ #include +#include struct module; struct marker; @@ -165,6 +166,9 @@ extern void *marker_get_private_data(const char *name, marker_probe_func *probe, * unregistration and the end of module exit to make sure there is no caller * executing a probe when it is freed. */ -#define marker_synchronize_unregister() synchronize_sched() +static inline void marker_synchronize_unregister(void) +{ + synchronize_sched(); +} #endif -- cgit v1.2.3 From d13744cd6e3fef373a3fe656ac349b4e7c49ff79 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Weisbecker?= Date: Tue, 23 Sep 2008 11:32:08 +0100 Subject: tracing/ftrace: add the boot tracer Add the boot/initcall tracer. It's primary purpose is to be able to trace the initcalls. It is intended to be used with scripts/bootgraph.pl after some small improvements. Note that it is not active after its init. To avoid tracing (and so crashing) before the whole tracing engine init, you have to explicitly call start_boot_trace() after do_pre_smp_initcalls() to enable it. Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 5de9903645d..91954eb6460 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -5,6 +5,8 @@ #include #include +#include +#include extern int ftrace_enabled; extern int @@ -209,4 +211,21 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } #endif +struct boot_trace { + pid_t caller; + initcall_t func; + int result; + unsigned long long duration; +}; + +#ifdef CONFIG_BOOT_TRACER +extern void trace_boot(struct boot_trace *it); +extern void start_boot_trace(void); +#else +static inline void trace_boot(struct boot_trace *it) { } +static inline void start_boot_trace(void) { } +#endif + + + #endif /* _LINUX_FTRACE_H */ -- cgit v1.2.3 From 7a8e76a3829f1067b70f715771ff88baf2fbf3c3 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Mon, 29 Sep 2008 23:02:38 -0400 Subject: tracing: unified trace buffer This is a unified tracing buffer that implements a ring buffer that hopefully everyone will eventually be able to use. The events recorded into the buffer have the following structure: struct ring_buffer_event { u32 type:2, len:3, time_delta:27; u32 array[]; }; The minimum size of an event is 8 bytes. All events are 4 byte aligned inside the buffer. There are 4 types (all internal use for the ring buffer, only the data type is exported to the interface users). RINGBUF_TYPE_PADDING: this type is used to note extra space at the end of a buffer page. RINGBUF_TYPE_TIME_EXTENT: This type is used when the time between events is greater than the 27 bit delta can hold. We add another 32 bits, and record that in its own event (8 byte size). RINGBUF_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to help keep the buffer timestamps in sync. RINGBUF_TYPE_DATA: The event actually holds user data. The "len" field is only three bits. Since the data must be 4 byte aligned, this field is shifted left by 2, giving a max length of 28 bytes. If the data load is greater than 28 bytes, the first array field holds the full length of the data load and the len field is set to zero. Example, data size of 7 bytes: type = RINGBUF_TYPE_DATA len = 2 time_delta: - array[0..1]: <7 bytes of data> <1 byte empty> This event is saved in 12 bytes of the buffer. An event with 82 bytes of data: type = RINGBUF_TYPE_DATA len = 0 time_delta: - array[0]: 84 (Note the alignment) array[1..14]: <82 bytes of data> <2 bytes empty> The above event is saved in 92 bytes (if my math is correct). 82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length. Do not reference the above event struct directly. Use the following functions to gain access to the event table, since the ring_buffer_event structure may change in the future. ring_buffer_event_length(event): get the length of the event. This is the size of the memory used to record this event, and not the size of the data pay load. ring_buffer_time_delta(event): get the time delta of the event This returns the delta time stamp since the last event. Note: Even though this is in the header, there should be no reason to access this directly, accept for debugging. ring_buffer_event_data(event): get the data from the event This is the function to use to get the actual data from the event. Note, it is only a pointer to the data inside the buffer. This data must be copied to another location otherwise you risk it being written over in the buffer. ring_buffer_lock: A way to lock the entire buffer. ring_buffer_unlock: unlock the buffer. ring_buffer_alloc: create a new ring buffer. Can choose between overwrite or consumer/producer mode. Overwrite will overwrite old data, where as consumer producer will throw away new data if the consumer catches up with the producer. The consumer/producer is the default. ring_buffer_free: free the ring buffer. ring_buffer_resize: resize the buffer. Changes the size of each cpu buffer. Note, it is up to the caller to provide that the buffer is not being used while this is happening. This requirement may go away but do not count on it. ring_buffer_lock_reserve: locks the ring buffer and allocates an entry on the buffer to write to. ring_buffer_unlock_commit: unlocks the ring buffer and commits it to the buffer. ring_buffer_write: writes some data into the ring buffer. ring_buffer_peek: Look at a next item in the cpu buffer. ring_buffer_consume: get the next item in the cpu buffer and consume it. That is, this function increments the head pointer. ring_buffer_read_start: Start an iterator of a cpu buffer. For now, this disables the cpu buffer, until you issue a finish. This is just because we do not want the iterator to be overwritten. This restriction may change in the future. But note, this is used for static reading of a buffer which is usually done "after" a trace. Live readings would want to use the ring_buffer_consume above, which will not disable the ring buffer. ring_buffer_read_finish: Finishes the read iterator and reenables the ring buffer. ring_buffer_iter_peek: Look at the next item in the cpu iterator. ring_buffer_read: Read the iterator and increment it. ring_buffer_iter_reset: Reset the iterator to point to the beginning of the cpu buffer. ring_buffer_iter_empty: Returns true if the iterator is at the end of the cpu buffer. ring_buffer_size: returns the size in bytes of each cpu buffer. Note, the real size is this times the number of CPUs. ring_buffer_reset_cpu: Sets the cpu buffer to empty ring_buffer_reset: sets all cpu buffers to empty ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a cpu buffer of another buffer. This is handy when you want to take a snap shot of a running trace on just one cpu. Having a backup buffer, to swap with facilitates this. Ftrace max latencies use this. ring_buffer_empty: Returns true if the ring buffer is empty. ring_buffer_empty_cpu: Returns true if the cpu buffer is empty. ring_buffer_record_disable: disable all cpu buffers (read only) ring_buffer_record_disable_cpu: disable a single cpu buffer (read only) ring_buffer_record_enable: enable all cpu buffers. ring_buffer_record_enabl_cpu: enable a single cpu buffer. ring_buffer_entries: The number of entries in a ring buffer. ring_buffer_overruns: The number of entries removed due to writing wrap. ring_buffer_time_stamp: Get the time stamp used by the ring buffer ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp into nanosecs. I still need to implement the GTOD feature. But we need support from the cpu frequency infrastructure. But this can be done at a later time without affecting the ring buffer interface. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ring_buffer.h | 130 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 include/linux/ring_buffer.h (limited to 'include') diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h new file mode 100644 index 00000000000..c52375b8330 --- /dev/null +++ b/include/linux/ring_buffer.h @@ -0,0 +1,130 @@ +#ifndef _LINUX_RING_BUFFER_H +#define _LINUX_RING_BUFFER_H + +#include +#include + +struct ring_buffer; +struct ring_buffer_iter; + +/* + * Don't reference this struct directly, use functions below. + */ +struct ring_buffer_event { + u32 type:2, len:3, time_delta:27; + u32 array[]; +}; + +/** + * enum ring_buffer_type - internal ring buffer types + * + * @RINGBUF_TYPE_PADDING: Left over page padding + * array is ignored + * size is variable depending on how much + * padding is needed + * + * @RINGBUF_TYPE_TIME_EXTEND: Extend the time delta + * array[0] = time delta (28 .. 59) + * size = 8 bytes + * + * @RINGBUF_TYPE_TIME_STAMP: Sync time stamp with external clock + * array[0] = tv_nsec + * array[1] = tv_sec + * size = 16 bytes + * + * @RINGBUF_TYPE_DATA: Data record + * If len is zero: + * array[0] holds the actual length + * array[1..(length+3)/4-1] holds data + * else + * length = len << 2 + * array[0..(length+3)/4] holds data + */ +enum ring_buffer_type { + RINGBUF_TYPE_PADDING, + RINGBUF_TYPE_TIME_EXTEND, + /* FIXME: RINGBUF_TYPE_TIME_STAMP not implemented */ + RINGBUF_TYPE_TIME_STAMP, + RINGBUF_TYPE_DATA, +}; + +unsigned ring_buffer_event_length(struct ring_buffer_event *event); +void *ring_buffer_event_data(struct ring_buffer_event *event); + +/** + * ring_buffer_event_time_delta - return the delta timestamp of the event + * @event: the event to get the delta timestamp of + * + * The delta timestamp is the 27 bit timestamp since the last event. + */ +static inline unsigned +ring_buffer_event_time_delta(struct ring_buffer_event *event) +{ + return event->time_delta; +} + +void ring_buffer_lock(struct ring_buffer *buffer, unsigned long *flags); +void ring_buffer_unlock(struct ring_buffer *buffer, unsigned long flags); + +/* + * size is in bytes for each per CPU buffer. + */ +struct ring_buffer * +ring_buffer_alloc(unsigned long size, unsigned flags); +void ring_buffer_free(struct ring_buffer *buffer); + +int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size); + +struct ring_buffer_event * +ring_buffer_lock_reserve(struct ring_buffer *buffer, + unsigned long length, + unsigned long *flags); +int ring_buffer_unlock_commit(struct ring_buffer *buffer, + struct ring_buffer_event *event, + unsigned long flags); +int ring_buffer_write(struct ring_buffer *buffer, + unsigned long length, void *data); + +struct ring_buffer_event * +ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts); +struct ring_buffer_event * +ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts); + +struct ring_buffer_iter * +ring_buffer_read_start(struct ring_buffer *buffer, int cpu); +void ring_buffer_read_finish(struct ring_buffer_iter *iter); + +struct ring_buffer_event * +ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts); +struct ring_buffer_event * +ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts); +void ring_buffer_iter_reset(struct ring_buffer_iter *iter); +int ring_buffer_iter_empty(struct ring_buffer_iter *iter); + +unsigned long ring_buffer_size(struct ring_buffer *buffer); + +void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu); +void ring_buffer_reset(struct ring_buffer *buffer); + +int ring_buffer_swap_cpu(struct ring_buffer *buffer_a, + struct ring_buffer *buffer_b, int cpu); + +int ring_buffer_empty(struct ring_buffer *buffer); +int ring_buffer_empty_cpu(struct ring_buffer *buffer, int cpu); + +void ring_buffer_record_disable(struct ring_buffer *buffer); +void ring_buffer_record_enable(struct ring_buffer *buffer); +void ring_buffer_record_disable_cpu(struct ring_buffer *buffer, int cpu); +void ring_buffer_record_enable_cpu(struct ring_buffer *buffer, int cpu); + +unsigned long ring_buffer_entries(struct ring_buffer *buffer); +unsigned long ring_buffer_overruns(struct ring_buffer *buffer); + +u64 ring_buffer_time_stamp(int cpu); +void ring_buffer_normalize_time_stamp(int cpu, u64 *ts); + +enum ring_buffer_flags { + RB_FL_OVERWRITE = 1 << 0, +}; + +#endif /* _LINUX_RING_BUFFER_H */ -- cgit v1.2.3 From d769041f865330034131525ee6a7f72eb4af2a24 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Wed, 1 Oct 2008 00:29:53 -0400 Subject: ring_buffer: implement new locking The old "lock always" scheme had issues with lockdep, and was not very efficient anyways. This patch does a new design to be partially lockless on writes. Writes will add new entries to the per cpu pages by simply disabling interrupts. When a write needs to go to another page than it will grab the lock. A new "read page" has been added so that the reader can pull out a page from the ring buffer to read without worrying about the writer writing over it. This allows us to not take the lock for all reads. The lock is now only taken when a read needs to go to a new page. This is far from lockless, and interrupts still need to be disabled, but it is a step towards a more lockless solution, and it also solves a lot of the issues that were noticed by the first conversion of ftrace to the ring buffers. Note: the ring_buffer_{un}lock API has been removed. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/linux/ring_buffer.h | 3 --- 1 file changed, 3 deletions(-) (limited to 'include') diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index c52375b8330..536b0ca46a0 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -63,9 +63,6 @@ ring_buffer_event_time_delta(struct ring_buffer_event *event) return event->time_delta; } -void ring_buffer_lock(struct ring_buffer *buffer, unsigned long *flags); -void ring_buffer_unlock(struct ring_buffer *buffer, unsigned long flags); - /* * size is in bytes for each per CPU buffer. */ -- cgit v1.2.3 From cb5ab74204a6e2579d1119bf1348eb806526b12b Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Thu, 2 Oct 2008 12:59:20 +0200 Subject: tracing/fastboot: change the printing of boot tracer according to bootgraph.pl Change the boot tracer printing to make it parsable for the scripts/bootgraph.pl script. We have now to output two lines for each initcall, according to the printk in do_one_initcall() in init/main.c We need now the call's time and the return's time. Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 91954eb6460..4455490d91b 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -216,6 +216,8 @@ struct boot_trace { initcall_t func; int result; unsigned long long duration; + ktime_t calltime; + ktime_t rettime; }; #ifdef CONFIG_BOOT_TRACER -- cgit v1.2.3 From 5601020feb0c3010e9e3e0131e9697ac6a06777b Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Thu, 2 Oct 2008 13:26:05 +0200 Subject: tracing/fastboot: get the initcall name before it disappears After some initcall traces, some initcall names may be inconsistent. That's because these functions will disappear from the .init section and also their name from the symbols table. So we have to copy the name of the function in a buffer large enough during the trace appending. It is not costly for the ring_buffer because the number of initcall entries is commonly not really large. Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 4455490d91b..e672e51c40a 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -7,6 +7,7 @@ #include #include #include +#include extern int ftrace_enabled; extern int @@ -213,7 +214,7 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } struct boot_trace { pid_t caller; - initcall_t func; + char func[KSYM_NAME_LEN]; int result; unsigned long long duration; ktime_t calltime; @@ -221,10 +222,10 @@ struct boot_trace { }; #ifdef CONFIG_BOOT_TRACER -extern void trace_boot(struct boot_trace *it); +extern void trace_boot(struct boot_trace *it, initcall_t fn); extern void start_boot_trace(void); #else -static inline void trace_boot(struct boot_trace *it) { } +static inline void trace_boot(struct boot_trace *it, initcall_t fn) { } static inline void start_boot_trace(void) { } #endif -- cgit v1.2.3 From 3e1932ad59726d794a865cc159c0593d54bf0cb6 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Thu, 2 Oct 2008 17:45:47 +0200 Subject: tracing/fastboot: build fix fix: In file included from kernel/sysctl.c:52: include/linux/ftrace.h:217: error: 'KSYM_NAME_LEN' undeclared here (not in a function) Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index e672e51c40a..deded114dff 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -1,14 +1,14 @@ #ifndef _LINUX_FTRACE_H #define _LINUX_FTRACE_H -#ifdef CONFIG_FTRACE - #include #include #include #include #include +#ifdef CONFIG_FTRACE + extern int ftrace_enabled; extern int ftrace_enable_sysctl(struct ctl_table *table, int write, -- cgit v1.2.3 From eb7fa935274bb233686fdf7a53f40c5d9ee76ed6 Mon Sep 17 00:00:00 2001 From: Steven Noonan Date: Thu, 2 Oct 2008 12:00:07 -0700 Subject: ftrace: ktime.h not included in ftrace.h Including eliminates the following error: include/linux/ftrace.h:220: error: expected specifier-qualifier-list before 'ktime_t' Signed-off-by: Steven Noonan Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index deded114dff..ed53265d1f6 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -3,6 +3,7 @@ #include #include +#include #include #include #include -- cgit v1.2.3 From 097d036a2f25eecc42435c57e010aaf4a2eed2d9 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Fri, 3 Oct 2008 15:39:21 +0200 Subject: tracing/fastboot: only trace non-module initcalls At this time, only built-in initcalls interest us. We can't really produce a relevant graph if we include the modules initcall too. I had good results after this patch (see svg in attachment). Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index ed53265d1f6..5812dba4ee2 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -225,9 +225,11 @@ struct boot_trace { #ifdef CONFIG_BOOT_TRACER extern void trace_boot(struct boot_trace *it, initcall_t fn); extern void start_boot_trace(void); +extern void stop_boot_trace(void); #else static inline void trace_boot(struct boot_trace *it, initcall_t fn) { } static inline void start_boot_trace(void) { } +static inline void stop_boot_trace(void) { } #endif -- cgit v1.2.3 From ca538f6bbe583406f941f3041d40c41f9a13d1de Mon Sep 17 00:00:00 2001 From: Tim Bird Date: Thu, 9 Oct 2008 15:23:05 -0700 Subject: tracing/fastboot: add better resolution to initcall debug/tracing Change the time resolution for initcall_debug to microseconds, from milliseconds. This is handy to determine which initcalls you want to work on for faster booting. One one of my test machines, over 90% of the initcalls are less than a millisecond and (without this patch) these are all reported as 0 msecs. Working on the 900 us ones is more important than the 4 us ones. With 'quiet' on the kernel command line, this adds no significant overhead to kernel boot time. Signed-off-by: Tim Bird Signed-off-by: Andrew Morton Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 5812dba4ee2..a3d46151be1 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -215,9 +215,9 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } struct boot_trace { pid_t caller; - char func[KSYM_NAME_LEN]; + char func[KSYM_NAME_LEN]; int result; - unsigned long long duration; + unsigned long long duration; /* usecs */ ktime_t calltime; ktime_t rettime; }; -- cgit v1.2.3 From bfadadfccc19e36f7d600c5ce7b3e5ba5197fbf0 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 10 Oct 2008 03:48:25 -0400 Subject: markers: fix synchronize marker unregister static inline Use a #define for synchronize marker unregister to fix include dependencies. Fixes the slab circular inclusion which triggers when slab.git is combined with tracing.git, where rcupdate includes slab, which includes markers which includes rcupdate. Signed-off-by: Mathieu Desnoyers Acked-by: Pekka Enberg Signed-off-by: Ingo Molnar --- include/linux/marker.h | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) (limited to 'include') diff --git a/include/linux/marker.h b/include/linux/marker.h index 38e32e781ed..889196c7fbb 100644 --- a/include/linux/marker.h +++ b/include/linux/marker.h @@ -13,7 +13,6 @@ */ #include -#include struct module; struct marker; @@ -166,9 +165,6 @@ extern void *marker_get_private_data(const char *name, marker_probe_func *probe, * unregistration and the end of module exit to make sure there is no caller * executing a probe when it is freed. */ -static inline void marker_synchronize_unregister(void) -{ - synchronize_sched(); -} +#define marker_synchronize_unregister() synchronize_sched() #endif -- cgit v1.2.3 From f2461fc82a083dd60062e05e704c5fcc1c658ba1 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Mon, 6 Oct 2008 10:33:00 -0400 Subject: tracepoints: tracepoint_synchronize_unregister() Create tracepoint_synchronize_unregister() which must be called before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers Signed-off-by: Ingo Molnar --- include/linux/tracepoint.h | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'include') diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index e623a6fca5c..199f4c207c1 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -124,4 +124,11 @@ extern void tracepoint_iter_reset(struct tracepoint_iter *iter); extern int tracepoint_get_iter_range(struct tracepoint **tracepoint, struct tracepoint *begin, struct tracepoint *end); +/* + * tracepoint_synchronize_unregister must be called between the last tracepoint + * probe unregistration and the end of module exit to make sure there is no + * caller executing a probe when it is freed. + */ +#define tracepoint_synchronize_unregister() synchronize_sched() + #endif -- cgit v1.2.3 From 231375cc5cc3549bb413f94a164bdcbd5f9ce943 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 3 Oct 2008 15:01:33 -0400 Subject: tracepoints: synchronize unregister static inline Turn tracepoint synchronize unregister into a static inline. There is no reason to keep it as a macro over a static inline. Signed-off-by: Mathieu Desnoyers Signed-off-by: Ingo Molnar --- include/linux/tracepoint.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'include') diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 199f4c207c1..c5bb39c7a77 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -129,6 +129,9 @@ extern int tracepoint_get_iter_range(struct tracepoint **tracepoint, * probe unregistration and the end of module exit to make sure there is no * caller executing a probe when it is freed. */ -#define tracepoint_synchronize_unregister() synchronize_sched() +static inline void tracepoint_synchronize_unregister(void) +{ + synchronize_sched(); +} #endif -- cgit v1.2.3