ftrace tutorial

Upload: mgcse4866

Post on 04-Apr-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Ftrace Tutorial

    1/45

    Ftrace Tutorial

    Steven Rostedt ([email protected])

    mailto:[email protected]:[email protected]
  • 7/30/2019 Ftrace Tutorial

    2/45

    Introduction

    Kernel internal tracer Derived from -rt patch Latency Tracer Plugin tracers

    ftrace : function tracer irqsoff : interrupt disabled latency wakeup : latency of highest priority task to

    wake up

    sched_switch: task context switches (more)

    Ring buffer Saved traces (snap shots)

    used to save maximum latency traces

  • 7/30/2019 Ftrace Tutorial

    3/45

    The Debug File System

    /sys/kernel/debug I prefer:

    mkdir /debug mount -t debugfs nodev /debug

    /etc/fstab debugfs /sys/kernel/debug debugfs defaults 0 0 debugfs /debug debugfs defaults 0 0

  • 7/30/2019 Ftrace Tutorial

    4/45

    /debug/tracing

    available_tracers current_tracer tracing_enabled trace latency_trace trace_pipe iter_ctrl tracing_max_latency tracing_cpumask trace_entries

  • 7/30/2019 Ftrace Tutorial

    5/45

    Selecting a tracer

    wakeup preemptirqsoff preemptoff irqsoff ftrace sysprof sched_switch none

    wakeup

  • 7/30/2019 Ftrace Tutorial

    6/45

    The none tracer

    No tracer selected none is special

    it is not a tracer echo none > /debug/tracing/current_tracer

  • 7/30/2019 Ftrace Tutorial

    7/45

    Starting a trace

    do not relay on tracing being enabled echo 1 > /debug/tracing/tracing_enabled

    note, make sure to have a space between the '1'

    and the '>'. This has burnt many a kernelprogrammer.

    The enabled stays across tracers. echo 1 > /debug/tracing/tracing_enabled echo ftrace > /debug/tracing/current_tracer echo irqsoff > /debug/tracing/current_tracer

  • 7/30/2019 Ftrace Tutorial

    8/45

    Stopping a trace

    echo 0 > /debug/tracing/tracing_enabled do not forget that space!

    Or in a program:int trace_fd;

    [...]

    int main(int argc, char *argv[]) {

    [...]

    trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY);

    [...]

    if (condition_hit()) {

    write(trace_fd, "0", 1);

    }

    [...]

    }

  • 7/30/2019 Ftrace Tutorial

    9/45

    Reading the Output

    latency_trace trace trace_pipe

  • 7/30/2019 Ftrace Tutorial

    10/45

    Latency Trace Output

    # tracer: irqsoff

    #

    irqsoff latency trace v1.1.5 on 2.6.26-tip

    --------------------------------------------------------------------

    latency: 971 us, #3/3, CPU#1 | (M:preem pt VP:0, KP:0, SP:0 HP:0 #P:2)

    -----------------

    | task: swapper-0 (uid:0 nice:20 policy:0 r t_prio:0)

    -----------------

    => started at: acpi_os_acq uire_lock

    => ended at: cpuidle_idle_call

    # _------=> CPU#

    # / _-----=> irqs-off

    # | / _----=> need-resched

    # || / _---=> hardirq/softirq# ||| / _--=> preempt-depth

    # |||| /

    # ||||| delay

    # cmd pid ||||| time | caller

    # \ / ||||| \ | /

    -0 1d..1 1us!: _spin_lock_irqsave (acpi_os_acq uire_lock)

    -0 1d..1 971us : acpi_idle_enter_bm (cpuidle_idle_call)

    -0 1d..2 972us : trace_hardirqs_on (cpuidle_idle_call)

  • 7/30/2019 Ftrace Tutorial

    11/45

    Various outputs

    -0 1d.h2 1335164us : tick_sched_timer (__r un_hrtimer)

    -0 0.Ns2 1386686us+: _spin_lock_irq (run_timer_softirq)

    -0 1d.H4 1388217us : ktime_get_ts (ktime_get)

    bash-3498 1.... 1576794us : rw_verify_area (vfs_write)

    bash-3498 1d..4 120768us+: 0:140:R + 3096:120:S gnome-ter minal

    bash-3498 1d..3 120796us!: 3498:120:S ==> 0:140:R

  • 7/30/2019 Ftrace Tutorial

    12/45

    trace output

    -0 [01] 1977.853298: read_hpet

  • 7/30/2019 Ftrace Tutorial

    13/45

    iter_ctrl

    print-parent sym-offset sym-addr verbose raw hex binary block stacktrace sched-tree

  • 7/30/2019 Ftrace Tutorial

    14/45

    Using iter_ctrl

    -0 [01] 2975.463936: tick_program_event

  • 7/30/2019 Ftrace Tutorial

    15/45

    The tracers

    sched_switch ftrace wakeup irqsoff preemptoff preemptirqsoff

  • 7/30/2019 Ftrace Tutorial

    16/45

    Available Tracers?

    wakeup preemptirqsoff preemptoff irqsoff ftrace sysprof sched_switch none

  • 7/30/2019 Ftrace Tutorial

    17/45

    sched_switch

    Traces task wakeups Traces task context switches

    bash-3498 [01] 5459.824565: 0:140:R + 7971:120:R

    -0 [00] 5459.824836: 0:140:R ==> 7971:120:R

    bash-3498 [01] 5459.824984: 3498:120:S ==> 0:140:R

    -0 [01] 5459.825342: 0:140:R ==> 7971:120:R

    ls-7971 [00] 5459.825380: 7971:120:R + 3: 0:S

    ls-7971 [00] 5459.825384: 7971:120:R ==> 3: 0:R

    migration/0-3 [00] 5459.825401: 3: 0:S ==> 0:140:Rls-7971 [01] 5459.825565: 7971:120:R + 598:115:S

  • 7/30/2019 Ftrace Tutorial

    18/45

    stacktrace

    iter_ctrl that effects the tracing itself

    bash-3498 [01] 6216.772637: 0:140:R + 8495:120:R

    bash-3498 [01] 6216.772639: do_fork

  • 7/30/2019 Ftrace Tutorial

    19/45

    ftrace - function tracer

    Traces at every non inline function Other functions not traced

    annotated with notrace

    Makefile with CFLAGS_REMOVE_... = -pg Must have /proc/sys/kernel/ftrace_enabled=1 Appears in most other tracers Very verbose

    init-1 [00] 6710.079562: _spin_lock

  • 7/30/2019 Ftrace Tutorial

    20/45

    Latency Tracers

    Stores the last maximum latency trace wakeup : scheduling latency of RT tasks irqsoff : interrupts off preemptoff : preemption off preemptirqsoff: interrupts and/or preemption

    off tracing_max_latency

  • 7/30/2019 Ftrace Tutorial

    21/45

    wakeup - sched latency

    Only traces RT tasks use LatencyTop for non-RT tasks

    Records and traces the maximum latency an

    RT task took from wake up to schedule Remember to reset tracing_max_latency

  • 7/30/2019 Ftrace Tutorial

    22/45

    Wakeup withoutfunction tracing

    # tracer: wakeup

    #

    wakeup latency trace v1.1.5 on 2.6.26-tip

    --------------------------------------------------------------------

    latency: 9 us, #2/2, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)

    -----------------

    | task: migration/1-7663 (uid:0 nice:-5 policy:1 rt_prio:99)

    -----------------

    # _------=> CPU#

    # / _-----=> irqs-off

    # | / _----=> need-resched

    # || / _---=> hardirq/softirq

    # ||| / _--=> preempt-depth

    # |||| /

    # ||||| delay

    # cmd pid ||||| time | caller

    # \ / ||||| \ | /

    usleep-10237 1d..2 2us+: try_to_wake_up (wake_up_process)

    usleep-10237 1d..3 9us : schedule (preempt_schedule)

  • 7/30/2019 Ftrace Tutorial

    23/45

    With function tracing# tracer: wakeup

    #

    wakeup latency trace v1.1.5 on 2.6.26-tip

    --------------------------------------------------------------------

    latency: 19 us, #18/18, CPU#0 | (M:preem pt VP:0, KP:0, SP:0 HP:0 #P:2)

    -----------------

    | task: usleep-10133 (uid:0 nice:0 policy:1 rt_prio:10)

    -----------------

    # _------=> CPU#

    # / _-----=> irqs-off

    # | / _----=> need-resched

    # || / _---=> hardirq/softirq

    # ||| / _--=> preempt-depth

    # |||| /

    # ||||| delay

    # cmd pid ||||| time | caller

    # \ / ||||| \ | /

    -0 0d.h4 1us : try_to_wake_up (wake_up_process)

    -0 0dNh4 2us : _spin_unlock_irqrestore (try_to_wake_up)

    -0 0dNh3 3us : _spin_lock (__run_hrtimer)

    -0 0dNh4 4us : _spin_unlock (hrtimer_interrupt)

    -0 0dNh3 5us : tick_program_event (hrtimer_interrupt)

    [...]

    -0 0.N.2 14us : _spin_lock_irqsave (hr tick_set)

    -0 0dN.3 15us+: _spin_unlock_irqrestore (hr tick_set)-0 0dN.2 16us : _spin_lock (schedule)

    -

  • 7/30/2019 Ftrace Tutorial

    24/45

    irqsoff

    local_irq_save(flags);[...]preempt_disabled();[...]local_irq_restore(flags);[...]preempt_enabled();

  • 7/30/2019 Ftrace Tutorial

    25/45

    preemptoff

    local_irq_save(flags);[...]preempt_disabled();[...]local_irq_restore(flags);[...]preempt_enabled();

  • 7/30/2019 Ftrace Tutorial

    26/45

    preemptirqsoff

    local_irq_save(flags);[...]preempt_disabled();[...]local_irq_restore(flags);[...]preempt_enabled();

  • 7/30/2019 Ftrace Tutorial

    27/45

    trace_entries

    Not enough data recorded Too much data recorded Run-time configurable Must be done with none tracer or it will give

    an -EBUSY Number is number of entries, but the buffers

    are allocate via pages. If more entries can fit on a page that was

    allocated to handle requested entries, theremaining page will be filled with entries

  • 7/30/2019 Ftrace Tutorial

    28/45

    Dynamic Ftrace(the fun begins!)

    Produces non-measurable overhead Requires kernel thread ftraced to check for

    more updates Calls kstop_machine to execute text

    modification Not safe to modify code text in SMP environment

    /debug/tracing/ftraced_enabled

  • 7/30/2019 Ftrace Tutorial

    29/45

    How it works?

    With the gcc profiler switch -pg Every non-inline function calls mcount

    00001adb :

    1adb: 55 push %ebp

    1adc: 89 e5 mov %esp,%ebp

    1ade: 57 push %edi

    1adf: 56 push %esi

    1ae0: 53 push %ebx

    1ae1: 83 ec 1c sub $0x1c,%esp

    1ae4: e8 fc ff ff ff call 1ae5 1ae5: R_386_PC32 mcount

    1ae9: 89 c3 mov %eax,%ebx

    1aeb: 89 c7 mov %eax,%edi

    1aed: 81 e3 00 00 00 02 and $0x2000000,%ebx

    1af3: 89 ce mov %ecx,%esi

  • 7/30/2019 Ftrace Tutorial

    30/45

    Non dynamic i368 mcount

    ENTRY(mcount)

    cmpl $ftrace_stub, ftrace_trace_function

    jnz trace

    .globl ftrace_stub

    ftrace_stub:

    ret

    /* taken from glibc */

    trace:

    pushl %eax

    pushl %ecx

    pushl %edx

    movl 0xc(%esp), %eax

    movl 0x4(%ebp), %edx

    subl $MCOUNT_INSN_SIZE, %eax

    call *ftrace_trace_function

    popl %edx

    popl %ecx

    popl %eax

    jmp ftrace_stub

    END(mcount)

  • 7/30/2019 Ftrace Tutorial

    31/45

    Dynamic i386 mcount

    ENTRY(mcount)

    pushl %eax

    pushl %ecx

    pushl %edx

    movl 0xc(%esp), %eax

    subl $MCOUNT_INSN_SIZE, %eax

    .globl mcount_call

    mcount_call:

    call ftrace_stub

    popl %edx

    popl %ecx

    popl %eax

    ret

    END(mcount)

  • 7/30/2019 Ftrace Tutorial

    32/45

    Call ftrace_record_ip

    ENTRY(mcount)

    pushl %eax

    pushl %ecx

    pushl %edx

    movl 0xc(%esp), %eax

    subl $MCOUNT_INSN_SIZE, %eax

    .globl mcount_call

    mcount_call:

    popl %edx

    popl %ecx

    popl %eax

    ret

    END(mcount)

  • 7/30/2019 Ftrace Tutorial

    33/45

    ftrace_record_ip

    ftrace_record_ip

    HASHdo_fork

    do_fork+0x9

  • 7/30/2019 Ftrace Tutorial

    34/45

    ftraced

    do_fork+0x9

    HASHftraced

    List

    kstop_machine(modify code: nop)

    do_fork+0x9

  • 7/30/2019 Ftrace Tutorial

    35/45

    nop

    00001adb :

    1adb: 55 push %ebp

    1adc: 89 e5 mov %esp,%ebp

    1ade: 57 push %edi

    1adf: 56 push %esi

    1ae0: 53 push %ebx

    1ae1: 83 ec 1c sub $0x1c,%esp

    1ae4:

    1ae9: 89 c3 mov %eax,%ebx

    1aeb: 89 c7 mov %eax,%edi

    1aed: 81 e3 00 00 00 02 and $0x2000000,%ebx

    1af3: 89 ce mov %ecx,%esi

  • 7/30/2019 Ftrace Tutorial

    36/45

    Starting of ftrace

    00001adb :

    1adb: 55 push %ebp

    1adc: 89 e5 mov %esp,%ebp

    1ade: 57 push %edi

    1adf: 56 push %esi

    1ae0: 53 push %ebx

    1ae1: 83 ec 1c sub $0x1c,%esp

    1ae4:

    1ae5: R_386_PC32

    1ae9: 89 c3 mov %eax,%ebx

    1aeb: 89 c7 mov %eax,%edi

    1aed: 81 e3 00 00 00 02 and $0x2000000,%ebx

    1af3: 89 ce mov %ecx,%esi

  • 7/30/2019 Ftrace Tutorial

    37/45

    ftrace_caller

    ENTRY(ftrace_caller)

    pushl %eax

    pushl %ecx

    pushl %edx

    movl 0xc(%esp), %eaxmovl 0x4(%ebp), %edx

    subl $MCOUNT_INSN_SIZE, %eax

    .globl ftrace_call

    ftrace_call:

    call ftrace_stub

    popl %edxpopl %ecx

    popl %eax

    .globl ftrace_stub

    ftrace_stub:

    ret

    END(ftrace_caller)

  • 7/30/2019 Ftrace Tutorial

    38/45

    Registering an ftrace caller

    ENTRY(ftrace_caller)

    pushl %eax

    pushl %ecx

    pushl %edx

    movl 0xc(%esp), %eaxmovl 0x4(%ebp), %edx

    subl $MCOUNT_INSN_SIZE, %eax

    .globl ftrace_call

    ftrace_call:

    call

    popl %edxpopl %ecx

    popl %eax

    .globl ftrace_stub

    ftrace_stub:

    ret

    END(ftrace_caller)

  • 7/30/2019 Ftrace Tutorial

    39/45

    Selective function tracer

    tracing is dynamically enabled have a list of functions that need to be traced Why not filter which functions we trace?

  • 7/30/2019 Ftrace Tutorial

    40/45

    Picking what functions to trace

    /debug/tracing/available_filter_functions /debug/tracing/set_ftrace_filter /debug/tracing/set_ftrace_notrace

  • 7/30/2019 Ftrace Tutorial

    41/45

    available_filter_functions

    filelock_init

    __rcu_read_lock

    kmem_cache_create

    notifier_call_chain

    down_write

    __rcu_read_unlock

    _spin_lock_irq

    _spin_unlock_irq

    _spin_lock

    __kmalloc

  • 7/30/2019 Ftrace Tutorial

    42/45

    set_ftrace_filter

    # tracer: ftrace

    ## TASK-PID CPU# TIMESTAMP FUNCTION

    # | | | | |

    ls-5652 [00] 2320.450897: sys_open

  • 7/30/2019 Ftrace Tutorial

    43/45

    set_ftrace_notrace

    Modify like set_ftrace_filter Acts like a notrace added to the function The function will not be traced even if in the

    set_ftrace_filter

  • 7/30/2019 Ftrace Tutorial

    44/45

    set_ftrace_* wildcards

    Prefix: echo 'sys_*' > /debug/tracing/set... Postfix: echo '*lock' > /debug/tracing/set... Included: echo '*device*' > /debug/tracing/set... Anything else:

    use grep on available_filter_functions

  • 7/30/2019 Ftrace Tutorial

    45/45

    Todo:

    ftrace dump on OOPS change sleep interval of ftraced thread use CPU clock (aka TSC) for interrupt and

    preemption latency traces option to force per CPU trace interleaving

    integrity printk like hooks (for debugging purposes

    only) Hooks for tuna to show in the oscilloscope