dead lock analysis of spin_lock() in linux kernel (english)

14
Outline • spin_lock and semaphore in linux kernel – Introduction and difference. – Dead lock example of spin_lock. • What is Context – What is “context”. – Control flow of procedure call, and interrupt handler. • Log analysis • Conclusion – How to prevent dead lock of spin_lock. 1

Upload: sneeker-yeh

Post on 31-Jul-2015

225 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

1

Outline• spin_lock and semaphore in linux kernel

– Introduction and difference.– Dead lock example of spin_lock.

• What is Context– What is “context”.– Control flow of procedure call, and interrupt handler.

• Log analysis• Conclusion

– How to prevent dead lock of spin_lock.

Page 2: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Spin lock & Semaphore

• Semaphore:– When init value is 1, it can be a mutex lock to prevent compromise of

critical section, just like spin lock.– Different from spin lock, thread goes sleep for waiting lock when failed

to get the lock.

• Spin lock:– Thread doesn’t go sleep for waiting lock when failed to get the lock, it

continue loop of trying to get lock.

2

Page 3: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Spin lock• Spin lock usage for mutex lock :

3

CriticalSection

code

Spin_unlock(&mutex_lock)

CriticalSection

code

Spin_lock(&mutex_lock)

Spin_unlock(&mutex_lock)

1Thread A start execution.

Kernel code : Thread ‘s time slice is decreased to zero. Thread’s context will be saved, then processor is assigned to another thread

2Timer interrupt preempt thread A

Spin_lock(&mutex_lock)

3

Thread B failed to get lock , and continue loop for trying getting lock forever

Kernel code : Thread ‘s time slice is decreased to zero. Thread’s context will be saved, then processor is assigned to another thread

4Timer interrupt preempt thread B

5 Thread A finish critical section.

Thread A Thread B

Page 4: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

What is context

• What does “context” means?– A set of dedicated hardware resource that program will

use to meet the need of successful execution.• Such as :

– general purpose register for computing.– stack memory for support of procedure call.

– But from kernel’s point of view, “dedicated context of process” actually is simulated, in fact resources are limited.

• kernel slices time and do context saving & restoring in purpose of emulating a multi-processor environment.

• Program (process) will think just like that it have a dedicated context.

4

Page 5: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

What is context• What is user context and interrupt context

– user context: provided by kernel context-switch facility which is triggered by timer interrupt, owner is call a user process, runs in user space code with user mode or in kernel space code with svc mode.

– Interrupt context: part of registers (context?) save and restore by interrupt handler by itself.

• Actually part of interrupt context(reg) will be the some context(reg) of some user process.

5

Processor time axis

Save every register which will be used later into stack.

……

Restore those register which have been used.And jump to return address (r14 register)

Pci bus interrupt

Timer interrupt

Timer interrupt

Thread A

Thread A

Thread B

Thread B

A’s subroutine

Int_handler()

Page 6: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

What is context• Compare Interrupt handler & procedure call.

– Interrupt handler run as a procedure call.– The difference is that

• int_handler don’t receive any parameter and don’t return any value.• Program is even unaware of execution of int_handler.

6

Processor time axis

Pci bus interrupt

Timer interrupt

Timer interrupt

Thread A

Thread A

Thread B

Thread B

subroutine

Save every register which will be used later into stack.

……

Restore those register which have been used, and jump to return address(r14).

Save every register which will be used later into stack.

Read parameter in param register…

Put return value in param registerRestore those register which have been used,

and jump to return address(r14).

Void Foo(void) : user space

Int_handler(): kernel space

Page 7: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

double-acquire deadlock(1/2)

• Spin_lock convention– Unlike spin lock implementation in other operating

system, linux kernel’s spin lock is not recursive. – Double-acquire deadlock example as followed:

7

Spin_lock(&mutex_lock);fooB();

Spin_unlock(&mutex_lock);

Thread A

Save every register which will be used later into stack.Read parameter in param register

…Spin_lock(&mutex_lock);

…Put return value in param register

Restore those register which have been used, and jump to return address(r14).

Void fooB(void)

Page 8: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

double-acquire deadlock(2/2) • Spin_lock synchronization between user context and interrupt context

– Double-acquire deadlock example(2) as followed:

– Example that won’t have Double-acquire deadlock as followed:

8

Spin_lock(&mutex_lock);

Spin_unlock(&mutex_lock);

Thread A

Save every register which will be used later into stack.…

Spin_lock(&mutex_lock);…

Restore those register which have been used, and jump to return address(r14).

Sdio_int_handler()Interrupt happens just after thread A get spin lock

Sdio_int handler will be busy-waiting mutex_lock

Spin_lock(&mutex_lock);

Spin_unlock(&mutex_lock);

Thread A

Save every register which will be used later into stack.

…Spin_lock(&mutex_lock);

…Restore those register which have been used,

and jump to return address(r14).

Sdio_int_handler()

Timer Interrupt happens just after thread A get spin lock

Kernel code : Thread ‘s time slice is decreased to zero. Thread’s context will be saved, then processor is assigned to another thread

Thread B’s user code execution

Sdio Interrupt happens just after thread A get spin lock

Sdio_int handler and thread B will be busy-waiting mutex_lock

Page 9: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Log Analysis(1) • In our case, CheckCallbackTimeout() might just

interrupt WiMAXQueryImformation() in user context(CM_Query thread)

9

Spin_lock(&mutex_lock);

Spin_unlock(&mutex_lock);

Thread ATimer Interrupt happens just after thread A get spin lock

Kernel code : …If (timer has to be exucuted){ CheckCallbackTimeout();}……Return;

CheckCallbackTimeout{ LDDB_spin_lock(); …

}

Page 10: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Log Analysis(2) • Timer callback function is called in __irq_svc.• __irq_svc is a subroutine which is only called by irq

handler.

10

Page 11: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Conclusion – Immediate Solution

• Use spin_lock_irqsave and spin_lock_irqrestore.– Turn off interrupt before acquire spin lock.

11

Page 12: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Conclusion – what action we have to take right now

• What should we do before implementation - Identify those context which open the same lock to do synchronization. – Prevent double-acquire deadlock scenario with interrupt disable API,

when lock is shared in interrupt and user context.– Prevent using semaphore in interrupt context.– Leave interrupt as soon as possible, and postpone task into other user

context, such as work queue.

• Turn on CONFIG_PROVE_LOCKING, CONFIG_DEBUG_LOCK_ALLOC, CONFIG_DEBUG_SPINLOCK – That will help debugging.

12

Page 13: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Reference • Linux.Kernel.Development.3rd.Edition, Robert

Love.• Linux device driver programming 驅動程式設計 ,

平田 豐 .

13

Page 14: Dead Lock Analysis of spin_lock() in Linux Kernel (english)

Appendix-context switch• Context-switch code

– Restore and jump should be combined to a atomic operation.

Copyright 2009 FUJITSU LIMITED 14

Timer interrupt code : …If thread ‘s time slice is decreased to zero. { save r0~r15 into current ’s TCB; restore B’s r0~r14 registers; jump r15 <- B’s TCB[15] + 3 } return from interrupt;

Spin_lock(&mutex_lock);……

Spin_unlock(&mutex_lock);

……

Sleep(2000ms);……

Sema_get(&mutex_lock)

Sleep function (kernel code ): …… save r0~r1 into current’s TCB; restore A’s r0~r14 registers; jump r15 <- A’s TCB[15] + 3 return ;

semaphore function (kernel code ):…. if lsemaphore is zero { save r0~r14 into current’s TCB; restore A’s r0~r14 registers; jump r15 <- B’s TCB[15] + 3} return ;

Thread A

Thread B

1

2

3

4

5