linux kernel booting process (1) - for nlkb

73
Booting Process (1) Taku Shimosawa Pour le livre nouveau du Linux noyau 1

Upload: shimosawa

Post on 29-Nov-2014

829 views

Category:

Engineering


10 download

DESCRIPTION

Describes the bootstrapping part in Linux and some related technologies. This is the part one of the slides, and the succeeding slides will contain the errata for this slide.

TRANSCRIPT

Page 1: Linux Kernel Booting Process (1) - For NLKB

Booting Process (1)Taku Shimosawa

Pour le livre nouveau du Linux noyau 1

Page 2: Linux Kernel Booting Process (1) - For NLKB

References• Wikipedia (!)

• Wikipedia knows everything

• Wiktionary• I wanted to use OED if I had one…

• Source Files• Linux 3.15• U-boot 2014.04• ELILO 3.16• GRUB 2.00

• Manual• Intel® 64 and IA-32 Architectures Software Developer’s

Manual• ARM® Architecture Reference Manual

2

Page 3: Linux Kernel Booting Process (1) - For NLKB

1. Booting “what’s boot”

3

Page 4: Linux Kernel Booting Process (1) - For NLKB

What is “boot”?• boot (n.)

4

[1] http://en.wikipedia.org/wiki/Boot

Page 5: Linux Kernel Booting Process (1) - For NLKB

Brief etymology[2]

• Phrase “pull oneself up by one’s bootstraps”• Misattributed (at latest in 1901!) to “The Surprising

Adventures of Baron Munchausen” (1781, Rudolf Erich Raspe) : The baron pulls himself out of a swamp by his hair (pigtail).

• The use of this phrase is found in 1834 in the U.S.• “[S]omeone is attempting or has claimed some ludicrously

far-fetched or impossible task”

• In the 20th century, the “possible task” meaning has appeared• “To begin an enterprise or recover

from a setback without any outsidehelp; to succeed only on one'sown effort or abilities”

5

bootstrap[3]

[2] http://en.wiktionary.org/wiki/pull_oneself_up_by_one%27s_bootstraps[3] http://en.wikipedia.org/wiki/Bootstrapping

Page 6: Linux Kernel Booting Process (1) - For NLKB

Bootstrapping (in Computer)• The process of loading the basic software (typically,

operating systems) into the main memory from persistent memory (HDD, flash ROM, etc.)• “Boot” is an abbreviation for “bootstrap(ping)”

6

BoostrappingCode

OS

Page 7: Linux Kernel Booting Process (1) - For NLKB

Boot loader• “It is responsible for loading and transferring control to the operating

system kernel software (such as the Hurd or Linux).”[4]

• Boot loader• BIOS (PC)• UEFI (Universal Extensible Firmware Interface) (PC)

• “Secure Boot” issue• Das U-Boot (Universal bootloader) (for embedded systems)

• Second-stage boot loader• LILO (Linux Loader, Ver. 24.0, Released on Jun 7, 2013)

• Supports GPT and RAID (!?)• GRUB2 (Ver. 2.00, Jun 26, 2012)

• Supports BIOS and UEFI boot• GRUB Legacy (Grand Unified Boot Loader, Ver. 0.97, May 8, 2005)

• ELILO (EFI Linux Boot Loader, Ver 3.16, Mar 29, 2013)• Originally for EFI and Itanium; currently bug fix only

• SYSLINUX (Ver. 6.02, Oct 13, 2013)• NTLDR, BOOTMGR (beginning from Windows Vista)

7

[4] http://www.gnu.org/software/grub/

Page 8: Linux Kernel Booting Process (1) - For NLKB

What loads and what is loaded

8

Power On

BIOS

GRUB2

Linux

HDD(MBR)

HDD

PXELINUX(a part of SYSLINUX)

BIOS/NIC Option ROM

Network (tftp)

Network (tftp)

bzImagebzImage

U-Boot

Flash ROMHDD

NetworkSD Card

etc…

uImage

Page 9: Linux Kernel Booting Process (1) - For NLKB

2. PrerequisitesHow to say “Hello” in x86?

9

Page 10: Linux Kernel Booting Process (1) - For NLKB

Miscellanea• Architecture and GNU Assembly Language• Very briefly• x86• ARMv7

• Things left..• Linker Script

10

Page 11: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Mode• Too complicated to explain• 3 Modes

• Real mode• 16 bit mode• No mode switch (always privileged)• No virtual memory (Segmentation only)

• Protected mode• 32 bit mode• Segmentation / Virtual memory

• (Virtual 8086 mode)• Compatibility for executing 16-bit code in 32-bit mode

• Long mode• 64 bit mode• Virtual memory only• (Another mode “,compatibility mode,” for executing 32-bit code)

• What is this bit?• Size of the virtual address• Default size of the operand registers (*)

11

(*) Of course, you can use %al in 32-bit mode, %ax in 64-bit mode…

Page 12: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Registers• Registers before 64-bit

• 8 general-purpose registers• Some instructions uses a certain set of registers for its input and output…• Especially, sp is only used for a stack pointer

• Each register has names for certain parts (the lower 8-bit, for example)

• Example: eax register

12

eax

ah alax

8bit8bit

16bit

32bit

Page 13: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Registers• In 64-bit mode, the registers are extended to 64-bit and new

names for them are introduced (r**)• The new 8 registers (r8 ~ r15) are also introduced

13

64-bit Lower 32-bit Lower 16-bit Higher/Lower 8-bitin Lower 16-bit

rax eax ax ah/al

rcx ecx cx ch/cl

rdx edx dx dh/dl

rbx ebx bx bh/bl

rsp esp sp --/spl

rbp ebp bp --/bpl

rsi esi si --/sil

rdi edi di --/dil

r8 r8d r8w --/r8l

Page 14: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Segmentation• 6 Segment Registers (16-bit registers)

• Code Segment Register: CS• Data Segment Register: DS, ES, FS, GS• Stack Segment Register: SS

• Real mode : 20-bit address space • Linear address = Physical address• The size of each segment is 64K (16-bit)• The segment register denotes the higher 16-bit offset in 20-bit address space for

the segment

• Protected mode : 32-bit/36-bit physical address space• Virtual –(Paging)-> Linear –(Segmentation)-> Physical• The offset and limit are stored in the descriptor table• The segment registers points to the entry in the table

• Long mode : 48-bit physical address space• For CS, DS, ES, and SS, the offset is always 0, the limit is ignored.• For FS and GS, the offset can be set by the descriptor or through MSR (for > 32-

bit addresses)

14

Page 15: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Segmentation• Default segment register

• For code accesses, CS is used (CS:IP)• For data accesses, DS is used (DS:xx)

• For string instructions, ES is used for destination (ES:(E)DI)• For stack accesses, SS is used (SS:SP)

• Anyway, in real-mode:• When CS = 0x0700 and IP = 0x0c00, the instruction at

0x7c00 is executed.• Of course, there are many ways to

point this address• CS : 0x0000, IP : 0x7c00• CS : 0x07c0, IP : 0x0000

• DS, ES, and SS are similar• movw $3, 0(%bx) means store 3 to the address DS * 16 + BX

15

Code segment

CS * 16 = 0x7000

0xc00

IP = 0xc00

Page 16: Linux Kernel Booting Process (1) - For NLKB

x86 architecture – Misc.• Ring (privilege mode)

• Ring 0, that’s all

• Descriptor Table• Paging, Interrupts

• Left to the next

• Basic Instructions• MOV• PUSH/POP• ADD/SUB…• LEA• JMP• Jcc

16

Page 17: Linux Kernel Booting Process (1) - For NLKB

x86 Architecture : Assembly Lang.• Two types of syntaxes• AT&T Style (gcc (gas)) : Of course, Linux uses this style• Intel Style (MASM, NASM)

17

AT&T Intel

Sample movb $0xff, %aladdl $8, 0(%rax, %rcx, 4)

MOV AL, FFhADD DWORD PTR [RAX + RCX * 4], 8

Operand Order

Source, Destination Destination, Source

Symbol Immediate : prefixed with $Register : prefixed with %

No prefix

Suffix b for 8-bit operation, w for 16-bit, l for 32-bit, and q for 64-bit

No suffix (Inferred by the operand)

Addressing displacement(base, index, scale) (width) ptr [base + index * scale + displacement]

Page 18: Linux Kernel Booting Process (1) - For NLKB

ARM Architecture (ARMv7-A)• Also too complicated

• ARM Instruction Set (32-bit)• Thumb Instruction Set (16-bit, more code density)

• ARMv6T2 introduced Thumb2• Thumb2 has almost the same functionality as ARM

• Jazelle, ThumbEE• Execution State register states which instruction set the

processor executes.

• Registers• 16 “general-purpose registers”

• 13 General Purpose Registers (r0 to r12)• r8 to r15 (r8 to r12, SP, LR and PC) are banked

• 3 Special Purpose Registers (SP, LR, and PC / or r13 to r15)• Reading PC returns the current inst + 8 (in ARM), + 4 (in Thumb)

18

Page 19: Linux Kernel Booting Process (1) - For NLKB

Instruction Sets• How to switch the instruction set?• BX, BLX instructions

• If the least significant bit of the target address is ‘1’, then it switches to Thumb. If the second significant bit is ‘0’, to ARM. [interworking address]

• LDR, LDM to PC (r15)• Also in ARM7, ALU instructions (ADD, MOV, etc.) for the

PC register in ARM instruction set w/o condition flags• Exceptions entries and returns

19

Page 20: Linux Kernel Booting Process (1) - For NLKB

ARM architecture – Misc.• Mode• User, Supervisor, FIQ, etc.

• Paging, Interrupts• Conditional Instructions• Etc.

20

Page 21: Linux Kernel Booting Process (1) - For NLKB

ARM Assembler• UAL (unified assembler language)• Canonical form for ARM and Thumb instructions

• ADC (Thumb) => ADCS

• Instruction Example• MOV{S}<c> <Rd>, #<const>

• Load the immediate (8-bit) to the register• MOV{S}<c><q> <Rd>, <Rm>

• Copy the contents of <Rm> to <Rd>• <c> : condition• <q> : encoding (16-bit/32-bit) qualifier

• When not specified and both are available, the 16-bit encoding is selected

21

Page 22: Linux Kernel Booting Process (1) - For NLKB

3. Booting in x86GRUB and bzImage

22

Page 23: Linux Kernel Booting Process (1) - For NLKB

Boot Sequence in This Presentation

• Typical boot sequence in PC (x86_64)

23

Power On

BIOS

GRUB2(boot.img)

HDD(MBR/VBR)

boot.img

GRUB2(core.img)

(1 sector = 512 byte)

HDD(MBR~1st part.)

core.img

(up to 62 sectors = approx. 32KB)

HDD(/boot part.)

grub.cfgbzImage*.mod

Entrypoint in Linux

Page 24: Linux Kernel Booting Process (1) - For NLKB

BIOS• BIOS (Basic Input/Output System)

• Executed right after the machine turns on• Initializes CPUs, and hardware• Provides basic I/O services

• Used by boot loaders (in real mode)• E.g.) Load from Hard Disk Drive (INT 13H), Memory Information

(INT 0x15, AX 0xe820) etc.• Builds up various data structures for machine information

• ACPI Tables• Starts up bootloaders

• Loads at CS:IP = 0x00:0x7c00• Provides user interface for boot, Hardware settings, various

managements• To be replaced by UEFI…

24

Page 25: Linux Kernel Booting Process (1) - For NLKB

BIOS Call• Uses “INT” instruction

• It executes an interrupt handler• BIOS sets the address for its service code in the interrupt vector table.

• Some operating systems also use this for system calls• INT 0x21 for MS-DOS• INT 0x80 for Linux (in the past)

• Parameters are specified by the registers• AH / AX : Function number• Other registers : Parameters

• Example• INT 0x13 (Disk access)

• AH = 0x02 (Read by CHS), 0x03 (Write by CHS)…• AL = Number of sectors• CH = Cylinder Number• CL = Sector Number (Bits 0-5), Higher bits in Cylinder Number (Bits 6-7)• DH =Head Number• DL = Driver Number• ES:BX = Buffer

25

Page 26: Linux Kernel Booting Process (1) - For NLKB

GRUB• boot.img (512 byte)

• Usually located in the first sector (MBR) in HDD• Loaded at 0x7c00 by BIOS• Real-mode• Loads the next sector from HDD

• The position is embedded by the GRUBinstaller (in sector, blue part)

• Typically, at Sector 1 (the next sector)

• core.img• Located at the gap sectors between

MBR and the first partition• The first partition begins at the 63rd

sector (traditionally) or at 1MB(recently, as seen in right)

• The first sector in core.img loadsthe remaining part of core.imgfrom HDD

26

# dd if=/dev/vda count=1 bs=512 2> /dev/null | od -t x1 -A x000000 eb 63 90 00 00 00 00 00 00 00 00 00 00 00 00 00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00*000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 02000040 ff 00 00 20 01 00 00 00 00 02 fa 90 90 f6 c2 80000050 75 02 b2 80 ea 59 7c 00 00 31 00 80 01 00 00 00000060 00 00 00 00 ff fa 90 90 f6 c2 80 74 05 f6 c2 70000070 74 02 b2 80 ea 79 7c 00 00 31 c0 8e d8 8e d0 bc000080 00 20 fb a0 64 7c 3c ff 74 02 88 c2 52 be 80 7d000090 e8 17 01 be 05 7c b4 41 bb aa 55 cd 13 5a 52 720000a0 3d 81 fb 55 aa 75 37 83 e1 01 74 32 31 c0 89 440000b0 04 40 88 44 ff 89 44 02 c7 04 10 00 66 8b 1e 5c0000c0 7c 66 89 5c 08 66 8b 1e 60 7c 66 89 5c 0c c7 440000d0 06 00 70 b4 42 cd 13 72 05 bb 00 70 eb 76 b4 080000e0 cd 13 73 0d f6 c2 80 0f 84 d8 00 be 8b 7d e9 820000f0 00 66 0f b6 c6 88 64 ff 40 66 89 44 04 0f b6 d1000100 c1 e2 02 88 e8 88 f4 40 89 44 08 0f b6 c2 c0 e8000110 02 66 89 04 66 a1 60 7c 66 09 c0 75 4e 66 a1 5c000120 7c 66 31 d2 66 f7 34 88 d1 31 d2 66 f7 74 04 3b000130 44 08 7d 37 fe c1 88 c5 30 c0 c1 e8 02 08 c1 88000140 d0 5a 88 c6 bb 00 70 8e c3 31 db b8 01 02 cd 13000150 72 1e 8c c3 60 1e b9 00 01 8e db 31 f6 bf 00 80000160 8e c6 fc f3 a5 1f 61 ff 26 5a 7c be 86 7d eb 03000170 be 95 7d e8 34 00 be 9a 7d e8 2e 00 cd 18 eb fe000180 47 52 55 42 20 00 47 65 6f 6d 00 48 61 72 64 20000190 44 69 73 6b 00 52 65 61 64 00 20 45 72 72 6f 720001a0 0d 0a 00 bb 01 00 b4 0e cd 10 ac 3c 00 75 f4 c30001b0 00 00 00 00 00 00 00 00 0e 14 50 70 00 00 00 200001c0 21 00 83 35 37 3e 00 08 00 00 00 38 0f 00 00 350001d0 38 3e 82 51 60 31 00 40 0f 00 00 98 3b 00 00 510001e0 61 31 83 fe ff ff 00 d8 4a 00 00 10 7e 03 00 fe0001f0 ff ff 05 fe ff ff fe ef c8 03 02 08 37 15 55 aa000200

Jump!

Boot sector signature

Page 27: Linux Kernel Booting Process (1) - For NLKB

GRUB (2)• core.img• Includes the modules required to boot operating

systems• Menu facilities

• e.g.) vga.mod• File system modules to access the configuration file (grub.cfg)

• e.g.) ext2.mod• OS Loader modules

• e.g.) linux.mod

• Modularized to fit in the gap sectors

27

Page 28: Linux Kernel Booting Process (1) - For NLKB

Linux image• Linux boot image

• vmlinux [+ compression + setup code + headers]

• Various types of boot images• bzImage

• “big zImage”• Mainly used in x86

• uImage• Used in systems booted by U-Boot• ARM, SPARC, PPC, SH, …

• treeImage• Used by OpenBIOS (ppc)• Includes DeviceTree blob

• simpleImage• Used by OpenFirmware (ppc)

• xipImage• “eXecute-In-Place” image

28

Page 29: Linux Kernel Booting Process (1) - For NLKB

bzImage• What you’ve got in /boot in your PC

• Usually named as /boot/vmlinuz-(version)

• What format is this?

• Originally bootable from FDD• bzImage is written in the first sector in FDD• Deprecated in 2.5.xx? (not verified)

29

000000 ea 05 00 c0 07 8c c8 8e d8 8e c0 8e d0 31 e4 fb000010 fc be 2d 00 ac 20 c0 74 09 b4 0e bb 07 00 cd 10000020 eb f2 31 c0 cd 16 cd 19 ea f0 ff 00 f0 44 69 72000030 65 63 74 20 66 6c 6f 70 70 79 20 62 6f 6f 74 20000040 69 73 20 6e 6f 74 20 73 75 70 70 6f 72 74 65 64000050 2e 20 55 73 65 20 61 20 62 6f 6f 74 20 6c 6f 61000060 64 65 72 20 70 72 6f 67 72 61 6d 20 69 6e 73 74000070 65 61 64 2e 0d 0a 0a 52 65 6d 6f 76 65 20 64 69000080 73 6b 20 61 6e 64 20 70 72 65 73 73 20 61 6e 79000090 20 6b 65 79 20 74 6f 20 72 65 62 6f 6f 74 20 2e0000a0 2e 2e 0d 0a 00 00 00 00 00 00 00 00 00 00 00 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00*0001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff0001f0 ff 1d 01 00 5c db 00 00 00 00 ff ff 00 00 55 aa...

Boot sector signatureAgain

Far jump

93: bugger_off_msg:94: .ascii "Direct floppy boot is not supported. "95: .ascii "Use a boot loader program instead.\r\n"96: .ascii "\n"97: .ascii "Remove disk and press any key to reboot ...\r\n"98: .byte 0

Page 30: Linux Kernel Booting Process (1) - For NLKB

How bzImage is created?• Magical ceremonies in arch/x86/boot• After “vmlinux” is ready, the following sequence runs.

30

LD vmlinux SORTEX vmlinux SYSMAP System.map CC arch/x86/boot/a20.o AS arch/x86/boot/bioscall.o... LDS arch/x86/boot/compressed/vmlinux.lds AS arch/x86/boot/compressed/head_32.o... CC arch/x86/boot/compressed/early_serial_console.o OBJCOPY arch/x86/boot/compressed/vmlinux.bin GZIP arch/x86/boot/compressed/vmlinux.bin.gz HOSTCC arch/x86/boot/compressed/mkpiggy MKPIGGY arch/x86/boot/compressed/piggy.S AS arch/x86/boot/compressed/piggy.o... LD arch/x86/boot/compressed/vmlinux ZOFFSET arch/x86/boot/zoffset.h... LD arch/x86/boot/setup.elf OBJCOPY arch/x86/boot/setup.bin OBJCOPY arch/x86/boot/vmlinux.bin... BUILD arch/x86/boot/bzImage

Page 31: Linux Kernel Booting Process (1) - For NLKB

So what?

31

vmlinux

boot/compressed/vmlinux.bin(1) Strip symbols

vmlinux.bin.gz(2) Compress (gzip, bzip2, lzma, lzo, lz4)

piggy.o

(3) mkpiggy (piggy-back)Make an object that contains the compressed image

piggy.o*.oboot/compressed/vmlinux

(4) Link with the other objects in boot/compressed

(Decompressing codes)(5) Transform it into a simple binary

boot/vmlinux.bin

boot/vmlinux.binboot/setup.bin(6) Concatenate with real-mode setup code, headers, and CRC32 CRC

boot/bzImage

The deprecated FDD boot code is (was) here!

Page 32: Linux Kernel Booting Process (1) - For NLKB

Column: How to embed a binary in your executable?

• Many ways to do that• Convert it to the “hex” text, and #include

• Use “.incbin” mnemonic in the assembler• mkpiggy automatically generates this assembler file.

32

(binary file.hex)0xeb, 0xfe, 0x90, 0x90, …

(C file)unsigned char binary[] = {#include “binary_file.hex”};

.section .rodata

.globl input_data, input_data_endinput_data: .incbin “binary_file.bin”input_data_end:

Page 33: Linux Kernel Booting Process (1) - For NLKB

Boot Protocol• Documentation/x86/boot.txt

• Build-time parameters (size of the setup code, etc.) and parameters filled by bootloaders (the address for command-line parameters, initrd, etc.) are located in the bottom of the boot sector and the header of the setup code.

33

boot/vmlinux.binboot/setup.bin CRC

Real-mode kernel Protected-mode kernel

Setup code

Boot sector(header.S) Real-mode (16-bit) entry point

32-bit entry point (+0x0)

64-bit entry point (+0x200)

Page 34: Linux Kernel Booting Process (1) - For NLKB

Boot Protocol in bzImage• 4 entry points

(1) 16-bit entry point (Real mode)(2) 32-bit entry point (Protected mode)(3) 64-bit entry point (Long mode)(4) The true entry point (in vmlinux)

34

boot/vmlinux.binboot/setup.bin CRC

Real-mode kernel

Protected-mode kernel

Setup code

Boot sector(header.S) (1) Real-mode (16-bit) entry point

(2) 32-bit entry point (+0x0)

(3) 64-bit entry point (+0x200)

vmlinux

decompress(4) entry point in vmlinux

Page 35: Linux Kernel Booting Process (1) - For NLKB

Boot Protocol in bzImage• To avoid excess mode transition• The modern bootloader/firmware runs in the protected

mode or long mode• For later entry points, the bootloader/firmware should

provide the information that had been collected in the prior stage of the Linux kernel• In most cases, such information is already retrieved in a

bootloader as the bootloader also needs it.

35

Page 36: Linux Kernel Booting Process (1) - For NLKB

Fast Backward

36

(1) bzImage is loaded by a boot loader

Protected-mode kernelRM Kernel

(2) The real-mode kernel runs and switches CPU to the protected mode.

RM Kernel Protected-mode kernel

(3) The protected-mode kernel runs

RM Kernel Protected-mode kernel

RM Kernel Protected-mode kernel

(4) It switches the CPU to the long mode. (In x86_64 only)

RM Kernel vmlinux

(5) It decompresses the compressed vmlinux (moves the decompressing code if necessary)

Decompress Code

(6) Jumps to the entery point in vmlinx (the startup_32/startup_64 function)

RM Kernel vmlinux Decompress Code

16-bitMode

32-bitMode

32-bit/64-bitMode

Higher Address

0x100000

Page 37: Linux Kernel Booting Process (1) - For NLKB

32-bit or 64-bit in x86• Originally, the “arch” directories for 32-bit kernel

and 64-bit kernel are different (i386 and amd64).• In Linux 2.6.24, they are merged into a single

directory (x86).• First it was almost just merging the directory and

renaming the 32-bit source files to xxx_32.c, and the 64-bit source files to xxx_64.c• Then merged to a single file with #ifdef’s

• Now, the duplication of the code is minimized• (non-suffixed) and xxxx_64.c/h• xxxx_32.c/h and xxxx_64.c/h• CONFIG_X86_32 and CONFIG_X86_64

37

Page 38: Linux Kernel Booting Process (1) - For NLKB

3-1. Real Mode“640 k ought to be enough for anybody”

38

Page 39: Linux Kernel Booting Process (1) - For NLKB

Real-mode protocol• Used with the “linux16” module in GRUB2• Starts with the transition from real-mode to

protected-mode, and jump into protected-mode kernel (32-bit entry point)• Suggested Memory Layout:

39

Setupcode

BS Heap/stack BIOS

Resv.

I/OMemHole

0xA0000(640KB)

0x100000(1MB)

Protected Mode Kernel

Higher Address0

Jump

Page 40: Linux Kernel Booting Process (1) - For NLKB

Headers (1)• Boot sector• The code itself is entirely useless

• setup_sects denotes the size of setup code in sector• Thus, the protect kernel begins at the offset (1 + setup_sects) *

512

40

262: .globl hdr263: hdr:264: setup_sects: .byte 0 /* Filled in by build.c */265: root_flags: .word ROOT_RDONLY266: syssize: .long 0 /* Filled in by build.c */267: ram_size: .word 0 /* Obsolete */268: vid_mode: .word SVGA_MODE269: root_dev: .word 0 /* Filled in by build.c */270: boot_flag: .word 0xAA55

(arch/x86/boot/header.S)

Page 41: Linux Kernel Booting Process (1) - For NLKB

Headers (2)• Setup code• The top of the setup code contains parameters

• struct setup_header• The parameter in the boot sector and the header of the

setup code are defined as one struct in C

41

47: struct setup_header {48: __u8 setup_sects;49: __u16 root_flags;50: __u32 syssize;51: __u16 ram_size;52: __u16 vid_mode;53: __u16 root_dev;54: __u16 boot_flag;55: __u16 jump;56: __u32 header;57: __u16 version;58: __u32 realmode_swtch;... (arch/x86/include/uapi/asm/bootparam.h)

Setupcode

Boot Sector

0x0000

0x02000x1f1

Page 42: Linux Kernel Booting Process (1) - For NLKB

struct setup_header (1)

42

Member Sz Description

setup_sects 1 Number of sectors for setup code

syssize 4 Size of protected-mode kernel in 16-byte unit

header 4 “HdrS”

version 2 Header version. Latest = 0x020d (Protocol 2.13)

type_of_loader 1 Type of bootloader + ver. 0xTV (T: 0 = LILO, 7 = GRUB…)

code32_start 4 Address of protected-mode kernel is loaded.Default: 0x100000. Used to hook/load in the other addr.

ramdisk_image 4 Address of initial ramdisk/ramfs. 0 = None

ramdisk_size 4 Size of initial ramdisk/ramfs. 0 = None

heap_end_ptr 2 Offset of the end of the heap/stack minus 0x200

cmd_line_ptr 4 Address of the command line parameter.Somewhere between the heap/stack end and 0xA0000If zero, loader is assumed not to support 2.02 protocol.For an empty parameter, point to “auto” or empty string.

Page 43: Linux Kernel Booting Process (1) - For NLKB

struct setup_header (2)

43

Member Sz Description

relocatable_kernel 1 Indicate whether the protected-mode kernel is relocatable.

payload_offset 4 Offset to the payload from the protected-mode code

payload_length 4 Length of the payload

setup_data 8 Pointer to the single linked list for additional setup_data’s

realmode_switch 4 Hook called just before switching to the protected mode

Page 44: Linux Kernel Booting Process (1) - For NLKB

What does the setup code setup?

• header.S• Contains setup_header• Prepares stack and BSS to run C programs• Jumps into the C program (main.c)

• main.c• Copies setup_header into “zeropage”• Setups early console• Initializes heap• Checks the CPUs (64-bit capable for 64-bit kernel?)• Collect HW information by querying to BIOS, and stores

the results in “zeropage”• Finally transits to protected-mode, and jumps into the

“protected-mode kernel”44

Page 45: Linux Kernel Booting Process (1) - For NLKB

struct boot_params• Traditionally called “zeropage”• A page that contains additional boot information for 32-bit

mode• Statically allocated in main.c• Including the whole struct setup_header.

• main.c first copies the contents of struct setup_header into &boot_params.hdr

45

113: struct boot_params {114: struct screen_info screen_info; /* 0x000 */…132: __u8 e820_entries; /* 0x1e8 */…150: struct setup_header hdr; /* setup header */ /* 0x1f1 */…153: struct e820entry e820_map[E820MAX]; /* 0x2d0 */…157: } __attribute__((packed));

Page 46: Linux Kernel Booting Process (1) - For NLKB

Collecting HW Information• Memory size [arch/x86/boot/memory.c]• Try the methods in the following order:

• AX = 0xe820, INT 0x15• AX = 0xe801, INT 0x15• AH = 0x88, INT 0x15

• IST (Intel SpeedStep Technology) Information

46

Page 47: Linux Kernel Booting Process (1) - For NLKB

Memory Information• AX = 0xe820, INT 0x15 [detect_memory_e820()]

• INPUT• AX = 0xe820• CX = size of the buffer• EDX = “SMAP” (0x534d4150 / Signature)• EBX = Continuation value• ES:DI = address for the buffer

• OUTPUT• CF = 0 if successful, 1 otherwise• CX = Returned Byte• EBX = Continuation value

• Each call returns information for one range• To get information for the next range, give the continuation value returned in the

previous call

• The range information is returned by the following structure• Stored in boot_params.e820_map (struct e820entry[128])

47

52 struct e820entry {53 __u64 addr; /* start of memory segment */54 __u64 size; /* size of memory segment */55 __u32 type; /* type of memory segment */56 } __attribute__((packed));

(arch/x86/include/uapi/asm/e820.h)

Type Value

E820_RAM 1

E820_RESERVED 2

E820_ACPI 3

E820_NVS 4

E820_UNUSABLE 5

Page 48: Linux Kernel Booting Process (1) - For NLKB

(Example)

48

BIOS-e820: [mem 0x0000000000000000-0x000000000009ebff] usableBIOS-e820: [mem 0x000000000009ec00-0x000000000009ffff] reservedBIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reservedBIOS-e820: [mem 0x0000000000100000-0x00000000668c2fff] usableBIOS-e820: [mem 0x00000000668c3000-0x000000006690afff] ACPI NVSBIOS-e820: [mem 0x000000006690b000-0x0000000066913fff] ACPI dataBIOS-e820: [mem 0x0000000066914000-0x0000000066916fff] ACPI NVSBIOS-e820: [mem 0x0000000066917000-0x0000000066918fff] usableBIOS-e820: [mem 0x0000000066919000-0x0000000066919fff] reservedBIOS-e820: [mem 0x000000006691a000-0x000000006691afff] ACPI NVSBIOS-e820: [mem 0x000000006691b000-0x000000006693dfff] reservedBIOS-e820: [mem 0x000000006693e000-0x0000000066945fff] ACPI NVSBIOS-e820: [mem 0x0000000066946000-0x000000006699ffff] reservedBIOS-e820: [mem 0x00000000669a0000-0x00000000669a3fff] ACPI NVSBIOS-e820: [mem 0x00000000669a4000-0x0000000066d95fff] usableBIOS-e820: [mem 0x0000000066d96000-0x0000000066ef2fff] reservedBIOS-e820: [mem 0x0000000066ef3000-0x0000000066efffff] usableBIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reservedBIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reservedBIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reservedBIOS-e820: [mem 0x00000000fed61000-0x00000000fed70fff] reservedBIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reservedBIOS-e820: [mem 0x00000000fef00000-0x00000000ffffffff] reserved

Page 49: Linux Kernel Booting Process (1) - For NLKB

Memory Information : Old Age• AH = 0x88, INT 0x15 [detect_memory_88()]

• INPUT • AH = 0x88, INT 0x15

• OUTPUT • AX = Size of memory above 1MB [in KB]• CF = 1 if error, 0 otherwise

• Can return up to 64MB• Stored in boot_params.ext_mem_k

• AX = 0xe801, INT 0x15 [detect_memory_e801()]• INPUT

• AH = 0xe801, INT 0x15• OUTPUT

• AX = Size of memory between 1MB ~ 16MB [in KB]• BX = Size of memory above 16MB [in 64KB]• CX, DX = Unknown (Same as AX and BX, respectively)

• Can return up to 4GB• Currently, Linux ignores the area > 16MB if AX != 15MB.

• Stored in boot_params.alt_mem_k

49

Gradually, converted to e820 mapin arch/x86/

kernel/e820.c

Page 50: Linux Kernel Booting Process (1) - For NLKB

Goes into the protected mode• go_to_protected_mode() in pm.c• Calls realmode_switch_hook• Enables A20• Deassert IGNNE# in x87• Disable interrupts in PICs• Set up IDT and GDT

• Protected_mode_jump in pmjump.S• Enable PE in CR0• And ljmp (0x66 0xea)• Jumps into 32-bit entry point

50

Page 51: Linux Kernel Booting Process (1) - For NLKB

3-2. Protected ModeNow it’s 32 bit.

51

Page 52: Linux Kernel Booting Process (1) - For NLKB

Protected-Mode Protocol• Starts at the top of the protected mode kernel

• Usually loaded at 0x100000 (1MB)• Can be at any position if compiled as relocatable• Should be at the same position as specified in the compile time if

compiled as not relocatable

• Used in “linux” module in GRUB2• [Protocol] At the entry point,

• The loaded GDT must have __BOOT_CS (0x10 / execute and read) and __BOOT_DS(0x18 / read and write)

• %cs must be __BOOT_CS• %ds, %es, and %ss must be __BOOT_DS• Interrupts must be disabled• %esi must be the address for struct boot_params• %ebp, %edi, and %ebx must be zero.

52

Page 53: Linux Kernel Booting Process (1) - For NLKB

Protected-Mode Kernel• arch/x86/boot/compressed/head_{32,64}.S• Goal: Decompresses the kernel (vmlinux.gz/.bz2/.xz…)

and start the kernel• Relocates the decompressing code (if relocatable and

loaded at a different address)• Enables paging and enters the long-mode (in head_64.S)• Clears the BSS, and prepares the heap and stack• Decompresses the kernel• Relocates if required

• RANDOMIZED_BASE or RELOCATABLE (in 32-bit)

53

Page 54: Linux Kernel Booting Process (1) - For NLKB

Relocation of the Kernel (1)• When is it required?• When a program is loaded at a different address from

the expected one in the compile-time.• When an instruction in the program uses an absolute

address for operand(s)

• When is it?• In 32-bit mode, kernel data address = kernel code

address• There is no simple way to do so

• No RIP-relative!

• Address randomization (RANDOMIZE_BASE)• To randomize kernel physical/virtual address

54

Page 55: Linux Kernel Booting Process (1) - For NLKB

Relocation of the Kernel (2)• How is it done?• Create tables of the positions of all the absolute symbols• At the runtime, rewrite the addresses adding the delta

between the expected address and the actual address.• Done!

• How is the table created?• The object files have the table to link with the other

objjects• LD’s option (-q/--emit-relocs) leaves the table in ELF

55

Page 56: Linux Kernel Booting Process (1) - For NLKB

Kernel memory map in x86(_32)

56

PAGE_OFFSET(0xc0000000

)

0xf8000000 lowmem

User space

0x00000000

Virtual Address

Physical Address

Linux Kernel

Linux Kernel

Page 57: Linux Kernel Booting Process (1) - For NLKB

Kernel memory map in x86_64

57

PAGE_OFFSET(0xFFFF880000000000

)

lowmem

User space

0x0000000000000000

Virtual Address

Physical Address

__START_KERNEL_map(0xFFFFFFFF8000000

0)

Linux Kernel

text & data

Page 58: Linux Kernel Booting Process (1) - For NLKB

Decompressing the Kernel• Heap and stack are taken from the static area in

head_{32/64}.S• If the output and the decompressing code may

overlap, first relocate the decompressing code

• When all done, it parses the ELF header, and loads the sections to appropriate addresses• Now, jumping!! to the entry point in ELF!

58

Page 59: Linux Kernel Booting Process (1) - For NLKB

Welcome to Linux kernel!• Now we are at arch/x86/head_{32/64}.S!• The details from here on are in the next presentation

59

Page 60: Linux Kernel Booting Process (1) - For NLKB

3-3. Modern World

60

Page 61: Linux Kernel Booting Process (1) - For NLKB

GPT• GPT (GUID Partition Table)

• Resolves MBR’s issues• The offset and size of a partition in MBR’s table is expressed by 32-bit wide LBA

(Logical Block Addressing)• Cannot point to the >2TB sector

• 32 bit (232) * 512 byte/sector (29) = 2 TB (241)

• Part of UEFI

• Sectors used• LBA 0 = MBR (compatible partition table)

• MBR’s partition table is set as the whole disk area is reserved for a partition (System ID = 0xee)

• LBA 1 = Header• LBA 2 ~ 33 = Partition Information (128 Partitions)

• Partition type is expressed by GUID (Global Unique Identifier)• EBD0A0A2-B9E5-4433-87C0-68B6B72699C7 for Linux file system

• LBA -34 ~ -1 = Backup• The copy of LBA 1 ~ 33

61

Page 62: Linux Kernel Booting Process (1) - For NLKB

GPT and GRUB• Hey, where should “core.img” be??• In MBR partitions, there is “gap” between MBR and the

first partition…

• In GPT, it should be allocated as a partition• “BIOS Boot Partition”• Created right after GPT, and before 1MB

• All the other partitions are allocated in 1MB aligned• Thus, there is also gap between GPT and the next partition.

• GUID: 21686148-6449-6e6f-744e656564454649• “Hah!I dontNeedEFI”

• Yes. If you use UEFI, you don’t need such partition!

62

Page 63: Linux Kernel Booting Process (1) - For NLKB

UEFI• Universal Extensible Firmware Interface• Its origin, EFI, was developed by HP and Intel.• Developed for the Itanium systems (2000)• Addresses the 16-bit mode limitation in BIOS

• Advantages• CPU-independent architecture and drivers

• Device drivers are created by EFI Byte Code (EBC)• Modular design• Rich functionality like GUI, file systems, and network

boot (not only by PXE, but also SAN boot, iSCSI, etc.), cryptography, etc.• GPT (Can boot from >2TB disks)

63

Page 64: Linux Kernel Booting Process (1) - For NLKB

Boot loaders in UEFI• OS Loaders

• One of the UEFI Application• Loaded from the EFI System Partition

• GRUB2 supports the UEFI boot• Linux kernel itself can be directly executed from UEFI

• “EFI Stub” (CONFIG_EFI_STUB)• Makes bzImage as the UEFI Application (PE Format)• Very simple functionality (no boot menu…), thus GRUB2 is

recommended

64

43 .global bootsect_start44 bootsect_start:45 #ifdef CONFIG_EFI_STUB46 # "MZ", MS-DOS header47 .byte 0x4d48 .byte 0x5a49 #endif5051 # Normalize the start address52 ljmp $BOOTSEG, $start2

84 #ifdef CONFIG_EFI_STUB85 .org 0x3c86 #87 # Offset to the PE header.88 #89 .long pe_header90 #endif /* CONFIG_EFI_STUB */

Page 65: Linux Kernel Booting Process (1) - For NLKB

“Secure Boot”• A mechanism that allows only the “signed” binary (OS) can

be executed• To protect the PCs from the malware (like ones that infects the

MBRs)• The trusted keys are pre-stored in the firmware• Microsoft can (practically) enforce the PC vendors to include its

key• Otherwise, Windows cannot be booted.

• Then, how about the other OSes?

• Several approaches• Having users put the keys of distributors to the trusted list• The open-source foundation makes the PC vendors include their

keys• Bootloader projects (or distribution) have their bootloaders

signed by Microsoft65

Page 66: Linux Kernel Booting Process (1) - For NLKB

shim• A simple EFI bootloader• It just chainloads the GRUB2 UEFI bootloader.• The path to the next bootloader is hardcoded in the

program• “grubx64.efi”• “fallback.efi”

• In Ubuntu, “shim-signed” package contains the signed version for “shim”

66

Page 67: Linux Kernel Booting Process (1) - For NLKB

4. Booting in ARMuBoot and uImage

67

Page 68: Linux Kernel Booting Process (1) - For NLKB

U-Boot and uImage• ARM Case

68

vmlinux

boot/Image(1) Strip symbols and

transform into a simple binary

piggy.gzip(2) Compress (gzip, xzkern, lzma, lzo, lz4)

piggy.gzip.o

piggy.o*.oboot/compressed/vmlinux

(5) Transform it into a simple binaryboot/zImage

(3) Make an object file piggyback the compressed image

(4) Link with the other objects in boot/compressed (Decompressing codes)

(6) Convert zImage to uImageby mkimage (U-boot’s utility) boot/uImage

Page 69: Linux Kernel Booting Process (1) - For NLKB

ARM Boot Protocol• Documentation/arm/Booting• The entry point (in compressed/head.S) is called with

two arguments• r1: CPU type• r2: boot data

• Either the pointer to ATAGs or to DTB (device tree blobs)• When r2 points to DTB, r1 is ignored.

• No BIOS-like things, thus hardware information should be provided by the boot loader (and also hard-coded in the kernel itself)• ATAG List

• An array of ATAG; each element is of variable length• ATAG_CORE : # of cores, ATAG_MEM : memory size, etc.

69

Page 70: Linux Kernel Booting Process (1) - For NLKB

ATAGs• ATAG

• Tagged information for hardware• Used by “bootm” command in uBoot• Converted to FDT

if CONFIG_ARM_ATAG_DTB_COMPAT

70

22: #define ATAG_NONE 0x0000000024: struct tag_header {25: __u32 size;26: __u32 tag;27: };...39: #define ATAG_MEM 0x5441000240: 41: struct tag_mem32 {42: __u32 size;43: __u32 start; /* physical start address */44: };45

(arch/arm/include/uapi/asm/setup.h)

Header: ATAG_CORE

Contents for ATAG_CORE

Header: ATAG_MEM

Contents for ATAG_MEM

Header: ATAG_INITRD2Contents for

ATAG_INITRD2Header: ATAG_CMDLINE

Contents for ATAG_CMDLINE

Header: ATAG_NONE

Page 71: Linux Kernel Booting Process (1) - For NLKB

Flattened Device Tree (FDT) Blobs• Binary form of flattened device tree• Device Tree (described in the next slide) expressed in

the memory sequentially in binary• Used by “fdt” command

71

44: struct boot_param_header {45: __be32 magic; /* magic word OF_DT_HEADER */46: __be32 totalsize; /* total size of DT block */47: __be32 off_dt_struct; /* offset to structure */48: __be32 off_dt_strings; /* offset to strings */49: __be32 off_mem_rsvmap; /* offset to memory reserve map */50: __be32 version; /* format version */51: __be32 last_comp_version; /* last compatible version */52: /* version 2 fields below */53: __be32 boot_cpuid_phys; /* Physical CPU id we're booting on */54: /* version 3 fields below */55: __be32 dt_strings_size; /* size of the DT strings block */56: /* version 17 fields below */57: __be32 dt_struct_size; /* size of the DT structure block */58: };

(include/linux/of_fdt.h)

Page 72: Linux Kernel Booting Process (1) - For NLKB

Device Tree [5][6]

• Device Tree• Describes hardware• Simple tree of named nodes and properties• A property is a pair of a name and a value

• “chosen” node• Not representing the real hardware• Information between the firmware (bootloader) and OS

kernel• Can include initrd information, command line

parameters (bootargs)

72

[5] http://www.devicetree.org/Main_Page[6] http://www.devicetree.org/Device_Tree_Usage

Page 73: Linux Kernel Booting Process (1) - For NLKB

FDT

73

struct boot_param_header: __be32 off_dt_struct; __be32 off_dt_strings;

r2

OF_DT_BEGIN_NODE (0x01)

Path name (“chosen”)

OF_DT_PROP (0x03)SizeOffset in the string

Strings...

Contents

OF_DT_END_NODE (0x02)

off_dt_struct

off_dt_strings

“bootargs”

OF_DT_END (0x09)

(drivers/of/fdt.c)