model-specific registers a look at intel’s scheme for introducing new cpu features

32
Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Post on 20-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Model-Specific Registers

A look at Intel’s scheme for introducing new CPU features

Page 2: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Microprocessor evolution…8080

8086

80186

80286

80386

80486

80586“Pentium”

64K-memory, 8-bit registers (no mul/div, no FPU)

1M -memory, 16-bit registers, I/O-ports (8087 option)

Ins/outs, shift/rotate-immediate, integrated-DMA+PIC+Timers

16M-memory, protected-mode multitasking (80287 option)

4GB-memory, 32-bit registers, paging (287/387 options)

Integrated FPU, RISC, cacheing, xadd (APIC option)

MMX-instructions, integrated local-APIC,MSRs, dual-pipelines, branch-prediction

Added TR6, TR7

Added TR3, TR4, TR5

Removed TR3,TR4,TR5,TR6,TR7

1973

1978

1981

1982

1985

1989

1993

Page 3: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

The ‘Model-Specific’ concept

• Beginning with the Pentium processor, Intel has been including ‘experimental’ features in its processors, warning that they may disappear from future designs, but providing a standard and permanent way for all such features to be accessed

• This access is via a pair of ‘privileged’ instructions (rdmsr and wrmsr) that can only be executed by ‘ring0’ code

Page 4: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Quite a few MSRs now!

• At first there were only about a dozen of these MSRs (Model-Specific Registers), but lately their number is well over 200

• Some MSRs have evidently proven to be sufficiently satisfactory and worth having that they are now deemed as permanent fixtures of the defined i386 architecture

Page 5: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

The Time-Stamp Counter

• This 64-bit Model-Specific Register was introduced in the Pentium processor and has been present in each CPU thereafter

• It increments once every CPU clock-cycle, starting from 0 when power is turned on

• It won’t overflow for at least ten years

• Unprivileged programs (ring3) normally can access, it via the rdtsc instruction

Page 6: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Using the TSC

EDX EAX

63 32 31 0

64-bits

time0: .quad 0 # saves starting value from the TSC time1: .quad 0 # saves concluding value from TSC

# how you can measure CPU clock-cycles in a code-fragment rdtsc # read the Time-Stamp Countermovl %eax, time0+0 # save least-significant longwordmovl %edx, time0+4 # save most-significant longword# <Your code-fragment to be measured goes here>rdtsc # read the Time-Stamp Countermovl %eax, time1+0 # save least-significant longwordmovl %edx, time1+4 # save most-significant longword# now subtract starting-value ‘time0’ from ending value ‘time1’

Page 7: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

The TSC as an MSR

• Each Model-Specific Register has its own identifying register-number, and it can be accessed (from ring0) using the special pair of instructions: rdmsr and wrmsr

• The Time-Stamp Counter is MSR number 0x10• To write a new 64-bit value into the TSC, you

load the desired 64-bit value into the EDX:EAX register-pair, you put the MSR ID-number 0x10 into register ECX, then you execute wrmsr

Page 8: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

IA32_APIC_BASE

• This register has MSR number 0x1B and is private to each CPU in an SMP system

• It establishes the base-address for the Local-APIC’s memory-mapped registers (the default base-address is 0xFEE00000, but that can be changed using this MSR)

• The CPU’s Local-APIC functions can be either enabled or disabled (via bit #11)

• The BSP can be recognized (via bit #8)

Page 9: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Relocating the APIC registers

reserved

63 32 31 12 11 8 0

IA32_APIC_BASE (64-bits)

APIC base-address(4K page-number)

EN

BSP

Default-value for APIC base-address page = 0xFEE00

Local-APIC Enable bit (1=enabled, 0=disabled)

Boot-Strap Processor (read-only): 1=yes, 0=no

# make the processor’s Local-APIC registers accessible in real-modemov $0x000D8000, %eax # least-significant 32-bitsmov $0x00000000, %edx # most-significant 32-bitsmov $0x1B, %ecx # MSR register-numberwrmsr # write to specified MSR

Page 10: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Extended Feature Enable Register

• This Model-Specific Register (MSR) was introduced in the AMD64 architecture and perpetuated by EM64T (for compatibility)

SCE

LME

LMA

NXE

63 11 10 8 0

Legend:SCE = SysCall/sysret is Enabled (1=yes, 0=no)LME = Long-Mode is Enabled (1=yes, 0=no)LMA = Long-Mode is Active (1=yes, 0=no)NXE = Non-eXecutable pages Enabled (1=yes, 0=no)

NOTE: The MSR address-index for EFER = 0xC0000080, and this register is accessed using RDMSR or WRMSR instructions

Page 11: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

The x86 operating ‘modes’

Realmode

Protectedmode

IA-32e mode

Virtual8086mode

SystemManagement

mode

64-bitmode

Compatibilitymode

Power on

Page 12: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Why CPU’s ‘mode’ matters

• Key differences among the x86 modes:– How memory is addressed and mapped– What instruction-set is available– Which registers are accessible– Which ‘exceptions’ may be generated– What data-structures are required– How task-switching can be accomplished– How interrupts will be processed

Page 13: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Mode transitions

• The processor starts up in ‘real mode’

• Mode-transitions normally happen under program control (except for transitions to the so-called ‘System Management Mode’)

• Details of programming a mode-change depend on which modes are involved

• Some mode-transfers aren’t possible

• ‘64-bit mode’ offers a lot of surprises

Page 14: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Registers in 64-bit mode

RAX

63 32 31 16 15 8 7 0

EAX RAX ECX RCX EDX RDX EBX RBX ESP RSP EBP RBP ESI RSI EDI RDI EIP RIPEFLAGS RFLAGS

R8 R9 R10 R11 R12 R13 R14 R15

CR8

CR0 CR2 CR3 CR4 DR0 DR1 DR2 DR3 DR6 DR7

EAX AX

AL

Page 15: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Some missing features…

• Memory-segmentation is “turned off”– Base-address is zero for CS, DS, ES, SS– Segment-limit checking is not performed

• Certain familiar instructions no longer are defined while executing in ’64-bit-mode’– Cannot use ‘pusha’ and ‘popa’– Cannot ‘ljmp’ or ‘lcall’ with ‘direct’ addressing – Cannot use ‘lahf’ and ‘sahf’

Page 16: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

“canonical” addresses0000000001000100001100100001010011000111

01000010010101001011011000110101110011111000010001100101001110100101011011010111

1100011001110101101111100111011111011111

“non-canonical”(invalid)

virtual addresses

“canonical” addresses

“canonical” addresses

0xFFFFFFFFFFFFFFFF … 0xFFFF800000000000

0x00007FFFFFFFFFFF … 0x0000000000000000

Analogy using 5-bit values

64-bit “vrtual”address space

Page 17: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

4-Levels of mapping

PageMap

Level-4Table

CR3

PageDirectoryPointerTable

PageDirectory

PageTable

PageFrame(4KB)

offset

64-bit ‘canonical’ virtual address

sign-extension PML4 PDPT PDIR PTBL 63 48 47 39 38 30 29 21 20 12 11 0

Each mapping-table contains up to 512 quadword-size entries

Page 18: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

4-level address-translation

• The CPU examines any virtual address it encounters, subdividing it into five fields

offset into page-frame

index into page-table

63 48 47 39 38 30 29 21 20 12 11 0

16-bits 9-bits 12-bits

index into page-

directory

index into page-

directorypointer table

index intolevel 4

page-maptable

9-bits9-bits9-bits

sign-extension

Any 48-bit virtual-address is sign-extended to a 64-bit “canonical” address

Only “canonical” 64-bit virtual-addresses are legal in 64-bit mode

Page 19: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Reserved(must be 0)

Format of 64-bit table-entries

63 62 52 51 40 39 32

31 12 11 9 8 7 6 5 4 3 2 1 0

Page-frame physical base-address[31..12]

Page-framephysical

base-address[39..32]

EXB

PWUPWT

PCD

A

avl

avl

Meaning of these bits varies with the table

Legend: P = Present (1=yes, 0=no) PWT = Page Cache Disable (1=yes, 0=no) W = Writable (1=yes, 0=no) PWT = Page Write-Through (1=yes, 0=no) U = User-page (1=yes, 0=no) avl = available for user-defined purposes A = Accessed (1=yes, 0=no) EXB = Execution-disabled Bit (if EFER.NXE=1)

Physical addresses on our current Core-2 CPUs are only 40 bits

Page 20: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

RDMSR and WRMSR

• An assembly language code-fragment to turn on the LME-bit (‘Long-Mode’ Enable):

# Each Model-Specific Register (MSR) is 64-bits wide and has a unique# 32-bit address-index which is first placed into register ECX. Then the # least-significant 32-bits of that MSR is accessed using register EAX,# while the most-significant 32-bits is accessed using register EDX.

mov $0xC0000080, %ecx # setup EFER address-indexrdmsr # read EFER into (EDX,EAX)bts $8, %eax # set the LME-bit’s image to 1wrmsr # write (EDX,EAX) into EFER

# NOTE: RDMSR and WRMSR must be executed at ‘Ring0’ privilege-level.

Page 21: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Control Registers CR4 and CR0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

VMXE

0 0

OSXMMEx

OSFXCR

PCE

PGE

MCE

PAE

PSE

DE

TSD

PVI

VME

PG

CD

NW 0 0 0 0 0 0 0 0 0 0

AM 0 W

P 0 0 0 0 0 0 0 0 0 0NE

ET

TS

EM

MP

PE

31 13 5 0

31 0

Control Register CR4

Control Register CR0

Legend (for 64-bit mode):PE = Protected-mode Enabled (1=yes, 0=no)PG = Paging Enabled (1=yes, 0=no)PAE = Page-Addressing Extensions (1=enabled, 0=disabled)

Page 22: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Segment-Descriptor Format

Base[31..24] (if L=0)

G D LAVL

Limit[19..16](if L=0)

PDPL

S XC/D

R/

WA

Base[23..16] (if L=0)

Base[15..0](if L=0)

Limit[15..0](if L=0)

63 32

31 0

Legend: DPL = Descriptor Privilege Level (0..3) G = Granularity (0 = byte, 1 = 4KB-page) P = Present (0 = no, 1 = yes) D = Default size (0 = 16-bit, 1 = 32-bit) S = System (0 = yes, 1 = no) X = eXecutable (0 = no, 1 = yes) A = Accessed (0 = no, 1 = yes)

code-segments: R = Readable (0 = no, 1 = yes) C = Conforming (0=no, 1=yes)data-segments: W = Writable (0 = no, 1 = yes) D = expands-Down (0=no, 1=yes)

L = Long-mode (i.e., 64-bit addressing) (0=no, 1=yes)AVL = Available for user’s purposes

64-bit code-segment (‘LONG’ mode)

Page 23: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

IA-32e Call-Gate descriptor

Base[31..24] (if S=0)

G D LAVL

offset[31..16] PDPL

0 XC/D

R/

W

GateType

(=1100)

Reserved(must be 0)

code-segment selector offset[15..0]

127 96

31 0

offset[63..32]

Reserved (must be 0)

offset[63..32]

We can use a call-gate to ‘jump’ from 16-bit code-segment to a 64-bit code-segment

Page 24: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Summary of steps

• Transition from real-mode to IA-32e mode:– Build the table of global descriptors – Load GDTR with pseudo-descriptor for GDT– Build the 4-level page-mapping tables– Enable IA-32e mode (set EFER.LME=1)– Enable Page-Address Extensions (CR4.PAE)– Load Level4 page-map table address in CR3– Activate IA-32e mode (CR0.PE and CR0.PG)– Transfer via call-gate to 64-bit code-segment

Page 25: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Notes on the transition

• Code-segment must be “identity-mapped”

• Interrupts have to be temporarily disabled

• All memory-addressing in 64-bit mode via CS, SS, DS or ES uses 0 as base-address (and checking of segment-limits is omitted)

Page 26: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

For a return to ‘real-mode’

• Processor must enter 16-bit code-segment in ‘compatibility-mode’ via indirect far jump– Load segment-registers DS, ES, and SS with

‘writable’ 16-bit segment-selectors (64K-limit)– Code-segment has to be “identity-mapped”– Deactivate IA-32e mode by clearing PG-bit– Leave ‘protected-mode’ by clearing PE-bit– Reload registers CS and SS with real-mode

segment-addresses before enabling interrupts

Page 27: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

In-class exercise #1

• Try running our ‘trymoves.s’ demo, to see the effect of changing the bottom-half of a 64-bit register

• Then modify the instructions in this demo so that you use as many of the new CPU registers as possible (i.e., use R8,…,R15 instead of RAX, RBX, etc., and R8L, R9L, …, instead of AL, BL, etc.)

Page 28: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Demo-program: ‘try64bit.s’

• We created a demo-program that starts in ‘real-mode’, enters 64-bit mode and draws a message, jumps to ‘compatibility mode’ and draws another message, then returns to real-mode and shows a final message

• It has to write directly to VRAM when it’s not executing in real-mode – because the ROM-BIOS routines use ‘real’-style code

Page 29: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

How text-mode VRAM works

• The video memory resides at 0x000B8000 and in text-mode it is organized as a linear array of two-byte elements (i.e., ‘words’):

• Array-elements are arranged in “row-major” order (left-to-right, top-to-bottom)

Attribute-code for theforeground and background

colors

Ascii code forcharacter

15 8 7 0

Page 30: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Blue1

Default color-programming

Intense1

Red1

Blue1

Green1

Red0

Green0

Blinking0

FORECOLORBACKCOLOR

7 6 5 4 3 2 1 0

Page 31: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

Character-cell screen-locations80 cells-per-row

25 rows

for (row 0, column 0) the address-offset is (0*80+0)*2

for (row 2, column 79) the address-offset is (2*80+79)*2

for (row 24, column 40) the address-offset is (24*80+40)*2

Page 32: Model-Specific Registers A look at Intel’s scheme for introducing new CPU features

In-class exercise #2

• Can you modify the message-colors used in our ‘try64bit.s’ demo-program so that:– the first message is bright-red against white– the second message is brown against cyan – The final message is magenta against black