input/output organization (chapter 4)
Post on 25-Feb-2016
65 Views
Preview:
DESCRIPTION
TRANSCRIPT
in1705/07
1
Input/Output Organization(Chapter 4)
http://www.pds.ewi.tudelft.nl/~iosup/Courses/2011_ti1400_8.ppt
TU-DelftTI1400/11-PDS
2
The “Data Deluge”: Trivia
The Petabyte Age: Because More Isn't Just More — More Is Different, Wired, 23 June 2008. http://www.wired.com/science/discoveries/magazine/16-07/pb_intro#
TU-DelftTI1400/11-PDS
3
The “Data Deluge”: Facts and Predictions
"Everywhere you look, the quantity of information in the world is soaring. According to one estimate, mankind created 150 exabytes (billion gigabytes) of data in 2005. This year, it will create 1,200 exabytes. Merely keeping up with this flood, and storing the bits that might be useful, is difficult enough. Analysing it, to spot patterns and extract useful information, is harder still.“The Data Deluge, The Economist, 25 February 2010.
TU-DelftTI1400/11-PDS
4
“Data Deluge”: The Mobile Example
Battling a Wireless Deluge: AT&T, Other Carriers Use Wi-Fi 'Hotzones' to Siphon Off Smartphone Traffic, Tech Journal, 2 February 2011.
Read more: http://online.wsj.com/article/SB10001424052748704124504576118353354099780.html#ixzz1LEpF4TuA
exabyte
TU-DelftTI1400/11-PDS
5
“Data Deluge”: The Personal Memex Example
• Vannevar Bush in the 1940s: record your life• MIT Media Laboratory: The Human Speechome
Project/TotalRecall, data mining/analysis/visio- Deb Roy and Rupal Patel “record practically every
waking moment of their son’s first three years”(20% privacy time…Is this even legal?! Should it be?!)
- 11x1MP/14fps cameras, 14x16b-48KHz mics, 4.4TB RAID + tapes, 10 computers; 200k hours audio-video
- Data size: 200GB/day, 1.5PB total
TU-DelftTI1400/11-PDS
6
“Data Deluge”: The Gaming Analytics Example
• EQ II: 20TB/year all logs• Halo3: 1.4PB served
statistics on player logs
TU-DelftTI1400/11-PDS
7
“Data Deluge”: Datasets in Comp.Sci.
Peer-to-Peer Trace Archive… PWA, ITA, CRAWDAD, …
• 1,000s of scientists: From theory to practice7
DatasetSize
Year
1GB
10GB
100GB
1TB
1TB/yr
P2PTA
GamTA
‘09 ‘10 ‘11‘06
The FailureTraceArchive
http://fta.inria.frhttp://fta.inria.fr
http://gwa.ewi.tudelft.nlhttp://gwa.ewi.tudelft.nl
TU-DelftTI1400/11-PDS
8
The Simplest(?) Problem: How to Access Data by the CPU/Cores?
• Computers must be able to communicate with outside
• Large variety of devices- size- speed- distance
• Timing and electrical properties not the same as within CPU
TU-DelftTI1400/11-PDS
9
Single-bus structure
Processor Memory
I/O device #1 I/O device #n
Bus
............
TU-DelftTI1400/11-PDS
10
Multiple buses
Processor
Memory
I/O device #1 I/O device #n
I/O Bus
............
memory bus
TU-DelftTI1400/11-PDS
11
Buses and interfacesBus contains generally three bit strings:• Data lines to transport data• Address lines to identify devices• Control lines that take care of correct
transfer of data
TU-DelftTI1400/11-PDS
12
InterfacesDevices are coupled to bus through
interface:• Address decoder
- for detection if data is for device• Data registers
- to store incoming and outgoing data• Status and control registers
- to certify status of device- to control transfer
TU-DelftTI1400/11-PDS
13
Interface organization
AddressDecoder
Data andStatus registers
Control circuits
Address linesData linesControl lines
I/Ointerface
Device
TU-DelftTI1400/11-PDS
14
Video terminal
CPU
DATAIN
SIN SOUT
DATAOUT
Video terminal
Keyboard Display
TU-DelftTI1400/11-PDS
15
Operation (1)
READWAIT Branch to READWAIT if SIN=0Input from DATAIN to R1
WRITEWAIT Branch to WRITEWAIT if SOUT=0Output from R1 to DATAOUT
Move DATAIN, R1Move R1, DATAOUT
Busy waiting:
I/O-instructions:
TU-DelftTI1400/11-PDS
16
Operation (2)
SIN SOUT012
IOSTATUS
DATAIN
DATAOUT
READWAIT Testbit #1, IOSTATUS Branch=0 READWAIT
Move DATAIN, R1
TU-DelftTI1400/11-PDS
17
I/O Instructions• Memory-mapped I/O
- the registers of the devices have addresses in the same space as main memory locations
- normal instructions can be used• move DATAIN, R1
• I/O instructions- special instructions for I/O
• IN device, data• OUT data, device
TU-DelftTI1400/11-PDS
18
Memory and register structure
IOPROC1 IOPROC2
......
Memory
CPU
TU-DelftTI1400/11-PDS
19
Address spaces
CPU CPU
IOPROC1
IOPROC2
Mem
0 IOPROC1
IOPROC2
Mem...... ......
memory mapped separate address spaces
123456
0120120
012
012
TU-DelftTI1400/11-PDS
20
I/O and Programming
There are two basic mechanisms for I/O:1. Programmed I/O2. Non-programmed I/O
TU-DelftTI1400/11-PDS
21
Programmed I/O• By executing of special program in CPU• Unconditional I/O
- no synchronization with I/O device• Passive signaling
- synchronization between CPU and Device by programmed interrogation by CPU
• Active signaling- synchronization between CPU and Device by
active interrupt of Device
TU-DelftTI1400/11-PDS
22
Non-programmed I/O
I/O is done by separate active entity• Direct Memory Access (DMA)
- some intelligence in device takes care of data transport
• Special I/O processors
TU-DelftTI1400/11-PDS
23
Interrupts
...
.... ..........
i i +1
1
M
Compute routine Print routine
Interruptjump
return
TU-DelftTI1400/11-PDS
24
Service Routines
• I/O device alerts CPU by hardware signal called interrupt signal
• Usually special line in control group of I/O bus is used for this: interrupt request line
• CPU stops program and starts executing service routine
• Much like executing subroutine• Except: these routines have nothing in common
!!
TU-DelftTI1400/11-PDS
25
Handling interrupts1. Device raises interrupt request• Processor interrupts program in execution• Interrupts are disabled• Device is informed of acceptance and,
as a consequence, lowers interrupt• Interrupt is handled by service routine• Interrupts are enabled• Execution of interrupted program is resumed
TU-DelftTI1400/11-PDS
26
Multiple devices• How can processors distinguish devices ?• How can processors obtain the
appropriate starting address service routine ?
• Should we allow a new interrupt while another is being served ?
• How do we handle simultaneous interrupts ?
TU-DelftTI1400/11-PDS
27
Interrupt line
CPU
INT1 INT2 INTn
INTR = INT1 + INT2 + .... + INTn
Finding device by POLLING :- search for device with IRQ bit set in status register
interrupt request
TU-DelftTI1400/11-PDS
28
Vectored Interrupt
• Device sends identification code on bus• Called interrupts vector• Issued after GRANT signal from CPU
CPU
INT1 INT2 INTngrant
interrupt request
TU-DelftTI1400/11-PDS
29
Interrupt priority
CPU
INT1 INT2 INTngrant1
priority circuit
grant2grant3
TU-DelftTI1400/11-PDS
30
Bus arbitration (1)
CPUgrant
interrupt request line (req_i)
bus release line (rel_i)
bus is free iff: (rel_1 • rel_2 • ..... • rel_n) =1
TU-DelftTI1400/11-PDS
31
Bus arbitration (2)
• Request: set req_i to 1• Acquire: if grant=1, then
set req_i to 0 (interrupt once) and set rel_i to 0 (prevent others from interrupting)
• Release: set rel_i to 1grant = (req_1 + req_2 + ..... +req_n) • (rel_1 • rel_2 • ..... • rel_n)
at least one request bus released by all
TU-DelftTI1400/11-PDS
32
PowerPC interrupt structure (1)
EE PR SE EP
EE = External interrupt enablePR = Privilege level
SE = Single step trace exception enableEP = Exception prefix
EP=0 address service starts at 000001F4 EP=1 address service starts at FFF001F4
MSR = Machine State Register16 17 21 250 31
TU-DelftTI1400/11-PDS
33
PowerPC interrupt structure (2)
• PowerPC has two special Save/Store registers: SRR0 and SRR1
• After interrupt:
PC
SRR0
MSR
SRR1
Clear interrupt enable bit in MSR
TU-DelftTI1400/11-PDS
34
IA-32 interrupt structure (1)
Processor status register (EFLAGS)
• CF, ZF, SF, OF: condition code flags• TF: trap flag• IF: Interrupt Enable Flag• IOPL: I/O Privilege Level (4 levels)• IA-32 has two interrupt request lines
31 06SF789111213
TF ZF CFIFOFIOPL
TU-DelftTI1400/11-PDS
35
IA-32 interrupt structure (2)
• Steps when an interrupt occurs:1. push processor status register, current
segment register (CS), and instruction pointer (EIP) onto the stack
2. clear interrupt-enable flag if needed3. fetch starting address of interrupt-service
routine from Interrupt Descriptor Table and load it into EIP
• At end of routine, execute IRET
TU-DelftTI1400/11-PDS
36
Example
SIN SOUT012
STATUS
DATAIN6IE
interrupt
keyboard interface
TU-DelftTI1400/11-PDS
37
Memory Layout
32 K I/O space
32 K program space
1F4
STATUSDATAIN
address READ
LINE ...............
READ .....
buffer area
TU-DelftTI1400/11-PDS
38
PowerPC: InitializationINTVEC EQU $1F4 Interrupt vector address
(location where start address of interrupt routine is stored)
INTEN EQU $40 Keyboard interrupt enableINTDIS EQU 0 and disable masks
(will be stored in status register of device)
NEWMSR EQU $8000 Desired contents of MSR(external interrupt enable)
RTRN EQU $0D Code Carriage Return(for checking end-of-line)
TU-DelftTI1400/11-PDS
39
PowerPC: Interrupt Processing (1)
START ADDI R2,0,READ Get address of serviceSTW R2,INTVEC(0) routine and store at
interrupt vector location
ADDI R2,0,LINE Get address of LINESTW R2, PNTR(0) and store at
PNTR
ADDI R2,0,INTEN Store interrupt enableSTW R2,STATUS(0) in STATUS register
TU-DelftTI1400/11-PDS
40
PowerPC: Interrupt Processing (3)
ADDI R2,0,NEWMSR Store new MSRMTSRR1 R2 in SRR1
ADDI R2,0,MAIN Store new PCMTSRR0 R2 in SRR0
RFI Return From Interrupt(use new MSR and PC)
TU-DelftTI1400/11-PDS
41
PowerPC: Program (1)MAIN <main program>
.....READ ..... Save registers
LBZ R30,DATAIN(0) Get input character
LWZ R31,PNTR(0) Load value at PNTRSTBU R30,1(R31) Store character
in bufferSTW R31,PNTR(0) Update PNTR for
next character
PNTR
TU-DelftTI1400/11-PDS
42
PowerPC: Program (2)CMPWI CR1,R30,RTRN Check for CR (end ofBNE CR1,DONE line)
ADDI R2,0,INTDIS Store interrupt disable STW R2,STATUS(0) in STATUS register
BL TEXT Call subroutine fordealing with line
DONE .... Restore saved registersRFI Return from interrupt
EOL
next character
TU-DelftTI1400/11-PDS
43
IA-32: Program (1)
MAIN: MOV EOL,0 not yet end of lineMOV BL,4 set keyboardOR CONTROL,BL interrupt enableSTI set interrupt flag in
processor registerREAD: PUSH EAX save registers
PUSH EBXMOV EAX,PNTR load address pntrMOV BL,DATAIN get input,MOV [EAX],BL store it,INC DWORD PTR [EAX] and increment pntr
TU-DelftTI1400/11-PDS
44
IA-32: Program (2)
CMP BL,0DH char=end of line?
JNE RTRN noMOV BL,4 yesXOR CONTROL,BL so disable interruptsMOV EOL,1 and set EOL flag
RTRN: POP EBX restore registersPOP EAXIRET return from
interrupt
TU-DelftTI1400/11-PDS
45
Other interrupts• Not only I/O devices can cause interrupts• Recovery from errors, e.g.:
- illegal OP code used- division by 0
• Debugging• Privilege exception
TU-DelftTI1400/11-PDS
46
Operating Systems (1)
• In general, interrupts controlled by Operating System
• CPU can be in user mode or supervisor mode• Privileged instructions only allowed in
supervisor mode- starting of I/O operations- setting of priorities- setting of clock values
TU-DelftTI1400/11-PDS
47
Operating Systems (2)• Process: program in execution
- Program- Data- Status: PC, Registers, etc
• State of a process: - Running- Runnable (waiting for CPU)- Blocked (waiting for something else)
• Multi-tasking- Multiple tasks in execution
• Time-slicing- Divide time across processes
TU-DelftTI1400/11-PDS
48
Operating Systems (3)
• Context switch: change of processes• After clock interrupt: dispatcher chooses
suitable process• Device drivers: service routines for devices• System Call: call to OS service routine
- printf (“%d\n”,a)- fscanf (file,”%d\n”,&a)
TU-DelftTI1400/11-PDS
49
OS init, services, scheduler
OSINIT Set interrupt vectorsTime slice clock <- SCHEDULERTrap <- OSSERVICESVDT interrupts <- IODATA...
OSSERVICES Examine stack to determine requestCall appropriate routine
SCHEDULER Save current contextSelect runnable processRestore saved context of new processReturn from interrupt
Set addressesfor dealingwith theseinterrupts
Context switch
TU-DelftTI1400/11-PDS
50
I/O routinesIOINIT Set process status to blocked
Initialize memory buffersCall device driver to initialize device (e.g.,
VDT)Return from subroutine
IODATA Poll devices to determine sourceof interrupt (e.g., VDT)Call appropriate driverif END=1 then set process to runnableReturn from interrupt
TU-DelftTI1400/11-PDS
51
VDT driver (e.g., Keyboard)
VDTINIT Initialize device interface (e.g., baud rate)Enable interruptsReturn from subroutine
VDTDATA Check device statusIf ready then transfer characterIf character = CR (check end-of-line) then
set END=1; Disable interrupts else set END=0
Return from subroutine
TU-DelftTI1400/11-PDS
52
Direct Memory Access
012
Status &Control
Wordcount
30
R/WDMA controller
Done
31
IRQIE
Start address
more “intelligent”device interface
Interrupt request
TU-DelftTI1400/11-PDS
53
Direct Memory Access to Physical Devices
Processor Memory
DMA/Diskcontroller
Disk1 Disk2
DMAcontroller
Network Interface
System bus
DMA Channel 2DMA Ch. 1
Bus priority modesCycle stealing: DMA > CPUBurst: DMA exclusive
BUFR
TU-DelftTI1400/11-PDS
54
Cell/B.E.: A Modern DMA Use
• 1 x PPE 64-bit PowerPC- L1: 32 KB I$+32 KB D$- L2: 512 KB
• 8 x SPE cores:- Local mem (LS): 256 KB - 128 x 128 bit vector
registers • Peak performance
- ~200 GFLOPS for all SPEs- ~240 GFLOPS total
• Main memory access:- PPE: Rd/Wr - SPEs: Async DMA
TU-DelftTI1400/11-PDS
55
Bus structuresSpecification of bus • Number of data lines• Size of address space• Multiplexing discipline• Control structure• Synchronous versus asynchronous• Physical properties:
connectors, pinning, electrical properties
TU-DelftTI1400/11-PDS
56
NVIDIA G80/GT200/Fermi: I/O as Performance Bottleneck
G80 GT200
• SM = streaming multiprocessor • 1 SM = 8 SP (streaming proc/CUDA cores)• 1TPC = 2 x SM / 3 x SM =
thread processing clustersPer chip 1+TFLOPS
I/O: 2.5GB/s (1:400)
TU-DelftTI1400/11-PDS
57
Synchronous Bus
Bus clock
Address
Data
Clock “slow enough” for all connected devices
TU-DelftTI1400/11-PDS
58
Asynchronous Bus (1)
Address
Data (from device)
Ready (set by CPU)
Accept (set by device)
Explicit handshaking: Input Cycle
(to allow for skew)
TU-DelftTI1400/11-PDS
59
Asynchronous Bus (2)
Address
Data (from CPU)
Ready
Accept
Explicit handshaking: Output Cycle
TU-DelftTI1400/11-PDS
60
SCSI bus
• Small Computer System Interface (SCSI)• ANSI standard X3.131• Up to 25 meter• 50-wire cable• Up to 8 (16) devices connected to bus• A connection has an initiator and a target• Target controls data transfer
TU-DelftTI1400/11-PDS
61
SCSI- based Computer System
Processor Memory Par. intface
Printer Terminal
Ser.intface
SCSIcontroller
Diskcontroller
Disk1 Disk2
CD-ROMcontroller
CD ROM drive
processor bus
SCSI bus
TU-DelftTI1400/11-PDS
62
SCSI bus signals• Data: DB(0),..., DB(7)• Parity: DB(P)• Phase: BSY, SEL• Information type: C/D, MSG (control/message)• Handshake: REQ, ACK• Direction: I/O• Other: ATN, RST• Data lines used for identifying bus controllers• Signals are active in the low-voltage state
TU-DelftTI1400/11-PDS
63
Typical sequence-DB2
-DB5
-DB6
-BSY
-SEL busfree arbitration selection
2 retreats
target
initiator
initiator 6 wins
top related