pc architectural standards page (1)

105
PC Architectural Standards Page JohnKalpus.com XT ISA EISA MCA PCI Slo t Architecture CPU Bit s Slot Colo r XT XT Architecture (‘81) 808 8 8 bi t Blac k ISA Industry Std. Arch. (‘83) 8028 6 16 bi t Blac k MCA Micro Channel Arch. ('87) IBM 8038 6 32 bi t Blu e EISA Enhanced Ind. Std. Arch. ('88) 8038 6 clon es 32 bi t Brow n VLB VESA Local Bus ('92) 8048 6 32 bi t Brow n PC I Peripheral Computer Interface ('93) 8058 6 32/6 4 bit Whit e In order to assure that the peripherals (modems"> PC Architectural Standards Page JohnKalpus.com

Upload: krishnaraja86

Post on 22-Nov-2014

134 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: PC Architectural Standards Page (1)

PC Architectural Standards PageJohnKalpus.com

XT ISA EISA MCA PCI

Slot Architecture CPU BitsSlot

Color

XT XT Architecture (‘81) 80888

bitBlack

ISA Industry Std. Arch. (‘83) 8028616 bit

Black

MCA Micro Channel Arch. ('87)IBM

8038632 bit

Blue

EISAEnhanced Ind. Std. Arch.

('88)80386clones

32 bit

Brown

VLB VESA Local Bus ('92) 8048632 bit

Brown

PCIPeripheral Computer

Interface ('93)80586

32/64 bit

White

In order to assure that the peripherals (modems">

PC Architectural Standards PageJohnKalpus.com

XT ISA EISA MCA PCI

Slot Architectu

reCPU Bits

Slot Color

XTXT

Architecture (‘81)

8088 8 bit Black

ISAIndustry

Std. Arch. (‘83)

80286

16 bit

Black

MCA

Micro Channel

Arch. ('87)

IBM 8038

6

32 bit

Blue

EISA

Enhanced Ind. Std.

Arch. ('88)

80386

clones

32 bit

Brown

Page 2: PC Architectural Standards Page (1)

VLBVESA

Local Bus ('92)

80486

32 bit

Brown

PCI

Peripheral Computer Interface

('93)

80586

32/64 bit

White

In order to assure that the peripherals (modems, network interface cards, NICs, sound cards, video cards, etc.) one can purchase for a modern PC today will work with virtually any PC, motherboards manufacturers agree to adhere to several standards in the way the various boards are wired. These standards are called Architectures.

The surface of a computer motherboard is traversed by thousands of tiny embedded copper wires which connect everything on the motherboard to everything else. They're much too small for anyone to determine which architectural model they represent. Instead, the several types of expansion slots into which one can insert the above-mentioned peripherals tell us definitively which architectures are represented on the board.

The first PC IBM designed and marketed in 1981 was called the IBM PC - XT; XT standing for eXtended Technology. This computer was designed around the Intel 8088 processor which itself was able to "digest" 8 bits per clock tick. We call the way an XT computer is wired XT Architecture and the expansion slots one can find on an XT are called XT slots. By virtue of the 8088 processor being an 8-bit processor, the XT slots are also 8-bit slots -- and any peripheral inserted correctly into an XT slot can also be called an 8-bit peripheral.

When IBM introduced their next PC in 1983, they called it the AT, meaning "Advanced Technology." This computer was designed around the Intel 080286 chip -- which was a 16-bit processor. This

Page 3: PC Architectural Standards Page (1)

chip was able to "digest" 16 bits per clock tick, or two characters, more or less. (see note below) In order for a new peripheral to send or receive 16 bit per clock into the motherboard on these new '286 computers, designers added a small extension to the XT slot and called the whole black slot an ISA slot -- Industry Standard Architecture.

With the advent of the '386 processor introduced in 1987 by Intel, the various competing PC manufactures, IBM, clones, etc, now split ranks and independently designed several different architectures which competed with each other. IBM developed Micro Channel Architecture -- or the MCA slot. Not surprisingly, "Big Blue used

 

 

 

Page 4: PC Architectural Standards Page (1)

blue MCA slots on their '386 computers also called PS2s or Personal System 2 computers. While MCA architecture is advanced, fast -- at 32 bits per clock tick, IBM asked the clone manufacturers for a licensing fee in order to use MCA architecture -- and they all refused. Instead, clone manufacturers decided to design their own architecture around the Intel '386 chip. A year later, in 1988, one could find EISA slots on

Page 5: PC Architectural Standards Page (1)

these machines. EISA slots communicated in 32 bit chunks of information with the motherboard bus and started to show up on clone computers. Unfortunately, motherboard manufacturers discovered that EISA architecture and their ensuing EISA architecture peripheral cards were much more costly to produce than an ISA card was. EISA architecture fell out of favor for the

Page 6: PC Architectural Standards Page (1)

general PC market and can only be found now in server PCs and higher-end PCs.

In 1989, Intel announced the 80486 chip, or the venerable '486. By this time IBM and the PC clone manufacturers came up with a new architecture which satisfied them all and they called it VLB or VESA Local Bus. VESA, or Video Electronics Standards Association, is an acronym, which represents

Page 7: PC Architectural Standards Page (1)

makers of video cards. They were instrumental in designing the new 32 bit VLB slot -- a simple idea which uses the standard 16-bit ISA slot with another, shorter 16-bit extension about 3 inches long. Any card which is made to make contact with all the contacts in the entire VLB slot communicates in 32-bit chunks with the motherboard.

Page 8: PC Architectural Standards Page (1)

The Pentium chip debuted in 1993 and allowed for a 32 and 64 bit communication channel with the motherboard - this new architecture is called PCI or Peripheral Computer Interface. Designers could had added a 32-bit extension to the VLB slot but this would have made a PCI card as big as the state of Rhode Island! Instead, a new slot was created -- the off-white and very stately PCI slot.

Page 9: PC Architectural Standards Page (1)

Nowadays, modern PC motherboards usually exhibit several ISA slots (which are also XT slots) and several PCI slots as well. ISA slots will start to disappear on motherboards in 1999 -- they've really outlived their usefulness. When purchasing new card peripherals, look for a PCI slot version to ensure usability for your subsequent motherboard upgrades. Enjoy!

Note: Bits per clock ticks have been simplified for ease of explanation. For those who demand the utmost accuracy, here's the real poop: For example, suppose you're typing the following on your keyboard -- "The quick brown fox jumped over the lazy dog's tail."

On an 8088 CPU with 8 - bit architecture, the processor requires 52 clock ticks x 12 clock cycles/instruction or a total of 624 total clock ticks. On a 16 bit 80286 CPU, this would require 27 clock ticks x 12 clock ticks/instruction or 324 clock ticks. A 32 bit 80386 CPU needs 13 clock ticks X 4.5 ticks/instruction or 58.5 total clock ticks. A 80486 CPU which digests 32 bits per clock tick needs 13 clock ticks X 2 ticks/instruction = 26 total clock ticks. Finally, a Pentium or Pentium type 586 CPU running at 64 bits per clock tick requires only 7 clock ticks X .5 ticks/instruction or 3.5 total clock ticks to digest the whole sentence.

The reason Pentium chips need only a half clock tick to digest all 64 bits flowing around a Pentium motherboard is because Pentium chips have two instruction pipelines and thus can execute 2 instructions per clock cycle. Pentium chips have 2 - 32 bit data registers - called Superscalar Architecture.

Extended Industry Standard ArchitectureFrom Wikipedia, the free encyclopedia

Page 10: PC Architectural Standards Page (1)

Jump to: navigation, search This article does not cite any references or sources.Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (December 2006)

EISA

Enhanced Industry Standard Architecture

Three EISA Slots.

Year created 1988

Created by Gang of Nine

Superseded by PCI (1993)

Width in bits 32

Number of devices 1 per slot

Capacity 8.33 MHz

Style Parallel

Hotplugging interface No

External interface No

A SCSI Controller (Adaptec AHA-1740).

ELSA Winner 1000 for ISA and EISA.

Page 11: PC Architectural Standards Page (1)

The Extended Industry Standard Architecture (in practice almost always shortened to EISA and frequently pronounced "eee-suh") is a bus standard for IBM compatible computers. It was announced in late 1988 by PC clone vendors (the "Gang of Nine") as a counter to IBM's use of its proprietary MicroChannel Architecture (MCA) in its PS/2 series.

EISA extends the AT bus, which the Gang of Nine retroactively renamed to the ISA bus to avoid infringing IBM's trademark on its PC/AT computer, to 32 bits and allows more than one CPU to share the bus. The bus mastering support is also enhanced to provide access to 4 GB of memory. Unlike MCA, EISA can accept older XT and ISA boards — the lines and slots for EISA are a superset of ISA.

EISA was much favoured by manufacturers due to the proprietary nature of MCA, and even IBM produced some machines supporting it. It was somewhat expensive to implement (though not as much as MCA), so it never became particularly popular in desktop PCs. However, it was reasonably successful in the server market, as it was better suited to bandwidth-intensive tasks (such as disk access and networking). Most EISA cards produced were either SCSI or network cards. EISA was also available on some non-IBM compatible machines such as the AlphaServer, HP 9000-D, SGI Indigo2 and MIPS Magnum.

By the time there was a strong market need for a bus of these speeds and capabilities, the VESA Local Bus and later PCI filled this niche and EISA vanished into obscurity.

Contents

[hide]

1 History 2 Technical data 3 Industry Acceptance 4 See also 5 External links

[edit] History

The original IBM PC included five 8-bit slots, running at the system clock speed of 4.77 MHz. The PC/AT, introduced in 1984, had three 8-bit slots and five 16-bit slots, all running at the system clock speed of 6 MHz in the earlier models and 8 MHz in the last version of the computer. The 16-bit slots were a superset of the 8-bit configuration, so most 8-bit cards were able to plug into a 16-bit slot (some cards used a "skirt" design that interfered with the extended portion of the slot) and continue to run in 8-bit mode. One of the key reasons for the success of the IBM PC (and the PC clones that followed it) was the active ecosystem of third-party expansion cards available for the machines. IBM was restricted from patenting the bus, and widely published the bus specifications.

As the PC-clone industry continued to build momentum in the mid- to late-1980s, several problems with the bus began to be apparent. First, because the "AT slot" (as it was known at the time) was not managed by any central standards group, there was nothing to prevent a manufacturer from "pushing" the standard. One of the most common issues was that as PC

Page 12: PC Architectural Standards Page (1)

clones became more common, PC manufacturers began ratcheting up the processor speed to maintain a competitive advantage. Unfortunately, because the ISA bus was originally locked to the processor clock, this meant that some 286 machines had ISA buses that ran at 10, 12, or even 16 MHz. In fact, the first system to clock the ISA bus at 8 MHz was the turbo 8088 clones that clocked the processors at 8 MHz. This caused many issues with incompatibility, where a true IBM-compatible third-party card (designed for an 8 MHz or 4.77 MHz bus) might not work in a higher speed system (or even worse, would work unreliably). Most PC makers eventually decoupled the slot clock from the system clock, there was still no standards body to "police" the industry.

The AT bus architecture was so well entrenched that no single clone manufacturer had the leverage to create an standardized alternative, and there was no compelling reason for them to cooperate on a new standard. Because of this, when the first 386-based system (the Compaq Deskpro 386) hit the market in 1986, it still supported 16-bit slots. Other 386 PCs followed suit, and the AT (later ISA) bus remained a part of most systems even into the late 1990s. Some of the 386 systems had proprietary 32-bit extensions to the ISA bus.

Meanwhile, IBM began to worry that it was losing control of the industry it had created. In 1987, IBM released the PS/2 line of computers, which included the MCA bus. MCA included numerous enhancements over the 16-bit AT bus, including bus mastering, burst mode, software configurable resources, and 32-bit capabilities. However, in an effort to reassert its dominant role, IBM patented the bus, and placed stringent licensing and royalty policies on its use. A few manufacturers did produce licensed MCA machines (most notably NCR), but overall the industry balked at IBM's restrictions.

In response, a group of PC manufacturers (the "Gang of Nine"), led by Compaq, created a new bus, which was named the Extended (or Enhanced) Industry Standard Architecture, or "EISA" (the Industry Standard Architecture, or "ISA", name replaced the "AT" name commonly used for the 16-bit bus). This provided virtually all of the technical advantages of MCA, while remaining compatible with existing 8-bit and 16-bit cards, and (most enticing to system and card makers) minimal licensing cost.

The first EISA computers to hit the market were the Compaq Deskpro 486 and the SystemPro. The SystemPro, being one of the first PC-style systems designed as a network server, was built from the ground up to take full advantage of the EISA bus. It included such features as multiprocessing, hardware RAID, and bus-mastering network cards.

Ironically, one of the benefits to come out of the EISA standard was a final codification of the standard to which ISA slots and cards should be held (in particular, clock speed was fixed at an industry standard of 8.33MHz). Thus, even systems which didn't use the EISA bus gained the advantage of having the ISA standardized, which contributed to its longevity.

[edit] Technical data

Page 13: PC Architectural Standards Page (1)

bus width 32 bitcompatible with 8 bit ISA, 16 bit ISA, 32 bit EISApins 98 + 100 inlayVcc +5 V, −5 V, +12 V, −12 Vclock 8.33 MHztheoretical data rate (32 bit) about 33 MB/s (8.33 MHz × 4 bytes)usable data rate (32 bit) about 20 MB/s

Although the EISA bus had a slight performance disadvantage over MCA (bus speed of 8.33 MHz, compared to 10 MHz), EISA contained almost all of the technological benefits that MCA boasted, including bus mastering, burst mode, software configurable resources, and 32-bit data/address buses. These brought EISA nearly to par with MCA from a performance standpoint, and EISA easily defeated MCA in industry support.

EISA replaced the tedious jumper configuration common with ISA cards with software-based configuration. Every EISA system shipped with an EISA configuration utility; this was usually a slightly customized version of the standard utilities written by the EISA chipset makers. The user would boot into this utility, either from floppy disk or on a dedicated hard drive partition. The utility software would detect all EISA cards in the system, and could configure any hardware resources (interrupts, memory ports, etc) on any EISA card (each EISA card would include a disk with information that described the available options on the card), or on the EISA system motherboard. The user could also enter information about ISA cards in the system, allowing the utility to automatically reconfigure EISA cards to avoid resource conflicts.

Similarly, Windows 95, with its Plug-and-Play capability, was not able to change the configuration of EISA cards, but it could detect the cards, read their configuration, and reconfigure Plug and Play hardware to avoid resource conflicts. Windows 95 would also automatically attempt to install appropriate drivers for detected EISA cards.

[edit] Industry Acceptance

Page 14: PC Architectural Standards Page (1)

EISA's success was far from guaranteed. Many manufacturers, including those in the "Gang of Nine", researched the possibility of using MCA. For example, Compaq actually produced prototype DeskPro systems using the bus. However, these were never put into production, and when it was clear that MCA had lost, Compaq allowed its MCA license to expire (the license actually cost relatively little; the primary costs associated with MCA, and at which the industry revolted, were royalties to be paid per system shipped).

On the other hand, when it became clear to IBM that Micro Channel was dying, IBM actually licensed EISA for use in a few server systems. As a final jab at their competitor, Compaq (leader of the EISA consortium) didn't cash the first check sent by IBM for the EISA license. Instead, the check was framed and put on display in the company museum at Compaq's main campus in Houston, Texas.

[edit] See also

Industry Standard Architecture (ISA) Micro Channel architecture (MCA) NuBus VESA Local Bus (VESA) Peripheral Component Interconnect (PCI) Accelerated Graphics Port (AGP) PCI Express (PCIe) List of device bandwidths PCI-X PC/104 CompactPCI PC card Low Pin Count (LPC) Universal Serial Bus

Conventional PCIFrom Wikipedia, the free encyclopedia

Jump to: navigation, search

Conventional PCI

PCI Local Bus

Page 15: PC Architectural Standards Page (1)

Three 5 V 32-bit PCI expansion slots on a motherboard (PC

bracket to left)

Year created July 1993

Created by Intel

Superseded by PCI Express (2004)

Width in bits 32 or 64

Capacity

133 MB/s (32-bit at 33 MHz)

266 MB/s (32-bit at 66 MHz or 64-bit

at 33 MHz)

533 MB/s (64-bit at 66 MHz)

Style Parallel

Hotplugging interface Optional

A typical 32-bit, 5 V-only PCI card, in this case a SCSI adapter from Adaptec

Conventional PCI (part of the PCI Local Bus standard and often shortened to PCI) is a computer bus for attaching hardware devices in a computer. These devices can take either the form of an integrated circuit fitted onto the motherboard itself, called a planar device in the PCI specification, or an expansion card that fits into a slot. The name PCI is an initialism formed from Peripheral Component Interconnect. The PCI Local Bus is common in modern PCs, where it has displaced ISA and VESA Local Bus as the standard expansion bus, and it also appears in many other computer types. Despite the availability of faster interfaces such as PCI-X and PCI Express, conventional PCI remains a very common interface.

The PCI specification covers the physical size of the bus (including the size and spacing of the circuit board edge electrical contacts), electrical characteristics, bus timing, and protocols. The specification can be purchased from the PCI Special Interest Group (PCI-SIG).

Page 16: PC Architectural Standards Page (1)

Typical PCI cards used in PCs include: network cards, sound cards, modems, extra ports such as USB or serial, TV tuner cards and disk controllers. Historically video cards were typically PCI devices, but growing bandwidth requirements soon outgrew the capabilities of PCI. PCI video cards remain available for supporting extra monitors and upgrading PCs that do not have any AGP or PCI Express slots.[1]

Many devices traditionally provided on expansion cards are now commonly integrated onto the motherboard itself, meaning that modern PCs often have no cards fitted. However, PCI is still used for certain specialized cards, although many tasks traditionally performed by expansion cards may now be performed equally well by USB devices.

Contents

[hide]

1 History 2 Auto Configuration 3 Interrupts 4 Conventional hardware specifications

o 4.1 Card keying o 4.2 Connector pinout

5 Physical card dimensions o 5.1 Full-size card o 5.2 Card backplate o 5.3 Half-length extension card (de-facto standard) o 5.4 Low-profile (half-height) card o 5.5 Mini PCI

5.5.1 Technical details of Mini PCI o 5.6 Other physical variations

6 PCI bus transactions o 6.1 PCI address spaces o 6.2 PCI command codes

7 PCI bus signals o 7.1 Signal timing o 7.2 Arbitration o 7.3 Address phase

7.3.1 Address phase timing 7.3.2 Dual-cycle address 7.3.3 Configuration access

o 7.4 Data phases 7.4.1 Fast DEVSEL# on reads

o 7.5 Ending transactions 7.5.1 Initiator burst termination 7.5.2 Target burst termination

o 7.6 Burst addressing o 7.7 Transaction examples o 7.8 Parity o 7.9 Fast back-to-back transactions o 7.10 64-bit PCI o 7.11 Cache snooping (obsolete)

Page 17: PC Architectural Standards Page (1)

8 Development tools 9 See also 10 References 11 External links

[edit] History

Work on PCI began at Intel's Architecture Development Lab circa 1990.

A team of Intel engineers (composed primarily of ADL engineers) defined the architecture and developed a proof of concept chipset and platform (Saturn) partnering with teams in the company's desktop PC systems and core logic product organizations. The original PCI architecture team included, among others, Dave Carson, Norm Rasmussen, Brad Hosler, Ed Solari, Bruce Young, Gary Solomon, Ali Oztaskin, Tom Sakoda, Rich Haslam, Jeff Rabe, and Steve Fischer.

PCI (Peripheral Component Interconnect) was immediately put to use in servers, replacing MCA and EISA as the server expansion bus of choice. In mainstream PCs, PCI was slower to replace VESA Local Bus (VLB), and did not gain significant market penetration until late 1994 in second-generation Pentium PCs. By 1996 VLB was all but extinct, and manufacturers had adopted PCI even for 486 computers.[2] EISA continued to be used alongside PCI through 2000. Apple Computer adopted PCI for professional Power Macintosh computers (replacing NuBus) in mid-1995, and the consumer Performa product line (replacing LC PDS) in mid-1996.

Later revisions of PCI added new features and performance improvements, including a 66 MHz 3.3 V standard and 133 MHz PCI-X, and the adaptation of PCI signaling to other form factors. Both PCI-X 1.0b and PCI-X 2.0 are backward compatible with some PCI standards.

The PCI-SIG introduced the serial PCI Express in 2004. At the same time they rechristened PCI as Conventional PCI. Since then, motherboard manufacturers have included progressively fewer Conventional PCI slots in favor of the new standard.

[edit] Auto Configuration

PCI provides separate memory and I/O port address spaces for the x86 processor family, 64 and 32 bits, respectively. Addresses in these address spaces are assigned by software. A third address space, called the PCI Configuration Space, which uses a fixed addressing scheme, allows software to determine the amount of memory and I/O address space needed by each device. Each device can request up to six areas of memory space or I/O port space via its configuration space registers.

In a typical system, the firmware (or operating system) queries all PCI buses at startup time (via PCI Configuration Space) to find out what devices are present and what system resources (memory space, I/O space, interrupt lines, etc.) each needs. It then allocates the resources and tells each device what its allocation is.

Page 18: PC Architectural Standards Page (1)

The PCI configuration space also contains a small amount of device type information, which helps an operating system choose device drivers for it, or at least to have a dialogue with a user about the system configuration.

Devices may have an on-board ROM containing executable code for x86 or PA-RISC processors, an Open Firmware driver, or an EFI driver. These are typically necessary for devices used during system startup, before device drivers are loaded by the operating system.

In addition there are PCI Latency Timers that are a mechanism for PCI Bus-Mastering devices to share the PCI bus fairly. "Fair" in this case means that devices won't use such a large portion of the available PCI bus bandwidth that other devices aren't able to get needed work done. Note, this does not apply to PCI Express.

How this works is that each PCI device that can operate in bus-master mode is required to implement a timer, called the Latency Timer, that limits the time that device can hold the PCI bus. The timer starts when the device gains bus ownership, and counts down at the rate of the PCI clock. When the counter reaches zero, the device is required to release the bus. If no other devices are waiting for bus ownership, it may simply grab the bus again and transfer more data.[3]

[edit] Interrupts

Devices are required to follow a protocol so that the interrupt lines can be shared. The PCI bus includes four interrupt lines, all of which are available to each device. However, they are not wired in parallel as are the other PCI bus lines. The positions of the interrupt lines rotate between slots, so what appears to one device as the INTA# line is INTB# to the next and INTC# to the one after that. Single-function devices use their INTA# for interrupt signaling, so the device load is spread fairly evenly across the four available interrupt lines. This alleviates a common problem with sharing interrupts.

PCI bridges (between two PCI buses) map the four interrupt traces on each of their sides in varying ways. Some bridges use a fixed mapping, and in others it is configurable. In the general case, software cannot determine which interrupt line a device's INTA# pin is connected to across a bridge. The mapping of PCI interrupt lines onto system interrupt lines, through the PCI host bridge, is similarly implementation-dependent. The result is that it can be impossible to determine how a PCI device's interrupts will appear to software. Platform-specific BIOS code is meant to know this, and set a field in each device's configuration space indicating which IRQ it is connected to, but this process is not reliable.

PCI interrupt lines are level-triggered. This was chosen over edge-triggering in order to gain an advantage when servicing a shared interrupt line, and for robustness: edge triggered interrupts are easy to miss.

Later revisions of the PCI specification add support for message-signaled interrupts. In this system a device signals its need for service by performing a memory write, rather than by asserting a dedicated line. This alleviates the problem of scarcity of interrupt lines. Even if interrupt vectors are still shared, it does not suffer the sharing problems of level-triggered interrupts. It also resolves the routing problem, because the memory write is not unpredictably modified between device and host. Finally, because the message signaling is

Page 19: PC Architectural Standards Page (1)

in-band, it resolves some synchronization problems that can occur with posted writes and out-of-band interrupt lines.

PCI Express does not have physical interrupt lines at all. It uses message-signaled interrupts exclusively.

[edit] Conventional hardware specifications

Diagram showing the different key positions for 32-bit and 64-bit PCI cards

These specifications represent the most common version of PCI used in normal PCs.

33.33 MHz clock with synchronous transfers peak transfer rate of 133 MB/s (133 megabytes per second) for 32-bit bus width (33.33 MHz

× 32 bits ÷ 8 bits/byte = 133 MB/s) 32-bit bus width 32- or 64-bit memory address space (4 gigabytes or 16 exabytes) 32-bit I/O port space 256-byte (per device) configuration space 5-volt signaling reflected-wave switching

The PCI specification also provides options for 3.3 V signaling, 64-bit bus width, and 66 MHz clocking, but these are not commonly encountered outside of PCI-X support on server motherboards.

The PCI bus arbiter performs bus arbitration among multiple masters on the PCI bus. Any number of bus masters can reside on the PCI bus, as well as requests for the bus. One pair of request and grant signals is dedicated to each bus master.

Page 20: PC Architectural Standards Page (1)

[edit] Card keying

A PCI-X Gigabit Ethernet expansion card. Note both 5 V and 3.3 V support notches are present.

Typical PCI cards present either one or two key notches, depending on their signaling voltage. Cards requiring 3.3 volts have a notch 56.21 mm from the front of the card (where the external connectors are) while those requiring 5 volts have a notch 104.47 mm from the front of the card. So called "Universal cards" have both key notches and can accept both types of signal.

[edit] Connector pinout

The PCI connector is defined as having 62 contacts on each side of the edge connector, but two or four of them are replaced by key notches, so a card has 60 or 58 contacts on each side. Pin 1 is closest to the backplate. B and A sides are as follows, looking down into the motherboard connector.[4][5][6]

32-bit PCI connector pinout

Pin Side B Side A Comments

1 −12V TRST#

JTAG port pins (optional)2 TCK +12V

3 Ground TMS

4 TDO TDI

5 +5V +5V

6 +5V INTA#

Interrupt lines (open-drain)7 INTB# INTC#

8 INTD# +5V

9 PRSNT1# Reserved Pulled low to indicate 7.5 or 25 W power required

10 Reserved IOPWR +5V or +3.3V

Page 21: PC Architectural Standards Page (1)

11 PRSNT2# Reserved Pulled low to indicate 7.5 or 15 W power required

12 Ground GroundKey notch for 3.3V-capable cards

13 Ground Ground

14 Reserved 3.3Vaux Standby power (optional)

15 Ground RST# Bus reset

16 CLK IOPWR 33/66 MHz clock

17 Ground GNT# Bus grant from motherboard to card

18 REQ# Ground Bus request from card to motherboard

19 IOPWR PME# Power management event (optional)

20 AD[31] AD[30]

Address/data bus (upper half)

21 AD[29] +3.3V

22 Ground AD[28]

23 AD[27] AD[26]

24 AD[25] Ground

25 +3.3V AD[24]

26 C/BE[3]# IDSEL

27 AD[23] +3.3V

28 Ground AD[22]

29 AD[21] AD[20]

30 AD[19] Ground

31 +3.3V AD[18]

32 AD[17] AD[16]

33 C/BE[2]# +3.3V

34 Ground FRAME# Bus transfer in progress

35 IRDY# Ground Initiator ready

36 +3.3V TRDY# Target ready

Page 22: PC Architectural Standards Page (1)

37 DEVSEL# Ground Target selected

38 Ground STOP# Target requests halt

39 LOCK# +3.3V Locked transaction

40 PERR# SMBCLK SDONE Parity error; SMBus clock or Snoop done (obsolete)

41 +3.3V SMBDAT SBO# SMBus data or Snoop backoff (obsolete)

42 SERR# Ground System error

43 +3.3V PAR Even parity over AD[31:00] and C/BE[3:0]#

44 C/BE[1]# AD[15]

Address/data bus (lower half)

45 AD[14] +3.3V

46 Ground AD[13]

47 AD[12] AD[11]

48 AD[10] Ground

49 M66EN Ground AD[09]

50 Ground GroundKey notch for 5V-capable cards

51 Ground Ground

52 AD[08] C/BE[0]#

Address/data bus (lower half)

53 AD[07] +3.3V

54 +3.3V AD[06]

55 AD[05] AD[04]

56 AD[03] Ground

57 Ground AD[02]

58 AD[01] AD[00]

59 IOPWR IOPWR

60 ACK64# REQ64# For 64-bit extension; no connect for 32-bit devices.

61 +5V +5V

62 +5V +5V

Page 23: PC Architectural Standards Page (1)

64-bit PCI extends this by an additional 32 contacts on each side which provide AD[63:32], C/BE[7:4]#, the PAR64 parity signal, and a number of power and ground pins.

Legend

Ground pin Zero volt reference

Power pin Supplies power to the PCI card

Output pin Driven by the PCI card, received by the motherboard

Initiator output Driven by the master/initiator, received by the target

I/O signal May be driven by initiator or target, depending on operation

Target output Driven by the target, received by the initiator/master

Input Driven by the motherboard, received by the PCI card

Open drain May be pulled low and/or sensed by multiple cards

Reserved Not presently used, do not connect

Most lines are connected to each slot in parallel. The exceptions are:

Each slot has its own REQ# output to, and GNT# input from the motherboard arbiter. Each slot has its own IDSEL line, usually connected to a specific AD line. TDO is daisy-chained to the following slot's TDI. Cards without JTAG support must connect

TDI to TDO so as not to break the chain. PRSNT1# and PRSNT2# for each slot have their own pull-up resistors on the motherboard.

The motherboard may (but does not have to) sense these pins to determine the presence of PCI cards and their power requirements.

REQ64# and ACK64# are individually pulled up on 32-bit only slots. The interrupt lines INTA# through INTD# are connected to all slots in different orders. (INTA#

on one slot is INTB# on the next and INTC# on the one after that.)

Notes:

IOPWR is +3.3V or +5V, depending on the backplane. The slots also have a ridge in one of two places which prevents insertion of cards that do not have the corresponding key notch, indicating support for that voltage standard. Universal cards have both key notches and use IOPWR to determine their I/O signal levels.

The PCI SIG strongly encourages 3.3 V PCI signaling,[5] requiring support for it since standard revision 2.3,[4] but most PC motherboards use the 5 V variant. Thus, while many currently available PCI cards support both, and have two key notches to indicate that, there are still a large number of 5 V-only cards on the market.

Page 24: PC Architectural Standards Page (1)

The M66EN pin is an additional ground on 5V PCI buses found in most PC motherboards. Cards and motherboards that do not support 66 MHz operation also ground this pin. If all participants support 66 MHz operation, a pull-up resistor on the motherboard raises this signal high and 66 MHz operation is enabled.

At least one of PRSNT1# and PRSNT2# must be grounded by the card. The combination chosen indicates the total power requirements of the card (25 W, 15 W, or 7.5 W).

SBO# and SDONE are signals from a cache controller to the current target. They are not initiator outputs, but are colored that way because they are target inputs.

[edit] Physical card dimensions

[edit] Full-size card

The original "full-size" PCI card is specified as a height of 107 mm (4.2 inches) and a depth of 312 mm (12.283 inches). The height includes the edge card connector. However, most modern PCI cards are half-length or smaller (see below) and many modern PCs cannot fit a full-size card.

[edit] Card backplate

In addition to these dimensions the physical size and location of a card's backplate are also standardized. The backplate is the part that fastens to the card cage to stabilize the card and also contains external connectors, so it usually attaches in a window so it is accessible from outside the computer case. The backplate is fixed to the cage by a 6-32 screw.

The card itself can be a smaller size, but the backplate must still be full-size and properly located so that the card fits in any standard PCI slot.

[edit] Half-length extension card (de-facto standard)

This is in fact the practical standard now – the majority of modern PCI cards fit inside this length.

Width: 0.6 inches (15.24 mm) Depth: 6.9 inches (175.26 mm) Height: 4.2 inches (106.68 mm)

[edit] Low-profile (half-height) card

The PCI organization has defined a standard for "low-profile" cards, which basically fit in the following ranges:

Height: 1.42 inches (36.07 mm) to 2.536 inches (64.41 mm) Depth: 4.721 inches (119.91 mm) to 6.6 inches (167.64 mm)

The bracket is also reduced in height, to a standard 3.118 inches (79.2 mm). The smaller bracket will not fit a standard PC case, but will fit in a 2U rack-mount case. Many manufacturers supply both types of bracket (brackets are typically screwed to the card so changing them is not difficult).

Page 25: PC Architectural Standards Page (1)

These cards may be known by other names such as "slim".

Low Profile PCI FAQ Low Profile PCI Specification

[edit] Mini PCI

Mini PCI Wi-Fi card Type IIIB

Mini PCI was added to PCI version 2.2 for use in laptops; it uses a 32-bit, 33 MHz bus with powered connections (3.3 V only; 5 V is limited to 100 mA) and support for bus mastering and DMA. The standard size for Mini PCI cards is approximately 1/4 of their full-sized counterparts. As there is limited external access to the card compared to desktop PCI cards, there are limitations on the functions they may perform.

MiniPCI-to-PCI converter Type III

MiniPCI and MiniPCI Express cards in comparison

Page 26: PC Architectural Standards Page (1)

Many Mini PCI devices were developed such as Wi-Fi, Fast Ethernet, Bluetooth, modems (often Winmodems), sound cards, cryptographic accelerators, SCSI, IDE–ATA, SATA controllers and combination cards. Mini PCI cards can be used with regular PCI-equipped hardware, using Mini PCI-to-PCI converters. Mini PCI has been superseded by PCI Express Mini Card.

[edit] Technical details of Mini PCI

Mini PCI cards have a 2 W maximum power consumption, which also limits the functionality that can be implemented in this form factor. They also are required to support the CLKRUN# PCI signal used to start and stop the PCI clock for power management purposes.

There are three card form factors: Type I, Type II, and Type III cards. The card connector used for each type include: Type I and II use a 100-pin stacking connector, while Type III uses a 124-pin edge connector, i.e. the connector for Types I and II differs from that for Type III, where the connector is on the edge of a card, like with a SO-DIMM. The additional 24 pins provide the extra signals required to route I/O back through the system connector (audio, AC-Link, LAN, phone-line interface). Type II cards have RJ11 and RJ45 mounted connectors. These cards must be located at the edge of the computer or docking station so that the RJ11 and RJ45 ports can be mounted for external access.

TypeCard on outer edge of host

systemConnector Size Comments

IA No 100-Pin Stacking 7.5 × 70 × 45 mmLarge Z dimension (7.5 mm)

IB No 100-Pin Stacking 5.5 × 70 × 45 mmSmaller Z dimension (5.5 mm)

IIA Yes 100-Pin Stacking 17.44 × 70 × 45 mmLarge Z dimension (17.44 mm)

IIB Yes 100-Pin Stacking 5.5 × 78 × 45 mmSmaller Z dimension (5.5 mm)

IIIA No124-Pin Card Edge

2.4 × 59.6 × 50.95 mm

Larger Y dimension (50.95 mm)

IIIB No124-Pin Card Edge

2.4 × 59.6 × 44.6 mm

Smaller Y dimension (44.6 mm)

[edit] Other physical variations

Typically consumer systems specify "N × PCI slots" without specifying actual dimensions of the space available. In some small-form-factor systems, this may not be sufficient to allow even "half-length" PCI cards to fit. Despite this limitation, these systems are still useful because many modern PCI cards are considerably smaller than half-length.

Page 27: PC Architectural Standards Page (1)

[edit] PCI bus transactions

PCI bus traffic is made of a series of PCI bus transactions. Each transaction is made up of an address phase followed by one or more data phases. The direction of the data phases may be from initiator to target (write transaction) or vice-versa (read transaction), but all of the data phases must be in the same direction. Either party may pause or halt the data phases at any point. (One common example is a low-performance PCI device that does not support burst transactions, and always halts a transaction after the first data phase.)

Any PCI device may initiate a transaction. First, it must request permission from a PCI bus arbiter on the motherboard. The arbiter grants permission to one of the requesting devices. The initiator begins the address phase by broadcasting a 32-bit address plus a 4-bit command code, then waits for a target to respond. All other devices examine this address and one of them responds a few cycles later.

64-bit addressing is done using a two-stage address phase. The initiator broadcasts the low 32 address bits, accompanied by a special "dual address cycle" command code. Devices which do not support 64-bit addressing can simply not respond to that command code. The next cycle, the initiator transmits the high 32 address bits, plus the real command code. The transaction operates identically from that point on. To ensure compatibility with 32-bit PCI devices, it is forbidden to use a dual address cycle if not necessary, i.e. if the high-order address bits are all zero.

While the PCI bus transfers 32 bits per data phase, the initiator transmits a 4-bit byte mask indicating which 8-bit bytes are to be considered significant. In particular, a masked write must affect only the desired bytes in the target PCI device.

[edit] PCI address spaces

PCI has three address spaces: memory, I/O address, and configuration.

Memory addresses are 32 bits (optionally 64 bits) in size, support caching and can be burst transactions.

I/O addresses are for compatibility with the Intel x86 architecture's I/O port address space. Although the PCI bus specification allows burst transactions in any address space, most devices only support it for memory addresses and not I/O.

Finally, PCI configuration space provides access to 256 bytes of special configuration registers per PCI device. Each PCI slot gets its own configuration space address range. The registers are used to configure devices memory and I/O address ranges they should respond to from transaction initiators. When a computer is first turned on, all PCI devices respond only to their configuration space accesses. The computers BIOS scans for devices and assigns Memory and I/O address ranges to them.

If an address is not claimed by any device, the transaction initiator's address phase will time out causing the initiator to abort the operation. In case of reads, it is customary to supply all-ones for the read data value (0xFFFFFFFF) in this case. PCI devices therefore generally

Page 28: PC Architectural Standards Page (1)

attempt to avoid using the all-ones value in important status registers, so that such an error can be easily detected by software.

[edit] PCI command codes

There are 16 possible 4-bit command codes, and 12 of them are assigned. With the exception of the unique dual address cycle, the least significant bit of the command code indicates whether the following data phases are a read (data sent from target to initiator) or a write (data sent from an initiator to target). PCI targets must examine the command code as well as the address and not respond to address phases which specify an unsupported command code.

The commands that refer to cache lines depend on the PCI configuration space cache line size register being set up properly; they may not be used until that has been done.

0000: Interrupt Acknowledge

This is a special form of read cycle implicitly addressed to the interrupt controller, which returns an interrupt vector. The 32-bit address field is ignored. One possible implementation is to generate an interrupt acknowledge cycle on an ISA bus using a PCI/ISA bus bridge. This command is for IBM PC compatibility; if there is no Intel 8259 style interrupt controller on the PCI bus, this cycle need never be used.

0001: Special Cycle

This cycle is a special broadcast write of system events that PCI card may be interested in. The address field of a special cycle is ignored, but it is followed by a data phase containing a payload message. The currently defined messages announce that the processor is stopping for some reason (e.g. to save power). No device ever responds to this cycle; it is always terminated with a master abort after leaving the data on the bus for at least 4 cycles.

0010: I/O Read

This performs a read from I/O space. All 32 bits of the read address are provided, so that a device can (for compatibility reasons) implement less than 4 bytes worth of I/O registers. If the byte enables request data not within the address range supported by the PCI device (e.g. a 4-byte read from a device which only supports 2 bytes of I/O address space), it must be terminated with a target abort. Multiple data cycles are permitted, using linear (simple incrementing) burst ordering.

The PCI standard is discouraging the use of I/O space in new devices, preferring that as much as possible be done through main memory mapping.

0011: I/O Write

This performs a write to I/O space.

010x: Reserved

A PCI device must not respond to an address cycle with these command codes.

Page 29: PC Architectural Standards Page (1)

0110: Memory Read

This performs a read cycle from memory space. Because the smallest memory space a PCI device is permitted to implement is 16 bits, the two least significant bits of the address are not needed; equivalent information will arrive in the form of byte select signals. They instead specify the order in which burst data must be returned. If a device does not support the requested order, it must provide the first word and then disconnect.

If a memory space is marked as "prefetchable", then the target device must ignore the byte select signals on a memory read and always return 32 valid bits.

0111: Memory Write

This operates similarly to a memory read. The byte select signals are more important in a write, as unselected bytes must not be written to memory.

Generally, PCI writes are faster than PCI reads, because a device can buffer the incoming write data and release the bus faster. For a read, it must delay the data phase until the data has been fetched.

100x: Reserved

A PCI device must not respond to an address cycle with these command codes.

1010: Configuration Read

This is similar to an I/O read, but reads from PCI configuration space. A device must respond only if the low 11 bits of the address specify a function and register that it implements, and if the special IDSEL signal is asserted. It must ignore the high 21 bits. Burst reads (using linear incrementing) are permitted in PCI configuration space.

Unlike I/O space, standard PCI configuration registers are defined so that reads never disturb the state of the device. It is possible for a device to have configuration space registers beyond the standard 64 bytes which have read side effects, but this is rare.[7]

Configuration space accesses often have a few cycles of delay in order to allow the IDSEL lines to stabilize, which makes them slower than other forms of access. Also, a configuration space access requires a multi-step operation rather than a single machine instruction. Thus, it is best to avoid them during routine operation of a PCI device.

1011: Configuration Write

This operates analogously to a configuration read.

1100: Memory Read Multiple

This command is identical to a generic memory read, but includes the hint that a long read burst will continue beyond the end of the current cache line, and the target should internally prefetch a large amount of data. A target is always permitted to consider this a synonym for a generic memory read.

Page 30: PC Architectural Standards Page (1)

1101: Dual Address Cycle

When accessing a memory address that requires more than 32 bits to represent, the address phase begins with this command and the low 32 bits of the address, followed by a second cycle with the actual command and the high 32 bits of the address. PCI targets that do not support 64-bit addressing can simply treat this as another reserved command code and not respond to it. This command code can only be used with a non-zero high-order address word; it is forbidden to use this cycle if not necessary.

1110: Memory Read Line

This command is identical to a generic memory read, but includes the hint that the read will continue to the end of the cache line. A target is always permitted to consider this a synonym for a generic memory read.

1111: Memory Write and Invalidate

This command is identical to a generic memory write, but comes with the guarantee that one or more whole cache lines will be written, with all byte selects enabled. This is an optimization for write-back caches snooping the bus. Normally, a write-back cache holding dirty data must interrupt the write operation long enough write its own dirty data first. If the write is performed using this command, the data to be written back is guaranteed to be irrelevant, and can simply be invalidated in the write-back cache.

This optimization only affects the snooping cache, and makes no difference to the target, which may treat this as a synonym for the memory write command.

[edit] PCI bus signals

PCI bus transactions are controlled by five main control signals, two driven by the initiator of a transaction (FRAME# and IRDY#), and three driven by the target (DEVSEL#, TRDY#, and STOP#). There are two additional arbitration signals (REQ# and GNT#) which are used to obtain permission to initiate a transaction. All are active-low, meaning that the active or asserted state is a low voltage. Pull-up resistors on the motherboard ensure they will remain high (inactive or deasserted) if not driven by any device, but the PCI bus does not depend on the resistors to change the signal level; all devices drive the signals high for one cycle before ceasing to drive the signals.

[edit] Signal timing

All PCI bus signals are sampled on the rising edge of the clock. Signals nominally change on the falling edge of the clock, giving each PCI device approximately one half a clock cycle to decide how to respond to the signals it observed on the rising edge, and one half a clock cycle to transmit its response to the other device.

The PCI bus requires that every time the device driving a PCI bus signal changes, one turnaround cycle must elapse between the time the one device stops driving the signal and the other device starts. Without this, there might be a period when both devices were driving the signal, which would interfere with bus operation.

Page 31: PC Architectural Standards Page (1)

The combination of this turnaround cycle and the requirement to drive a control line high for one cycle before ceasing to drive it means that each of the main control lines must be high for a minimum of two cycles when changing owners. The PCI bus protocol is designed so this is rarely a limitation; only in a few special cases (notably fast back-to-back transactions) is it necessary to insert additional delay to meet this requirement.

[edit] Arbitration

Any device on a PCI bus that is capable of acting as a bus master may initiate a transaction with any other device. To ensure that only one transaction is initiated at a time, each master must first wait for a bus grant signal, GNT#, from an arbiter located on the motherboard. Each device has a separate request line REQ# that requests the bus, but the arbiter may "park" the bus grant signal at any device if there are no current requests.

The arbiter may remove GNT# at any time. A device which loses GNT# may complete its current transaction, but may not start one (by asserting FRAME#) unless it observes GNT# asserted the cycle before it begins.

The arbiter may also provide GNT# at any time, including during another master's transaction. During a transaction, either FRAME# or IRDY# or both are asserted; when both are deasserted, the bus is idle. A device may initiate a transaction at any time that GNT# is asserted and the bus is idle.

[edit] Address phase

A PCI bus transaction begins with an address phase. The initiator, seeing that it has GNT# and the bus is idle, drives the target address onto the AD[31:0] lines, the associated command (e.g. memory read, or I/O write) on the C/BE[3:0]# lines, and pulls FRAME# low.

Each other device examines the address and command and decides whether to respond as the target by asserting DEVSEL#. A device must respond by asserting DEVSEL# within 3 cycles. Devices which promise to respond within 1 or 2 cycles are said to have "fast DEVSEL" or "medium DEVSEL", respectively. (Actually, the time to respond is 2.5 cycles, since PCI devices must transmit all signals half a cycle early so that they can be received three cycles later.)

Note that a device must latch the address on the first cycle; the initiator is required to remove the address and command from the bus on the following cycle, even before receiving a DEVSEL# response. The additional time is available only for interpreting the address and command after it is captured.

On the fifth cycle of the address phase (or earlier if all other devices have medium DEVSEL or faster), a catch-all "subtractive decoding" is allowed for some address ranges. This is commonly used by an ISA bus bridge for addresses within its range (24 bits for memory and 16 bits for I/O).

On the sixth cycle, if there has been no response, the initiator may abort the transaction by deasserting FRAME#. This is known as master abort termination and it is customary for PCI bus bridges to return all-ones data (0xFFFFFFFF) in this case. PCI devices therefore are

Page 32: PC Architectural Standards Page (1)

generally designed to avoid using the all-ones value in important status registers, so that such an error can be easily detected by software.

[edit] Address phase timing

_ 0_ 1_ 2_ 3_ 4_ 5_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ ___ GNT# \___/XXXXXXXXXXXXXXXXXXX (GNT# Irrelevant after cycle has started) _______ FRAME# \___________________ ___ AD[31:0] -------<___>--------------- (Address only valid for one cycle.) ___ _______________ C/BE[3:0]# -------<___X_______________ (Command, then first data phase byte enables) _______________________ DEVSEL# \___\___\___\___ Fast Med Slow Subtractive _ _ _ _ _ _ _ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ 0 1 2 3 4 5

On the rising edge of clock 0, the initiator observes FRAME# and IRDY# both high, and GNT# low, so it drives the address, command, and asserts FRAME# in time for the rising edge of clock 1. Targets latch the address and begin decoding it. They may respond with DEVSEL# in time for clock 2 (fast DEVSEL), 3 (medium) or 4 (slow). Subtractive decode devices, seeing no other response by clock 4, may respond on clock 5. If the master does not see a response by clock 5, it will terminate the transaction and remove FRAME# on clock 6.

TRDY# and STOP# are deasserted (high) during the address phase. The initiator may assert IRDY# as soon as it is ready to transfer data, which could theoretically be as soon as clock 2.

[edit] Dual-cycle address

To allow 64-bit addressing, a master will present the address over two consecutive cycles. First, it sends the low-order address bits with a special "dual-cycle address" command on the C/BE[3:0]#. On the following cycle, it sends the high-order address bits and the actual command. Dual-address cycles are forbidden if the high-order address bits are zero, so devices which do not support 64-bit addressing can simply not respond to dual cycle commands.

_ 0_ 1_ 2_ 3_ 4_ 5_ 6_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ ___ GNT# \___/XXXXXXXXXXXXXXXXXXXXXXX _______ FRAME# \_______________________ ___ ___ AD[31:0] -------<___X___>--------------- (Low, then high bits) ___ ___ _______________ C/BE[3:0]# -------<___X___X_______________ (DAC, then actual command) ___________________________ DEVSEL# \___\___\___\___

Page 33: PC Architectural Standards Page (1)

Fast Med Slow _ _ _ _ _ _ _ _ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ 0 1 2 3 4 5 6

[edit] Configuration access

Addresses for PCI configuration space access are decoded specially. For these, the low-order address lines specify the offset of the desired PCI configuration register, and the high-order address lines are ignored. Instead, an additional address signal, the IDSEL input, must be high before a device may assert DEVSEL#. Each slot connects a different high-order address line to the IDSEL pin, and is selected using one-hot encoding on the upper address lines.

[edit] Data phases

After the address phase (specifically, beginning with the cycle that DEVSEL# goes low) comes a burst of one or more data phases. In all cases, the initiator drives active-low byte select signals on the C/BE[3:0]# lines, but the data on the AD[31:0] may be driven by the initiator (on case of writes) or target (in case of reads).

During data phases, the C/BE[3:0]# lines are interpreted as active-low byte enables. In case of a write, the asserted signals indicate which of the four bytes on the AD bus are to be written to the addressed location. In the case of a read, they indicate which bytes the initiator is interested in. For reads, it is always legal to ignore the byte enable signals and simply return all 32 bits; cacheable memory resources are required to always return 32 valid bits. The byte enables are mainly useful for I/O space accesses where reads have side effects.

A data phase with all four C/BE# lines deasserted is explicitly permitted by the PCI standard, and must have no effect on the target (other than to advance the address in the burst access in progress).

The data phase continues until both parties are ready to complete the transfer and continue to the next data phase. The initiator asserts IRDY# (initiator ready) when it no longer needs to wait, while the target asserts TRDY# (target ready). Whichever side is providing the data must drive it on the AD bus before asserting its ready signal.

Once one of the participants asserts its ready signal, it may not become un-ready or otherwise alter its control signals until the end of the data phase. The data recipient must latch the AD bus each cycle until it sees both IRDY# and TRDY# asserted, which marks the end of the current data phase and indicates that the just-latched data is the word to be transferred.

To maintain full burst speed, the data sender then has half a clock cycle after seeing both IRDY# and TRDY# asserted to drive the next word onto the AD bus.

0_ 1_ 2_ 3_ 4_ 5_ 6_ 7_ 8_ 9_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ ___ _______ ___ ___ ___ AD[31:0] ---<___XXXXXXXXX_______XXXXX___X___X___ (If a write) ___ ___ _______ ___ ___ AD[31:0] ---<___>~~~<XXXXXXXX___X_______X___X___ (If a read) ___ _______________ _______ ___ ___ C/BE[3:0]# ---<___X_______________X_______X___X___ (Must always be valid)

Page 34: PC Architectural Standards Page (1)

_______________ | ___ | | | IRDY# x \_______/ x \___________ ___________________ | | | | TRDY# x x \___________________ ___________ | | | | DEVSEL# \___________________________ ___ | | | | FRAME# \___________________________________ _ _ _ _ _ |_ _ |_ |_ |_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ 0 1 2 3 4 5 6 7 8 9

This continues the address cycle illustrated above, assuming a single address cycle with medium DEVSEL, so the target responds in time for clock 3. However, at that time, neither side is ready to transfer data. For clock 4, the initiator is ready, but the target is not. On clock 5, both are ready, and a data transfer takes place (as indicated by the vertical lines). For clock 6, the target is ready to transfer, but the initator is not. On clock 7, the initiator becomes ready, and data is transferred. For clocks 8 and 9, both sides remain ready to transfer data, and data is transferred at the maximum possible rate (32 bits per clock cycle).

In case of a read, clock 2 is reserved for turning around the AD bus, so the target is not permitted to drive data on the bus even if it is capable of fast DEVSEL.

[edit] Fast DEVSEL# on reads

A target that supports fast DEVSEL could in theory begin responding to a read the cycle after the address is presented. This cycle is, however, reserved for AD bus turnaround. Thus, a target may not drive the AD bus (and thus may not assert TRDY#) on the second cycle of a transaction. Note that most targets will not be this fast and will not need any special logic to enforce this condition.

[edit] Ending transactions

Either side may request that a burst end after the current data phase. Simple PCI devices that do not support multi-word bursts will always request this immediately. Even devices that do support bursts will have some limit on the maximum length they can support, such as the end of their addressable memory.

[edit] Initiator burst termination

The initiator can mark any data phase as the final one in a transaction by deasserting FRAME# at the same time as it asserts IRDY#. The cycle after the target asserts TRDY#, the final data transfer is complete, both sides deassert their respective RDY# signals, and the bus is idle again. The master may not deassert FRAME# before asserting IRDY#, nor may it deassert FRAME# while waiting, with IRDY# asserted, for the target to assert TRDY#.

The only minor exception is a master abort termination, when no target responds with DEVSEL#. Obviously, it is pointless to wait for TRDY# in such a case. However, even in this case, the master must assert IRDY# for at least one cycle after deasserting FRAME#. (Commonly, a master will assert IRDY# before receiving DEVSEL#, so it must simply hold IRDY# asserted for one cycle longer.) This is to ensure that bus turnaround timing rules are obeyed on the FRAME# line.

Page 35: PC Architectural Standards Page (1)

[edit] Target burst termination

The target requests the initiator end a burst by asserting STOP#. The initiator will then end the transaction by deasserting FRAME# at the next legal opportunity; if it wishes to transfer more data, it will continue in a separate transaction. There are several ways for the target to do this:

Disconnect with data

If the target asserts STOP# and TRDY# at the same time, this indicates that the target wishes this to be the last data phase. For example, a target that does not support burst transfers will always do this to force single-word PCI transactions. This is the most efficient way for a target to end a burst.

Disconnect without data

If the target asserts STOP# without asserting TRDY#, this indicates that the target wishes to stop without transferring data. STOP# is considered equivalent to TRDY# for the purpose of ending a data phase, but no data is transferred.

Retry

A Disconnect without data before transferring any data is a retry, and unlike other PCI transactions, PCI initiators are required to pause slightly before continuing the operation. See the PCI specification for details.

Target abort

Normally, a target holds DEVSEL# asserted through the last data phase. However, if a target deasserts DEVSEL# before disconnecting without data (asserting STOP#), this indicates a target abort, which is a fatal error condition. The initiator may not retry, and typically treats it as a bus error. Note that a target may not deassert DEVSEL# while waiting with TRDY# or STOP# low; it must do this at the beginning of a data phase.

There will always be at least one more cycle after a target-initiated disconnection, to allow the master to deassert FRAME#. There are two sub-cases, which take the same amount of time, but one requires an additional data phase:

Disconnect-A

If the initiator observes STOP# before asserting its own IRDY#, then it can end the burst by deasserting FRAME# at the end of the current data phase.

Disconnect-B

If the initiator has already asserted IRDY# (without deasserting FRAME#) by the time it observes the target's STOP#, it is already committed to an additional data phase. The target must wait through an additional data phase, holding STOP# asserted without TRDY#, before the transaction can end.

Page 36: PC Architectural Standards Page (1)

If the initiator ends the burst at the same time as the target requests disconnection, there is no additional bus cycle.

[edit] Burst addressing

For memory space accesses, the words in a burst may be accessed in several orders. The unnecessary low-order address bits AD[1:0] are used to convey the initiator's requested order. A target which does not support a particular order must terminate the burst after the first word. Some of these orders depend on the cache line size, which is configurable on all PCI devices.

PCI burst ordering

A[1] A[0] Burst order (with 16-byte cache line)

0 0 Linear incrementing (0x0C, 0x10, 0x14, 0x18, 0x1C, ...)

0 1 Cacheline toggle (0x0C, 0x08, 0x04, 0x00, 0x1C, 0x18, ...)

1 0 Cacheline wrap (0x0C, 0x00, 0x04, 0x08, 0x1C, 0x10, ...)

1 1 Reserved (disconnect after first transfer)

If the starting offset within the cache line is zero, all of these modes reduce to the same order.

Cache line toggle and cache line wrap modes are two forms of critical-word-first cache line fetching. Toggle mode XORs the supplied address with an incrementing counter. This is the native order for Intel 486 and Pentium processors. It has the advantage that it is not necessary to know the cache line size to implement it.

PCI version 2.1 obsoleted toggle mode and added the cache line wrap mode,[1] where fetching proceeds linearly, wrapping around at the end of each cache line. When one cache line is completely fetched, fetching jumps to the starting offset in the next cache line.

Note that most PCI devices only support a limited range of typical cache line sizes; if the cache line size is programmed to an unexpected value, they force single-word access.

PCI also supports burst access to I/O and configuration space, but only linear mode is supported. (This is rarely used, and may be buggy in some devices; they may not support it, but not properly force single-word access either.)

[edit] Transaction examples

This is the highest-possible speed four-word write burst, terminated by the master:

0_ 1_ 2_ 3_ 4_ 5_ 6_ 7_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ ___ ___ ___ ___ ___ AD[31:0] ---<___X___X___X___X___>---<___> ___ ___ ___ ___ ___

Page 37: PC Architectural Standards Page (1)

C/BE[3:0]# ---<___X___X___X___X___>---<___> | | | | ___ IRDY# ^^^^^^^^\______________/ ^^^^^ | | | | ___ TRDY# ^^^^^^^^\______________/ ^^^^^ | | | | ___ DEVSEL# ^^^^^^^^\______________/ ^^^^^ ___ | | | ___ FRAME# \_______________/ | ^^^^\____ _ _ |_ |_ |_ |_ _ _ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ 0 1 2 3 4 5 6 7

On clock edge 1, the initiator starts a transaction by driving an address, command, and asserting FRAME# The other signals are idle (indicated by ^^^), pulled high by the motherboard's pull-up resistors. That might be their turnaround cycle. On cycle 2, the target asserts both DEVSEL# and TRDY#. As the initiator is also ready, a data transfer occurs. This repeats for three more cycles, but before the last one (clock edge 5), the master deasserts FRAME#, indicating that this is the end. On clock edge 6, the AD bus and FRAME# are undriven (turnaround cycle) and the other control lines are driven high for 1 cycle. On clock edge 7, another initiator can start a different transaction. This is also the turnaround cycle for the other control lines.

The equivalent read burst takes one more cycle, because the target must wait 1 cycle for the AD bus to turn around before it may assert TRDY#:

0_ 1_ 2_ 3_ 4_ 5_ 6_ 7_ 8_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ ___ ___ ___ ___ ___ AD[31:0] ---<___>---<___X___X___X___>---<___> ___ _______ ___ ___ ___ C/BE[3:0]# ---<___X_______X___X___X___>---<___> ___ | | | | ___ IRDY# ^^^^\___________________/ ^^^^^ ___ _____ | | | | ___ TRDY# ^^^^ \______________/ ^^^^^ ___ | | | | ___ DEVSEL# ^^^^\___________________/ ^^^^^ ___ | | | ___ FRAME# \___________________/ | ^^^^\____ _ _ _ |_ |_ |_ |_ _ _ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ 0 1 2 3 4 5 6 7 8

A high-speed burst terminated by the target will have an extra cycle at the end:

0_ 1_ 2_ 3_ 4_ 5_ 6_ 7_ 8_ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ ___ ___ ___ ___ ___ AD[31:0] ---<___>---<___X___X___X___XXXX>---- ___ _______ ___ ___ ___ ___ C/BE[3:0]# ---<___X_______X___X___X___X___>---- | | | | ___ IRDY# ^^^^^^^\_______________________/ _____ | | | | _______ TRDY# ^^^^^^^ \______________/ ________________ | ___

Page 38: PC Architectural Standards Page (1)

STOP# ^^^^^^^ | | | \_______/ | | | | ___ DEVSEL# ^^^^^^^\_______________________/ ___ | | | | ___ FRAME# \_______________________/ ^^^^ _ _ _ |_ |_ |_ |_ _ _ CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ 0 1 2 3 4 5 6 7 8

On clock edge 6, the target indicates that it wants to stop (with data), but the initiator is already holding IRDY# low, so there is a fifth data phase (clock edge 7), during which no data is transferred.

[edit] Parity

The PCI bus detects parity errors, but does not attempt to correct them by retrying operations; it is purely a failure indication. Because of this, there is no need to detect the parity error before it has happened, and the PCI bus actually detects it a few cycles later. During a data phase, whichever device is driving the AD[31:0] lines computes even parity over them and the C/BE[3:0]# lines, and sends that out the PAR line one cycle later. All access rules and turnaround cycles for the AD bus apply to the PAR line, just one cycle later. The device listening on the AD bus checks the received parity and asserts the PERR# (parity error) line one cycle after that. This generally generates a processor interrupt, and the processor can search the PCI bus for the device which detected the error.

The PERR# line is only used during data phases, once a target has been selected. If a parity error is detected during an address phase (or the data phase of a Special Cycle), the devices which observe it assert the SERR# (System error) line.

Even when some bytes are masked by the C/BE# lines and not in use, they must still have some defined value, and this value must be used to compute the parity.

[edit] Fast back-to-back transactions

Due to the need for a turnaround cycle between different devices driving PCI bus signals, in general it is necessary to have an idle cycle between PCI bus transactions. However, in some circumstances it is permitted to skip this idle cycle, going directly from the final cycle of one transfer (IRDY# asserted, FRAME# deasserted) to the first cycle of the next (FRAME# asserted, IRDY# deasserted).

An initiator may only perform back-to-back transactions when:

they are by the same initiator (or there would be no time to turn around the C/BE# and FRAME# lines),

the first transaction was a write (so there is no need to turn around the AD bus), and the initiator still has permission (from its GNT# input) to use the PCI bus.

Additional timing constraints may come from the need to turn around are the target control lines, particularly DEVSEL#. The target deasserts DEVSEL#, driving it high, in the cycle following the final data phase, which in the case of back-to-back transactions is the first cycle of the address phase. The second cycle of the address phase is then reserved for DEVSEL#

Page 39: PC Architectural Standards Page (1)

turnaround, so if the target is different from the previous one, it must not assert DEVSEL# until the third cycle (medium DEVSEL speed).

One case where this problem cannot arise is if the initiator knows somehow (presumably because the addresses share sufficient high-order bits) that the second transfer is addressed to the same target as the previous one. In that case, it may perform back-to-back transactions. All PCI targets must support this.

It is also possible for the target keeps track of the requirements. If it never does fast DEVSEL, they are met trivially. If it does, it must wait until medium DEVSEL time unless:

the current transaction was preceded by an idle cycle (is not back-to-back), or the previous transaction was to the same target, or the current transaction began with a double address cycle.

Targets which have this capability indicate it by a special bit in a PCI configuration register, and if all targets on a bus have it, all initiators may use back-to-back transfers freely.

A subtractive decoding bus bridge must know to expect this extra delay in the event of back-to-back cycles in order to advertise back-to-back support.

[edit] 64-bit PCIThis section explains only basic 64-bit PCI; the full PCI-X protocol extension is much more extensive.

The PCI specification includes optional 64-bit support. This is provided via an extended connector which provides the 64-bit bus extensions AD[63:32], C/BE[7:4]#, and PAR64. (It also provides a number of additional power and ground pins.)

Memory transactions between 64-bit devices may use all 64 bits to double the data transfer rate. Non-memory transactions (including configuration and I/O space accesses) may not use the 64-bit extension. During a 64-bit burst, burst addressing works just as in a 32-bit transfer, but the address is incremented twice per data phase. The starting address must be 64-bit aligned; i.e. AD2 must be 0. The data corresponding to the intervening addresses (with AD2 = 1) is carried on the upper half of the AD bus.

To initiate a 64-bit transaction, the initiator drives the starting address on the AD bus and asserts REQ64# at the same time as FRAME#. If the selected target can support a 64-bit transfer for this transaction, it replies by asserting ACK64# at the same time as DEVSEL#. Note that a target may decide on a per-transaction basis whether to allow a 64-bit transfer.

If REQ64# is asserted during the address phase, the initiator also drives the high 32 bits of the address and a copy of the bus command on the high half of the bus. If the address requires 64 bits, a dual address cycle is still required, but the high half of the bus carries the upper half of the address and the final command code during both address phase cycles; this allows a 64-bit target to see the entire address and begin responding earlier.

If the initiator sees DEVSEL# asserted without ACK64#, it performs 32-bit data phases. The data which would have been transferred on the upper half of the bus during the first data phase is instead transferred during the second data phase. Typically, the initiator drives all 64

Page 40: PC Architectural Standards Page (1)

bits of data before seeing DEVSEL#. If ACK64# is missing, it may cease driving the upper half of the data bus.

The REQ64# and ACK64# lines are held asserted for the entire transaction save the last data phase, and deasserted at the same time as FRAME# and DEVSEL#, respectively.

The PAR64 line operates just like the PAR line, but provides even parity over AD[63:32] and C/BE[7:4]#. It is only valid for address phases if REQ64# is asserted. PAR64 is only valid for data phases if both REQ64# and ACK64# are asserted.

[edit] Cache snooping (obsolete)

PCI originally included optional support for write-back cache coherence. This required support by cacheable memory targets, which would listen to two pins from the cache on the bus, SDONE (snoop done) and SBO# (snoop backoff).[8]

Because this was rarely implemented in practice, it was deleted from revision 2.2 of the PCI specification,[5][9] and the pins re-used for SMBus access in revision 2.3.[4]

The cache would watch all memory accesses, without asserting DEVSEL#. If it noticed an access that might be cached, it would drive SDONE low (snoop not done). A coherence-supporting target would avoid completing a data phase (asserting TRDY#) until it observed SDONE high.

In the case of a write to data that was clean in the cache, the cache would only have to invalidate its copy, and would assert SDONE as soon as this was established. However, if the cache contained dirty data, the cache would have to write it back before the access could proceed. so it would assert SBO# when raising SDONE. This would signal the active target to assert STOP# rather than TRDY#, causing the initiator to disconnect and retry the operation later. In the meantime, the cache would arbitrate for the bus and write its data back to memory.

Targets supporting cache coherency are also required to terminate bursts before they cross cache lines.

[edit] Development tools

When developing and/or troubleshooting the PCI bus, examination of hardware signals can be very important. Logic analyzers and bus analzyers are tools which collect, analyze, decode signals for users to view in useful ways.

[edit] See also

List of device bandwidths (A useful listing of device bandwidths that includes PCI) Advanced Microcontroller Bus Architecture Industry Standard Architecture (ISA) Extended Industry Standard Architecture (EISA) Micro Channel architecture (MCA) NuBus

Page 42: PC Architectural Standards Page (1)

In personal computers, the front-side bus (FSB) is the bus that carries data between the CPU and the northbridge.

Depending on the processor used, some computers may also have a back-side bus that connects the CPU to the cache. This bus and the cache connected to it are faster than accessing the system memory (or RAM) via the front-side bus.

The bandwidth or maximum theoretical throughput of the front-side bus is determined by the product of the width of its data path, its clock frequency (cycles per second) and the number of data transfers it performs per clock cycle. For example, a 64-bit (8-byte) wide FSB operating at a frequency of 100 MHz that performs 4 transfers per cycle has a bandwidth of 3200 megabytes per second (MB/s):

8B x 100MHz x 4/cycle = 8B x 100M x Hz x 4/cycle= 8B x 100M x cycle/s x 4/cycle= 3200MB/s

The number of transfers per clock cycle is dependent on the technology used. For example, GTL+ performs 1 transfer/cycle, EV6 2 transfers/cycle, and AGTL+ 4 transfers/cycle. Intel calls the technique of four transfers per cycle Quad Pumping.

Many manufacturers publish the speed of the FSB in MHz, but often do not use the actual physical clock frequency but the theoretical effective data rate (which is commonly called megatransfers per second or MT/s). This is because the actual speed is determined by how many transfers can be performed by each clock cycle as well as by the clock frequency. For example, if a motherboard (or processor) has a FSB clocked at 200 MHz and performs 4 transfers per clock cycle, the FSB is rated at 800 MT/s.

Contents

[hide]

1 History and current usage 2 Related component speeds

o 2.1 CPU o 2.2 Memory o 2.3 Peripheral buses o 2.4 Overclocking

3 Pros and cons o 3.1 Pros o 3.2 Cons

4 Transfer rates o 4.1 Intel processors o 4.2 AMD processors

5 See also

[edit] History and current usage

Page 43: PC Architectural Standards Page (1)

The front-side bus is an alternative name for the data and address buses of the CPU as defined by the manufacturer's datasheet. The term is mostly associated with the various CPU buses used on PC-related motherboards (including servers etc), seldom with the data and address buses used in embedded systems and similar small computers.

Front-side buses serve as a connection between the CPU and the rest of the hardware via a chipset. This chipset is usually divided in a northbridge and a southbridge part,and is the connection point for all other buses in the system. Buses like the PCI, AGP, and memory buses all connect to the chipset in order for data to flow between the connected devices. These secondary system buses usually run at speeds derived from the front-side bus clock, but are not necessarily synchronous to it.

In response to AMD's Torrenza initiative, Intel has opened its FSB CPU socket to third party devices [1] [2]. Prior to this announcement, made in Spring 2007 at Intel Developer Forum in Beijing, Intel had very closely guarded who had access to the FSB, only allowing Intel processors in the CPU socket. This is now changing, the first example being FPGA co-processors, a result of collaboration between Intel-Xilinx-Nallatech [3] and Intel-Altera-XtremeData [4] [5].

[edit] Related component speeds

[edit] CPU

The frequency at which a processor (CPU) operates is determined by applying a clock multiplier to the front-side bus (FSB) speed in some cases. For example, a processor running at 3200 MHz might be using a 400 MHz FSB. This means there is an internal clock multiplier setting (also called bus/core ratio) of 8. That is, the CPU is set to run at 8 times the frequency of the front-side bus: 400 MHz × 8 = 3200 MHz. By varying either the FSB or the multiplier, different CPU speeds can be achieved.

[edit] MemorySee also: Memory divider

Setting an FSB speed is related directly to the speed grade of memory a system must use. The memory bus connects the northbridge and RAM, just as the front-side bus connects the CPU and northbridge. Often, these two buses must operate at the same frequency. Increasing the front-side bus to 450 MHz in most cases also means running the memory at 450 MHz.

In newer systems, it is possible to see memory ratios of "4:5" and the like. The memory will run 5/4 times as fast as the FSB in this situation, meaning a 400 MHz bus can run with the memory at 500 MHz. This is often referred to as an 'asynchronous' system. It is important to realize that due to differences in CPU and system architecture, overall system performance can vary in unexpected ways with different FSB-to-memory ratios.

In image, audio, video, gaming, FPGA synthesis and scientific applications that perform a small amount of work on each element of a large data set, FSB speed becomes a major performance issue. A slow FSB will cause the CPU to spend significant amounts of time waiting for data to arrive from system memory. However, if the computations involving each

Page 44: PC Architectural Standards Page (1)

element are more complex, the processor will spend longer performing these; therefore, the FSB will be able to keep pace because the rate at which the memory is accessed is reduced.

[edit] Peripheral buses

Similar to the memory bus, the PCI and AGP buses can also be run asynchronously from the front-side bus. In older systems, these buses are operated at a set fraction of the front-side bus frequency. This fraction was set by the BIOS. In newer systems, the PCI, AGP, and PCI Express peripheral buses often receive their own clock signals, which eliminates their dependence on the front-side bus for timing.

[edit] OverclockingMain article: Overclocking

Overclocking is the practice of making computer components operate beyond their stock performance levels.

Many motherboards allow the user to manually set the clock multiplier and FSB settings by changing jumpers or BIOS settings. Almost all CPU manufacturers now "lock" a preset multiplier setting into the chip. It is possible to unlock some locked CPUs; for instance, some Athlons can be unlocked by connecting electrical contacts across points on the CPU's surface. For all processors, increasing the FSB speed can be done to boost processing speed by reducing latency between CPU and the northbridge.

This practice pushes components beyond their specifications and may cause erratic behavior, overheating or premature failure. Even if the computer appears to run normally, problems may appear under a heavy load. Most PCs purchased from retailers or manufacturers, such as Hewlett-Packard or Dell, do not allow the user to change the multiplier or FSB settings due to the probability of erratic behavior or failure. Motherboards purchased separately to build a custom machine are more likely to allow the user to edit the multiplier and FSB settings in the PC's BIOS.

[edit] Pros and cons

[edit] Pros

Although the front-side bus architecture is an aging technology, it does have the advantage of high flexibility and low cost. There is no theoretical limit to the number of CPUs that can be placed on a FSB, though performance will not scale linearly across additional CPUs (due to the architecture's bandwidth bottleneck).

[edit] Cons

The front-side bus as it is traditionally known may be disappearing, but it's still being used in all of Intel's Atom, Celeron, Pentium, and Core 2 processor models. Originally, this bus was a central connecting point for all system devices and the CPU. In recent years, this has been breaking down with the increasing use of individual point-to-point connections like AMD's HyperTransport and Intel's QuickPath Interconnect. The front-side bus has been criticized by AMD as being an old and slow technology that bottlenecks today's computer systems. While

Page 45: PC Architectural Standards Page (1)

a faster CPU can execute individual instructions faster, this is wasted if it cannot fetch instructions and data as fast as it can execute them; when this happens, the CPU must wait for one or more clock cycles until the memory returns its value. Furthermore, a fast CPU can be delayed when it must access other devices attached to the FSB. Thus, a slow FSB can become a bottleneck that slows down a fast CPU. FSB's fastest transfer speed is currently 1.6 GT/s, which provides only 80% of the theoretical bandwidth of a 16-bit HyperTransport 3.0 link as implemented on AM3 Phenom II CPUs, only half of the bandwidth of a 6.4 GT/s QuickPath Interconnect link, and only 25% of the bandwidth of a 32-bit HyperTransport 3.1 link. In addition, in an FSB-based architecture, the memory must be accessed via the FSB. In HT- and QPI-based systems, the memory is accessed independently by means of a memory controller on the CPU itself, freeing bandwidth on the HyperTransport or QPI link for other uses.

[edit] Transfer rates

This article may contain original research. Please improve it by verifying the claims made and adding references. Statements consisting only of original research may be removed. More details may be available on the talk page. (October 2009)

[edit] Intel processors

CPU FSB ClockNumber of

CyclesBus Width Transfer Rate

Pentium 50 MHz-66 MHz 1 64-bit 400 MB/s-528 MB/s

Pentium Overdrive 25 MHz-66 MHz 1 64-bit 200 MB/s-528 MB/s

Pentium MMX 60 MHz-66 MHz 1 64-bit 480 MB/s-528 MB/s

Pentium MMX Overdrive 50 MHz-66 MHz 1 64-bit 400 MB/s-528 MB/s

Pentium II 66 MHz-100 MHz 1 64-bit 528 MB/s-800 MB/s

Pentium II Overdrive 60 MHz-66 MHz 1 64-bit 480 MB/s-528 MB/s

Pentium III100 MHz-133 MHz

1 64-bit 800 MB/s-1064 MB/s

Pentium III-M100 MHz-133 MHz

1 64-bit 800 MB/s-1064 MB/s

Pentium 4100 MHz-133 MHz

4 64-bit 3200 MB/s-4256 MB/s

Pentium 4-M 100 MHz 4 64-bit 3200 MB/s

Pentium 4 HT133 MHz-200 MHz

4 64-bit 4256 MB/s-6400 MB/s

Page 46: PC Architectural Standards Page (1)

Pentium 4 HT Extreme Edition

800 MHz-266 MHz

4 64-bit 6400 MB/s-8512 MB/s

Pentium D133 MHz-200 MHz

4 64-bit 4256 MB/s-6400 MB/s

Pentium Extreme Edition200 MHz-266 MHz

4 64-bit 6400 MB/s-8512 MB/s

Pentium M100 MHz-133 MHz

4 64-bit 3200 MB/s-4256 MB/s

Core Solo133 MHz-166 MHz

4 64-bit 4256 MB/s-5312 MB/s

Core Duo133 MHz-166 MHz

4 64-bit 4256 MB/s-5312 MB/s

Core 2 Solo133 MHz-200 MHz

4 64-bit 4256 MB/s-6400 MB/s

Core 2 Duo133 MHz-333 MHz

4 64-bit4256 MB/s-10656 MB/s

Core 2 Quad266 MHz-333 MHz

4 64-bit8512 MB/s-10656 MB/s

Core 2 Extreme200 MHz-400 MHz

4 64-bit6400 MB/s-12800 MB/s

Atom133 MHz-166 MHz

4 64-bit 4256 MB/s-5312 MB/s

Celeron 66 MHz-266 MHz 1-4 64-bit 528 MB/s-8512 MB/s

Celeron D 133 MHz 4 64-bit 4256 MB/s

Celeron M100 MHz-200 MHz

4 64-bit 3200 MB/s-6400 MB/s

Celeron Dual-Core133 MHz-200 MHz

4 64-bit 4256 MB/s-6400 MB/s

Pentium Dual-Core133 MHz-266 MHz

4 64-bit 4256 MB/s-8512 MB/s

Pentium Pro 60 MHz-66 MHz 1 64-bit 480 MB/s-528 MB/s

Pentium II Xeon 100 MHz 1 64-bit 800 MB/s

Page 47: PC Architectural Standards Page (1)

Pentium III Xeon100 MHz-133 MHz

1 64-bit 800 MB/s-1064 MB/s

Xeon100 MHz-400 MHz

4 64-bit3200 MB/s-12800 MB/s

Itanium100 MHz-133 MHz

1 64-bit 800 MB/s-1064 MB/s

Itanium 2100 MHz-166 MHz

4 64-bit 3200 MB/s-5312 MB/s

[edit] AMD processorsCPU FSB Clock Number of Cycles Bus Width Transfer Rate

K5 50 MHz-66 MHz 1 64-bit 400 MB/s-528 MB/s

K6 66 MHz 1 64-bit 528 MB/s

K6-II 66 MHz-100 MHz 1 64-bit 528 MB/s-800 MB/s

K6-III 66 MHz-100 MHz 1 64-bit 528 MB/s-800 MB/s

Athlon 100 MHz-133 MHz 2 64-bit 1600 MB/s-2128 MB/s

Athlon XP 100 MHz-200 MHz 2 64-bit 1600 MB/s-3200 MB/s

Mobile Athlon 4 100 MHz 2 64-bit 1600 MB/s

Athlon XP-M 100 MHz-133 MHz 2 64-bit 1600 MB/s-2128 MB/s

Duron 100 MHz-133 MHz 2 64-bit 1600 MB/s-2128 MB/s

Sempron 166 MHz-200 MHz 2 64-bit 2656 MB/s-3200 MB/s

Athlon MP 100 MHz-133 MHz 2 64-bit 1600 MB/s-2128 MB/s

Memory-mapped I/OFrom Wikipedia, the free encyclopediaJump to: navigation, search

This article needs additional citations for verification.Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (August 2010)

For more generic meanings of input/output port, see Computer port (hardware).

Page 48: PC Architectural Standards Page (1)

Memory-mapped I/O (MMIO) and port I/O (also called port-mapped I/O (PMIO) or isolated I/O) are two complementary methods of performing input/output between the CPU and peripheral devices in a computer. Another method, not discussed in this article, is using dedicated I/O processors — commonly known as channels on mainframe computers — that execute their own instructions.

Memory-mapped I/O (not to be confused with memory-mapped file I/O) uses the same address bus to address both memory and I/O devices, and the CPU instructions used to access the memory are also used for accessing devices. In order to accommodate the I/O devices, areas of the CPU's addressable space must be reserved for I/O. The reservation might be temporary — the Commodore 64 could bank switch between its I/O devices and regular memory — or permanent. Each I/O device monitors the CPU's address bus and responds to any of the CPU's access of device-assigned address space, connecting the data bus to a desirable device's hardware register.

Port-mapped I/O uses a special class of CPU instructions specifically for performing I/O. This is generally found on Intel microprocessors, specifically the IN and OUT instructions which can read and write one to four bytes (outb, outw, outl) to an I/O device. I/O devices have a separate address space from general memory, either accomplished by an extra "I/O" pin on the CPU's physical interface, or an entire bus dedicated to I/O. Because the address space for I/O is isolated from that for main memory, this is sometimes referred to as isolated I/O.

A device's direct memory access (DMA) is not affected by those CPU-to-device communication methods, especially it is not affected by memory mapping. This is because by definition, DMA is a memory-to-device communication method that bypasses the CPU.

Hardware interrupt is yet another communication method between CPU and peripheral devices. However, it is always treated separately for a number of reasons. It is device-initiated, as opposed to the methods mentioned above, which are CPU-initiated. It is also unidirectional, as information flows only from device to CPU. Lastly, each interrupt line carries only one bit of information with a fixed meaning, namely "an event that requires attention has occurred in a device on this interrupt line".

Contents

[hide]

1 Relative merits of the two I/O methods 2 Memory barriers 3 Example 4 Basic types of address decoding 5 Incomplete address decoding 6 References 7 See also

[edit] Relative merits of the two I/O methods

Page 49: PC Architectural Standards Page (1)

The main advantage of using port-mapped I/O is on CPUs with a limited addressing capability. Because port-mapped I/O separates I/O access from memory access, the full address space can be used for memory. It is also obvious to a person reading an assembly language program listing (or even, in rare instances, analyzing machine language) when I/O is being performed, due to the special instructions that can only be used for that purpose.

I/O operations can slow the memory access, if the address and data buses are shared. This is because the peripheral device is usually much slower than main memory. In some architectures, port-mapped I/O operates via a dedicated I/O bus, alleviating the problem.

There are two major advantages of using memory-mapped I/O. One of them is that, by discarding the extra complexity that port I/O brings, a CPU requires less internal logic and is thus cheaper, faster, easier to build, consumes less power and can be physically smaller; this follows the basic tenets of reduced instruction set computing, and is also advantageous in embedded systems. The other advantage is that, because regular memory instructions are used to address devices, all of the CPU's addressing modes are available for the I/O as well as the memory, and instructions that perform an ALU operation directly on a memory operand — loading an operand from a memory location, storing the result to a memory location, or both — can be used with I/O device registers as well. In contrast, port-mapped I/O instructions are often very limited, often providing only for plain load and store operations between CPU registers and I/O ports, so that, for example, to add a constant to a port-mapped device register would require three instructions: read the port to a CPU register, add the constant to the CPU register, and write the result back to the port.

As 16-bit processors have become obsolete and replaced with 32-bit and 64-bit in general use, reserving ranges of memory address space for I/O is less of a problem, as the memory address space of the processor is usually much larger than the required space for all memory and I/O devices in a system. Therefore, it has become more frequently practical to take advantage of the benefits of memory-mapped I/O. However, even with address space being no longer a major concern, neither I/O mapping method is universally superior to the other, and there will be cases where using port-mapped I/O is still preferable.

A final reason that memory-mapped I/O is preferred in x86-based architectures is that the instructions that perform port-based I/O are limited to one or two registers: EAX, AX, and AL are the only registers that data can be moved in to or out of, and either a byte-sized immediate value in the instruction or a value in register DX determines which port is the source or destination port of the transfer[1][2]. Since any general purpose register can send or receive data to or from memory and memory-mapped I/O, memory-mapped I/O uses less instructions and can run faster than port I/O. AMD did not extend the port I/O instructions when defining the x86-64 architecture to support 64-bit ports, so 64-bit transfers cannot be performed using port I/O[3].

[edit] Memory barriers

Memory-mapped I/O is the cause of memory barriers in older generations of computers – the 640 KiB barrier is due to the IBM PC placing the Upper Memory Area in the 640–1024 KiB range (of its 20-bit memory addressing), while the 3 GB barrier is due to similar memory-mapping in 32-bit architectures in the 3–4 GB range.

Page 50: PC Architectural Standards Page (1)

[edit] Example

Consider a simple system built around an 8-bit microprocessor. Such a CPU might provide 16-bit address lines, allowing it to address up to 64 kibibytes (KiB) of memory. On such a system, perhaps the first 32 KiB of address space would be allotted to random access memory (RAM), another 16K to read only memory (ROM) and the remainder to a variety of other devices such as timers, counters, video display chips, sound generating devices, and so forth. The hardware of the system is arranged so that devices on the address bus will only respond to particular addresses which are intended for them; all other addresses are ignored. This is the job of the address decoding circuitry, and it is this that establishes the memory map of the system.

Thus we might end up with a memory map like so:

DeviceAddress range(hexadecimal)

Size

RAM 0000 - 7FFF 32 KiBGeneral purpose I/O 8000 - 80FF 256 bytes

Sound controller 9000 - 90FF 256 bytesVideo controller/text-mapped display RAM A000 - A7FF 2 KiB

ROM C000 - FFFF 16 KiB

Note that this memory map contains gaps; that is also quite common.

Assuming the fourth register of the video controller sets the background colour of the screen, the CPU can set this colour by writing a value to the memory location A003 using its standard memory write instruction. Using the same method, graphs can be displayed on a screen by writing character values into a special area of RAM within the video controller. Prior to cheap RAM that enabled bit-mapped displays, this character cell method was a popular technique for computer video displays (see Text user interface).

[edit] Basic types of address decoding

Exhaustive — 1:1 mapping of unique addresses to one hardware register (physical memory location)

Partial — n:1 mapping of n unique addresses to one hardware register. Partial decoding allows a memory location to have more than one address, allowing the programmer to reference a memory location using n different addresses. It may also be done just to simplify the decoding hardware, when not all of the CPU's address space is needed. Synonyms: foldback, multiply-mapped, partially-mapped.

Linear — Address lines are used directly without any decoding logic. This is done with devices such as RAMs and ROMs that have a sequence of address inputs, and with peripheral chips that have a similar sequence of inputs for addressing a bank of registers. Linear addressing is rarely used alone (only when there are few devices on the bus, as using purely linear addressing for more than one device usually wastes a lot of address space) but instead is combined with one of the other methods to select a device or group of devices within which the linear addressing selects a single register or memory location.

Page 51: PC Architectural Standards Page (1)

[edit] Incomplete address decoding

Addresses may be decoded completely or incompletely by a device.

Complete decoding involves checking every line of the address bus, causing an open data bus when the CPU accesses an unmapped region of memory. (Note that even with incomplete decoding, decoded partial regions may not be associated with any device, leaving the data bus open when those regions are accessed.)

Incomplete decoding, or partial decoding, uses simpler and often cheaper logic that examines only some address lines. Such simple decoding circuitry might allow a device to respond to several different addresses, effectively creating virtual copies of the device at different places in the memory map. All of these copies refer to the same real device, so there is no particular advantage in doing this, except to simplify the decoder (or possibly the software that uses the device). This is also known as address aliasing [4][5]; Aliasing has other meanings in computing. Commonly, the decoding itself is programmable, so the system can reconfigure its own memory map as required, though this is a newer development and generally in conflict with the intent of being cheaper.

PDP-11From Wikipedia, the free encyclopedia

Jump to: navigation, search

This article is about the PDP-11 series of minicomputers. For the PDP-11 processor architecture, see PDP-11 architecture.

Page 52: PC Architectural Standards Page (1)

PDP-11/40 with TU56 dual DECtape drive.

The PDP-11 was a series of 16-bit minicomputers sold by Digital Equipment Corp. from 1970[1][2] into the 1990s.[2] Though not explicitly conceived as a successor to DEC's PDP-8 computer in the PDP (for Programmed Data Processor) series of computers (both product lines lived in parallel for more than 10 years), the PDP-11 replaced the PDP-8 in many real-time applications. It had several uniquely innovative features, and was easier to program than its predecessors with its use of general registers. It was replaced in the mid-range minicomputer niche by the VAX-11 32-bit extension of the PDP-11.

Smaller personal computers such as the IBM PC, and workstations and servers by vendors such as Sun Microsystems became popular based on standard microprocessors such as the Intel x86 and Motorola 68000. As a PDP-11 based personal computer offering failed in the marketplace, these computers using standard operating systems such as MS-DOS and Unix eventually evolved into full 32 bit memory addressing, and replaced most proprietary minicomputer and midrange computers including the VAX-11.

Design features of the PDP-11 influenced the design of other microprocessors such as the Motorola 68000; design features of its operating systems, as well as other operating systems from Digital Equipment, influenced the design of other operating systems such as CP/M [3] and hence also MS-DOS [4] . The first officially named version of Unix ran on the PDP-11/20 in 1970. The C programming language was written to take advantage of PDP-11 programming features such as byte addressing to rewrite Unix in a high level language.

Contents

[hide]

1 Unique features o 1.1 Instruction set orthogonality o 1.2 No dedicated I/O bus o 1.3 Interrupts o 1.4 Designed for mass production

2 LSI-11 3 Decline 4 Models

o 4.1 Unibus models o 4.2 Q-bus models o 4.3 Models without standard bus o 4.4 Models that were planned but never introduced o 4.5 Special purpose versions o 4.6 Unauthorized clones

5 Operating systems o 5.1 From Digital o 5.2 From third parties

6 Peripherals 7 See also 8 Use 9 Notes

Page 53: PC Architectural Standards Page (1)

10 References 11 Further reading 12 External links

[edit] Unique features

[edit] Instruction set orthogonalitySee also: PDP-11 architecture

The PDP-11 processor architecture had a mostly orthogonal instruction set. For example, instead of instructions such as load and store, the PDP-11 had a move instruction for which either operand (source and destination) could be memory or register. There were no specific input or output instructions; the PDP-11 used memory-mapped I/O and so the same move instruction was used; orthogonality would even let you move data directly from an input device to an output device. More complex instructions such as add likewise could have memory, register, input, or output as source or destination.

Generally, any operand could apply any of eight addressing modes to eight registers. The addressing modes provided register, immediate, absolute, relative, deferred (indirect), and indexed addressing, and could specify autoincrementation and autodecrementation of a register by one (byte instructions) or two (word instructions). Use of relative addressing let a machine-language program be position-independent.

For these reasons, PDP-11 programmers viewed the assembly language as easy to learn and uniquely elegant.[citation needed]

[edit] No dedicated I/O bus

In the most radical departure from earlier computers, the initial models of the PDP-11 had no dedicated bus for input/output; it had only a memory bus called the Unibus. All input and output devices were mapped to memory addresses, so no special I/O instructions were needed.

An input/output device determined the memory addresses to which it would respond and the interrupt priority it would request, and specified its own interrupt vector. This loose framework provided by the processor architecture made it unusually easy to invent new bus devices, including devices to control hardware that had not been contemplated when the processor was designed.

This Unibus was one reason why the PDP-11 became so appreciated for specific usages. One of the predecessors of Alcatel-Lucent, the Bell Telephone Manufacturing Company, developed the BTMC DPS-1500 packet-switching (X.25) network and used PDP-11s in the regional and national network management system, with the Unibus directly connected to the DPS-1500 hardware.

Higher-performance members of the PDP-11 family, starting with the PDP-11/45, departed from the single bus approach. Instead, memory was interfaced by dedicated circuitry and space in the CPU cabinet, while the Unibus continued to be used for I/O only. In the PDP-

Page 54: PC Architectural Standards Page (1)

11/70 this was taken a step further, with the addition of a dedicated interface from disks and tapes, via the Massbus to memory. Use of different buses was not visible to the programmer, however, and the orthogonality of the assembly language was preserved.

[edit] Interrupts

The PDP-11 supported hardware interrupts at four priority levels. Interrupts were serviced by software service routines, which could specify whether they themselves could be interrupted (achieving interrupt nesting). The event that caused the interrupt was indicated by the device itself, as it informed the processor of the address of its own interrupt vector.

Interrupt vectors were blocks of two 16-bit words in low kernel address space (which normally corresponded to low physical memory) between 0 and 776. The first word of the interrupt vector contained the address of the interrupt service routine and the second word the value to pe loaded into the PSW (priority level) on entry to the service routine.

The article on PDP-11 architecture provides more details on interrupts.

[edit] Designed for mass production

Finally, the PDP-11 was designed to be produced in a factory by semiskilled labor. All of the dimensions of its pieces were relatively non-critical. It used a wire-wrapped backplane. That is, the printed circuit boards plugged into a backplane connector. The backplane connectors had square pins that could be connected to by wrapping wires around them. The corners of the pins would bite into the wire to form a gas-tight (i.e. corrosion-proof, therefore reliable) connection.

[edit] LSI-11

Q-Bus board with LSI-11/2 CPU

Page 55: PC Architectural Standards Page (1)

DEC "Jaws-11" (J11) Chipset

DEC "Fonz-11" (F11) Chipset

The LSI-11 (PDP-11/03) was the first PDP-11 model produced using large-scale integration; the entire CPU was contained on 4 LSI chips made by Western Digital (the MCP-1600 chip set; a fifth chip could be added to extend the instruction set, as pictured on the right). It used a bus which was a close variant of the Unibus called the Q-Bus; it differed from the Unibus primarily in that addresses and data were multiplexed onto a shared set of wires, as opposed to having separate sets of wires, as in the Unibus. It also differed slightly in how it addressed I/O devices and it eventually allowed a 22-bit physical address (whereas the Unibus only allowed an 18-bit physical address) and block-mode operations for significantly improved bandwidth (which the Unibus did not support).

The CPU's microcode includes a debugger: firmware with a direct serial interface (RS-232 or current loop) to a terminal. This let the operator do debugging by typing and reading octal numbers, rather than operating switches and reading lights, the typical debugging method at the time. The operator could thus examine and modify the computer's registers, memory, and input/output devices, diagnosing and correcting failures in software and peripherals except a failure that disabled the microcode itself. On a failure that kept the LSI-11 from booting, the operator could command it to boot from a different disk.

Both innovations increased the reliability and decreased the cost of the LSI-11.

Later Q-Bus based systems such as the LSI-11/23, /73, and /83 were based upon chip sets designed in house by Digital Equipment Corporation. Later PDP-11 Unibus systems were designed to use similar Q-Bus processor cards, sometimes with a special memory bus for improved speed, using a Unibus adapter to support existing Unibus peripherals.

There were significant other innovations in the Q-Bus lineup. A system variant of the PDP-11/03 introduced full system Power-On Self-Test (POST) and the 11/83 introduced a

Page 56: PC Architectural Standards Page (1)

primitive (by today's standards) anticipatory CPU cache pre-load as well as a high-speed private memory interconnect (bus). On the other hand, this was among the last computers to allow the use of magnetic core memory configured as up to three quad-wide modules providing 8KB of memory each.[citation needed]

The chip set[clarification needed] was not restricted to implementing a PDP-11; a published book[5] gives the complete source listing of microcode that instead implements the APL programming language.

[edit] Decline

The basic design of the PDP-11 was sound and was continually updated to use newer technologies. However, in the 1980s, inexpensive VLSI memory chips made large amounts of memory affordable and the PDP-11's 16-bit limit on logical addresses proved insurmountable.

The article on PDP-11 architecture describes the hardware and software techniques used to work around this limitation.

DEC's successor to the PDP-11, the VAX (for "Virtual Address Extension") overcame the 16-bit limitation, but was initially a superminicomputer aimed at the high-end time-sharing market. The early VAXes provided a PDP-11 compatibility mode under which much existing software could be immediately used.

Microprocessor chips such as the Motorola 68000 and Intel 80386 began to support 32-bit logical addresses as well. The eventual mass-production of those chips eliminated any cost advantage for the 16-bit PDP-11. A line of personal computers based on the PDP-11, the DEC Professional series, failed commercially, along with other non-PDP-11 PC offerings from DEC.

DEC discontinued PDP-11 production in 1997,[citation needed] and in 1994 [6], sold the PDP-11 system software rights to Mentec Inc., an Irish producer of LSI-11 based boards for Q-Bus and ISA architecture personal computers. For several years, Mentec produced new PDP-11 processors.

By the late 1990s, not only DEC but most of the New England computer industry which had been built around minicomputers similar to the PDP-11 collapsed in the face of microcomputer-based workstations and servers.

[edit] Models

The PDP-11 processors tended to fall into several natural groups depending on the original design upon which they are based and which I/O bus they used. Within each group, most models were offered in two versions, one intended for OEMs and one intended for end-users.

Page 57: PC Architectural Standards Page (1)

[edit] Unibus models

PDP-11/70 Front Panel

PDP-11/70, running

PDP-11/84

The following models used the Unibus as their principal bus:

PDP-11 (later renamed the PDP-11/20) and PDP-11/15 — The original, non-microprogrammed processor; designed by Jim O'Loughlin.

PDP-11/35 and PDP-11/40 — A microprogrammed successor to the /20; the design team was led by Jim O'Loughlin.

PDP-11/45, PDP-11/50, and PDP-11/55 — A much faster microprogrammed processor that could use semiconductor memory instead of or in addition to core memory.

PDP-11/70 — The 11/45 architecture expanded to allow 4 MB of physical memory segregated onto a private memory bus, 2 KB of cache memory, and much faster I/O devices connected via the Massbus.[7]

PDP-11/05 and PDP-11/10 — A cost-reduced successor to the 11/20.

Page 58: PC Architectural Standards Page (1)

PDP-11/34 and PDP-11/04 — Cost-reduced follow-on products to the 11/35 and 11/05. The PDP-11/09 and 11/39 model names were documented internally to DEC but never produced for sale. The PDP-11/34 concept was created by Bob Armstrong.

PDP-11/44 — Replacement for the 11/45 and 11/70 that included cache memory as a standard feature and optional floating point unit. This machine also included a sophisticated serial console interface and support for 4 MB of physical memory. The design team was managed by John Sofio. This was the last PDP-11 processor to be constructed using discrete logic gates; later models were all microprogrammed.

PDP-11/60 — A PDP-11 with user-writable microcontrol store; this was designed by another team led by Jim O'Loughlin.

PDP-11/24 — First VLSI PDP-11 for Unibus, using the "Fonz-11" (F11) chip set PDP-11/84 — Using the VLSI "Jaws-11" (J11) chip set PDP-11/94 — J11-based, faster than 11/84

[edit] Q-bus models

LSI-11/23, cover removed

The following models used the Q-Bus as their principal bus:

PDP-11/03 (also known as the LSI-11/03) — The first LSI PDP-11, this system used a chipset from Western Digital.

PDP-11/23 — 2nd generation of LSI (F-11), early units only supported 248 KB memory, but could be modified for 4 MB support

PDP-11/23+ /MicroPDP-11/23 — Improved 11/23 with more functions on processor card (physically a quad-size card rather than dual)

MicroPDP-11/73 — The third generation LSI PDP, this system used the "Jaws-11" (J-11) chip set.

MicroPDP-11/53 — slower 11/73 with on-board memory

MicroPDP-11/83 — faster 11/73 with PMI (private memory interconnect)

MicroPDP-11/93 — faster 11/83; final DEC Q-Bus PDP-11 model.

Page 59: PC Architectural Standards Page (1)

KXJ11 - QBUS card (M7616) with PDP-11 based peripheral processor and DMA controller. Based on a J11 CPU equipped with 512kBytes RAM, 64kBytes ROM and parallel and serial interfaces.

Mentec M100 — Mentec redesign of the 11/93, with J-11 chipset at 19.66 MHz, 4 onboard serial ports, 1-4 MB on-board memory, and optional FPU.

Mentec M11 — processor upgrade board; microcode implementation of PDP-11 instruction set by Mentec, using the TI 8832 ALU and TI 8818 microsequencer from Texas Instruments

Mentec M1 - processor upgrade board; microcode implementation of PDP-11 instruction set by Mentec, using Atmel 0.35 micron ASIC Development Project Report .

Quickware QED-993 — high performance PDP-11/93 processor upgrade board

[edit] Models without standard bus

PDT-11/150

PDT-11/110 PDT-11/130 PDT-11/150

The PDT series were desktop systems marketed as "smart terminals". The /110 and /130 were housed in a VT100 terminal enclosure. The /150 was housed in a table-top unit which included two 8" floppy drives, 3 asynchronous serial ports, 1 printer port, 1 modem port and 1 synchronous serial port and required an external terminal. All three employed the same chipset as used on the LSI-11/03 and LSI-11/2 in four "microm"s. There was an option which combined two of the microms into one dual carrier, freeing one socket for an EIS/FIS chip. The /150 in combination with a VT-105 terminal was also sold as MiniMINC, a budget version of the MINC-11.

PRO-325 PRO-350 PRO-380

The DEC Professional series were desktop PCs intended to compete with IBM's earlier 8088 and 80286 based personal computers. The models were equipped with 5 1/4" floppy disk

Page 60: PC Architectural Standards Page (1)

drives and hard disks, except the 325 which had no hard disk. The original operating system was P/OS, which was essentially RSX-11M+ with a menu system on top. As the design was intended to avoid software exchange with existing PDP-11 models, their ill fate in the market was no surprise for anyone except DEC. RT-11 was eventually ported to the PRO series. A port to the PRO for RSTS/E was also done internal to DEC, but was not released. The PRO-325 and -350 units were based on the DCF-11 ("Fonz") chipset, the same as found in the 11/23, 11/23+ and 11/24. The PRO-380 was based on the DCJ-11 ("Jaws") chipset, the same as found in the 11/53,73,83 and others, though running only at 10 MHz because of limitations in the support chipset.

[edit] Models that were planned but never introduced

PDP-11/27 — A Jaws-11 implementation that would have used the VAXBI Bus as its principal I/O bus.

PDP-11/68 — A follow-on to the PDP-11/60 that would have supported 4 MB of physical memory.

PDP-11/74 — A PDP-11/70 that was extended to contain multiprocessing features. Up to four processors could be interconnected, although the physical cable management became unwieldy. Another variation on the 11/74 contained both the multiprocessing features and the Commercial Instruction Set. A substantial number of prototype 11/74's (of various types) were built and at least two multiprocessor systems were sent to customers for beta testing, but no systems were ever officially sold. A four processor system was maintained by the RSX-11 operating system development team for testing and a uniprocessor system served PDP-11 engineering for general purpose timesharing. The 11/74 was due to be introduced around the same time as the announcement of the new 32 bit product line and the first model - the VAX 11/780. Rumour/Legend or conspiracy theory held that the reason the 11/74 was cancelled was due to its higher performance compared to the 11/780 (see, for example [1]). Marketing was therefore concerned that the availability of a higher performing PDP-11 would slow migration to the new VAX. This was not the case. Rather, the ability to maintain the product in the field was the issue. However conspiracy or not, DEC was never able to successfully migrate its entire PDP-11 customer base to the VAX. The primary reason was not performance, but the PDP-11's superior real-time responsiveness.

[edit] Special purpose versions

DEC GT40 running Lunar Lander

Page 61: PC Architectural Standards Page (1)

MINC-23

GT40 — VT11 vector graphics terminal using a PDP-11/05 GT42 — VT11 vector graphics terminal using a PDP-11/10 GT44 — VT11 vector graphics terminal using a PDP-11/40 GT62 — VS60 vector graphics workstation using a PDP-11/34a H-11 — Heathkit OEM version of the LSI-11/03 VT20 — Terminal with PDP-11/05 with direct mapped character display for text editing and

typesetting (predecessor of the VT71) VT71 — Terminal with LSI-11/03 and QBUS backplane with direct mapped character display

for text editing and typesetting VT103 — VT100 with backplane to host an LSI-11 VT173 — A high-end typseset terminal containing an 11/03 MINC-11 — Laboratory system based on 11/03 or 11/23; when based on the 11/23, it was

sold as a 'MINC-23', but many MINC-11 machines were field-upgraded with the 11/23 processor. Early versions of the MINC-specific software package would not run on the 11/23 processor because of subtle changes in the instruction set; MINC 1.2 is documented as compatible with the later processor.

C.mmp — Multiprocessor system from Carnegie Mellon University SBC 11/21 (boardname KXT11) Falcon and Falcon Plus — single board computer on a Qbus

card implementing the basic PDP11 instruction set, based on T11 chipset containing 32KB static RAM, 2 ROM sockets, 3 serial lines, 20 bits parallel I/O, 3 interval timers and a 2-channel DMA controller. Up to 14 Falcons could be placed into one Qbus system.

KXJ11 - QBUS card (M7616) with PDP-11 based peripheral processor and DMA controller. Based on a J11 CPU equipped with 512kBytes RAM, 64kBytes ROM and parallel and serial interfaces.

[edit] Unauthorized clones

The PDP-11 was sufficiently popular that many unauthorized PDP-11-compatible minicomputers and microcomputers were produced in Eastern Bloc countries. At least some of these were pin-compatible with DEC's PDP-11s and could share peripherals and system software. These include:

SM-4 , SM-1420, SM-1600, Elektronika BK series, Elektronika 60, Elektronika 85, DVK and UKNC (in the Soviet Union)

SM-4 , SM-1420, IZOT-1016 and peripherals (in Bulgaria). SM-1420 (in East Germany) SM-4 (in Hungary) Independent and Coral (in Romania).[citation needed]

Page 62: PC Architectural Standards Page (1)

CalData - made in US, ran all DEC OS'es, not much faster than real DEC machines.[citation needed]

[edit] Operating systems

Several operating systems were available for the PDP-11

[edit] From Digital BATCH-11/DOS-11 CAPS-11 (Cassette Based Programme

development System)[8]

GAMMA-11 [8] DSM-11 IAS

P/OS RSTS/E RSX-11 RT-11 Ultrix -11

[edit] From third parties ANDOS CSI-DOS DEMOS (Soviet Union) Duress (University of Illinois at Urbana-

Champaign/Datalogics)[8]

Fuzzball MERT [8] Micropower Pascal [8] MK-DOS MONECS MTS (Multi-Tasking System written in

RTL/2 by SPL)[8]

MUMPS PC11 (Decus 11-501/Pilkington)[8]

Sphere (Infosphere - Portland Oregon 1981-87)[8]

Softech Microsystems UCSD System with UCSD Pascal [8]

TRAX (Transaction Processing system)[8]

TRIPOS TSX-Plus Unix (many versions, including Version 6

Unix, Version 7 Unix, UNIX System III, and 2BSD)

Xinu OS for instructional purposes Venix (implementation/port of Unix

developed by VenturCom)[8]

[edit] Peripherals

A wide range of peripherals were available, some of them were also used in other DEC systems like PDP-8 or PDP-10.

TU56 : block addressed tape system TU11 9 track tape drive RX01/RX02: 8 inch floppy disk RL01/RL02: hard disk with exchangeable platter RK drives: hard disk with exchangeable platter RA drives: fixed hard disk PC11: high speed papertape reader/punch

[edit] See also

Page 63: PC Architectural Standards Page (1)

Delimiterless input MACRO-11 , the PDP-11's native assembly language SIMH , a multiple minicomputer architecture emulator written in portable C

[edit] Use

The PDP-11 family of computers were used for many purposes. It was used as a 'normal' computer but also often as system to control complex systems like traffic-light systems, medical systems or for network-management. An example of the use of PDP-11's was the management of the packet switched network Datanet 1. In the 1980s, the UK's Air Traffic Radar processing was conducted on a PDP 11/34 system known as PRDS - Processed Radar Display System at RAF West Drayton.[citation needed] The software for the Therac-25 medical linear accelerator also ran on a 32K PDP 11/23.[9]

Another use was for storage of test programs for Teradyne ATE equipment, in a system known as the TSD (Test System Director). As such they were in use until their software was rendered inoperable by the Year 2000 problem.

Universal asynchronous receiver/transmitterFrom Wikipedia, the free encyclopedia

Jump to: navigation, search

A universal asynchronous receiver/transmitter (usually abbreviated UART and pronounced /ˈuːɑrt/) is a type of "asynchronous receiver/transmitter", a piece of computer hardware that translates data between parallel and serial forms. UARTs are commonly used in conjunction with other communication standards such as EIA RS-232.

A UART is usually an individual (or part of an) integrated circuit used for serial communications over a computer or peripheral device serial port. UARTs are now commonly included in microcontrollers. A dual UART, or DUART, combines two UARTs into a single chip. Many modern ICs now come with a UART that can also communicate synchronously; these devices are called USARTs (universal synchronous/asynchronous receiver/transmitter).

Contents

[hide]

1 Definition o 1.1 Transmitting and receiving serial data o 1.2 Asynchronous receiving and transmitting

2 History o 2.1 UART models

3 Structure 4 Special Receiver Conditions

Page 64: PC Architectural Standards Page (1)

o 4.1 Overrun Error o 4.2 Underrun Error o 4.3 Framing Error o 4.4 Parity Error o 4.5 Break Condition

5 Baudrate 6 See also 7 References 8 External links

[edit] Definition

[edit] Transmitting and receiving serial data

The Universal Asynchronous Receiver/Transmitter (UART) controller is the key component of the serial communications subsystem of a computer. The UART takes bytes of data and transmits the individual bits in a sequential fashion. At the destination, a second UART re-assembles the bits into complete bytes. Serial transmission of digital information (bits) through a single wire or other medium is much more cost effective than parallel transmission through multiple wires. A UART is used to convert the transmitted information between its sequential and parallel form at each end of the link. Each UART contains a shift register which is the fundamental method of conversion between serial and parallel forms.

The UART usually does not directly generate or receive the external signals used between different items of equipment. Typically, separate interface devices are used to convert the logic level signals of the UART to and from the external signaling levels.

External signals may be of many different forms. Examples of standards for voltage signaling are RS-232, RS-422 and RS-485 from the EIA. Historically, the presence or absence of current (in current loops) was used in telegraph circuits. Some signaling schemes do not use electrical wires. Examples of such are optical fiber, IrDA (infrared), and (wireless) Bluetooth in its Serial Port Profile (SPP). Some signaling schemes use modulation of a carrier signal (with or without wires). Examples are modulation of audio signals with phone line modems, RF modulation with data radios, and the DC-LIN for power line communication.

Communication may be "full duplex" (both send and receive at the same time) or "half duplex" (devices take turns transmitting and receiving).

As of 2008, UARTs are commonly used with RS-232 for embedded systems communications. It is useful to communicate between microcontrollers and also with PCs. Many chips provide UART functionality in silicon, and low-cost chips exist to convert logic level signals (such as TTL voltages) to RS-232 level signals (for example, Maxim's MAX232).

[edit] Asynchronous receiving and transmitting

This section may require cleanup to meet Wikipedia's quality standards. Please improve this section if you can. (June 2010)

Page 65: PC Architectural Standards Page (1)

.

In asynchronous transmitting, teletype-style UARTs send a "start" bit, five to eight data bits, least-significant-bit first, an optional "parity" bit, and then one, one and a half, or two "stop" bits. The start bit is the opposite polarity of the data-line's idle state. The stop bit is the data-line's idle state, and provides a delay before the next character can start. (This is called asynchronous start-stop transmission). In mechanical teletypes, the "stop" bit was often stretched to two bit times to give the mechanism more time to finish printing a character. A stretched "stop" bit also helps resynchronization.

The parity bit can either make the number of "one" bits between any start/stop pair odd, or even, or it can be omitted. Odd parity is more reliable because it assures that there will always be at least one data transition, and this permits many UARTs to resynchronize.

In synchronous transmission, the clock data is recovered separately from the data stream and no start/stop bits are used. This improves the efficiency of transmission on suitable channels since more of the bits sent are usable data and not character framing. An asynchronous transmission sends no characters over the interconnection when the transmitting device has nothing to send—only idle stop bits; but a synchronous interface must send "pad" characters to maintain synchronism between the receiver and transmitter. The usual filler is the ASCII "SYN" character. This may be done automatically by the transmitting device.

USART chips have both synchronous and asynchronous modes.

Asynchronous transmission allows data to be transmitted without the sender having to send a clock signal to the receiver. Instead, the sender and receiver must agree on timing parameters in advance and special bits are added to each word which are used to synchronize the sending and receiving units.

When a word is given to the UART for Asynchronous transmissions, a bit called the "Start Bit" is added to the beginning of each word that is to be transmitted. The Start Bit is used to alert the receiver that a word of data is about to be sent, and to force the clock in the receiver into synchronization with the clock in the transmitter. These two clocks must be accurate enough to not have the frequency drift by more than 10% during the transmission of the remaining bits in the word. (This requirement was set in the days of mechanical teleprinters and is easily met by modern electronic equipment.)

After the Start Bit, the individual bits of the word of data are sent, with the Least Significant Bit (LSB) being sent first. Each bit in the transmission is transmitted for exactly the same amount of time as all of the other bits, and the receiver “looks” at the wire at approximately halfway through the period assigned to each bit to determine if the bit is a 1 or a 0. For example, if it takes two seconds to send each bit, the receiver will examine the signal to determine if it is a 1 or a 0 after one second has passed, then it will wait two seconds and then examine the value of the next bit, and so on.

The sender does not know when the receiver has “looked” at the value of the bit. The sender only knows when the clock says to begin transmitting the next bit of the word.

Page 66: PC Architectural Standards Page (1)

When the entire data word has been sent, the transmitter may add a Parity Bit that the transmitter generates. The Parity Bit may be used by the receiver to perform simple error checking. Then at least one Stop Bit is sent by the transmitter.

When the receiver has received all of the bits in the data word, it may check for the Parity Bits (both sender and receiver must agree on whether a Parity Bit is to be used), and then the receiver looks for a Stop Bit. If the Stop Bit does not appear when it is supposed to, the UART considers the entire word to be garbled and will report a Framing Error to the host processor when the data word is read. The usual cause of a Framing Error is that the sender and receiver clocks were not running at the same speed, or that the signal was interrupted.

Regardless of whether the data was received correctly or not, the UART automatically discards the Start, Parity and Stop bits. If the sender and receiver are configured identically, these bits are not passed to the host. If another word is ready for transmission, the Start Bit for the new word can be sent as soon as the Stop Bit for the previous word has been sent. Because asynchronous data is “self synchronizing”, if there is no data to transmit, the transmission line can be idle. A data communication pulse can only be in one of two states but there are many names for the two states. When on, circuit closed, low voltage, current flowing, or a logical zero, the pulse is said to be in the "space" condition. When off, circuit open, high voltage, current stopped, or a logical one, the pulse is said to be in the "mark" condition. A character code begins with the data communication circuit in the space condition. If the mark condition appears, a logical one is recorded otherwise a logical zero.

Figure 1 shows this format.

The start bit is always a 0 (logic low), which is also called a space. The start bit signals the receiving DTE that a character code is coming. The next five to eight bits, depending on the code set employed, represent the character. In the ASCII code set the eighth data bit may be a parity bit. The next one or two bits are always in the mark (logic high, i.e., '1') condition and called the stop bit(s). They provide a "rest" interval for the receiving DTE so that it may prepare for the next character which may be after the stop bit(s). The rest interval was required by mechanical Teletypes which used a motor driven camshaft to decode each character. At the end of each character the motor needed time to strike the character bail (print the character) and reset the camshaft.

All operations of the UART hardware are controlled by a clock signal which runs at a multiple (say, 16) of the data rate - each data bit is as long as 16 clock pulses. The receiver tests the state of the incoming signal on each clock pulse, looking for the beginning of the start bit. If the apparent start bit lasts at least one-half of the bit time, it is valid and signals the start of a new character. If not, the spurious pulse is ignored. After waiting a further bit time, the state of the line is again sampled and the resulting level clocked into a shift register. After the required number of bit periods for the character length (5 to 8 bits, typically) have elapsed, the contents of the shift register is made available (in parallel fashion) to the receiving system. The UART will set a flag indicating new data is available, and may also generate a processor interrupt to request that the host processor transfers the received data. In some common types of UART, a small first-in, first-out FIFO buffer memory is inserted

Page 67: PC Architectural Standards Page (1)

between the receiver shift register and the host system interface. This allows the host processor more time to handle an interrupt from the UART and prevents loss of received data at high rates.

Transmission operation is simpler since it is under the control of the transmitting system. As soon as data is deposited in the shift register, the UART hardware generates a start bit, shifts the required number of data bits out to the line,generates and appends the parity bit (if used), and appends the stop bits. Since transmission of a single character may take a long time relative to CPU speeds, the UART will maintain a flag showing busy status so that the host system does not deposit a new character for transmission until the previous one has been completed; this may also be done with an interrupt. Since full-duplex operation requires characters to be sent and received at the same time, practical UARTs use two different shift registers for transmitted characters and received characters.

Transmitting and receiving UARTs must be set for the same bit speed, character length, parity, and stop bits for proper operation. The receiving UART may detect some mismatched settings and set a "framing error" flag bit for the host system; in exceptional cases the receiving UART will produce an erratic stream of mutilated characters and transfer them to the host system.

Typical serial ports used with personal computers connected to modems use eight data bits, no parity, and one stop bit; for this configuration the number of ASCII character per seconds equals the bit rate divided by 10.

[edit] History

Some early telegraph schemes used variable-length pulses (as in Morse code) and rotating clockwork mechanisms to transmit alphabetic characters. The first UART-like devices (with fixed-length pulses) were rotating mechanical switches (commutators). These sent 5-bit Baudot codes for mechanical teletypewriters, and replaced morse code. Later, ASCII required a seven bit code. When IBM built computers in the early 1960s with 8-bit characters, it became customary to store the ASCII code in 8 bits.

Gordon Bell designed the UART for the PDP series of computers. Western Digital made the first single-chip UART WD1402A around 1971; this was an early example of a medium scale integrated circuit.

An example of an early 1980s UART was the National Semiconductor 8250. In the 1990s, newer UARTs were developed with on-chip buffers. This allowed higher transmission speed without data loss and without requiring such frequent attention from the computer. For example, the popular National Semiconductor 16550 has a 16 byte FIFO, and spawned many variants, including the 16C550, 16C650, 16C750, and 16C850.

Depending on the manufacturer, different terms are used to identify devices that perform the UART functions. Intel called their 8251 device a "Programmable Communication Interface". MOS Technology 6551 was known under the name "Asynchronous Communications Interface Adapter" (ACIA). The term "Serial Communications Interface" (SCI) was first used at Motorola around 1975 to refer to their start-stop asynchronous serial interface device, which others were calling a UART.

Page 68: PC Architectural Standards Page (1)

Some very low-cost home computers or embedded systems dispensed with a UART and used the CPU to sample the state of an input port or directly manipulate an output port for data transmission. While very CPU-intensive, since the CPU timing was critical, these schemes avoided the purchase of a costly UART chip. The technique was known as a bit-banging serial port.

[edit] UART modelsModel Description

EXAR XR21V1410

Intersil 6402

Z84402000 kbit/s. Async, Bisync, SDLC, HDLC, X.25. CRC. 4-byte RX buffer. 2-byte TX buffer. DMA.[1]

8250

Obsolete with 1-byte buffers

Motorola 6850

6522

6551

Rockwell 65C52

16450

16550

16550A 16-byte buffers, TL=1,4,8,14; 115.2 kbit/s standard, many support 230.4 or 460.8 kbit/s. DMA-mode.[2]

16C552

16650 32-byte buffers. 460.8 kbit/s

16750 64-byte buffer for send, 56-byte for receive. 921.6 kbit/s

16850128-byte buffers. 460.8 kbit/s or 1500 kbit/s

16C850

16950

Hayes ESP 1 kByte buffers

[edit] Structure

Page 69: PC Architectural Standards Page (1)

A UART usually contains the following components:

a clock generator, usually a multiple of the bit rate to allow sampling in the middle of a bit period.

input and output shift registers transmit/receive control read/write control logic transmit/receive buffers (optional) parallel data bus buffer (optional) First-in, first-out (FIFO) buffer memory (optional)

[edit] Special Receiver Conditions

[edit] Overrun Error

An "overrun error" occurs when the USART receiver cannot process the character that just came in before the next one arrives. Various USART devices have different amounts of buffer space to hold received characters. The CPU must service the USART in order to remove characters from the input buffer. If the CPU does not service the USART quickly enough and the buffer becomes full, an Overrun Error will occur.

[edit] Underrun Error

An "underrun error" occurs when the UART transmitter has completed sending a character and the transmit buffer is empty. In asynchronous modes this is treated as an indication that no data remains to be transmitted, rather than an error, since additional stop bits can be appended. This error indication is commonly found in USARTs, since an underrun is more serious in synchronous systems.

[edit] Framing Error

A "framing error" occurs when the designated "start" and "stop" bits are not valid. As the "start" bit is used to identify the beginning of an incoming character, it acts as a reference for the remaining bits. If the data line is not in the expected idle state when the "stop" bit is expected, a Framing Error will occur.

[edit] Parity Error

A "parity error" occurs when the number of "active" bits does not agree with the specified parity configuration of the USART, producing a Parity Error. Because the "parity" bit is optional, this error will not occur if parity has been disabled. Parity error is set when the parity of an incoming data character does not match the expected value.

[edit] Break Condition

A "break condition" occurs when the receiver input is at the "space" level for longer than some duration of time, typically, for more than a character time. This is not necessarily an error, but appears to the receiver as a character of all zero bits with a framing error.

Page 70: PC Architectural Standards Page (1)

Some equipment will deliberately transmit the "break" level for longer than a character as an out-of-band signal. When signaling rates are mismatched, no meaningful characters can be sent, but a long "break" signal can be a useful way to get the attention of a mismatched receiver to do something (such as resetting itself). Unix-like systems can use the long "break" level as a request to change the signaling rate, to support dial-in access at multiple signaling rates.

[edit] Baudrate

In embedded designs, it is necessary to choose a proper oscillator to get the correct baud rate with little or no error. Some examples of common crystal frequencies and baud rates with no errors are:

18.432 MHz: 300, 600, 1200, 2400, 4800, 9600, 19200 Bd 22.118400 MHz: 300, 600, 1200, 1800, 2400, 4800, 7200, 9600, 14400, 19200, 38400, 57600,

115200 Bd 16 MHz: 125000, 500000 Bd 24 MHz: 4800 Bd

Plug and playFrom Wikipedia, the free encyclopedia

Jump to: navigation, search

For the legacy extension specifications developed by Intel and Microsoft, see Legacy Plug and Play.

In computing, plug and play is a term used to describe the characteristic of a universal computer bus, or device specification, which facilitates the discovery of a hardware component in a system, without the need for physical device configuration, or user intervention in resolving resource conflicts.[1][2]

Plug and play refers to both the boot-time assignment of device resources, and to hotplug systems such as USB and Firewire.[3]

Contents

[hide]

1 History of Device Configuration o 1.1 Early self-configuring devices

2 See also 3 References 4 External links

[edit] History of Device Configuration

Page 71: PC Architectural Standards Page (1)

IBM 402 Accounting Machine plug-board wiring. This board was labeled "profit & loss summary."

This section requires expansion.

In the beginnings of data processing technology, the hardware was just a collection of modules, and the functions of those modules had to be linked to accommodate different calculating operations. This linking was usually done by connecting some wires between modules and disconnecting others. For many mechanical data processing machines, such as the IBM punched card accounting machines, their calculating operations were directed by the use of a quick-swap control panel wired to route signals between module sockets.

As general purpose computing devices developed, these connections and disconnections were instead used to specify locations in the system address space where an expansion device should appear, in order for the device to be accessible by the central processing unit. If two or more of the same device were installed in one computer, it would be necessary to assign the second device to a separate, non-overlapping region of the system address space so that both could be accessible at the same time.

Some early microcomputing devices such as the Apple II required the end-user to physically cut some wires and solder others together to make these configuration changes. The changes were intended to be largely permanent for the life of the hardware.

Jumpers

Over time the need developed for more frequent changes and for easier changes to be made by unskilled computer users. Rather than cutting and soldering connections, the header and jumper was developed. The header consists of two or more vertical pins arranged in an evenly-spaced grid. The jumper is a small conductive strip of metal clipped across the header pins. The conductive jumper strip is commonly encased in a plastic shell to help prevent electrical shorting between adjacent jumpers.

Page 72: PC Architectural Standards Page (1)

Slide style DIP switch

Jumpers have the unfortunate property of being easy to misplace if not needed, and are difficult to grasp in order to remove them from headers. To help make these changes easier, the DIP switch was developed, also known as a dual in-line package switch. The DIP switch has small either rocker or sliding switches enclosed in a plastic shell and usually numbered for easy reference. DIP switches usually come in units of four or eight switches; longer rows of switches can be made by combining two or more units. DIP switches are particularly useful where a long string of jumpers would be closely packed together or where four or more jumpers would be used in combination to configure one device function. DIP switches also have a particular advantage for configuration settings which are likely to be changed more frequently than once every few years. (Because of the inconvenience of setting them, jumpers are typically used for settings that are not expected to need to be changed unless the device is removed from one computer and installed in another, an infrequent occurrence for internal devices in consumer desktop PCs.)

[edit] Early self-configuring devices

Typical MCA expansion card without jumpers or DIP switches.

As computing devices spread further out into the general population, there was ever greater pressure developing to automate this configuration process. One of the first major industry efforts towards self-configuration was done by Commodore in 1986 with the creation of their Amiga 2000 line of computers using the AutoConfig protocol and the Zorro II expansion bus. This took a giant leap forward, as expansion devices had absolutely no jumpers or DIP switches.

However, IBM's first attempt at self-configuration, was with the creation of their Personal System/2 line of computers using the Micro Channel Architecture (MCA) had a few major problems. In an attempt to simplify device setup, every piece of hardware was issued with a disk containing a special file used to auto-configure the hardware to work with the computer. (If the device required one or more drivers for specific operating systems, they were usually included on the same disk.) Without this disk the hardware would be completely useless and the computer would not boot at all until the unconfigured device was removed.

MCA also suffered for being a proprietary technology. Unlike their previous PC bus design, the AT bus, IBM did not publicly release specifications for MCA and actively pursued patents to block third parties from selling unlicensed implementations of it, and the

Page 73: PC Architectural Standards Page (1)

developing PC clone market did not want to pay royalties to IBM in order to use this new technology. The PC clone makers instead developed EISA, an extension to the existing old non-PnP AT bus standard, which they also further standardized and renamed ISA (to avoid IBM's "AT" trademark). With few vendors other than IBM supporting it with computers or cards, MCA eventually failed in the marketplace. Most vendors of PC-compatibles stayed largely with ISA and manual configuration, while EISA offered the same type of auto-configuration featured in MCA. (EISA cards required a configuration file as well.)

In time, many ISA cards incorporated, through proprietary and varied techniques, hardware to self-configure or to provide for software configuration; often the card came with a configuration program on disk that could automatically set the software-configurable (but not itself self-configuring) hardware. Some cards had both jumpers and software-configuration, with some settings controlled by each; this compromise reduced the number of jumpers that had to be set, while avoiding great expense for certain settings, e.g. nonvolatile registers for a base address setting. The problems of required jumpers continued on but slowly diminished as more and more devices, both ISA and other types, included extra self-configuration hardware. However, these efforts still did not solve the problem of making sure the end-user has the appropriate software driver for the hardware.

ISA PnP or (legacy) Plug & Play ISA was a plug-n-play system that used a combination of modifications to hardware, the system BIOS, and operating system software to automatically manage resource allocations. It was superseded by the PCI bus during the mid-1990s.

[edit] See also

Autodetection Auto-configuration Autoconfig (Amiga) Hot plugging Display Data Channel PCI configuration space Universal Plug and Play (UPnP) USB flash drive