intel core micro-architecture

Upload: sabbir-ul-haque

Post on 08-Apr-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Intel Core Micro-architecture

    1/35

    Intel Core MicroIntel Core Micro--

    ArchitectureArchitecture

    By -

    Sabbir ul Haque

  • 8/7/2019 Intel Core Micro-architecture

    2/35

    This Presentation is prepared by

    Md.Sabbir ul Haque[072 091 040]

    Dept. of ECE

    PRESIDENCY UNIVERSITY

    Dhaka.

    Presented by- Group 5

    1. Md. Sabbir ul Haque

    2. Sajib

    3. Nadira yasmin

  • 8/7/2019 Intel Core Micro-architecture

    3/35

    Outline

    HistoryMicroprocessor After Pentium Pro

    Intel Core Duo Processor Overview

    Architectural Features of Core 2

    Micro architecture

    Resources

  • 8/7/2019 Intel Core Micro-architecture

    4/35

    History

    (List of Intel microprocessors) The 4-bit processors

    4004, 4040 The 8-bit processors

    8008, 8080, 8085 The 16-bit processors: Originofx86

    8086, 8088, 80186, 80188, 80286 The 32-bit processors: Non x86

    iAPX 432, 80960, 80860, XScale The 32-bit processors: The 80386 Range

    80386DX, 80386SX, 80376, 80386SL, 80386EX

    The 32-bit processors: The 80486 Range80486DX, 80486SX, 80486DX2, 80486SL, 80486DX4 The 32-bit processors: The Pentium (I)

    Pentium, Pentium MMX The 32-bit processors: P6/Pentium M

    Pentium Pro, Pentium II, Celeron, Pentium III, PII and III XeonCeleron(PIII), Pentium M, Celeron M, Intel Core, Dual Core Xeon LV

    The 32-bit processors: NetBurstmicroarchitecturePentium 4, Xeon, Pentium 4 EE

    The 64-bit processors: IA-64

    Itanium, Itanium 2 The 64-bit processors: EM64T-NetBurst

    Pentium D, Pentium Extreme Edition, Xeon The 64-bit processors: EM64T- Core microarchitecture

    Xeon, Intel Core 2

  • 8/7/2019 Intel Core Micro-architecture

    5/35

    Microprocessor Hall Of Fame

    1995: Intel Pentium Pro Processor

    Released in the Fall of 1995.

    5.5 million transistors.

    Designed for 32-bit server and workstation applications.

    Packaged with a second speed-enhancing cache memory chip.

  • 8/7/2019 Intel Core Micro-architecture

    6/35

    Microprocessor Hall Of Fame

    1997: Intel Pentium II Processor

    7.5 million transistor.

    incorporates Intel MMX technology, which isdesigned specifically to process video, audio andgraphics data efficiently.

    high-speed cache memory chip.

  • 8/7/2019 Intel Core Micro-architecture

    7/35

    Microprocessor Hall Of Fame

    1999: Intel Pentium III Processor

    9.5 million transistors.

    Using 0.25-micron technology.

    70 new instructions that enhance the performance of:

    Advanced imaging

    3D

    Streaming audio, video

  • 8/7/2019 Intel Core Micro-architecture

    8/35

    Microprocessor Hall Of Fame

    2000: Intel Pentium 4 Processor

    42 million transistors.

    Circuit lines of 0.18 microns.

    Intel's first microprocessor, the 4004, ran at 108 KHz, comparedto the Intel Pentium 4 processor's initial speed of 1.5 GHz. If

    automobile speed had increased similarly over the same period,you could now drive from San Francisco to New York (about4100 Km) in about 13 seconds.

  • 8/7/2019 Intel Core Micro-architecture

    9/35

    Microprocessor Hall Of Fame

    2006: The Intel Core Duo processor

    151 million transistor. Using 65 nm technology.

    2.33 2.50 GHz Clock Frequency.

    4-wide, 14 stage pipeline.

    Low power consumption.

  • 8/7/2019 Intel Core Micro-architecture

    10/35

    Benefits

    New Micro architecture: Low Power.

    Higher Performance.

    At Home: Ultra-quiet.

    Sleek and low-power computing. For IT:

    Reduced footprints

    Lower power

    Energy efficiency across client and server platforms.

    For Mobile Users: greater computer performance and battery life to enable a variety of small

    form factors that enable world-class computing "on the go.

  • 8/7/2019 Intel Core Micro-architecture

    11/35

    IntelIntelWideWideDynamic ExecutionDynamic Execution

    IntelIntelAdvancedAdvancedDigital Media BoostDigital Media Boost

    IntelIntelIntelligentIntelligentPower CapabilityPower Capability

    IntelIntelSmartSmartMemory AccessMemory Access

    IntelIntelAdvancedAdvancedSmart CacheSmart Cache

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    12/35

    IntelAdvancedDigital Media Boost

    IntelIntelWideWideDynamic ExecutionDynamic Execution

    IntelSmartMemory Access

    IntelAdvancedSmart Cache

    44--widewide

    1414--stage pipelinestage pipeline

    MacroMacro--fusionfusion

    IntelIntelligentPower Capability

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    13/35

    IntelSmartMemory Access

    IntelAdvancedSmart Cache

    IntelIntelligentPower Capability

    SingleSingle--cyclecycle128128--bit SSEbit SSE

    Intel WideDynamic Execution

    IntelIntelAdvancedAdvancedDigital Media BoostDigital Media Boost

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    14/35

    IntelIntelAdvancedAdvancedSmart CacheSmart Cache

    IntelAdvancedDigital Media Boost

    IntelSmartMemory Access

    IntelIntelligentPower Capability

    Shared L2 cacheShared L2 cache

    Intel WideDynamic Execution

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    15/35

    IntelAdvancedDigital Media Boost

    IntelAdvancedSmart Cache

    IntelIntelSmartSmartMemory AccessMemory Access

    IntelIntelligentPower Capability

    Advanced PreAdvanced Pre--fetchfetch

    MemoryMemoryDisambiguationDisambiguation

    Intel WideDynamic Execution

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    16/35

    IntelAdvancedDigital Media Boost

    IntelSmartMemory Access

    IntelIntelIntelligentIntelligentPower CapabilityPower Capability

    AdvancedAdvancedPower GatingPower Gating

    Intel WideDynamic Execution

    IntelAdvancedSmart Cache

    Five Key Innovations

  • 8/7/2019 Intel Core Micro-architecture

    17/35

    Intel WideIntel Wide

    Dynamic ExecutionDynamic Execution

    14 stage Pipeline

    Instruction decoder: Macro-fusion

  • 8/7/2019 Intel Core Micro-architecture

    18/35

    Pipeline ConceptCore microarchitecture uses a 14-stage

    pipeline. Pipeline is a list of all stages a

    given instruction must go thru in order to

    be fully executed. In a pipeline, a set of

    data processing elements connected in

    series, so that the output of one elementis the input of the next one. The elements

    of a pipeline are often executed in parallel

    or in time-sliced fashion

    Intel didnt disclosure Pentium Ms pipeline

    and so far they didnt publish thedescription of each stage of Core

    microarchitecture pipeline as well, so we

    are unable to provide more in depth

    information on that.

  • 8/7/2019 Intel Core Micro-architecture

    19/35

    Macro-Fusion

    Macro-fusionisthe abilityofjoiningtwo x86 instructionstogether

    into a single micro-op.

    consider this piece of a program:

    load eax, [mem1]

    cmp eax, [mem2]jne target

    With macro-fusion the comparison (cmp) and branching (jne) instructions willbe merged into a single micro-op. So after passing thru the instructiondecoder, this part of the program will something like this:

    load eax, [mem1]cmp eax, [mem2] + jne target

    Compare instructionCompare instruction

    Jump instructionsJump instructions

    CompareCompare + JumpJump = microOpmicroOp

  • 8/7/2019 Intel Core Micro-architecture

    20/35

    Intel Wide Dynamic Execution

    As we can see, we saved one instruction.The less instructions there are to beexecuted, the faster the computer willfinish the execution of the task andlowers the CPU power consumption.

    Every execution core is 33 percentwiderthan previous generations, allowingeach core to fetch, dispatch, executeand retire up to fourfull instructionssimultaneously

    Figure : Fetch unit and instruction decoder onCore microarchitecture

  • 8/7/2019 Intel Core Micro-architecture

    21/35

    Intel Advanced Digital Media Boost

    Another new feature found on Core microarchitecture is a true128-bit internal datapath. On previous CPUs, the internaldatapath was of64 bits only. This was a problem for SSEinstructions, since SSE registers, called XMM, are 128-bit long.So, when executing an instruction that manipulated a 128-bit

    data, this operation had to be broke down into two 64-bitoperations SIMD:

    In computing, SIMD (Single Instruction, Multiple Data) is a technique employed toachieve data level parallelism, as in a vector or array processor.

    SSE (Streaming SIMD Extensions)

  • 8/7/2019 Intel Core Micro-architecture

    22/35

    Intel Advanced Digital Media Boost

    Lower 64 bitinone cycle, upperinthe next

  • 8/7/2019 Intel Core Micro-architecture

    23/35

    Intel Advanced Digital Media Boost

    128 bit instruction completed in one cycle

  • 8/7/2019 Intel Core Micro-architecture

    24/35

    Intel Advanced Digital Media Boost

    Enables these 128-bit instructions to be completelyexecuted at a throughput rate ofone per clockcycle, effectively doubling the speed of executionfor these instructions as compared to previous

    generations.

    This feature significantly improves performance whenexecuting Streaming SIMD Extension (SSE/SSE2/SSE3)instructions:

    Video, Speech and Image (MPEG).

    Photo Processing.

    Encryption.

  • 8/7/2019 Intel Core Micro-architecture

    25/35

    Intel Advanced Smart Cache

    The Intel Advanced Smart Cache is a multi-

    core optimized cache that significantly

    reduces latency to frequently used data, thus

    improving performance and efficiency by

    increasing the probability that each execution

    core of a multi-core processor can access data

    from a higher-performance, more efficientcache subsystem.

  • 8/7/2019 Intel Core Micro-architecture

    26/35

    Core microarchitecture was createdhaving the multi-core concept in mind,i.e. more than one chip per packaging.On Pentium D, which is the dual-coreversion of Pentium 4, each core has itsown L2 memory cache. The problemwith that is that at some moment one

    core may run out of cache while theother may have unused parts on its ownL2 memory cache. When this happens,the first core must grab data from themain RAM memory, even though therewas empty space on the L2 memorycache of the second core that could be

    used to store data and prevent that corefrom accessing the main RAM memory

    Intel Advanced Smart Cache

    Highercache hit

    rate

    Reduced bustraffic

    Lower latencytodata

    Decreasedtraffic

    Increasedtraffic

  • 8/7/2019 Intel Core Micro-architecture

    27/35

    Intel Smart Memory Access

    Optimizing the use of the available data bandwidthfrom the memory subsystem .

    Includes a new capability called MemoryDisambiguation, which increases the efficiency ofout-of-order processing by providing the executioncores with the built-in intelligence to speculatively

    load data for instructions that are about to executebefore all previous store instructions are executed.

  • 8/7/2019 Intel Core Micro-architecture

    28/35

    Intel Smart Memory Access

  • 8/7/2019 Intel Core Micro-architecture

    29/35

    Intel Smart Memory Access

  • 8/7/2019 Intel Core Micro-architecture

    30/35

    Advanced PreAdvanced Pre--fetchfetch

    Prefetches are shared between the cores, i.e. if the

    memory cache system loaded a block of data to be

    used by the first core, the second core can also use

    the data already loaded on the cache. On theprevious architecture, if the second core needed a

    data that was located on the cache of the first core, it

    had to access it thru the external bus (which works

    under the CPU external clock, which is far lower thanthe CPU internal clock) or even grab the required

    data directly from the system RAM.

  • 8/7/2019 Intel Core Micro-architecture

    31/35

    Execution units in Intel Core

    Micro-architecture

    Pentium M has five dispatch ports located on its

    Reservation Station, but only two ports are used to

    dispatch micro-ops to execution units. The other

    three are used by memory-related units (Load, StoreAddress and Store Data). Core microarchitecture has

    also five dispatch ports, however three of them are

    used to send micro-ops to execution units. This

    means that CPUs using Core microarchitecture willbe able to send three micro-ops to be executed per

    clock cycle, contrasted to only two on Pentium M.

  • 8/7/2019 Intel Core Micro-architecture

    32/35

    Execution units in Intel Core

    Micro-architecture

    Core microarchitecture provides one extra FPU and one extra IEU

    (a.k.a. ALU) compared to Pentium Ms architecture. This means

    Core microarchitecture can process three integer instructions per

    clock cycle, contrasted to only two on Pentium M.

  • 8/7/2019 Intel Core Micro-architecture

    33/35

    Intel Advanced Power Gating

    This feature enables the CPU to shut down units thatarent being used at the moment. This idea goeseven further, as the CPU can shut down specific parts

    inside each CPU unit in order to save energy, todissipate less power and to provide a greater battery

    life (in the case of mobile CPUs).

    Another power-sav

    ing capability of Coremicroarchitecture is to turn on only the necessary

    bits in the CPU internal busses.

  • 8/7/2019 Intel Core Micro-architecture

    34/35

    Resources

    Intel.com

    PCWorld.com

    ExtremeTech.com Wikipedia.org

    Microsoft.com

    Hardwarwsecrets.com

  • 8/7/2019 Intel Core Micro-architecture

    35/35