lecture 1- introduction - computer engineering at ... · regular exam or two midterm exams ......
TRANSCRIPT
Department of Computer Systems
Lecture 1- IntroductionErno SalminenTIE-50200 LogiikkasynteesiTampere University of Technology 2013-2014
#2/47 Department of Computer SystemsErno Salminen - Jan. 2014
Lecture contents1. Course organization2. Introduction to implementing digital
systems
#3/47 Department of Computer SystemsErno Salminen - Jan. 2014
Course GoalsGet to know practical digital system designAware of challenges of digital system designDesign for portability Device independency, software independency, RTL design, parameterization
Design for large scale Large module, large system, overall
development process Design reuse
Design for efficiency
#4/47 Department of Computer SystemsErno Salminen - Jan. 2014
Course Description Web: http://www.tkt.cs.tut.fi/kurssit/50200
Note: This course is as POP-free as possible Lectures, starting at Tue 7.1.2013
Monday 14-16 TB220 Tuesday 10-12 TB219 (three times in January! )
Exercises at TC221, starting at 7.1.2013 Arto Perttula Tue 8-10 Fri 10-12
#5/47 Department of Computer SystemsErno Salminen - Jan. 2014
Course Description (2) Course requirements:
Regular exam or two midterm exams(own notes are allowed)
Succesful exercises/exercise work
Course primarily based on book: RTL Hardware Design Using VHDL: Coding for
Efficiency, Portability, and Scalability. Chu, Pong P. (2006) Can be borrowed from the lecturer Available at TUT library
Snippets from other sources also Available from the lecturer
Lectures and lecture notes should be enough for passing the course
#6/47 Department of Computer SystemsErno Salminen - Jan. 2014
Course contents
I. VHDL language Very High Speed Integrated Circuit Hardware
Description language = VHSIC HDL = VHDL Familiarize with the language constructs
II. Testbenches and simulators, synthesis, guidelines for re-use
III. FPGA circuits, designing for themIV. Advanced topics: Multiple clock domains, clock
synchronization, system design challenges
#7/47 Department of Computer Systems
Preliminary schedule, spring 2014
Erno Salminen - Jan. 2014
I.
II.
III.
IV.
#8/47 Department of Computer SystemsErno Salminen - Jan. 2014
Mandatory exercise workSimple audio synthesizer implemented on
FPGA development board Each of the four buttons produces different tone Sound is heard from the external speakers
DE2 development boardBlock diagram of the synthesizer
#9/47 Department of Computer SystemsErno Salminen - Jan. 2014
During the exercises, you’ll learn1. to describe, synthesize, and verify digital
systems using VHDL De facto standard in European microelectronic
industry2. to read data sheets3. to use I2C bus developed by Philips. Serial bus used e.g. in car industry
4. to operate Wolfson audio codec chip also used e.g. in some iPods
#10/47 Department of Computer SystemsErno Salminen - Jan. 2014
Exercises in practiceWeekly exercises in TC221 (Windows class)
Done in groups of two (alone only in special cases) Two guidance sessions per week
Presence is not required Return is mandatory Deadline is within two weeks (due Sunday 23:59)
The first exercise on week 2, 6-10.01.2013 Possibility to gain 6 bonus points to the passed
exam by completing separate bonus tasks You must report (avg) hours per person for each
exercise
#11/47 Department of Computer SystemsErno Salminen - Jan. 2014
Reserve enough timeExercises take 3-4h/week on average but verification is harder than you think large variations between groups
Start early!
1.45 1.551.85 2.07
4.70
2.64
4.28
5.84
1.872.18
7.46 7.33
3.96
1.97 1.71
012345678
tuto
riaal
i
3-b
+
gen.
+
hier
.
tied-
tb
kolm
io
audi
octrl
audi
o tb
synt
hto
pqu
artu
s
i2c
i2c
tb
debu
g
fifo
synt
h
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Harjoitus
Harjo
itusk
ohta
iset
aja
nkäy
ttö, [
tunt
ia]
avg (kaikki)
min
#12/47 Department of Computer SystemsErno Salminen - Jan. 2014
Getting the development boards and softwaresStudents may borrow an FPGA kit
to do exercises and own hobby projects. You may keep the kit if you write a
BSc/MSc thesis for Department of Pervasive Computing
Next pickups Fri 10.1.2014 at 14-16from the room TG218 (Riku Uusikartano)
http://www.tkt.cs.tut.fi/Opetus/Fpga_board
Students may install the neededEDA tools to their own computer http://www.tkt.cs.tut.fi/tools/public/tutorials/me
ntor/licensing/licensing.html to be updated
#13/47 Department of Computer SystemsErno Salminen - Jan. 2014
Action points1. Register to one of the exercise groups2. Access rights from fall 2013 are still valid Otherwise, fill and sign Access application and
confidentiality agreement http://www.tkt.cs.tut.fi/kurssit/STUDENTS_CONF
IDENTIALITY_AGREEMENT.pdf Return the form to Jari Salo’s at room TE207
3. Optional: You may install the needed SW tools to your home computer
4. Optional: You may borrow an Altera DE2 FPGA board
#14/47 Department of Computer Systems
DI-tutkinto 30 opEsitiedot/Koulutusohjelma-
kohtaiset
Kandidaatin tutkinto 25 op
Most direct follow-up is TIE-50506 System Design which combines HW and SW on a single chip CPU, application software, drivers+OS HW accelerator, systrem architecture, buses
Erno Salminen - Jan. 2014
TST-01100 JohdTietotekn
6 op (s1)
TIE-50100DigSuunn5 op (s1)
TIE-50200 LogSyn5 op (k3)
TIE-51200 TietokoneenArk
5 op (k3)
TIE-05200 Mikroprosess.
4 op (k3)
TIE-50506SystemDes
5 op (s1)
TIE-50306 EleSysLevDes
5 op (k3)
TIE-50406Dsp Imple5 op (s3)
ELT-40606DigCircAndPlat
5 op (s1)
TIE-13100Projektityö
5-10 op
Check the details from the study guide
TIE01300DI-työ semin.
1+0 op
TIE-11xxxSeminaari
1-6 op
TIE-01606DemolaProject
5-10 op
TIE-10100Kandisemin.
0 op
tai TIE-05100 JohdDig 3 op
ELT-23000MIkrokontJärj
5 op (k3)
ELT-21300MIkrokont5 op (s1)
ELT-23100VerkSulLait
5 op (s2)
ELT-23050SulJärjTuott
5 op (k4)
HW-oriented courses at TUT
Department of Computer Systems
1. Introduction
#16/47 Department of Computer SystemsErno Salminen - Jan. 2014 TKT-1212 Dig.järj.tot., syksy 2008, A. Kulmala, TTY
AcknowledgementsProf. Pong . P. Chu provided ”official” slides
for the book which is gratefully aknowledged See also: http://academic.csuohio.edu/chu_p/
Most slides were made by Ari Kulmala and other previous lecturers (Teemu Pitkänen,
Konsta Punkka, Mikko Alho…)
#17/47 Department of Computer SystemsErno Salminen - Jan. 2014
Digital Circuits Nowadays found everywhere - From washing machines to
space shuttles Digital circuits are typically integrated circuits (IC)
Minimize the number of discrete components Typical digital systems, such as cellular phones, contain
(Several) Processors and co-processors Application-specific hardware An on-chip interconnection between the components Memory
RAM, FLASH, even hard disks
RF/Analog IC Out of the scope of this
course
#18/47 Department of Computer SystemsErno Salminen - Jan. 2014
How to implement a digital system No two applications are identical and every one
needs certain amount of customization Basic methods for customization1. “General-purpose hardware” with custom software
General purpose processor (GPP): e.g., performance-oriented processor (e.g., Pentium), cost-oriented processor (e.g., AVR micro-controller)
Special purpose processor: architecture with a specific set of functions: e.g., DSP processor (efficient multiply-add), network processor (to do buffering and routing), GPU (to do 3D rendering)
2. Custom software on a custom platform (CPU+other hardware) (requires hardware-software co-design)
3. Custom hardware (no software)
#19/47 Department of Computer SystemsErno Salminen - Jan. 2014
How to implement a digital system (2)
Trade-off between flexibility, programmability, design effort, cost, performance, and power consumptionA complex application contains many
different tasks and use more than one customization methods
Department of Computer Systems
1a. Device Technologies
#21/47 Department of Computer SystemsErno Salminen - Jan. 2014
What does an IC look like? Intel Penryn dual core.
http://www.intel.com/pressroom/kits/45nm/photos.htm
Package
The IC
http://www.namedevelopment.com/blog/archives/Intel-penryn.gif
#22/47 Department of Computer SystemsErno Salminen - Jan. 2014
What does an IC look like? (2)45 nm, quad-coreNote the symmetryTwo dual-cores integrated
http://www.intel.com/pressroom/kits/45nm/photos.htm
#23/47 Department of Computer SystemsErno Salminen - Jan. 2014
Structure of an IC Transistors and connections are made from many layers
(typical 10 to 15 in CMOS) built on top of one another Ever increasing number of layers (more layers, more cost,
though) Each layer has a special pattern defined by a mask One important aspect of an IC is the length of a smallest
feature that can be fabricated Feature may stands for channel legnth of the transistor or
the width of a wire (or something completely different…) Unit is micrometer (m, 10-6 meter), or nanometer, 10-9m E.g., we may say an IC is built with 0.35 μm process The process continues to improve (Moore’s law) even in
deep sub-micron era The state-of-art commercial process is 22 nm, and Intel
has demonstrated 14nm
#24/47 Department of Computer SystemsErno Salminen - Jan. 2014
Structure of an IC (2)
M1
M2
M1
M3
silicon substrate
via
via
contact
...
insu
lato
r
...
Cel
lsIn
terc
onne
ct
transistor
#25/47 Department of Computer SystemsErno Salminen - Jan. 2014
Structure of an IC (3)
transistors
Several metal layers, e.g. M1-M10 Less congestion
Every other layer routes wiresin X-direction, every other in Y
[ITRS 2003]
Hierarchical scaling Wires on top levels
are wider and taller than on lower levels
Top layers for Power supply Clock
Global signals
#26/47 Department of Computer SystemsErno Salminen - Jan. 2014
Fabrication of an IC1. Purified silicon ingot (cylinder)
is sliced into wafers (e.g. 12-inch diameter)
2. Wafer is coated with photoresist
3. Light shines through the mask4. Photoresist not hit by light is
washed away5. New layers (n-well, dielectric,
copper wire, via etc.) arecreated on top of the silicon
6. Finally, the rectangular dies(chips, e.g. 1-200 mm2) aresawed from the wafer, tested, and packaged
#27/47 Department of Computer Systems
Example lithography machine
Erno Salminen - Jan. 2014
[K.M. Palmer, An extremely fine line , IEEE Spectrum, Jan 2012, pp. 47 - 50]
#28/47 Department of Computer SystemsErno Salminen - Jan. 2014
Classification: Where HW customization is donea) In a fabrication facility: ASIC Full-custom, Standard cell,
and Gate array ASIC (Application Specific IC)
b) In the “field”: non-ASIC Simple/Complex field
programmable logic device Off-the-shelf SSI/MSI
(Small/Medium Scaled IC) components
#29/47 Department of Computer SystemsErno Salminen - Jan. 2014
Full-custom ASICAll aspects (e.g., size of a transistor) of a circuit
are tailored for a particular application.Circuit fully optimizedDesign extremely complexVery time consuming design (Typically only
feasible for small components)Masks needed for all layers Very expensive Fabrication time up to months
Example: Intel, AMD, and IBM processors are (partly) full-custom
Fig. Silicon layout editor
#30/47 Department of Computer SystemsErno Salminen - Jan. 2014
Standard-Cell ASIC
Layout created with special EDA tools
Masks needed for all layers Same fabrication cost as
with full custom Eg. Mobile phone digital ICs
SC-ASIC has fixed-height rows of std cells
Closer look at 4 standard cell rows. Power can ground lines run horizontally inside the cells
Circuit made using a set of pre-defined logic components , known as standard cells E.g., basic logic gates, 1-bit adder, D-FF Library cannot be altered albeit
some basic parameters can (e.g. fan-out) Height of a cell is pre-determined
Layout of the complete circuit is customized1. The location and type of the standard cells2. Connections between cells
#31/47 Department of Computer SystemsErno Salminen - Jan. 2014
Gate array ASIC Circuit is built from an array of a single type of cell
(known as base cell) Base cells are pre-arranged and placed in fixed
positions, aligned as one- or two-dimensional array Connections customized by the designer
More sophisticated components (macro cells) can be constructed from base cells
Masks needed only for metal layers (connection wires) Cheaper than full custom
or standard cell Aka. channelless array or
sea of gates array
#32/47 Department of Computer SystemsErno Salminen - Jan. 2014
Complex Field Programmable Logic Device Device consists of an array of generic logic
cells and general interconnect structure Logic cells and interconnect can be
“programmed” by utilizing “semiconductor fuses” or “switches”
Customization is done “in the field” Two categories:1. CPLD (Complex Programmable Logic
Device) Sea-of-gates to implement logic
2. FPGA (Field Programmable Gate Array) Look-up tables to implement logic
No custom mask needed For example, Cisco 2600 series routers and
this course
#33/47 Department of Computer SystemsErno Salminen - Jan. 2014
Simple Field Programmable Logic Device (PLD)Programmable device with simple
internal structure E.g., PROM (Programmable
Read Only Memory), PAL (Programmable Array Logic) No custom mask neededOutdated technologyReplaced by CPLD/FPGA
Fig.1 Example PAL (AND-OR net)
#34/47 Department of Computer SystemsErno Salminen - Jan. 2014
SSI/MSI componentsSmall discrete parts with fixed,
limited functionalityE.g. few AND-ports in Printed
Circuit Board (PCB)E.g., 7400 TTL series has
more than 100 partsResources (e.g., power, board
area, manufacturing cost etc.) is consumed by package but not siliconNo longer a viable option
except for hobby projects Fig. 2 TTL clock with 7400s.Rather hackish, ehh.
Fig.1 Example component
#35/47 Department of Computer Systems
Major trend: Integration Minimize the number of
discrete components Integrate to single
chip/package several CPUs memories HW accelerators on-chip network also passive, RF, and
MEMS components, Fig: J. Blau, Talk is cheap, IEEE Spectrum, vol. 43,
iss. 10, Oct 2006, pp. 10-11.
Minimize the number of discrete componentsIntegrate to single
chip/package several CPUsmemories HW accelerators on-chip network also passive, RF,
and MEMS components,
Fig: [J. Blau, Talk is cheap, IEEE Spectrum, vol. 43, iss. 10, Oct 2006, pp. 10-11.]
System
-on-chip
Erno Salminen - Jan. 2014
#36/47 Department of Computer SystemsErno Salminen - Jan. 2014
Actel Fusion Mixed-signal FPGA
1. Integrated Analog-to-Digital Converter (ADC)
2. Fusion Supports Low Power, synchronization
3. Embedded Flash Memory4. Advanced I/O Standards5. Charge Pumps6. Analog Quads7. Flash FPGA VersaTile8. SRAM and FIFOs9. Integrated Oscillators—Crystal
and RC10.Routing Structure11.JTAG
http://www.actel.com/documents/Fusion_PIB.pdf
Major trend: Integration (2)
#37/47 Department of Computer Systems
Fig: [J.M. Rabaey - Silicon Architectures for Wireless Systems - Part 01, Tutorial, HotChips, 2001] http://bwrc.eecs.berkeley.edu/People/Faculty/jan/presentations/hotchips1.pdf
3. Designer productivityRel
ativ
e pe
rform
ance
2. Memory speed
Major problem areas
1.
Trend: Shannon’s law > Moore’s law > productivity
Wirth’s (or Reiser’s) law: ”Software is slowing faster than hardware is accelerating”
Unknown: ”What Grove giveth, Gates taketh away”
Erno Salminen - Jan. 2014
#38/47 Department of Computer Systems
Principles of modern SoC design Adopt system-level design
Use models prior to implementation Seek global optimum instead of local
Extensive reuse (of IP components) Use pre-designed and pre-verified components instead of
implementing from scratch Leads to shorter time-to-market
Hardware/Software co-design Software can be tested with simulation/emulation before HW
has been implemented Need change in SW programming paradigm, languages
and tools SW must be designed as concurrent instead of sequential
Need change in education Tenhunen’s law ”The number of courses needed to understand
digital systems doubles every decade” Use formal models more(not on this course, though)
Erno Salminen - Aug. 2012
Department of Computer Systems
1b. Comparing the technologies
Gizmotech
Kludgetech
#40/47 Department of Computer SystemsErno Salminen - Jan. 2014
Comparison criteriaArea (Size, silicon real-estate): [mm2], [eq. gates]
Speed (Performance): [MHz], operations/second [op/s] Time required to perform a task, [s]
Power consumption, [mW]Cost, [€]Design effort, [person-month]Life-cycle of COTS components
[years] Commercial off-the-self
#41/47 Department of Computer SystemsErno Salminen - Jan. 2014
Std cell ASIC versus FPGA1. Area [1]
ASIC is smaller since the cells and interconnect are customized
FPGA has overhead for programmability and capacity cannot be completely utilized
Roughly: FPGA area is approximately 35x using the LUT-based logic elements However, that is not directly seen by FPGA end users –
high volume compensates some costs ($$)2. Performance [1]
Roughly: ASIC has 3.4 - 4.6x frequency compared to FPGA
3. Power [2] ASIC is bettter, the ratio ~10x
[1] I. Kuon and J. Rose, "Measuring the Gap between FPGAs and ASICs" in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 - 215.
[2] John Blyler, Navigating the Silicon Jungle: FPGA or ASIC?, June / July 2005 issue of Chip Design Magazine, [online]: http://chipdesignmag.com/display.php?articleId=115&issueId=11
#42/47 Department of Computer SystemsErno Salminen - Jan. 2014
Cost of Integrated Circuits Types of cost:1. Chip costs NRE (Non-Recurring Engineering) cost: one-
time, per-design cost Part cost: per-unit cost
2. Indirect design costs Lead time: time to get the chip out of the factory Time-to-market “cost” loss of revenue
Standard cell: high NRE, small part cost and large lead time
FPGA: low NRE, large part cost and small lead time
#43/47 Department of Computer SystemsErno Salminen - Jan. 2014
ASIC cheaperFPGA cheaper
Cost of Integrated Circuits (2) For ASIC, first-time-right is necessary FPGA has lower NRE, but higher RE
Suitable for low volumes Break even volume getting bigger all the time
cost
[€]
#chips
break even
ASIC
FPGA
trend
Xilinx Inc.
faster growth rate than with ASIC
#44/47 Department of Computer SystemsErno Salminen - Jan. 2014
Summary of technologies
Trade-off between optimal use of hardware resource and design effort/cost
No single best technology
#45/47 Department of Computer Systems
Heinrich Meyr, Future Wireless Communication Systems…, VTC, 2005. (Figure data by T.Noll T.Noll, RWTH Aachen)http://www.ieeevtc.org/vtc2005spring/presentations/2020_presentations/HMeyr.pdf
General-purpose CPU
DSP
FPGA, ASIP
std-cell ASIC
full custom ASIC
General-purpose CPU
DSP
FPGA, ASIP
ASIC
Java, Python
Architectue choice makes big difference
[log scale]
[log
scal
e]
Erno Salminen - Jan. 2014
Department of Computer Systems
Conclusions
#47/47 Department of Computer SystemsErno Salminen - Jan. 2014
Conclusions Two viable implementation technologies: std cell
ASIC and FPGA ASICs are smaller in area and faster than FPGA ASICs have low unit cost but high NRE, FPGA vice
versa ASICs used in high volume products, FPGAs in
tailorable products FPGA is a ”programmable ASIC” (custom IC,
actually) i.e. someone (Altera, Xilinx etc.) has done an IC the
application of which is FPGA Extra resources needed to provide in-field configuration
Many chips include several programmable processors
Department of Computer Systems
For selfstudy
#49/47 Department of Computer Systems
Multiprocessor is mainstream in SoC
6:
[Herzen, Lerer, Grand Challenges…, Design, Applications, Integration and Software, 2006][Turley, Survey says: software tools more important than chips, Embedded Systems Design, 04/11/05]
32-bit processors are the most popular (> 55%)
#CPUs has increased since 2005 SoC frequencies are much lower
than high-end CPUs
Erno Salminen - Aug. 2012
#50/47 Department of Computer Systems
Manycore chips are here today Increase the number processors on chip On-chip parallel computer
On market: 2,4,8 CPU cores per chip Not accounting
GPUs! Coming: tens or
hundres CPUs per chip
A 36, 48, 80-core demo chips from Intel, 64-core and 100-core chips from Tilera on market
Erno Salminen - Aug. 2012
#51/47 Department of Computer Systems
iPhone 4(S) teardown iPhone 4S
Motherboard
Motheboard – the other side
Sources: http://www.appleinsider.com/print/11/10/13/teardown_of_apples_iphone_4s_reveals_larger_battery_new_baseband_chip.html
Erno Salminen - Jan. 2014
#52/47 Department of Computer Systems
iPhone 4s SoCs
Erno Salminen - Jan. 2014
Apple A5: 45nm, 122 mm2, 800MHz- 1 GHz, est. 15M tran, 50 mm2, includes dual-core ARM Cortex A9 with NEON SIMD accelerator and PowerVR graphics prcoessor, (power <1 W??)+533 MHz 512 MiB DDR2 in the same packageFigure: http://www.electronics-lab.com/blog/?p=10110http://en.wikipedia.org/wiki/Apple_A45
Qualcomm MDM6600, 45 nm, 512 MHz, ARM1136JS 32+32 KB L1, 256 KB L2; 147 MH< Application DSP, 162 MHz Modem DSP, 160 MHz 16-b DDR interface (power ~ tens of mW?)http://www.scribd.com/doc/54154049/80-Vr001-1-Mdm6200-and-Mdm6600-Mobile-Data-Modem-Device-Specification-Advance-Information
#53/47 Department of Computer Systems
iPhone 4 SoCs
A4: 45nm, 800MHz- 1 GHz, est. 15M tran, 50 mm2, includes dual-issue superscalar ARM Cortex 8 and PowerVR graphics prcoessorFigure: http://techon.nikkeibp.co.jp/article/HONSHI/20100727/184585/?P=2http://en.wikipedia.org/wiki/Apple_A4, http://en.wikipedia.org/wiki/Apple_Axhttp://en.wikipedia.org/wiki/Samsung_Hummingbirdhttp://en.wikipedia.org/wiki/ARM_Cortex-A8, http://en.wikipedia.org/wiki/PowerVRhttp://pdadb.net/index.php?m=cpu&id=a40000&c=samsung-intrinsity_apple_a4_s5pc110a01
XMM 6180 baseband platform, 65 nm, ARM1176 @ 416MHz, supports HSDPA/HSUPA, WCDMA, EDGE, speech(aka. Infineon 337S3394 WEDGE baseband, marked SP836175 G0822, nowadays probably called intel XMM 6180)http://www.theiphonewiki.com/wiki/index.php?title=XMM_6180http://www.infineon.com/cms/en/corporate/press/news/releases/2008/INFCOM200805-068.html
Check out also the newer chip SDR20: [U. ramcher et al., Architecture and implementation of a Software-Defined Radio baseband processor , ISCAS, 2011] and http://www.teknologisk.dk/_root/media/34851_3_SDR_Infineon.pdf
Erno Salminen - Jan. 2014
#54/47 Department of Computer Systems
Table for iPhone 4
SoC
SoC, later replaced byQualcomm baseband
SoC
Expensive part
Erno Salminen - Jan. 2014
#55/47 Department of Computer Systems
Expensive part
Table for iPhone 4
Erno Salminen - Jan. 2014
#56/47 Department of Computer Systems
iPhone 4 cost teardownMost profit is made with top models
Together with mindless hype and (annoying) bundle sales
E.g. consider Flash memory costs 1.2 $/GB as a chip 4-6 $/GB inside iPhone
Development, SW etc. costs not included in table
Sources:http://www.isuppli.com/Teardowns/News/Pages/iPhone-4S-Carries-BOM-of-$188,-IHS-iSuppli-Teardown-Analysis-Reveals.aspxhttp://www.isuppli.com/Teardowns/News/Pages/iPhone-4-Carries-Bill-of-Materials-of-187-51-According-to-iSuppli.aspx
Cost category 16 GB 32 GB 64 GBRetail price w/ contract, [$] 199 299 399Total BOM cost, [$] 188 207 245 of which NAND Flash, [$] 19 38 77
Manufacturing cost, [$] 8 8 8Price ‐ Bom ‐ manufac., [$] 3 84 146
Erno Salminen - Jan. 2014