Download Embedded Processors - High-Performance Processors and Systems | ECE 569 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!
Embedded Processors
Lec 21
Embedded systems overview
Computing systems are everywhere
Most of us think of “desktop” computers
- PC’s
- Laptops
- Mainframes
- Servers
But there’s another type of computing system
What are Embedded Systems?
Generic Definition:
“nearly any computing system other than a
desktop computer”
Embedded systems overview
Embedded computing systems
- Computing systems embedded within
electronic devices
- Hard to define. Nearly any computing
system other than a desktop computer
- Billions of units produced yearly,
versus millions of desktop units
- Perhaps 50 per household and per
automobile
Computers are in here... and here...
and even here...
Lots more of these, though they cost a lot less each.
Some categories…
Signal processing systems
- radar, sonar, real-time video, set-top boxes, DVD players, medical equipment, residential gateways
Mission critical systems
- avionics, space-craft control, nuclear plant control
Distributed control
- network routers & switches, mass transit systems, elevators in large buildings
“Small” systems
- cellular phones, pagers, home appliances, toys, smart cards, MP players, PDAs, digital cameras and camcorders, sensors, smart badges
Large dynamic range of attributes and capabilities
A Variety of Application Domains
Hybrid and embedded systems
- Aerospace, automobiles, robotics, process control, sensor networks
Multimedia
- Virtual reality, immersive environment
Consumer electronics
- Mobile phones, office electronics, digital appliances
Network components
- Bridges, routers, switches, hubs
Medical devices and instruments
- Patient monitoring, MRI, infusion pumps, artificial organs
E-business
Distributed and grid computing
- Critical infrastructure defense system, air traffic control, intelligent highway systems, emergence response system
Some common characteristics of
embedded systems
Single-functioned
- Executes a single program, repeatedly
Tightly-constrained
- Low cost, low power, small, fast, etc.
Reactive and real-time
- Continually reacts to changes in the system’s environment
- Must compute certain results in real-time without delay
An embedded system example -- a
digital camera
Microcontroller
A2D^ CCD preprocessor^ Pixel coprocessor^ D2A
JPEG codec
DMA controller
Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
Single-functioned -- always a digital camera
Tightly-constrained -- Low cost, low power, small, fast
Reactive and real-time -- only to a small extent
Why do we care? Some Market Tidbits...
Specialized devices replacing general PC
- set-top boxes, fixed-screen phones, smart mobile phones, iPods, PDAs,etc.
- Vast majority of inter access devices are appliances and not PCs
- In 1997, 96% of internet access devices sold in the US were PCs
- Now, unit shipments of just internet-enabled cells phones exceed PCs
Traditional systems becoming dependent on computation systems
- Modern cars: up to ~100 processors running complex software
- engine & emissions control, stability & traction control, diagnostics, gearless automatic transmission An indicator:
- where are the CPUs being used?
Why do we care?
Some Market Tidbits...in $$$$
General Purpose Computing
- PC Chipsets ($6.9B) expected to grow to $10.3B in 2009
Embedded Systems
- Wi-Fi ($140M) expected to grow 3x by 2009 (mobile PCs, residential networks, cell phones)
- Media Devices (digital TVs, MP3 players, video games consoles)
- ASIC ($209.8M) expected to grow to $2.53B in 2009 (ASSP)
- RFID ($1.3B) expected to grow 25x by 2010
57.0%
10.5%
90.4%
86.3%
31.6%
Compound Annual Growth Rate
Example: Automotive Telematics In 2005, 30-90 processors per car
- Engine control, Break system, Airbag deployment system
- Windshield wiper, door locks, entertainment systems Example: BMW 745i
- 2,000,000 LOC
- Window CE OS
- Over 60 microprocessors
- 53 8-bit, 11 32-bit, 7 16-bit
- Multiple networks
- Buggy? Problems?
- Disparity between the design cycle of a car and the design cycle of embedded components
- Difficult to upgrade
- Not possible to integrate the user’s own devices into a car
Source:Source:^ InsupInsup^ Lee,Lee,^ UPennUPenn
Challenges
Three aspects of embedded system development
- Embedding for smart control
- Creating new computing gadgets
- Connecting the physical world to the computing infrastructure
The goal is to make them invisible cost-effectively!
- Trustworthy: should not fail (or gracefully degrade), and safe to use. The existence of embedded software becomes apparent only when an embedded system fails.
- Context Aware: should be able to sense people, environment, and threats and to plan/notify/actuate responses to provide real-time interaction with the dynamically changing physical environment with limited resources.
- Seamless Integration: should be invisible at multiple levels of a hierarchy: home systems, metropolitan systems, regional systems, and national systems.
Design challenge –
optimizing design metrics
Common metrics
- Unit cost:
- the monetary cost of manufacturing each copy of the system, excluding NRE cost
- NRE cost (Non-Recurring Engineering cost):
- The one-time monetary cost of designing the system
- Size:
- the physical space required by the system
- Performance:
- the execution time or throughput of the system
- Power:
- the amount of power consumed by the system
- Flexibility:
- the ability to change the functionality of the system without incurring heavy NRE cost
Design challenge –
optimizing design metrics
Common metrics (continued)
- Time-to-prototype:
- the time needed to build a working version of the system
- Time-to-market:
- the time required to develop a system to the point that it can be released and sold to customers
- Maintainability:
- the ability to modify the system after its initial release
- Correctness, safety, many more
Design metric competition --
improving one may worsen others
Expertise with both software and hardware is needed to optimize design metrics
- Not just a hardware or software expert, as is common
- A designer must be comfortable with various technologies in order to choose the best for a given application and constraints
Performance Size
Power
NRE cost
Microcontroller
A2D CCD preprocessor^ Pixel coprocessor D2A JPEG codec DMA controller Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
Hardware Software
NRE and unit cost metrics
Costs:
- Unit cost:
- the monetary cost of manufacturing each copy of the system, excluding NRE cost
- NRE cost (Non-Recurring Engineering cost):
- The one-time monetary cost of designing the system
- total cost = NRE cost + unit cost * # of units
- per-product cost = total cost / # of units = (NRE cost / # of units) + unit cost
Losses due to delayed market entry
Simplified revenue model
- Product life = 2W, peak at W
- Time of market entry defines a triangle, representing market penetration
- Triangle area equals revenue
On-time Delayed entry entry
Peak revenue
Peak revenue from delayed entry
Market rise
Market fall
W 2W Time
D
On-time
Delayed
Revenues ($)
Loss -The difference between the on-time and delayed triangle areas
Losses due to delayed market entry (cont.)
Area = 1/2 * base * height
- On-time = 1/2 * 2W * W
- Delayed = 1/2 * (W-D+W)(W-D) Percentage revenue loss = (D(3W-D)/2W^2 )100% Try some examples
On-time Delayed entry entry
Peak revenue
Peak revenue from delayed entry
Market rise
Market fall
W 2W Time
D
On-time
Delayed
Revenues ($)
- Lifetime 2W=52 wks, delay D=4 wks
- Lifetime 2W=52 wks, delay D=10 wks
- (10(326 –10)/2*26^2) = 50%
- Delays are costly!
The Performance design metric
Latency (or response time, or execution time)
- time to complete one task
Bandwidth (or throughput)
- tasks completed per unit time
Widely-used measure of system, widely-abused
- Clock frequency, instructions per second – not good measures
- Digital camera example – a user cares about how fast it processes images, not clock speed or instructions per second
Performance Measurement
Average rate: AAAA > BBBB > CCCC Worst-case rate: AAAA < BBBB < CCCC
Processing Rate (Inputs/Second)
Inputs
C B A
Which is best for desktop performance? _______
Which is best for hard real-time task? _______
Average rates Worst case rates
Power Impacts on Computer System
Energy consumed per task determines battery life
- Second order effect is that higher current draws decrease effective battery energy capacity (higher power also lowers battery life)
Current draw causes IR drops in power supply voltage
- Requires more power/ground pins to reduce resistance R
- Requires thick&wide on-chip metal wires or dedicated metal layers
Switching current (dI/dt) causes inductive power supply voltage bounce ∝ LdI/dt
- Requires more pins/shorter pins to reduce inductance L
- Requires on-chip/on-package decoupling capacitance to help bypass pins during switching transients
Power dissipated as heat, higher temps reduce speed and reliability
- Requires more expensive packaging and cooling systems
- Fan noise
- Laptop/handheld case temperature
Reducing Switching Power
Power ∝ activity * 1/2 CV^2 * frequency
Reduce activity
Reduce switched capacitance C
Reduce supply voltage V
Reduce frequency
Reducing Activity
Clock Gating
- don’t clock flip-flop if not needed
- avoids transitioning downstream logic
- Pentium-4 has hundreds of gated clocks
Global Clock
Gated Local Clock
Enable
D Q
Latch (transparent on clock low)
Bus Encodings
- choose encodings that minimize transitions on average (e.g., Gray code for address bus)
- compression schemes (move fewer bits)
Remove Glitches
- balance logic paths to avoid glitches during settling
- use monotonic logic (domino)
Reducing Switched Capacitance
Reduce switched capacitance C
- Different logic styles (logic, pass transistor, dynamic)
- Careful transistor sizing
- Tighter layout
- Segmented structures
A B C
Bus
Shared bus driven by A or B when sending values to C
Insert switch to isolate bus segment when B sending to C
A B C
Voltage Scaling for Reduced Energy
Reducing supply voltage by 0.5 improves energy per
transition to 0.25 of original
Performance is reduced – need to use slower clock
Can regain performance with parallel architecture
Alternatively, can trade surplus performance for lower
energy by reducing supply voltage until “just enough”
performance
Dynamic Voltage Scaling
“Just Enough” Performance
Save energy by reducing frequency and voltage to
minimum necessary (usually done in O.S.)
t=0 Time t=deadline
Frequency
Run slower and just meet deadline
Run fast then stop
frequency (MHz) VDD (V) Chip energy versus frequency for various supply voltages 0 50 100 150 200 250 300 frequency (MHz) VDD (V) Chip energy versus frequency for various supply voltages 2x Reduction in Supply Voltage 4x Reduction in Energy
- 4. - 3. - 3. - 3. - 2. - 2. - 2. - 2. - 1. - 1.351. - 0.98 1.
- energy per cycle (nJ) - 2. - 2. - 2. - 1. - 1. - 1. - 1. - 1. - 1. - 1. - 1. - 4. [ MIT Scale Vector-Thread Processor, TSMC 0.18μm CMOS process, 2006 ] - 3. - 3. - 3. - 2. - 2. - 2. - 2. - 1.561. - 1.15 1.
- energy per cycle (nJ) - 2. - 2. - 2. - 1. - 1. - 1. - 1. - 1. - 1. - 1. - 1.