Technology Mapping - Computer Aided Logic Design - Lecture Slides | ECE 474A, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Professor: Lysecky; Class: Computer-Aided Logic Design; Subject: Electrical & Computer Engr; University: University of Arizona; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-dzn
koofers-user-dzn 🇺🇸

9 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
ECE 474a/575a
Susan Lysecky 1of 36
ECE 474A/57A
Computer-Aided Logic Design
Lecture 15
Technology Mapping
ECE 474a/575a
Susan Lysecky 2of 36
Technology Mapping
Previously we performed logic minimization
Minimized size of equation and/or number of
literals
Didn’t consider how we will implement the
circuit
Technology mapping
Transforms technology independent logic
network into gates implemented with a
technology library
standard cell library, gate array library, look-up
tables (for FPGAs)
F = w’z’+ wz + yz
F(w, x, y, z) = w’x’y’z’ + w’x’yz + w’ x’yz’ +
w’xy’z’+ w’xyz + w’xyz’ +
wxy’z + wxyz + wx’y’z + wx’yz
F = a + c’
F(a, b, c) = a + a’b’c’ + bc’
F = hi’j’k’+ h’i’ + jk
F(h, i, j, k) = h’i’j’k’+ hi’j’k’ + h’i’k + h’i’j + jk
F = w’z’+ wz + yz
F = a + c’
F = hi’j’k’+ h’i’ + jk
ECE 474a/575a
Susan Lysecky 3of 36
Technology Mapping Example
Standard cell ASIC technology
Uses library of pre-layed-out gates or small pieces technology known as cells
Designer instantiates and connects these cell to implement a digital circuit
library 1
library 2
F = abcd + a’b’
a
b
c
d
a
b
F
a
b
c
d
a
b
F
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Technology Mapping - Computer Aided Logic Design - Lecture Slides | ECE 474A and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

ECE 474a/575a Susan Lysecky

1 of 36

ECE 474A/57A

Computer-Aided Logic Design

Lecture 15

Technology Mapping

ECE 474a/575a Susan Lysecky

2 of 36

Technology Mapping

ƒ Previously we performed logic minimization

ƒ Minimized size of equation and/or number of literals ƒ Didn’t consider how we will implement the circuit

ƒ Technology mapping

ƒ Transforms technology independent logic network into gates implemented with a technology library ƒ standard cell library, gate array library, look-up tables (for FPGAs)

F = w’z’ + wz + yz

F(w, x, y, z) = w’x’y’z’ + w’x’yz + w’x’yz’ + w’xy’z’ + w’xyz + w’xyz’ + wxy’z + wxyz + wx’y’z + wx’yz

F = a + c’

F(a, b, c) = a + a’b’c’ + bc’

F = hi’j’k’ + h’i’ + jk

F(h, i, j, k) = h’i’j’k’ + hi’j’k’ + h’i’k + h’i’j + jk F = w’z’ + wz + yz F = a + c’ F = hi’j’k’ + h’i’ + jk

ECE 474a/575a Susan Lysecky

3 of 36

Technology Mapping Example

ƒ Standard cell ASIC technology

ƒ Uses library of pre-layed-out gates or small pieces technology known as cells ƒ Designer instantiates and connects these cell to implement a digital circuit

library 1

library 2

F = abcd + a’b’

a b c d a b

F

ab cd

a b

F

ECE 474a/575a Susan Lysecky

4 of 36

Technology Mapping Phases

ƒ Technology mapping broken down into 3 phases

ƒ Decomposition

ƒ Translate logic network into some primitive cells to create a simple

structure that aids the mapping process

ƒ Pattern Matching

ƒ Analysis on the circuit library to determine the set of matches for all nodes

in the circuit

ƒ Covering

ƒ Identify the best possible matches (based on particular cost function) for

logic network so that every node is covered at least once and the

functionality is maintained

ECE 474a/575a Susan Lysecky

5 of 36

Decomposition

ƒ Depending on the input function, circuit may need to be modified

ƒ Cell library typically limited to a few primitives functions with few inputs

ƒ AND, OR, INV (2-input only)

ƒ NAND (2-, 3-, 4-input)

ƒ NOR (2-, 3-, 4-input)

ƒ Most practical cell libraries limit primitives to 2-input NAND gates and INV

ƒ 2-input NOR gates work equally

well

Library 1 Library 2

a b

c d

F

c

ECE 474a/575a Susan Lysecky

6 of 36

Why NAND and NOR Gates?

CMOS Transistor Level Gate Implementation

ƒ At the low level NAND/ NOR gates are require fewer transistors than AND/OR

gates and are more desirable to use

a (^) F b

a F b

a F (^) b

a F b

a F

F

0

x

1

NOT gate NOR gate

F

0

1

x y

x y

OR gate

F

0

1

0

1

x y

x y

NAND gate

F

0

1

x y

x y

AND gate

0

1

x y

x y

F

0

1

ECE 474a/575a Susan Lysecky

10 of 36

Decomposition

ƒ Many ways to decompose logic network

ƒ Based on delay, size, power, etc.

ƒ Typically not limited to 1 decomposition

ƒ Multiple versions provided to next step to allow more

opportunity for pattern matching and covering phases

ƒ Additional strategy to increase potential pattern

matches – add cascading inverters

ƒ Potential to be helpful ƒ Make sure leftover cascading inverters covered by wires

ƒ Decomposition phase greatly impacts resulting circuit

ƒ Tremendous work that considers timing-, layout-, area-

driven decomposition techniques

ECE 474a/575a Susan Lysecky

11 of 36

Pattern Matching

ƒ Several pattern matching techniques available

ƒ Combination of matchers frequently used

ƒ Structural matcher

ƒ Pure structural isomorphic tests

ƒ Boolean matcher

ƒ Uses BDDs to find matches based on logic

functions

ƒ Independent of the actual decomposition

ƒ PLA matcher

ƒ Similar to Boolean matcher, differs in the

representation of the function

ƒ Utilizes truth tables of gates (AND, OR)

1 0

a

d d

c T

E

T E T

E

T

E

f

b b T E ET c E (^) T 1 0

b ET

c T

E

a

c d F

c c

a c

ECE 474a/575a Susan Lysecky

12 of 36

Pattern Matching

PLA Matcher – AND/OR Truth Table Representation

ƒ Programmable logic array (PLA)

ƒ Set of programmable AND array linked to programmable OR array

ƒ AND Table mimics AND array

ƒ Table as wide as number of entries ƒ 1 = term positive, 0 = term negative, X = term doesn’t apply

ƒ OR Table mimics OR array

ƒ Table as wide as number of outputs ƒ 1 = term belong to output, 0 = term doesn’t belong to output

F

abc

G

F = abc’ + b’a’

G = ac + a’c’

abc’ b’a’ ac a’c’

AND Table

f g 10 10 01 01

abc 110 00 X 1 X 1 0 X 0

abc’ b’a’ ac a’c’

OR Table

ECE 474a/575a Susan Lysecky

13 of 36

Covering

ƒ After network decomposed and patterns

generated need to select subset of

patterns so entire network is covered

ƒ Objective function indicates which subset to choose (delay, cost, reliability, power)

ƒ Represent these choices with covering

matrix

ƒ Rows represent nodes in graph ƒ Column represent pattern matches ƒ “1” in matrix signifies node in row covered by pattern in column

ƒ How to choose subset of columns?

ƒ Find set of columns so all rows covered ƒ Choose columns to obtain minimal cost ƒ Each match has inputs available from output of other matches

P 1 P 2 Pn N 1

Nm

N 2

patterns found

nodes in logic

network

Does this look familiar?

Binate covering problem

We already know this is hard to find

exact solution – look at heuristics

ECE 474a/575a Susan Lysecky

14 of 36

Covering

Dynamic Programming

ƒ Propose mapping method based on tree- covering

ƒ Network partitioned into forest of trees

ƒ Solve problem by solving for each tree

ƒ Stitch trees back together

ƒ Motivated by existence of efficient dynamic

programming algorithm available

ƒ General idea of dynamic programming

ƒ Optimal substructure

ƒ Overlapping sub-problems

ƒ Memoization

optimal solutions of sub-problems can be used to find the optimal solutions of the overall problem

same sub-problems are used to solve many different larger problems.

savings and re-use of already-computed sub-solutions

ECE 474a/575a Susan Lysecky

15 of 36

Example 3

Dynamic Programming

Cover following circuit using dynamic programming approach cell library

INV (1)

numbers in parenthesis represent cost

b

NAND2 [a] (2)* NAND2 (2)

NAND3 (3)

AOI21 (3)

a

d

g

c

e f

j

i

h ƒ Start with leaves to root ƒ Given node, look for best cover of subtree based on node

ƒ Node a ƒ NAND2 is only match, cost of 2 ƒ *** indicates best solution for subtree** ƒ [] indicates which nodes are covered by pattern

ECE 474a/575a Susan Lysecky

19 of 36

Example 3

Dynamic Programming

Cover following circuit using dynamic programming approach

cell library

INV (1)

numbers in parenthesis represent cost

b

NAND2 [a] (2)* NAND2 (2)

NAND3 (3)

AOI21 (3)

a

d

g

c e f

j

i

h

NAND2 [i] (5) NAND3 [i, h, g] (3)*

INV [b] (1)*

INV [d] (1)*

NAND2 [g] (2)* (^) INV [h] (3)*

NAND2 [c] (5)*

NAND2 [e] (8) INV [f] (9) AOI21 [f, e, c, d] (6)**

ƒ Node h ƒ INV is only match, cost of 3 (2 + 1)

ƒ Node i ƒ NAND2 is a possible match, cost is 5 ƒ NAND3 is a possible match, cost is 3

ECE 474a/575a Susan Lysecky

20 of 36

Example 3

Dynamic Programming

Cover following circuit using dynamic programming approach cell library

INV (1)

numbers in parenthesis represent cost

b

NAND2 [a] (2)* NAND2 (2)

NAND3 (3)

AOI21 (3)

a

d

g

c

e f

j

i

h

NAND2 [i] (5) NAND3 [i, h, g] (3)*

INV [b] (1)*

INV [d] (1)*

NAND2 [g] (2)* (^) INV [h] (3)*

NAND2 [c] (5) NAND2 [e] (8) INV [f] (9) AOI21 [f, e, c, d] (6)*

NAND2 [j] (11) NAND3 [j, f, e] (12)*

ƒ Node j ƒ NAND2 is a possible match, cost is 11 ƒ NAND3 is a possible match, cost is 12

ECE 474a/575a Susan Lysecky

21 of 36

Example 3

Dynamic Programming

Cover following circuit using dynamic programming approach cell library

INV (1)

numbers in parenthesis represent cost

b

NAND2 [a] (2)* NAND2 (2)

NAND3 (3)

AOI21 (3)

a

d

g

c

e f

j

i

h

NAND2 [i] (5) NAND3 [i, h, g] (3)*

INV [b] (1)*

INV [d] (1)*

NAND2 [g] (2) INV [h] (3)**

NAND2 [c] (5) NAND2 [e] (8) INV [f] (9) AOI21 [f, e, c, d] (6)*

NAND2 [j] (11) NAND3 [j, f, e] (12)*

ƒ Now we backtrack starting from root and choose the best covering seen

ƒ At node j, NAND2 gives best cost

ECE 474a/575a Susan Lysecky

22 of 36

Example 3

Dynamic Programming

Cover following circuit using dynamic programming approach

cell library

INV (1)

numbers in parenthesis represent cost

b

NAND2 [a] (2)* NAND2 (2)

NAND3 (3)

AOI21 (3)

a

d

g

c e f

j

i

h

NAND2 [i] (5) NAND3 [i, h, g] (3)*

INV [b] (1)*

INV [d] (1)*

NAND2 [g] (2)* (^) INV [h] (3)*

NAND2 [c] (5)*

NAND2 [e] (8) INV [f] (9) AOI21 [f, e, c, d] (6)**

NAND2 [j] (11) NAND3 [j, f, e] (12)*

ƒ At node f, AOI21 gives best cost ƒ At node a, NAND2 gives best cost ƒ At node b, INV gives best cost ƒ At node i, NAND3 gives best cost

ƒ FINAL COST = 11

ECE 474a/575a Susan Lysecky

23 of 36

Conclusion

ƒ Technology mapping phases

ƒ Decomposition

ƒ Pattern Matching

ƒ Covering

ƒ Only considered a few techniques

ƒ Many more exist with alternative constraints

ƒ Delay, Power, Reliability, Layout, Congestion, etc.

ƒ Others look at two phase technology mapping

ƒ Simplify technology mapping by finding a minimal area solution ƒ Apply post technology mapping transformations to customize solution to specific interest

ECE 474a/575a Susan Lysecky

24 of 36

ECE 474A/57A

Computer-Aided Logic Design

Lecture 16

Course Summary and Additional Topics

ECE 474a/575a Susan Lysecky

28 of 36

Pipelining

W1 D1 W2 D2 W3D

Without pipelining:

With pipelining:

“Stage 1”

“Stage 2”

Time

W

D

W

D

W

D

a

ƒ Intuitive example: Washing dishes with a

friend, you wash, friend dries

ƒ You wash plate 1

ƒ Then friend dries plate 1,while you wash

plate 2

ƒ Then friend dries plate 2, while you wash plate 3; and so on ƒ You don’t sit and watch friend dry; you start on the next plate

ƒ Pipelining: Break task into stages, each

stage outputs data for next stage, all stages

operate concurrently (if they have data)

Digital Design Copyright © 2006 Frank Vahid

ECE 474a/575a Susan Lysecky

29 of 36

Concurrency

ƒ Concurrency : Divide task into subparts,

execute subparts simultaneously

ƒ Dishwashing example: Divide stack into 3 substacks, give substacks to 3 neighbors, who work simultaneously -- 3 times speedup (ignoring time to move dishes to neighbors' homes) ƒ Concurrency does things side-by-side; pipelining instead uses stages (like a factory line)

Task

Concurrency Pipelining

Can do both, too

Digital Design Copyright © 2006 Frank Vahid

ECE 474a/575a Susan Lysecky

30 of 36

Component Level Optimization and Tradeoffs

delay

carry-select

carry-

ripple

carry-lookahead

multilevel

carry-lookahead

size

ƒ Designer picks the adder that satisfies

particular delay and size requirements

ƒ May use different adder types in different parts of same design ƒ Faster adders on critical path, smaller adders on non-critical path

FA

a

co s

b

FA

a0b0 ci

FA

a

s2 s1 s

b

FA

a1b 4-bitCLA logic

4-bitCLA logic

4-bitCLA logic

4-bitCLA logic

4-bitCLA logic

4-bitCLA logic

4-bitCLA logic

4-bitCLA logic 2-bitCLA logic

4-bitCLA logic

4-bitCLA logic

P G c SPG block

P

P P

GP^ P^ P^ P^ P^ P^ P

G G

cG^ G^ G^ G^ G^ G^ G

c c

c c c c c c c

carry-ripple adder carry-lookahead adder

Digital Design Copyright © 2006 Frank Vahid

ECE 474a/575a Susan Lysecky

31 of 36

Power Optimization

ƒ Until now, we’ve focused on size and delay

ƒ Power is another important design criteria

ƒ Measured in Watts (energy/second) ƒ Rate at which energy is consumed

ƒ Increasingly important as more transistors fit on a chip

ƒ Power not scaling down at same rate as size ƒ Means more heat per unit area – cooling is difficult ƒ Coupled with battery’s not improving at same rate ƒ Means battery can’t supply chip’s power for as long ƒ CMOS technology: Switching a wire from 0 to 1 consumes

power (known as dynamic power )

ƒ P = k * CV^2 f ƒ k: constant; C: capacitance of wires; V: voltage; f: switching frequency ƒ Power reduction methods ƒ Reduce voltage: But slower, and there’s a limit ƒ What else?

energy (1=value in 2001)

(^8)

4

2 1

battery energy density

energy demand

2001 03 05 07 09

Digital Design Copyright © 2006 Frank Vahid

ECE 474a/575a Susan Lysecky

32 of 36

Power Optimization using Clock Gating

ƒ P = k * CV^2 f

ƒ Much of a chip’s switchingf (>30%) due to

clock signals

ƒ After all, clock goes to every register ƒ Portion of FIR filter shown on right ƒ Notice clock signals n1, n2, n3, n

ƒ Solution: Disable clock switching to registers

unused in a particular state

ƒ Achieve using AND gates ƒ FSM only sets 2nd^ input to AND gate to 1 in those states during which register gets loaded

ƒ Note: Advanced method, usually done by

tools, not designers

ƒ Putting gates on clock wires creates

variations in clock signal ( clock skew); must

be done with great care

yreg

xt0 c0 xt1 c1 xt2 c X

x_ld

y_ld

clk n1 n2 n3 n

yreg

X xt0^ c0 xt1^ c1 xt2 c

x_ld

y_ld

n2 n3 n

n clk

clk n1, n2, n n

Much switching on clock wires

clk n1, n2, n n

Greatly reduced switching – less power

s

s

Digital Design Copyright © 2006 Frank Vahid

ECE 474a/575a Susan Lysecky

33 of 36

Power Optimization

Low-Power Gates on Non-Critical Paths

ƒ Another method: Use low-power gates

ƒ Multiple versions of gates may exist ƒ Fast/high-power, and slow/low-power, versions ƒ Use slow/low-power gates on non-critical paths ƒ Reduces power, without increasing delay

g

f

e

d

c

a b

F

26 transistors 3 ns delay 5 nanowatts power

nanowatts nanoseconds (^) gf

e

d

c

a b

F

26 transistors 3 ns delay 4 nanowatts power

low-power gates delay

high-power gates low-power gates on non-critical path

size

Digital Design Copyright © 2006 Frank Vahid