Download Digital Signal Processing for FPGA: Altera Stratix and Virtex II Pro and more Slides Electronics engineering in PDF only on Docsity!
DSP for FPGA
SYSC5603 (ELG6163) Digital Signal Processing
Microprocessors, Software and Applications
Objectives
• Comparison between PDSP and FPGA
• Virtex II Pro
• Altera Stratix FPGA
• Stratix DSP Block and its configuration
• Altera design flow
Why Use FPGAs in DSP Applications?
- 10x More DSP Throughput Than DSP
Processors
- Parallel vs. Serial Architecture
- Cost-Effective for Multi-Channel
Applications
- Flexible Hardware Implementation
- Single-Chip Solution
- System (Hardware/Software) Integration
Benefits
FPGA
Software
Embedded
Processor
FPGA
DSP System
Software
DSP
MAC MAC
MAC MAC
Can implement hundreds of MAC
functions in an FPGA
Parallel implementation allows for
faster throughput
- 200 Tap FIR Filter would need 1 clock cycle per sample
1-8 Multipliers
− Needs looping for more than 8 multiplications
Needs multiple clock cycles
because of serial computation
− 200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor
MAC MAC MAC MAC MAC MAC MAC MAC
MAC MAC MAC MAC MAC MAC MAC MAC
MAC MAC MAC MAC MAC MAC MAC MAC
MAC MAC MAC MAC MAC MAC MAC MAC
High Speed DSP
Processor
High Level of Parallel
Processing in FPGA
DSP Processors vs. FPGAs
Data Programmable DSP Processors Reconfigurable DSP
Benefits • Easy to Use
- Programmed Via C-Code or Assembly
- Fast Development Time
- Easy to Use
- Programmed via C-Code, Assembly, or HDL
- Efficient for Recursive Algorithms Using DSP IP Cores
- Higher Levels of Integration
Weaknesses • Fixed Architecture
- Inefficient for Highly Recursive Algorithms Unless Hardware Accelerated
- Potential Bus Bottlenecks
- Other Devices (FPGAs) Often Used on Board for Other Functions - Longer Development Time (But Getting Shorter!)
Comparison of DSP Devices
Objectives
• Comparison between PDSP and FPGA
• Virtex II Pro
• Altera Stratix FPGA
• Stratix DSP Block and its configuration
• Altera design flow
Memory Bandwidth Summary
Stratix Device Family [1]
Device Total RAM
Bits
M-RAM
Blocks
M4K
Blocks
M
Blocks
Maximum
Bandwidth
(Mbps)
EP1S10 920,448 1 60 94 1,245,
EP1S20 1,669,248 2 82 194 2,096,
EP1S25 1,944,576 2 138 224 2,894,
EP1S30 3,317,184 4 171 295 3,750,
EP1S40 3,423,744 4 183 384 4,384,
EP1S60 5,215,104 6 292 574 6,762,
EP1S80 7,427,520 9 364 767 8,784,
Logic Element (LE) [2]
Sync Load
& Clear
Logic
D
DATA
4-Input LUT
Register Control Signals
Register Chain Input
Register Chain Output
LUT Chain Output
data
data
data
data
cin
Row, Column
& DirectLink
Routing
Local Routing
_Note:
- Functional Diagram Only. Please See Datasheet for more Details.
- Addnsum & data1 connected via XOR logic_
LUT Chain Input
Register
Feedback
addnsub
(2)
Logic Array Blocks (LAB) [2]
• 10 LEs
• Local Interconnect
• LAB-Wide Control Signals
LE
LE
LE
LE
LE
LE
LE
LE
LE
LE
Control
Signals
Local Interconnect
30 LAB Input Lines
10 LE Feedback Lines
Avalon Switch Fabric Contents
• Avalon Switch Fabric provides the following to
peripherals it connects
– Data-Path Multiplexing
– Address Decoding
– Wait-State Generation
– Dynamic Bus Sizing
– Interrupt-Priority Assignment
– Latent Transfer Capabilities
– Streaming Read and Write Capabilities
• Avalon Switch Fabric tailors transactions to the
characteristic of peripherals that are attached
Data Path Multiplexing & Slave Arbitration
DMA Controller With Streaming
Control Port (Slave)
Read Port (Master – Streaming)
Write Port (Master – Streaming)
Instruction UART Memory 32- bit Data path
Avalon Switch Fabric
Arbiter
Avalon Tri-State Bridge
VGA Controller
External FLASH 1 MB 16-bit Datapath
External SRAM 256 KB 32-bit Datapath
Inst Master
Data Master
CPU 32 Bit
Data Memory 32-bit Data path
MUX
1. Data-Path Multiplexing
2- Slave Arbitration
3- Address Decoding
Objectives
• Comparison between PDSP and FPGA
• Virtex II Pro
• Altera Stratix FPGA
• Stratix DSP Block and its configuration
• Altera design flow