Digital Signal Processing for FPGA: Altera Stratix and Virtex II Pro, Slides of Electronics engineering

An in-depth comparison between pdsp and fpga, focusing on the virtex ii pro and altera stratix fpga. It covers the stratix dsp block and its configuration, altera design flow, and the use of fpgas in dsp applications. The document also discusses the memory architecture of altera stratix fpga and the design flow for creating designs using altera libraries.

Typology: Slides

2012/2013

Uploaded on 03/23/2013

dhrupad
dhrupad 🇮🇳

4.4

(17)

213 documents

1 / 47

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DSP for FPGA
SYSC5603 (ELG6163) Digital Signal Processing
Microprocessors, Software and Applications
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f

Partial preview of the text

Download Digital Signal Processing for FPGA: Altera Stratix and Virtex II Pro and more Slides Electronics engineering in PDF only on Docsity!

DSP for FPGA

SYSC5603 (ELG6163) Digital Signal Processing

Microprocessors, Software and Applications

Objectives

• Comparison between PDSP and FPGA

• Virtex II Pro

• Altera Stratix FPGA

• Stratix DSP Block and its configuration

• Altera design flow

Why Use FPGAs in DSP Applications?

  • 10x More DSP Throughput Than DSP

Processors

  • Parallel vs. Serial Architecture
  • Cost-Effective for Multi-Channel

Applications

  • Flexible Hardware Implementation
  • Single-Chip Solution
    • System (Hardware/Software) Integration

Benefits

FPGA
Software
Embedded
Processor
FPGA

DSP System

Software
DSP
MAC MAC
MAC MAC
 Can implement hundreds of MAC
functions in an FPGA
 Parallel implementation allows for
faster throughput
  • 200 Tap FIR Filter would need 1 clock cycle per sample
 1-8 Multipliers

Needs looping for more than 8 multiplications

 Needs multiple clock cycles
because of serial computation

200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor

MAC MAC MAC MAC MAC MAC MAC MAC

MAC MAC MAC MAC MAC MAC MAC MAC

MAC MAC MAC MAC MAC MAC MAC MAC

MAC MAC MAC MAC MAC MAC MAC MAC

High Speed DSP
Processor
High Level of Parallel
Processing in FPGA

DSP Processors vs. FPGAs

Data Programmable DSP Processors Reconfigurable DSP

BenefitsEasy to Use

  • Programmed Via C-Code or Assembly
  • Fast Development Time
    • Easy to Use
    • Programmed via C-Code, Assembly, or HDL
    • Efficient for Recursive Algorithms Using DSP IP Cores
    • Higher Levels of Integration

WeaknessesFixed Architecture

  • Inefficient for Highly Recursive Algorithms Unless Hardware Accelerated
  • Potential Bus Bottlenecks
  • Other Devices (FPGAs) Often Used on Board for Other Functions - Longer Development Time (But Getting Shorter!)

Comparison of DSP Devices

Objectives

• Comparison between PDSP and FPGA

• Virtex II Pro

• Altera Stratix FPGA

• Stratix DSP Block and its configuration

• Altera design flow

Memory Bandwidth Summary

Stratix Device Family [1]

Device Total RAM
Bits
M-RAM
Blocks
M4K
Blocks
M
Blocks

Maximum

Bandwidth

(Mbps)

EP1S10 920,448 1 60 94 1,245,
EP1S20 1,669,248 2 82 194 2,096,
EP1S25 1,944,576 2 138 224 2,894,
EP1S30 3,317,184 4 171 295 3,750,
EP1S40 3,423,744 4 183 384 4,384,
EP1S60 5,215,104 6 292 574 6,762,
EP1S80 7,427,520 9 364 767 8,784,

Logic Element (LE) [2]

Sync Load
& Clear
Logic
D
DATA

4-Input LUT

Register Control Signals

Register Chain Input

Register Chain Output

LUT Chain Output

data

data

data

data

cin

Row, Column
& DirectLink
Routing
Local Routing

_Note:

  1. Functional Diagram Only. Please See Datasheet for more Details.
  2. Addnsum & data1 connected via XOR logic_

LUT Chain Input

Register
Feedback

addnsub

(2)

Logic Array Blocks (LAB) [2]

• 10 LEs

• Local Interconnect

• LAB-Wide Control Signals

LE

LE

LE

LE

LE

LE

LE

LE

LE

LE

Control
Signals

Local Interconnect

30 LAB Input Lines
10 LE Feedback Lines

Avalon Switch Fabric Contents

• Avalon Switch Fabric provides the following to

peripherals it connects

– Data-Path Multiplexing

– Address Decoding

– Wait-State Generation

– Dynamic Bus Sizing

– Interrupt-Priority Assignment

– Latent Transfer Capabilities

– Streaming Read and Write Capabilities

• Avalon Switch Fabric tailors transactions to the

characteristic of peripherals that are attached

Data Path Multiplexing & Slave Arbitration

DMA Controller With Streaming

Control Port (Slave)

Read Port (Master – Streaming)

Write Port (Master – Streaming)

Instruction UART Memory 32- bit Data path

Avalon Switch Fabric

Arbiter

Avalon Tri-State Bridge

VGA Controller

External FLASH 1 MB 16-bit Datapath

External SRAM 256 KB 32-bit Datapath

Inst Master

Data Master

CPU 32 Bit

Data Memory 32-bit Data path

MUX

1. Data-Path Multiplexing
2- Slave Arbitration
3- Address Decoding

Objectives

• Comparison between PDSP and FPGA

• Virtex II Pro

• Altera Stratix FPGA

• Stratix DSP Block and its configuration

• Altera design flow