Simple Digital Camera - Embedded System Design - Lecture Slides, Slides of Computer Science

These are the Lecture Slides of Embedded System Design which includes Hardware Design, Elevator Controller, Simple Elevator Controller, Try Capturing, Unit Control, Request Resolver, Sequential Program Model, Partial English Description, System Interface etc. Key important points are: v

Typology: Slides

2012/2013

Uploaded on 03/22/2013

dhritiman
dhritiman 🇮🇳

4.7

(6)

106 documents

1 / 21

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Outline
Introduction to a simple digital camera
Designer’s perspective
Requirements specification
Design
Four implementations
1
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15

Partial preview of the text

Download Simple Digital Camera - Embedded System Design - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

Outline

• Introduction to a simple digital camera

• Designer’s perspective

• Requirements specification

• Design

– Four implementations

Introduction

• Putting it all together

– General-purpose processor

– Single-purpose processor

• Custom

• Standard

– Memory

– Interfacing

• Knowledge applied to designing a simple digital

camera

– General-purpose vs. single-purpose processors

– Partitioning of functionality among different processor

types

Designer’s perspective

• Two key tasks

– Processing images and storing in memory

• When shutter pressed:

– Image captured

– Converted to digital form by charge-coupled device (CCD)

– Compressed and archived in internal memory

– Uploading images to PC

• Digital camera attached to PC

• Special software commands camera to transmit archived images

serially

CCD module

  • Simulates real CCD
  • CcdInitialize is passed name of image file
  • CcdCapture reads “image” from file
  • CcdPopPixel outputs pixels one at a time

char CcdPopPixel(void) {

char pixel;

pixel = buffer[rowIndex][colIndex];

if( ++colIndex == SZ_COL ) {

colIndex = 0;

if( ++rowIndex == SZ_ROW ) {

colIndex = -1;

rowIndex = -1;

return pixel;

#include <stdio.h>

#define SZ_ROW 64

#define SZ_COL (64 + 2)

static FILE *imageFileHandle;

static char buffer[SZ_ROW][SZ_COL];

static unsigned rowIndex, colIndex;

void CcdInitialize(const char *imageFileName) {

imageFileHandle = fopen(imageFileName, "r");

rowIndex = -1;

colIndex = -1;

void CcdCapture(void) {

int pixel;

rewind(imageFileHandle);

for(rowIndex=0; rowIndex<SZ_ROW; rowIndex++) {

for(colIndex=0; colIndex<SZ_COL; colIndex++) {

if( fscanf(imageFileHandle, "%i", &pixel) == 1 ) {

buffer[rowIndex][colIndex] = (char)pixel;

rowIndex = 0;

colIndex = 0;

UART module

• Actually a half UART

– Only transmits, does not receive

• UartInitialize is passed name of file to output

to

• UartSend transmits (writes to output file)

bytes at a time

#include <stdio.h>

static FILE *outputFileHandle;

void UartInitialize(const char *outputFileName) {

outputFileHandle = fopen(outputFileName, "w");

void UartSend(char d) {

fprintf(outputFileHandle, "%i\n", (int)d);

CODEC (cont.)

  • Implementing FDCT formula C(h) = if (h == 0) then 1/sqrt(2) else 1. F(u,v) = ¼ x C(u) x C(v) Σx=0..7 Σy=0..7 Dxy x cos(π(2u + 1)u/16) x cos(π(2y + 1)v/16)
  • Only 64 possible inputs to COS , so table can be used to save

performance time

  • Floating-point values multiplied by 32,678 and rounded to nearest integer
  • 32,678 chosen in order to store each value in 2 bytes of memory
  • Fixed-point representation explained more later
  • FDCT unrolls inner loop of summation, implements outer

summation as two consecutive for loops

static short ONE_OVER_SQRT_TWO = 23170; static double COS(int xy, int uv) { return COS_TABLE[xy][uv] / 32768.0; } static double C(int h) { return h? 1.0 : ONE_OVER_SQRT_TWO / 32768.0; }

static const short COS_TABLE[8][8] = { { 32768, 32138, 30273, 27245, 23170, 18204, 12539, 6392 }, { 32768, 27245, 12539, -6392, -23170, -32138, -30273, -18204 },

{ 32768, 18204, -12539, -32138, -23170, 6392, 30273, 27245 }, { 32768, 6392, -30273, -18204, 23170, 27245, -12539, -32138 }, { 32768, -6392, -30273, 18204, 23170, -27245, -12539, 32138 }, { 32768, -18204, -12539, 32138, -23170, -6392, 30273, -27245 }, { 32768, -27245, 12539, 6392, -23170, 32138, -30273, 18204 },

{ 32768, -32138, 30273, -27245, 23170, -18204, 12539, -6392 } };

static int FDCT(int u, int v, short img[8][8]) { double s[8], r = 0; int x; for(x=0; x<8; x++) { s[x] = img[x][0] * COS(0, v) + img[x][1] * COS(1, v) +

img[x][2] * COS(2, v) + img[x][3] * COS(3, v) + img[x][4] * COS(4, v) + img[x][5] * COS(5, v) + img[x][6] * COS(6, v) + img[x][7] * COS(7, v); } for(x=0; x<8; x++) r += s[x] * COS(x, u);

return (short)(r * .25 * C(u) * C(v)); }

CNTRL (controller) module

  • Heart of the system
  • CntrlInitialize for consistency with other modules only
  • CntrlCaptureImage uses CCDPP module to input image and place in buffer
  • CntrlCompressImage breaks the 64 x 64 buffer into 8 x 8 blocks and performs FDCT on each block using the CODEC module - Also performs quantization on each block
  • CntrlSendImage transmits encoded image serially using UART module

void CntrlSendImage(void) { for(i=0; i<SZ_ROW; i++) for(j=0; j<SZ_COL; j++) { temp = buffer[i][j]; UartSend(((char)&temp)[0]); / send upper byte / UartSend(((char)&temp)[1]); /* send lower byte */ } } }

#define SZ_ROW 64 #define SZ_COL 64 #define NUM_ROW_BLOCKS (SZ_ROW / 8) #define NUM_COL_BLOCKS (SZ_COL / 8) static short buffer[SZ_ROW][SZ_COL], i, j, k, l, temp;

void CntrlInitialize(void) {}

void CntrlCaptureImage(void) { CcdppCapture(); for(i=0; i<SZ_ROW; i++)

for(j=0; j<SZ_COL; j++) buffer[i][j] = CcdppPopPixel(); }

void CntrlCompressImage(void) { for(i=0; i<NUM_ROW_BLOCKS; i++) for(j=0; j<NUM_COL_BLOCKS; j++) {

for(k=0; k<8; k++) for(l=0; l<8; l++) CodecPushPixel( (char)buffer[i * 8 + k][j * 8 + l]); CodecDoFdct();/* part 1 - FDCT */

for(k=0; k<8; k++) for(l=0; l<8; l++) { buffer[i * 8 + k][j * 8 + l] = CodecPopPixel(); /* part 2 - quantization / buffer[i8+k][j*8+l] >>= 6; } } }

Putting it all together

• Main initializes all modules, then uses

CNTRL module to capture, compress,

and transmit one image

• This system-level model can be used for

extensive experimentation

– Bugs much easier to correct here rather

than in later models

int main(int argc, char *argv[]) { char *uartOutputFileName = argc > 1? argv[1] : "uart_out.txt"; char imageFileName = argc > 2? argv[2] : "image.txt"; / initialize the modules / UartInitialize(uartOutputFileName); CcdInitialize(imageFileName); CcdppInitialize(); CodecInitialize(); CntrlInitialize(); / simulate functionality */ CntrlCaptureImage(); CntrlCompressImage(); CntrlSendImage(); }

Implementation 1: Microcontroller

alone

• Low-end processor could be Intel 8051 microcontroller

• Total IC cost including NRE about $

• Well below 200 mW power

• Time-to-market about 3 months

• However, one image per second not possible

  • 12 MHz, 12 cycles per instruction
    • Executes one million instructions per second
  • CcdppCapture has nested loops resulting in 4096 (64 x 64) iterations
    • ~100 assembly instructions each iteration
    • 409,000 (4096 x 100) instructions per image
    • Half of budget for reading image alone
  • Would be over budget after adding compute-intensive DCT and Huffman encoding

Implementation 2:

Microcontroller and CCDPP

  • CCDPP function implemented on custom single-purpose processor
    • Improves performance – less microcontroller cycles
    • Increases NRE cost and time-to-market
    • Easy to implement
      • Simple datapath
      • Few states in controller
  • Simple UART easy to implement as single-purpose processor also
  • EEPROM for program memory and RAM for data memory added as well

UART CCDPP

EEPROM RAM

SOC

UART

  • UART in idle mode until invoked
    • UART invoked when 8051 executes store instruction with UART’s

enable register as target address

  • Memory-mapped communication between 8051 and all

single-purpose processors

  • Lower 8-bits of memory address for RAM
  • Upper 8-bits of memory address for memory-mapped I/O

devices

  • Start state transmits 0 indicating start of byte transmission then

transitions to Data state

  • Data state sends 8 bits serially then transitions to Stop state
  • Stop state transmits 1 indicating transmission done then

transitions back to idle mode

invoked

I = 8

I < 8

Idle

I = 0

Start :

Transmi t LOW

Data :

Transmit data(I), then I++

Stop :

Transmi t HIGH

FSMD description of UART

CCDPP

  • Hardware implementation of zero-bias operations
  • Interacts with external CCD chip
    • CCD chip resides external to our SOC mainly because combining CCD with ordinary logic not feasible
  • Internal buffer, B , memory-mapped to 8051
  • Variables R , C are buffer’s row, column indices
  • GetRow state reads in one row from CCD to B
    • 66 bytes: 64 pixels + 2 blacked-out pixels
  • ComputeBias state computes bias for that row and stores in

variable Bias

  • FixBias state iterates over same row subtracting Bias from each

element

  • NextRow transitions to GetRow for repeat of process on next row

or to Idle state when all 64 rows completed

C = 64

C < 64

R = 64 C = 66

invoked

R < 64

C < 66

Idle :

R=

C=

GetRow :

B[R][C]=Pxl C=C+

ComputeBias :

Bias=(B[R][11] + B[R][10]) / 2 C=

NextRow :

R++

C=

FixBias :

B[R][C]=B[R][C]-Bias

FSMD description of CCDPP

Software

  • System-level model provides majority of code
    • Module hierarchy, procedure names, and main program unchanged
  • Code for UART and CCDPP modules must be redesigned
    • Simply replace with memory assignments
      • xdata used to load/store variables over external memory bus
      • at specifies memory address to store these variables
      • Byte sent to U_TX_REG by processor will invoke UART
      • U_STAT_REG used by UART to indicate its ready for next byte
        • UART may be much slower than processor
    • Similar modification for CCDPP code
  • All other modules untouched

static unsigned char xdata U_TX_REG at 65535; static unsigned char xdata U_STAT_REG at 65534; void UARTInitialize(void) {} void UARTSend(unsigned char d) { while( U_STAT_REG == 1 ) { /* busy wait */ } U_TX_REG = d; }

Rewritten UART module

#include <stdio.h> static FILE *outputFileHandle; void UartInitialize(const char *outputFileName) { outputFileHandle = fopen(outputFileName, "w"); } void UartSend(char d) { fprintf(outputFileHandle, "%i\n", (int)d); }

Original code from system-level model

Analysis

  • Entire SOC tested on VHDL simulator
    • Interprets VHDL descriptions and functionally simulates

execution of system

  • Recall program code translated to VHDL

description of ROM

  • Tests for correct functionality
  • Measures clock cycles to process one image

(performance)

  • Gate-level description obtained through synthesis
    • Synthesis tool like compiler for SPPs
    • Simulate gate-level models to obtain data for power

analysis

  • Number of times gates switch from 1 to 0 or 0 to
  • Count number of gates for chip area

Power

VHDL

simulator

VHDL VHDL VHDL

Execution time

Synthesis tool

gates gates gates

Sum gates

Gate level simulator

Power equation

Chip area

Obtaining design metrics of interest