Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Enhancements to Von Neumann Computer: Timing, Memory, Prediction, and Pipelining, Slides of Digital Logic Design and Programming

Birla Institute of Technology and Science Digital Logic Design and Programming

Various techniques to improve the performance of the von neumann stored program computer. Topics include functional timing, memory architecture, algorithmic branch prediction, and pipelining. Functional timing involves adding a delay to the clock to allow for multiple register transfers in one clock cycle. Memory architecture focuses on the use of cache memory to reduce memory access time. Algorithmic branch prediction allows the memory controller to load up a portion of cache with potentially executed code based on branch predictions. Pipelining is a method to execute instructions in parallel, reducing overall execution time.

Typology: Slides

2012/2013

Uploaded on 03/18/2013

luucky 🇮🇳

4.5

(2)

86 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

Sequential Logic Design

Lecture #32

•Agenda

1. Improvements to the von Neumann Stored Program Computer

•Announcements

1. IDEA evaluations

Docsity.com

Discover Slides of Digital Logic Design and Programming Birla Institute of Technology and Science

Partial preview of the text

Download Enhancements to Von Neumann Computer: Timing, Memory, Prediction, and Pipelining and more Slides Digital Logic Design and Programming in PDF only on Docsity!

Sequential Logic Design

Lecture

• Agenda

1. Improvements to the von Neumann Stored Program Computer

• Announcements

1. IDEA evaluations

von Neumann Computer

• Bottleneck

we have seen that the von Neumann computer is serial in its execution of

instructions

this is good for simplicity, but can limit performance
there are many techniques to improve the performance of this computer

1) Functional Timing

2) Memory Architecture

3) Algorithmic Branch Prediction

4) Pipelines

Functional Timing
- a delay (or phase) can be added to the clock that the B-register sees. This creates a single-shot

structure which executes in 1 cycle

D Q CLKB D Q CLK

A B

(from controller) LOAD AQ BQ

A(0) A(1)

A(0)

t (^) phase CLKA tphase t (^) CQ t (^) CQ

Functional Timing
- this allows multiple register transfers in one clock cycle

ex) Clock 1 : MAR <= PC

Clock 2 : IR <= MAR

PC <= PC + 1

it still takes two clock edges, but due to phasing, these edges occur within one system clock cycle
note that control signals going to the phased block need to be phased also:
- IR_Load
- PC_Inc
the phase timing needs to wait long enough for combinational logic signals to propagate

and for Register delays (Setup/Hold/tCQ)

3-Algorithmic Branch Predicting
algorithms can be developed to “predict” a potential branch
this would allow the memory controller to load up a portion of Cache with the code that could

be potentially executed if the branch was taken

ex) Loop:

DECX

BEQ Loop

BRA Compute

The code for “Loop” is loaded into Cache because it will probably be executed

more than once

the code residing at “Compute” will be executed (once the loop ends), so it is

loaded into an unused portion of Cache in parallel to the execution of “Loop”

Multi-Level Cache
- Multiple levels of cache can be used to try to reduce memory access time even further
  - L1 (Level 1) cache – smallest and fastest
  - L2 (Level 2) cache - a little larger and a little slower than L
    - but still faster than DRAM
- the CPU will always try to execute out of the fastest RAM
- the CPU will

1) check whether the code to be executed is in L1 cache

2) if not, check whether the code is in L2 cache

3) if not, the memory controller will load L1 with the code from DRAM

NOTE: cache will have both Instruction and Data storage

Enhancements to Von Neumann Computer: Timing, Memory, Prediction, and Pipelining, Slides of Digital Logic Design and Programming

Related documents

Partial preview of the text

Download Enhancements to Von Neumann Computer: Timing, Memory, Prediction, and Pipelining and more Slides Digital Logic Design and Programming in PDF only on Docsity!

Sequential Logic Design

Lecture

• Agenda

1. Improvements to the von Neumann Stored Program Computer

• Announcements

1. IDEA evaluations

von Neumann Computer

• Bottleneck

instructions

1) Functional Timing

2) Memory Architecture

3) Algorithmic Branch Prediction

4) Pipelines

structure which executes in 1 cycle

A B

A(0) A(1)

A(0)

ex) Clock 1 : MAR <= PC

Clock 2 : IR <= MAR

PC <= PC + 1

and for Register delays (Setup/Hold/tCQ)

be potentially executed if the branch was taken

ex) Loop:

DECX

BEQ Loop

BRA Compute

more than once

loaded into an unused portion of Cache in parallel to the execution of “Loop”

1) check whether the code to be executed is in L1 cache

2) if not, check whether the code is in L2 cache

3) if not, the memory controller will load L1 with the code from DRAM