Boosting Superscalar Performance with Store Sets: Memory Dependence Prediction | Papers Computer Architecture and Organization

12/2006

Implementation of Store Sets for Memory Dependence Prediction

Daniel Chen and George H. Huang

ECE511: Computer Architecture

University of Illinois, Urbana-Champaign, IL

Abstract

Superscalar processors introduce a number

of problems when issuing instructions out-of-order.

Among those problems are register and memory

dependencies. A memory dependence occurs when a

load reads from the same address as a previous

store. The information of the load is dependent on the

store, which means the load must be executed after

the store. One way to increase the throughput of

memory instructions is to use memory dependence

prediction using “store sets.” A store set is the set of

stores that a load instruction has previously

depended on. A processor can discover and use store

sets to predict when a load may be issued with

reduced risk of memory violations. We implemented a

store set system based on the paper by Chrysos and

Emer to investigate the advantage of using store sets

and confirmed that memory prediction using store

sets significantly reduces the number of memory

violations.

1. Introduction

While the widely accepted solution to

overcome register dependencies in out-of-order

processors is register renaming, there is no such

standard for overcoming memory dependencies.

There are two basic approaches to issuing

instructions with memory dependencies: no

speculation and naïve speculation.

A no speculation approach waits until all

previous store instructions have issued before a load

instruction is allowed to issue.

Naïve speculation issues memory

instructions as they arrive, but when stores execute,

they must check if any loads that were dependent on

them have already executed. If so, then a memory

violation has occurred, and execution must restart

from the misspeculated load.

The problem with no speculation is that

while it eliminates memory violations, it results in

poor performance. On the other hand, naïve

speculation, while providing better performance,

suffers from wasted time when recovering from

memory violations.

The goal of store sets is to use previous

memory violation history to predict memory

dependencies and aggressively issue loads by

delaying loads which have dependencies only long

enough to avoid memory violations.

2. Store Sets

A store set is a set of all the stores that a

particular load has ever depended on. This set of

stores is not known by the processor before hand, but

must be discovered.

Every time a memory violation occurs, the

store which caused the violation is added to the

dependent load’s store set. The next time the load is

fetched, the scheduler will make sure all of the stores

in the load’s store set have been issued before it will

allow the load to issue.

In the following short program, each load’s

store set is shown:

Example 2.1:

0 Store A

4 Store B

8 Store C

12 Store A

16 Load A, store set[0,12]

20 Load B, store set[4]

24 Load C, store set[8]

28 Load D, store set[empty]

It is important to note that the example

shows complete store sets. In an actual

implementation, a particular store will only be added

into a store set when it causes a memory violation.

3. Implementation

The implementation of store sets we chose

to use is modeled after the paper by Chysos and

Emer.[1]

Conceptually, a store set structure would be

infinitely large and hold a store set for each load

executed, however building such a structure would

not be practical.

The idea is simplified by using a Store Set

ID Table (SSIT) and Last Fetched Store Table

(LFST). The SSIT is addressed by a hash of the

current PC for load or store instructions. The

Boosting Superscalar Performance with Store Sets: Memory Dependence Prediction, Papers of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Boosting Superscalar Performance with Store Sets: Memory Dependence Prediction and more Papers Computer Architecture and Organization in PDF only on Docsity!

Implementation of Store Sets for Memory Dependence Prediction

Daniel Chen and George H. Huang

ECE511: Computer Architecture

University of Illinois, Urbana-Champaign, IL