ECE 4100/6100 Assignment 1: Optimizing IPC with SimpleScalar - Prof. Sudhakar Yalamanchili, Assignments of Computer Architecture and Organization

An assignment for ece 4100/6100 students in the fall 2006 semester. The assignment involves using simplescalar, a microprocessor model, to evaluate the performance of two benchmarks by optimizing instructions per cycle (ipc) through varying machine parameters such as the number of functional units and branch predictor configurations. Students are required to submit their reports, code segments, and machine configurations via email by september 29, 2006.

Typology: Assignments

Pre 2010

Uploaded on 08/05/2009

koofers-user-ali
koofers-user-ali ๐Ÿ‡บ๐Ÿ‡ธ

9 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 4100/6100 Fall 2006
ECE 4100/6100: Assignment 1
This assignment is to be performed individually. You may discuss generally about methodology,
tool specifics, behaviors one can expect to see, relationships between machine parameters and
application parameters, etc., but the assignment must be done individually. Any clarifications of
goals of the assignment should be addressed to the TA and Prof. Yalamanchili.
Due date: September 29th, 6pm via email.
Submission Instructions: Mail your report, code segments, and machine configuration (if
separate from your report) to the TA ([email protected]). When submitting, please use the
following format as the TA has a filter set up to collect all homeworks into one place:
[ECEX100] Homework 1 โ€“ YOUR NAME
where 'X' is a 6 or 4 and 'YOUR NAME' is your full official name registered with Georgia Tech
(so no nicknames).
Part I: Warmup
In this project you are to use SimpleScalar, a high-level microprocessor model, to evaluate the
performance of two benchmarks. The simulator should be installed and configured for the
execution of pre-compiled binaries gcc (the GNU compiler collection) and gzip (the GNU
archiving tool). Instructions for downloading and configuring the installation to execute the
binaries can be found at http://www.ece.gatech.edu/~gth723m/ECE6100/Lab1/Lab1.html.
Additional documentation and information for SimpleScalar can be found at
http://www.simplescalar.com/.
If you just run
sim-outorder
you will get a list of all the options that can be used to configure the model, and also the default
values used (this information is also available on the SimpleScalar website). Your task is to vary
machine parameters such as number of functional units and the type and configuration of the
branch predictors so as to optimize the IPC for each of the two benchmarks.
You can use the following machines in the CoC 3rd floor or ssh to them in order to run your
experiments.
ccblincad##.ece.gatech.edu (where ## is from 01 to 20)
You cannot use the Solaris machines as PISA compiler in part 2 has been compiled for Linux.
You should provide the following.
pf3

Partial preview of the text

Download ECE 4100/6100 Assignment 1: Optimizing IPC with SimpleScalar - Prof. Sudhakar Yalamanchili and more Assignments Computer Architecture and Organization in PDF only on Docsity!

ECE 4100/6100: Assignment 1

This assignment is to be performed individually. You may discuss generally about methodology,

tool specifics, behaviors one can expect to see, relationships between machine parameters and

application parameters, etc., but the assignment must be done individually. Any clarifications of

goals of the assignment should be addressed to the TA and Prof. Yalamanchili.

Due date: September 29th, 6pm via email.

Submission Instructions: Mail your report, code segments, and machine configuration (if

separate from your report) to the TA ([email protected]). When submitting, please use the

following format as the TA has a filter set up to collect all homeworks into one place:

[ECEX100] Homework 1 โ€“ YOUR NAME

where 'X' is a 6 or 4 and 'YOUR NAME' is your full official name registered with Georgia Tech

(so no nicknames).

Part I: Warmup

In this project you are to use SimpleScalar, a high-level microprocessor model, to evaluate the

performance of two benchmarks. The simulator should be installed and configured for the

execution of pre-compiled binaries gcc (the GNU compiler collection) and gzip (the GNU

archiving tool). Instructions for downloading and configuring the installation to execute the

binaries can be found at http://www.ece.gatech.edu/~gth723m/ECE6100/Lab1/Lab1.html.

Additional documentation and information for SimpleScalar can be found at

http://www.simplescalar.com/.

If you just run

sim-outorder

you will get a list of all the options that can be used to configure the model, and also the default

values used (this information is also available on the SimpleScalar website). Your task is to vary

machine parameters such as number of functional units and the type and configuration of the

branch predictors so as to optimize the IPC for each of the two benchmarks.

You can use the following machines in the CoC 3rd floor or ssh to them in order to run your

experiments.

ccblincad##.ece.gatech.edu (where ## is from 01 to 20)

You cannot use the Solaris machines as PISA compiler in part 2 has been compiled for Linux.

You should provide the following.

1. Using the base machine configuration provided to you and only changing the reorder buffer

size[1], maximize IPC.

2. Once you have fixed the ROB size, explore options with the branch predictors to further

improve IPC as much as you can (you may not use a perfect or perceptron predictor).

Compare the relative impact of the branch predictors on the IPC with plausible explanations

for the behavior (concise explanations).

Part II: Modeling and Design

Now you will explore the space of architecture options and program structure in the presence of

an implementation constraint โ€“ silicon area. Write two C- based kernels: one for matrix

multiplication on 64x64 matrices and one for sorting of 1024 numbers (you can pick any sorting

algorithm you like). Your target is a chip with 200 mm^2 of silicon area. The table below provides

a model for computing the area required of your design. Using SimpleScalar, produce a design

that maximizes the value of IPC/mm^2 when averaged across both kernels.

Table 1: Definition of Chip-Area Model Parameters Parameter Name Corresponding Machine Configuration File Parameter(s) N RS Number of reservation^ stations/execution^ unit[1] N RB Number of reorder^ buffer^ entries[1] L (# lanes) Superscalar^ factor U I Number of integer^ units U FP Number of floating^ point^ units U M Number of memory^ units

EBR Number of words^ in^ branch^ predictor^ (word^ =^16 bits)

Table 2: Chip-Area Cost Functions for CPU Components CPU Component Chip-Area Cost Function (mm^2 ) Reorder Buffer 0.04 * N RB Integer Units 2.0 * U I FP Arithmetic Units 2.5 * U FP Memory Units 3.0 * U M Branch Predictor 0.04 * EBR Reservation Station (0.04 * L + 0.035 + 0.065 * ( U I + U FP + U M)) * N RS Do not provide data from every simulation you performed to obtain an optimized design. Your report should include a concise description of each of the following elements.

  1. Outline your strategy for exploring this design space and describe different simulations that you ran in a manner that makes clear the path you explored towards you final solution.
  2. List the design tradeoffs you made, e.g., additional functional units rather than size of the reorder buffer and the reasons for the same.
  3. Provide details of the final design, i.e., machine configuration, and submit the final code segments. Note that you may modify the sources, e.g., unroll the loops.
  4. Provide the final breakdown of the chip area components. Your final submission for both parts is limited to 8 pages, using 11 point times roman font and one inch margins on all sides. The format of the report is as follows:
  5. Title and Author Part I: