Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Architecture Of Parallel Computers - Problem Set 1 | ECE 506, Assignments of Electrical and Electronics Engineering

North Carolina State University (NCSU)Electrical and Electronics Engineering

Prof. Gehringer

Material Type: Assignment; Professor: Gehringer; Class: Architecture Of Parallel Computers; Subject: Electrical and Computer Engineering; University: North Carolina State University; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 03/09/2009

koofers-user-pig 🇺🇸

5

(1)

9 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

–1–

CSC/ECE 506: Architecture of Parallel Computers

Problem Set 1

Due Friday, June 7, 2002

Problems 2, 3, and 5 will be graded. There are 60 points on these problems

. Note: You must do

all the problems, even the non-graded ones

. If you do not do some of them, half as many points

as they are worth will be subtracted from your score on the graded problems.

Problem 1.

(15 points)

Page 12 of the lecture notes for Lecture 3 (titled

Communication

architectures, cont.

) presents an assembly-language version of the inner loop of a vector-

processor matrix-multiplication program whose pseudocode is given on page 10 of the same

lecture. Write the assembly code for the entire program (i.e., the program on page 10), using the

code already given on page 12. Please make sure that your program is well commented;

however, you do not have to comment portions that you use without change from page 12.

Problem 2.

(15 points)

Suppose a program that was being run on one processor is now run on

a 100-processor machine. If a speedup of 80 is desired (on the 100-processor machine, as

compared to the single processor), what fraction of the program can be serial? Use Amdahl’s law.

Problem 3.

(25 points)

Main memory for a processor consists of 1,024K words (220 addresses).

We would like to design a cache that contains 16K words (214 addresses) that minimizes the

average access time (AAT) for a given program. There are 16 words per cache line.

(a) How many cache lines are there?

(b) If a direct-mapped cache is used, how many bits make up the tag field?

(c) Consider a program that accesses the following 16 hexadecimal memory words in sequence:

5A000

5A010

5A020

5A030

4A000

4A010

4A020

4A030

5A000

5A010

5A020

5A030

4A000

4A010

4A020

4A030

Fill in the following table with the tag values and the first word on the line for each of the first four

cache lines, after the program makes the above 16 memory accesses.

Discover Assignments of Electrical and Electronics Engineering North Carolina State University (NCSU)

Partial preview of the text

Download Architecture Of Parallel Computers - Problem Set 1 | ECE 506 and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

CSC/ECE 506: Architecture of Parallel Computers

Problem Set 1

Due Friday, June 7, 2002

Problems 2, 3, and 5 will be graded. There are 60 points on these problems. Note: You must do all the problems, even the non-graded ones. If you do not do some of them, half as many points as they are worth will be subtracted from your score on the graded problems.

Problem 1. (15 points) Page 12 of the lecture notes for Lecture 3 (titledCommunication architectures, cont.) presents an assembly-language version of the inner loop of a vector- processor matrix-multiplication program whose pseudocode is given on page 10 of the same lecture. Write the assembly code for the entire program (i.e., the program on page 10), using the code already given on page 12. Please make sure that your program is well commented; however, you do not have to comment portions that you use without change from page 12.

Problem 2 .(15 points) Suppose a program that was being run on one processor is now run on a 100-processor machine. If a speedup of 80 is desired (on the 100-processor machine, as compared to the single processor), what fraction of the program can be serial? Use Amdahl’s law.

Problem 3. (25 points) Main memory for a processor consists of 1,024K words (2 20 addresses). We would like to design a cache that contains 16K words (2^14 addresses) that minimizes the average access time (AAT) for a given program. There are 16 words per cache line.

(a) How many cache lines are there?

(b) If a direct-mapped cache is used, how many bits make up the tag field?

(c) Consider a program that accesses the following 16 hexadecimal memory words in sequence:

5A 5A 5A 5A 4A 4A 4A 4A 5A 5A 5A 5A 4A 4A 4A 4A

Fill in the following table with the tag values and the first word on the line for each of the first four cache lines, after the program makes the above 16 memory accesses.

Line Tag (binary) First Word (Hexadecimal)

(d) How many cache misses in this code fragment, assuming the cache is initially empty?

(e) How many of the above cache misses are cold (compulsory) misses, how many are conflict misses, and how many are capacity misses?

(f) If the hit time is 10 ns and the miss penalty is 200 ns, what is the average access time for this direct-mapped cache?

(g) When memory references have the same low-order bits, they are constantly in contention for thesame four cache lines. This makes very poor use of the 1024-line cache. What would be a more efficient cache organization for this program? Why is this scheme better than direct- mapped? Please give the number of sets and the number of tag bits for this scheme.

(h) How many cache misses would occur with your new cache scheme, assuming the cache was initially empty?

(i) How many of the above cache misses are cold (compulsory) misses, how many are conflict misses, and how many are capacity misses?

(j) What is the average access time for your new cache organization?

Architecture Of Parallel Computers - Problem Set 1 | ECE 506, Assignments of Electrical and Electronics Engineering

Related documents

Partial preview of the text

Download Architecture Of Parallel Computers - Problem Set 1 | ECE 506 and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

CSC/ECE 506: Architecture of Parallel Computers

Problem Set 1

Due Friday, June 7, 2002