Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CA226: Advanced Computer Architecture - Module Code: ca226, Study notes of Advanced Computer Architecture

University of Babylon (UB)Advanced Computer Architecture

Information about the Advanced Computer Architecture module (ca226) offered by the School of Computing at DCU. It includes details on lab schedules, exams, and course content. Topics covered include interrupts, processor speed, 64-bit systems, CISC versus RISC, multi-core systems, and computer performance.

Typology: Study notes

2021/2022

Uploaded on 09/07/2022

adnan_95 🇮🇶

4.3

(39)

918 documents

1 / 18

This page cannot be seen from the preview

Don't miss anything!

CA226 — Advanced

Computer Architecture

Stephen Blott <[email protected]>

Table of Contents

CA226 — Advanced

Computer Architecture

Preliminaries

Contacting me:

1. before or after lectures, or during labs

2. in my office: L1.11

3. at [email protected] [mailto:[email protected]]

please put the module code (ca226) in the subject line

CA226 — Advanced

Computer Architecture

More Preliminaries

Course web site:

• http://ca226.computing.dcu.ie/

use your School of Computing credentials

There’s a link to this site on Moodle [http://moodle.dcu.ie/].

CA226 — Advanced

Computer Architecture

Still More Preliminaries

Labs:

• begin week five

Lab exams:

• weeks eight and twelve

in the regular lab slot (Friday’s at 14:00)

Discover Study notes of Advanced Computer Architecture University of Babylon (UB)

Partial preview of the text

Download CA226: Advanced Computer Architecture - Module Code: ca226 and more Study notes Advanced Computer Architecture in PDF only on Docsity!

CA226 — Advanced

Computer Architecture

Stephen Blott

CA226 — Advanced Computer Architecture

Preliminaries

Contacting me:

before or after lectures, or during labs
in my office: L1.
at [email protected] [mailto:[email protected]] please put the module code (ca226) in the subject line

CA226 — Advanced Computer Architecture

More Preliminaries

Course web site:

http://ca226.computing.dcu.ie/ use your School of Computing credentials

There’s a link to this site on Moodle [http://moodle.dcu.ie/].

CA226 — Advanced Computer Architecture

Still More Preliminaries

Labs:

begin week five

Lab exams:

weeks eight and twelve in the regular lab slot (Friday’s at 14:00)

Computer Architecture

Starters for 10.

List the powers of 2?
What is 2^{32}?
What is 2^{64}?

Computer Architecture

Starters for 10..

What is a register?
What is a bus?
What does USB stand for?
What is a frame buffer?
What is an interrupt?

CA226 — Advanced Computer Architecture

Starters for 10…

What’s special about this IP address: 127.0.0.1?
What’s special about this IP address: 192.168.3.3?
What’s special about this IP address: 192.168.3.255?
Could every person on earth be allocated a unique IP address?
Old versions of the Linux ext2 file system had a 2GB limit on file sizes. Why?

CA226 — Advanced Computer Architecture

Observations on Processor Speed

Computer Architecture

CISC versus RISC

Memory constraints influenced early processor designs:

with small memories, high code density [http://en.wikipedia.org/wiki/ Instruction_set#Code_density] was necessary
this led to the development of processors with complex instruction sets:
- a single instruction might implement a high-level programming-language operation
- complex addressing modes
- e.g. b = a[i] + 1

Computer Architecture

CISC versus RISC

As memory costs reduced:

memory size constraints lessened
code did not need to be so dense
reduced instruction sets became viable
- a single high-level programming-language operation might be implemented by several instructions

Almost all modern processors implement reduced instruction sets.

CA226 — Advanced Computer Architecture

A simple computer…

Note

Source [http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/projects/risc/ risccisc/].

CA226 — Advanced Computer Architecture

Example — The Problem

The problem:

a = a * b;
so: multiply memory locations 5:2 and 2:3 (say)

Computer Architecture

Example — CISC Approach

CISC approach:

MULT 5:2 2:

a single, complex instruction
load both memory locations into registers
multiply
store the result back in the appropriate memory location say 5:

Just one instruction encodes a commonly-occurring programming operation which, at the hardware level, involves several steps.

Computer Architecture

Example — RISC Approach

RISC approach:

LOAD A, 2: LOAD B, 5: MULT A, B STORE 2:3, A

Four steps are required:

so the program memory required is (well, may be) four times larger
so this approach was only possible when cheaper/larger memory systems became more widespread

CA226 — Advanced Computer Architecture

RISC

RISC:

reduced instruction set computing
computations are performed only on register contents
the only memory operations are LOAD and STORE
few, uniformly-sized instructions

CA226 — Advanced Computer Architecture

RISC Advantages

Both approaches are likely to require roughly the same number of computational steps.

RISC advantages:

moves complexity from hardware to software (compilers)
smarter compilers make better use of registers
fewer transistors:
- so smaller, can be clocked faster, reduced power consumption, less heat
pipelining (and super-scalar processing)

Computer Architecture

Answer?

It depends.

Computer Architecture

Answer?

Usually:

we’re interested in how long it takes to get some work done

So:

wall-clock time might be a good measure

CA226 — Advanced Computer Architecture

However …

It depends how/why we’re measuring.

Wall-clock time includes:

user CPU time
system CPU time
interrupt handling time
I/O time (to/from terminal, disk, network)

CA226 — Advanced Computer Architecture

CPU Architectures

If we’re interested in comparing processors:

we may be more interested in the number of clock cycles necessary to complete some task

Computer Architecture

Clock Rate

Clock rate:

the number of clock cycles per unit time (usually, per second)
say, 2GHz

Computer Architecture

CA226 — Advanced Computer Architecture

CPU Clock Cycles

CPU clock cycles:

the number of clock cycles necessary to complete some job

CA226 — Advanced Computer Architecture

Computer Architecture

Alternatively

But that approach:

is too dependent on a single job

Computer Architecture

Alternatively

Better:

derive a metric which is (somewhat) independent of any particular job
let IC be the instruction count the number of instructions needed to complete some job

Say:

IC is 2 times 10^8

CA226 — Advanced Computer Architecture

Then …

Then:

cycles per instruction (CPI): text{CPI} = text{CPU clock cycles}/text{IC}

Example:

text{CPI} = {4 times 10^8} / {2 times 10^8} = 2 so, two cycles per instruction

CA226 — Advanced Computer Architecture

Then again …

Then:

CPU time: text{CPU time} = {text{IC} times text{CPI}} / text{clock rate}

Example:

text{CPU time} = {2 times 10^8 times 2} / {2 times 10^9} = 0.2s

Computer Architecture

So …

text{CPU time} = {text{IC} times text{CPI}} / text{clock rate}

So, to make things go faster (reduce CPU time):

reduce the instruction count (IC)
reduce the number of cycles per instruction (CPI), or
increase the clock rate

Computer Architecture

Improvements in CPI

The Intel 8086 instruction PUSH AX:

8086 — 11 clock cycles
80286 — 3 clock cycles
80386 — 2 clock cycles
80486 — 1 clock cycles

So:

it is not just clock speed that has improved over the years
in fact: it is now commonplace to see text{CPI} le 1

CA226 — Advanced Computer Architecture

Example

Example:

two machines (A and B) implementing the same instruction set architecture
- A has cycle time of 10ns and CPI of 2.0 (for some prog. P)
- B has cycle time of 20ns and CPI of 1.2 (for same P)

Which is faster?

CA226 — Advanced Computer Architecture

Aside

Note

The cycle time (in seconds) is just the reciprocal of the clock speed (in Herz) — and vice versa.

Computer Architecture

More Common Metrics

MIPS:

text{MIPS} = text{clock rate} / {text{CPI} times 10^6}

MFLOPS:

text{MFLOPS} = text{clock rate} / {text{C-per-FPI} times 10^6}

Computer Architecture

MIPS and MFLOPS

These can be poor metrics for comparing different processors:

some implement FP division (e.g. Pentium)
some don’t (e.g. SPARC)

Instruction counts:

they may have different instruction sets (so the ICs will be different)
for complex operations like sine and cosine may be quite large
so these differences can be significant

CA226 — Advanced Computer Architecture

Improving Performance

Generally:

optimise for the common case

CA226 — Advanced Computer Architecture

Improving Performance

However, (particularly) with computer hardware:

optimisation is expensive (it requires substantial investment)

So:

we need to decide where to invest in optimisation, and
we need to know that the payback is going to be worth it

Computer Architecture

Speedup

Consider some possible hardware or software enhancement.

Speedup:

text{performance without enhancement} / text{performance with enhancement}

Note

"Performance", here, might be response time (say). With speedup, larger values are better.

Computer Architecture

Speedup — Example

Example:

a baseline implementation might execute a job in 3 seconds
with some enhancement, that might be reduced to 2 seconds

Speedup:

3/2 = 1.5

CA226 — Advanced Computer Architecture

Important Gotcha!

Typically:

only a portion of an entire job will be sped up by any proposed enhancement

Example:

sort the contents of a disk file, storing the sorted results back in a new file on disk so: read data in, sort it, write data out
an enhanced sorting algorithm can only improve the CPU costs, not the IO costs
an enhanced IO subsystem can only improve the IO costs, not the sorting costs

CA226 — Advanced Computer Architecture

Example

Assume:

some job involving sub-jobs A and B
B accounts for 70% of the execution time, A the rest

Given a proposed enhancement:

running B 20 times faster

How much faster would our job run overall?

Computer Architecture

Amdahl’s Law — Example

Overall speedup:

1 / {(1-P) + P/S}
1 / {(1-0.7) + 0.7/20}
1 / {0.3 + 0.035}
2.985 (approximately)

Computer Architecture

Example

Given a proposed enhancement:

running B 20 times faster

How much faster would our job run overall?

It will run in about three times faster:

this may be less than you intuitively expected.

CA226 — Advanced Computer Architecture

Another Example

Amdahl’s law also allows comparison between two or more design alternatives.

CA226 — Advanced Computer Architecture

Another Example

Example:

a program spends:
- half its time doing floating-point operations
- including 20% of its time calculating floating-point square roots

Alternative optimisations:

Add floating-point square root hardware which speeds up such operations by a factor of 10.
Make all floating-point operations run twice as fast.

Computer Architecture

Engineering

Assuming we can only choose one:

in which of these optimisations should we invest?

Computer Architecture

Engineering — First Case

Optimisations:

Add floating-point square root hardware which speeds up such operations by a factor of 10.

Amdahl’s law:

text{speedup} = 1 / {0.8 + 0.2 / 10} = 1.22

CA226 — Advanced Computer Architecture

Engineering — Second Case

Optimisations:

Make all floating-point operations run twice as fast.

Amdahl’s law:

text{speedup} = 1 / {0.5 + 0.5 / 2} = 1.33

So, under these assumptions, the second approach looks like the better investment.

CA226 — Advanced Computer Architecture

Corollary

Amdahl’s law tells us to:

make the common case fast!

Or:

we can never see a big speedup by optimising the uncommon case

CA226: Advanced Computer Architecture - Module Code: ca226, Study notes of Advanced Computer Architecture

Related documents

Partial preview of the text

Download CA226: Advanced Computer Architecture - Module Code: ca226 and more Study notes Advanced Computer Architecture in PDF only on Docsity!

CA226 — Advanced

Computer Architecture

Stephen Blott

Table of Contents

Preliminaries

More Preliminaries

Still More Preliminaries

Note

RISC:

Note

MIPS:

MFLOPS:

Note