














Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Some concept of High Performance Computing are Addressing Modes, Program Execution, Basic Computer Organization, Control Hazard Solutions, Least Recently Used, Memory Hierarchy Progression. Main points of this lecture are: Reality Check, Real Caches, Physical Addresses, Virtual Addresses, Pipelining, Caches, Address Translation, Physical Addressed Cache, Virtual Addressed Cache, Physical Addressed Cache
Typology: Slides
1 / 22
This page cannot be seen from the preview
Don't miss anything!















2
4
Flush cache on context switch, or
Include Process id as part of each cache directory entry
Virtual addresses that translate to same physical address
More than one copy of a block in cache …
5
Another possibility: Overlapped operation
Cache
Virtual
Address
Physical
Address
Indexing into
cache directory
using virtual
address
Tag comparison
using physical
address
Virtual indexed physical tagged cache
7
Cache Tag
16 bits
Physical Page No
18 bits
Virtual Address
Virtual Page No
18 bits
Page Offset
14 bits
C
offset
Physical Address
Physical Cache
5
16KB page size
64KB direct mapped
cache with 32B block
size
C-Index
11 bits
Page Offset
14 bits
8
Virtual Address
Physical Address
VPN
18 bits
Block
offset
C-Index
11 bits
5
Page Offset
14 bits
PPN
18 bits
Hit/Miss
10
11
Pipelining
Current processors use more aggressive
techniques for more performance
Some exploit Instruction Level Parallelism -
often, many consecutive instructions are
independent of each other and can be
executed in parallel (at the same time)
13
Challenge: identifying which instructions are
independent
Approach 1: build processor hardware to
analyze and keep track of dependences
Approach 2: compiler does analysis and
packs suitable instructions together for
parallel execution by processor
14
and return, Address space, Data & its representation (4)
architecture, Instruction processing (6)
Process management (6)
impact on programming (4)
Protection (4)
Synchronization, Mutual exclusion, Parallel architecture,
Programming with message passing using MPI (5)
16
17
Reports Real/Elapsed/Wallclock
time, CPU time in user mode,
CPU time in system mode
19
struct timeval before, after;
gettimeofday(&before);
/ region of program you want to time
gettimeofday(&after);
printf (“%d\n”, after.tv_sec – before.tv_sec);
Your C program
20
®