











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Techniques for optimizing procedure placement in compiled programs to reduce cache misses and improve performance. The authors, k. Pettis and r. Hansen, introduce the concept of profile-guided code positioning and describe algorithms for procedure positioning based on a weighted call graph and merging nodes. They also discuss the limitations of weighted call graphs and propose using temporal ordering information to construct a more accurate temporal relationship graph for better placement decisions.
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!












Procedure Positioning
Node = procedure
Edge from A
B means procedure A calls B (perhaps multiple
times)
Weight on edge is the total number of dynamic calls
Example Call Graph
Merging Nodes (continued)
A-D-C-F or F-C-D-A
»
A-D-F-C or C-F-D-A
»
D-A-C-F or F-C-A-D
»
D-A-F-C or C-F-A-D
F is not connected to either A or D and C is more stronglyconnected to A than D
»
Thus, C and A should be adjacent: Pick DACF or FCAD
Basic Block Ordering
Define the order of the basic blocks in eachprocedure
First, identify chains of blocks
y
Top-down
y
Bottom-up
Second, define precedence relation between thechains
Class Problem
Create the BB chains using top-downpositioning
Bottom-up Chain Formation
Form BB chains using the following steps
Define Precedence Relation Among Chains
6 conditional branches in our example »
B to C/O
C to D/G
D to E/F
»
F to H/I
I to J/M
J to K/L
Branch B to C/O
»
Yields, chain 2(B) before chain 4(O)
Branch C to D/G
»
Yields, chain 2(C) before chain 4(G)
Branch D to E/F
»
Nothing, all in same chain
Branch F to H/I
»
Yields, chain 2(F) before chain 3(I)
Branch I to J/M
»
Yields, chain 3(I) before chain 6(M)
Branch J to K/L
»
Yields chain 3(J) before chain 5(L)
Final order: A , E-N-B-C-D-F-H, I-J-L, G-O, K, M
Chains: 1)
A
E-N-B-C-D-F-H
I-J-L
G-O
K
M
This yields a partial order. To get a full order, choose a chain to follow another by the highest inter- chain weight
Temporal Information
This is more information than provided in the WCG
Thus, hopefully can make better placement decisions
Walk procedure trace, appending to
Bounded by cache size
Procedures moved to newest end if in
Temporal Relationship Graph Construction
If procedure
p
already exists in
increment TRG edge weights
from old
p
to newest procedures in
…A B C A B
Trace
A B C
Q
B C A
B
A
C
B
A
C
10
1
OLDEST
11
2
OLDEST
Generate Code Layout
Place code of merged nodes adjacent in memory
Sum weights of common edges into new node
Repeat until TRG is empty
Basically, same as Pettis/Hansen but on TRG rather than WCG
Create 2
nd
TRG that is smaller granularity than procedure
Chunk size of 256 bytes found to be a good number
Currently use 2
nd
TRG for alignment only – place full procedures
as unit