Optimizing Procedure Placement with Profile-Guided Code Positioning - Prof. Scott Mahlke, Study notes of Electrical and Electronics Engineering

Techniques for optimizing procedure placement in compiled programs to reduce cache misses and improve performance. The authors, k. Pettis and r. Hansen, introduce the concept of profile-guided code positioning and describe algorithms for procedure positioning based on a weighted call graph and merging nodes. They also discuss the limitations of weighted call graphs and propose using temporal ordering information to construct a more accurate temporal relationship graph for better placement decisions.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-dl5
koofers-user-dl5 🇺🇸

9 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 583 - Profile-guided
Code Layout
(Supplementary Notes)
University of Michigan
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Optimizing Procedure Placement with Profile-Guided Code Positioning - Prof. Scott Mahlke and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EECS 583 - Profile-guided Code Layout (Supplementary Notes)

University of Michigan

Profile Guided Code Positioning

K. Pettis and R. Hansen, PLDI-90, 1990.

  • 3 -

Procedure Positioning

™

“Closest is best” strategy: if a procedure calls anotherfrequently, we want the two procedures to wind up closeto one another. This increases the chances that they willland on the same page, reducing the working set.

™

Construct a weighted call graph

Node = procedure

Edge from A

Æ

B means procedure A calls B (perhaps multiple

times)

Weight on edge is the total number of dynamic calls

™

Using the weighted call graph to build the link order forthe procedures

  • 4 -

Example Call Graph

  • 6 -

Merging Nodes (continued)

™

Procedures in a node are organized as a linear list called a chain

  • this is the layout order of the procedures

™

Next, we merge AD and CF nodes. 4 choices result:

A-D-C-F or F-C-D-A

»

A-D-F-C or C-F-D-A

»

D-A-C-F or F-C-A-D

»

D-A-F-C or C-F-A-D

™

Use “Closest is best” strategy

F is not connected to either A or D and C is more stronglyconnected to A than D

»

Thus, C and A should be adjacent: Pick DACF or FCAD

™

Repeat merging process until no edges in graph

  • 7 -

Basic Block Ordering

Define the order of the basic blocks in eachprocedure

First, identify chains of blocks

» 2 methods

y

Top-down

y

Bottom-up

Second, define precedence relation between thechains

» Want non-taken conditional branches to be forward as

much as possible

  • 9 -

Class Problem

Create the BB chains using top-downpositioning

  • 10 -

Bottom-up Chain Formation

Form BB chains using the following steps

» Initially consider each basic block as the head and tail

of a chain.

» Looking at the edges from largest to smallest weight,

two different chains are merged if the arc connects thetail of one chain to the head of the other.

  • 12 -

Define Precedence Relation Among Chains

™

6 conditional branches in our example »

B to C/O

C to D/G

D to E/F

»

F to H/I

I to J/M

J to K/L

™

Branch B to C/O

»

Yields, chain 2(B) before chain 4(O)

™

Branch C to D/G

»

Yields, chain 2(C) before chain 4(G)

™

Branch D to E/F

»

Nothing, all in same chain

™

Branch F to H/I

»

Yields, chain 2(F) before chain 3(I)

™

Branch I to J/M

»

Yields, chain 3(I) before chain 6(M)

™

Branch J to K/L

»

Yields chain 3(J) before chain 5(L)

™

Final order: A , E-N-B-C-D-F-H, I-J-L, G-O, K, M

Chains: 1)

A

E-N-B-C-D-F-H

I-J-L

G-O

K

M

This yields a partial order. To get a full order, choose a chain to follow another by the highest inter- chain weight

Procedure Placement UsingTemporal Ordering Information

N. Gloy, T. Blackwell, M. Smith, and B. Calder,MICRO-30, 1997.

  • 15 -

Temporal Information

™

Use temporal ordering to accurately order code blocks

™

Define a structure that summarizes for each procedure pand q, the frequency of finding q between 2 successivereferences to p

This is more information than provided in the WCG

Thus, hopefully can make better placement decisions

™

Represent with a queue

Q

, listing procedures/cache line

usage

Walk procedure trace, appending to

Q

Bounded by cache size

Procedures moved to newest end if in

Q

  • 16 -

Temporal Relationship Graph Construction

™

Generate TRG

If procedure

p

already exists in

Q,

increment TRG edge weights

from old

p

to newest procedures in

Q

…A B C A B

Trace

A B C

Q

B C A

B

A

C

B

A

C

10

1

OLDEST

11

2

OLDEST

  • 18 -

Generate Code Layout

™

Pick highest weight edge and merge nodes

Place code of merged nodes adjacent in memory

Sum weights of common edges into new node

Repeat until TRG is empty

Basically, same as Pettis/Hansen but on TRG rather than WCG

™

Rather than just placing procedures sequentially – canleave small amount of whitespace to reduce conflicts

™

Chunking

Create 2

nd

TRG that is smaller granularity than procedure

Chunk size of 256 bytes found to be a good number

Currently use 2

nd

TRG for alignment only – place full procedures

as unit