Download Static Semantics and Compiler Error Recovery and more Thesis Compiler Design in PDF only on Docsity!
Static Semantics^ and^ Compiler^ Error Recovery
by
Robert Paul Corbett
June 1985
Sponsored by Defense Advance Research^ Projects^ Agency (DoD) Arpa Order No. 4871 Monitored by Naval Electronic Systems^ Command under Contract^ No.^ N00039-84-C-
Report Documentation Page OMB No. 0704-0188Form Approved
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering andmaintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,
including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, ArlingtonVA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it
does not display a currently valid OMB control number.
- REPORT DATE
JUN 1985 2. REPORT TYPE
- DATES COVERED
00-00-1985 to 00-00-
- TITLE AND SUBTITLE
Static Semantics and Compiler Error Recovery
5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER
- AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER
- PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
University of California at Berkeley,Department of Electrical
Engineering and Computer Sciences,Berkeley,CA,
- PERFORMING ORGANIZATION REPORT NUMBER
- SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S)
- SPONSOR/MONITOR’S REPORT NUMBER(S)
- DISTRIBUTION/AVAILABILITY STATEMENT
Approved for public release; distribution unlimited
- SUPPLEMENTARY NOTES
- ABSTRACT
Good error recovery for compilers depends on accurate diagnosis of errors. When an error is
misdiagnosed, the error message issued for it is apt to be misleading. Worse, the error recovery system may
leave the compiler in a configuration that will cause spurious errors to be reported later. This dissertation
presents new error recovery techniques for compilers that generally diagnose errors more accurately than
earlier techniques. The major innovation embodied in the new error recovery techniques is the use of
general static semantic information to help detect and diagnose syntactic errors. There are usually many
possible ways of recovering from an error. Testing if a potential recovery leads to semantic problems later
involves executing the semantic actions associated with that recovery. If a potential recovery is rejected, the
semantic actions that were performed while testing it must have no apparent effect on later compilation.
Thus, it must be possible to undo the effects of semantic actions. For conventional compilers, the
mechanisms needed to reverse the effects of semantic actions are too slow to be practical. A new compiler
organization that permits semantic actions to be undone efficiently is presented. This new organization is
suited for compiling languages, such as C, Pascal, and Ada, that require declarations to precede uses. Two
further ways of improving the performance of error recovery systems are considered. Error recovery
systems sometimes fail to accurately diagnose an error because the parser has performed reductions based
on erroneous input. A variety of techniques for avoiding the adverse effects of such reductions are
presented and compared. Also, a new panic mode algorithm for use with LR parsers is presented. The new
error recovery techniques have been applied in an error checking program for Pascal. The recoveries
produced by that program are shown to compare favorably with those produced by two well known error
recovery systems. Finally, some drawbacks of the new techniques and some directions for future work are
discussed.
- SUBJECT TERMS
Static^ Semantics^ and^ Compiler^ Error^ Recovery
Robert Paul Corbett Department^ of^ Electrical Engineering^ and^ Computer^ Science Computer Science Division University of California Berkeley, California 94720
ABSTRACT
Good^ error^ recovery for compilers depends on accurate diagnosis of errors.^ When^ an errorismisdiagnosed,theerror^ message issued for itisapt^ tobe misleading.^ Worse,^ the errorrecovery system may leavethecompiler in a configurationthatwill cause spurious errors to be reported^ later.^ This dissertation presents new^ error^ recovery techniques for compilersthatgenerally diagnose errors more accuratelythanearlier techniques.
The major innovation embodied in^ the^ new^ error^ recovery techniques^ istheuse of general static semantic information^ to^ help detect^ and^ diagnose syntactic errors.^ There are usually^ many^ possible^ ways^ of recovering from^ an^ err-or.^ Testing^ if^ a^ potential recovery leads^ to^ semantic^ problems^ later^ involves^ executing^ the^ semantic^ actions
associated with thatrecovery.^ If^ a potential recovery^ is^ rejected,^ the^ semantic actions
thatwere performed while testing it^ must^ have no^ apparent^ effect on later compilation. Thus, it must^ be^ possible^ to^ undo^ the^ effects of semantic^ actions.^ For^ conventional compilers,^ the^ mechanisms neededtoreverse^ the^ effects of semantic actions^ are^ too slow tobe practical.^ A new compiler organizationthatpermits semantic actionstobe undone efficiently is presented.^ This^ new organization^ is^ suited for compiling languages, such^ as C, Pascal, and Ada,thatrequire declarations^ to^ precede uses. Two further ways^ of^ improving^ the^ performance^ of^ error^ recovery^ systems^ are considered. Error recovery^ systems^ sometimes^ fail^ to^ accurately^ diagnose^ an^ error because the parser has^ performed reductions^ based^ on^ the^ erroneous input.^ A variety of techniques for avoiding^ the^ adverse^ effects^ of^ such^ reductions^ are^ presented^ and compared. Also, a new^ panic mode algorithm for use with LR parsers^ is^ presented. The^ new^ error^ recovery techniques have been applied in an^ error^ checking program for Pascal.^ The^ recoveries produced by^ that^ program are shown^ to^ compare favorably with those produced by two well known^ error^ recovery systems.^ Finally, some drawbacks of the new techniques^ and^ some directions for future work are discussed.
Ill
Acknowledgements
I would like^ to^ express^ my^ deep^ gratitude^ to^ my dissertation advisor,^ Professor Susan L. Graham, for her encouragement^ and^ understanding, and for her^ insights^ and assistance^ which^ contributed greatly^ to^ this^ work.^ I would^ also^ like^ to^ express^ my gratitude^ to^ theother members of my doctoral committee,^ Professors^ Paul^ N.^ Hilfinger and^ RobertM.^ Solovay. I would like to offer special^ thanks^ to^ my fellow^ student^ Michael^ C.^ Shebanow, whose implementation of^ the^ algorithms presented in^ Chapter^7 was^ invaluable.^ I would like^ to^ thank Gerald Fisher and Michael Burke for freely sharing their^ ideas^ and^ their codes with me. I would like^ to^ thank^ Peter^ B.^ Kessler^ and Marshall K. McKusick for providing the graph^ profiler^ gprof. I would^ like to^ thank Benjamin^ Zorn,^ Eduardo^ Pelegrf-Llopart,^ and^ Phillip Garrison for many^ discussions^ thathelped clarify the concepts presented herein.^ I would like to^ thank all those who^ graciously offered their counsel, but especially^ Professors W.^ Kahan^ and^ Robert Fabry. Particular^ thanks^ must^ be^ paid^ to^ my^ parents,^ Jeanette^ and^ Harvey^ Corbett, without whose^ support^ and^ encouragement^ this^ work would not have been^ possible. Finally, I would like to offer^ thanks^ to^ the^ National^ Science^ Foundation, Grant MCS80-05144, and^ the^ Defense Advanced Research^ Projects^ Agency, Contracts^ N00039- 82-C-0235 and N00039-84-C-0089,^ for their financial^ support^ over^ the^ years.
Appendix^ C:^ Programsfor^ which^ Berkeley
Vll
- 1 Introduction Tableof Contents
- 2 Terminology
- 3 Previous Proposals for Semantics-directed Error Recovery
- 4 Semantics-directed Error Recovery - 4.1 Local Recovery Algorithms for LR Parsers - 4.2 Applying Semantics to Repairs - 4.3 Semantics for Semantics-directed Repairs
- 5 A ModelofCompilation for Semantics-directed Error Recovery - 5.1 Attribute Grammars................................................................................... - 5.2 LL-and LR-attributed Grammars - Semantics-directed Error Recovery 5.3 A Practical OrganizationthatSupports - 5.4 Symbol Tables
- 6 Erroneous Reductions - 6.1 General Backtracking - 6.2 Suppressing Default Reductions ......................................:.......................... - 6.3 Pretesting - 6.4 LR( k) Error Checking via Stack Restoration - 6.5 Limited Backtracking - 6.6 Comparing the Techniques
- 7 Panic Mode for LR Parsers - 7.1 Desirable Characteristics for Panic Mode Algorithms - 7.2 Some Earlier Panic Mode Algorithms :................................................... - 7.2.1 Aho and Ullman's algorithm - 7.2.2 Pai and Kieburtz' algorithm - 7.2.3 Hartmann's algorithm - 7.2.4 The Yacc algorithm - 7.2.5 Burke and Fisher's algorithm - 7 .2.6 Sippu and Soisalon-Soininen's algorithm - 7 .2.7 Properties that lead to good panic mode recoveries - 7.3 Panic Declarations
- 7.4 TheNew Panic Mode Algorithm VI
- 7.5 Semantics and Panic Mode.....
- 8 An Implementation and EmpiricalResults - 8.1 TheBison Parser Generator - 8.2 The Parser - 8.3 The Pascal Auditor's Error Recovery System - 8.4 The Repairs - 8.5 Reporting Errors - 8.6 Space and Time^7 - 8.7 Examples of Use - 8.8 Comparisons
- 9 ImplementationNotes - 9.1 Error Messages for Insertions - 9.2 The Lexical Analyzer - 9.3 Assigning Costs to Syntactic Repairs - 9.4 Recording Repairs - 9.5 The Spelling Matcher
- 10 FutureWork - New Test Suites for Error Recovery - 10.2 Error Productions - 10.3 Improving the Parser Generator :.............................................................. - 10.4 Enhancing the Local Recovery Algorithm - 10.5 Other Languages
- Appendix A: The Grammarforthe Pascal Auditor.....................................
- AppendixB: RecoveriesProduced with and without Semantics - Outperform the Pascal Auditor Pascal or the Burke-Fisher System - Produces BetterRecoveriesthanBerkeley Pascal Appendix D: Some Examples for whichthe Pascal Auditor
- References.................................................................................................................
- 4.1 Example illustrating backtracking in Levy's algorithm TableofFigures
- 5.1 A sample LR-attributed grammar
- 5.2 Grammar for function calls without using inherited attributes
- 5.3 Grammar for function calls using inherited attributes
- 5.4 The look up algorithm
- 5.5 The algorithm for popping the scope
- 5.6 The back up routine
- 5.7 The restore routine
- 5.8 The reset routine
- 6.1 The function Shiftable
- 6.2 A semantic error requiring backtracking
- 7.1 Panic declarations for Pascal
- 7.2 The new panic mode algorithm
- 8.1 The Graham-Rhodes example
- 8.2 P. J. Brown's example
Introduction
Ideally, a compiler should^ detect^ and^ correctly identify every^ error^ in every^ program submitted to it. Thatgoal, regrettably,^ is^ unattainable.^ Many errors^ either^ cannot^ be detected atcompile^ time^ or^ are so difficulttodetectthatitisnot practicaltocheck for them. Even whenan^ erroris^ detected, it is, in general, impossibletocorrectly diagnose the error.^ Diagnosing^ an^ error involves^ guessing^ how^ a^ program^ deviates^ from the programmer's^ intent.^ Through^ heuristics, those guesses can be made highly accurate. Still, some failures must^ be^ expected. The problem^ of^ providing^ good^ error^ recovery^ in^ a^ practical^ compiler^ is compounded by^ the^ need for efficiency. To handle errors well, a compiler^ must^ record information^ and^ perform^ teststhatwould otherwise be unnecessary.^ For^ example,tobe able to associate locations^ with^ errors,^ the^ position of each symbol in thesource text must be recorded. Those additional operations will cause^ the^ compilerto be slower.^ A slow compiler is as undesirable^ as^ one that does^ not^ handle errors well.^ A practical compiler must strike^ a balance between^ the^ efficiency^ and^ thepower of its^ error^ recovery system. Error recovery is^ a^ four^ step^ process.^ The^ four^ steps^ are^ detection,^ diagnosis, reporting, and patching.^ Detection^ consists^ of discovering^ the^ presence^ of^ an^ error. Errors are often^ classified^ according to^ the^ part of^ the^ compiler^ by^ which^ they^ are detected. Thus, errors detected by^ the^ lexical analyzer are called lexical errors, those detected by the parser are called syntax errors,^ and^ those detected by semantic action routines are called^ semantic^ errors.^ Diagnosis^ consists^ of guessing the location^ and nature of the^ error. Theresults ofthediagnosis are used when^ reporting^ and^ patching the error.^ Reporting^ consists of providing^ the^ programmer^ with informationtohelp him identify the^ error.^ Patching^ consists of modifying^ the^ state^ of^ the^ compiler so that compilation can continue. Many error recovery techniques have been proposed.^ Among^ the^ most commonly used techniques are
1.^ Error^ Productions.^ If^ a compiler writer anticipates that certain syntax errors^ may^ occur,^ he^ can^ extend^ his^ grammar^ for^ the language to^ be compiled^ to^ i,:1clude^ the^ erroneous constructs.^ Rules that are partof such extensions are called^ error^ productions.^ The compiler writer^ must^ provide for reporting errors handled by^ error productions.
- Local Recoveries.^ A^ local^ recovery^ is^ a^ recovery that is determined by^ the^ immediate^ context^ in^ which^ the^ error^ was detected. Most^ local^ recovery^ algorithms^ consider^ only^ simple recovery actions such^ as^ insertion, deletion^ or^ replacement of single symbols.^ Local^ recovery^ algorithms^ usually^ do^ not^ require^ the compiler^ writer^ to^ supply^ any^ special^ information;^ any^ necessary information is inferred from^ the^ parser.^ Many algorithms allow^ the
compiler^ writertosupply a small^ amount^ of informationthatisused tofine-tunethechoice of recoveries.
- Panic Mode.^ A panic mode recovery consists of deleting symbols from the remaining^ input^ until a recognized symbol^ or^ sequence of symbols is at the head^ of the input. The parse stack is then reconfigured sothatparsing^ can^ continue overtheremaining input. Good error recovery^ systems^ typically^ incorporate^ the^ three^ recovery^ techniques mentioned above^ and^ perhaps others^ as^ well.^ Recoveries that involve changes to^ the program textatthe^ token^ level only are customarily called^ repairs.
In theory, all syntax errors could be handled^ by^ error^ productions.^ Ifthe^ grammar
for a language isextendedto accept all possible^ input^ strings, no^ other^ syntactic^ error recovery capabilities need be provided.^ However, a^ grammar^ capable of distinguishing erroneous syntax from legal syntax for any^ inputisapt^ to be large^ and^ of a form for which efficient parsers cannot be constructed.^ Therefore,^ error^ productions are normally used only for errorsthatcannot^ be handled well using^ other^ recovery techniques. Error^ productions^ are^ often^ used to relax^ restrictions^ in the language to be
compiled.^ For^ example, in^ Pascal^ [ANS83], declarations^ must^ appear in a fixed order.^ If
a declaration occursoutof^ order,^ thereislittle chancethatthe^ error^ could be^ patched^ by a local^ recovery.^ A panic^ mode recovery for such an^ error^ would be tantamount^ to deleting the declaration.^ Many spurious semantic errors result from^ such a recovery. Therefore, extending^ the^ grammartoallow declarations^ to^ appear in any^ order^ appears to betheonly way^ to^ handle such errors gracefully. Local recoveries work best for simple errors.^ For example, consider^ the^ erroneous Pascal code fragment
- .- i +^1 J .- 0;
where i and^ j^ are integer variables.^ The^ likely erroristhata semicolon^ has^ been^ omitted
from the end of^ the^ first line.^ A good repair algorithm should determinethata semicolon should be inserted between^ the^ two lines.^ A statistical^ study^ of errors in^ Pascal^ programs [RD78] has^ shownthat,for^ Pascal^ atleast, local recovery techniques should be effective for most common errors. There are often many different local recoveriesthatcould be used^ to^ patch^ an^ error. For example, suppose^ the^ erroneous^ statement
a :=^ m);
appears^ in^ a^ Pascal^ program.^ The^ apparent^ error is that the^ statement^ contains^ an unmatched^ right^ parenthesis:^ The^ error^ could be^ patched^ by inserting a left parenthesis before^ the^ identifier^ m^ or^ by^ deleting^ the^ right^ parenthesis.^ Either^ repair would seem reasonable. However,^ the^ error^ could be patched^ just^ as^ effectively by replacing^ the^ right parenthesis with^ a^ semicolon.^ That^ repair^ is^ apt^ to^ seem^ unreasonable^ to^ most programmers.^ Many local recovery algorithms allow a compiler writer^ to^ bias^ the^ choice of recoveries in favor of those he feels are desirable.^ The^ compiler writer^ is^ allowed^ to assign costs^ to^ each possible recovery.^ \Vhenever^ there^ is^ a choice of local recoveries^ that patch^ an^ error,^ the^ recovery whose cost^ is^ the^ lowest^ is^ selected. Panic mode recoveries are useful when an^ error^ deviates^ so^ far from a legal^ text that no simple correction^ can^ patch^ the error.^ Suppose, for example,^ that^ the^ Algol-like statement
of the presence ofthe comma.^ Unless the effects ofthe erroneous reductions can be
reversed, it is unlikely a good recovery will be found.^ An analogous problem for top-
down parsersisdiscussed by Burke^ and^ Fisher [BF82].
Many good panic mode algorithms have been developed for top-down parsers. The
panic mode algorithmsthathave been proposed for LR parsers do^ not^ work nearly so
well. As apartof this work, an improved panic mode algorithm for LR parsers^ has^ been
developed.
An^ implementationisthebest^ test^ ofan^ error^ handling system.^ Many impractical
error^ handling techniques have been described in^ the^ literature.^ Withbutfewexceptions,
those techniques either have not been implemented or have been implemented only for
unrealistic languagesthatdo^ not^ expose^ their^ flaws.^ The^ new^ error^ recovery techniques
described in later chapters have been implemented^ as^ partof^ an^ error^ checking program
for Pascal called the Pascal^ auditor.^ Measurements ofthe^ Pascal^ auditor's^ speed^ and
space requirements showthepracticality ofthenew techniques.
A new parser generator^ named^ Bison^ has^ been^ written^ to^ assist^ construction of
compilers using the new error^ recovery^ techniques.^ Bison^ was^ designed to^ support
experiments with a variety of^ error^ recovery techniques.^ The^ parsers produced by Bison
are faster than those^ produced by most^ other^ parser generators.^ Furthermore,^ Bison
itselfis faster^ than most^ other^ parse generators because^ it is^ based^ on^ more^ modern
algorithms.
As a demonstration ofthe power ofthenew^ error^ recovery techniques,the^ Pascal
auditor has been compared with two well-known^ error^ handling systems.^ Ripley^ and
Druseikis [RD78] have created^ a sample of erroneous^ Pascal^ programsthat^ has^ become a
standard testsuite for error^ handling systems.^ The^ recoveries produced bythe^ Pascal
auditor forthattest suite have been compared with those produced by^ the^ other^ systems.
The remaining chapters are organized^ as^ follows.^ The^ next^ chapter^ introduces^ the
terminology^ and^ notation used in later chapters.^ Chapters^3 through^ 5 describe schemes
for using^ semantics^ to^ help^ detect^ and^ recover^ from^ errors.^ Chapter^6 explores
techniques for^ preventing^ or^ reversing^ the^ effects of erroneous reductions.^ Chapter^7
presents the new panic mode algorithm.^ Chapter^ 8 describes^ the^ Pascal^ auditor^ and^ the
empirical data obtained from it.^ The^ final^ chapters discuss lessons learned from^ the
implementation, directions^ for^ future^ work,^ and^ conclusions.^ All^ examples^ of errors
presented in the remaining^ chapters^ are^ taken^ from^ Pascal^ programs^ unless^ stated
otherwise.
Terminology
Let S be a set of symbols.^ A^ string^ over^ S^ isa finite sequence of symbols in^ S.^ The
empty sequenceiscalled^ the^ empty^ string^ and isdenoted by^ the^ Greek letter^ ~.^ The
length of a string^ a^ 1 ...^ a 11^ is^ n.^ For any symbol^ a,^ a^ /e^ is the string consisting of^ k
instances of a.^ Sic^ is^ the^ set of all^ strings^ over^ S^ of length^ k.^ s•^ isthe^ set of all^ strings
over^ S^ (including^ ~).^ Let^ x^ =^ a^1 ..•^ am^ and^ y^ =^ b^1 •••^ b^11^ be^ any^ two^ strings.^ The
concatenation of x^ andy^ (inthatorder)^ is^ the^ string^ a^1 •••^ amb^1 .:.b^11 •^ Concatenationis
indicated by adjacency.^ For example,^ the^ concatenation of^ x,^ y,^ and^ z^ is^ denoted^ as^ xyz.
A string x is a prefix^ of a^ stringy^ if and only if^ y^ =^ xz^ for some^ z^ E^ s•.
Let^ V^ be^ a^ finite set of symbols, and let^ E^ be a proper subset of^ V.^ Let^ N^ denote
V- E. A production^ or^ rule^ over^ V^ and^ E^ isan ordered pair^ (A,^ x)^ where^ A^ ENand
X E v·.^ A production^ (A,^ x)^ is^ denoted^ as^ A-^ X.^ For any rule^ A-x,^ A^ isits^ left-
hand^ side^ (lhs) and^ x^ isits^ right-hand^ side^ (rhs).
A context-free^ grammar^ G^ is a 4-tuple^ (V,^ E,^ P,^ S),^ where^ Vis^ a finite set of
symbols,^ E^ is^ a^ proper subset of^ V,^ Pis^ a^ finite^ set^ of productions over^ V^ and^ E,^ and
S E V.^ N^ denotes^ the^ set^ V-^ E.^ A symbol in^ E^ is^ a^ terminal^ symbol,^ and a symbol in
N^ is^ a^ nonterminal^ symbol.^ The symbol^ S^ isthe^ start^ symbol.
For any two strings^ x,^ y^ E^ v•,^ the^ relation^ x^ ==:}^ y^ is^ true^ if^ and^ only if^ x^ =^ sZt, y = szt, and^ Z^ -^ z^ E^ P,^ for some^ Z^ E^ N^ and^ s,^ t,^ z^ E^ v•.^ The^ relation^ x^ ~^ y^ istrue
if and^ only^ if^ x^ =^ sZt,^ y^ =^ szt,^ and^ Z- z^ E^ P,^ for^ some^ ZEN, t^ E^ E•,^ and
s, z^ E^ v•.^ The symbol^ ~^ denotes the^ reflexive^ transitive closure of^ ==:},^ and^ ~^ denotes
the reflexive transitive closure^ of~·^ A string^ x^ derives^ a^ stringy^ if and only if^ x^ ~^ y.
A string x E v•^ isa^ sentential^ form^ of^ G^ if and only if^ S^ ~^ x.^ A^ sentence^ of^ G^ is
a^ sentential form^ x^ such that^ x^ E^ E•.^ The^ language^ defined by^ G^ is the set of all
sentences of G andisdenoted^ as^ L(^ G).^ A string^ x^ is^ a^ correct^ prefix^ if and only if^ x^ is
the prefix of a sentential form.^ A string^ x^ is a^ right^ sentential^ form^ if^ and only if
S ~ x. Let^ x^ =^ szt^ be^ a string suchthat^ S^ ~^ sZt^ ~^ szt.^ Then^ z^ is^ a^ handle^ of^ x.
A derivation^ tree^ T^ of^ G^ isa labeled ordered tree suchthat
1. Each interior nodeislabeled with^ a^ nonterminal symbol.
2. Each leaf node^ is^ labeled with^ a^ terminal symbol or^ )...
3. For each interior node^ v,^ let^ Vv^ ...,^ v 11^ be the immediate descendants
of v. Let A^ be the symbol labeling^ v.^ Then either
a) n = 1,^ v 1 is^ labeled^ with~'^ and^ A^ -^ ~^ E^ P,^ or
b) vv ... ,v 11^ are^ labeled with the symbols^ a^1 ,^ .••^ ,a 11 respectively,
and A-a 1 •••^ a 11^ E^ P.
A parse tree^ is^ a derivation tree whose root node^ is^ labeled with the^ start^ symbol^ S.^ The
frontier^ of^ a^ derivation tree^ Tis^ the^ string formed by concatenating^ the^ symbols labeling the leaves of^ T^ in left to right order.^ The frontier of every parse tree^ is^ a^ sentence of^ G.
5
7
where m > 0, and^ ai^ E^ A^ for^1 <^ i^ ~^ m. The^ arity^ of each semantic function may be
different.
Each semantic function^ !fa^ ispaired with a dependency vector^ dfa.^ A dependency
vector indicates which values of^ the^ attributes^ of the symbols^ X^0 ,^ •.•^ ,Xn.,^ areto be^ the
arguments of the^ matching^ semantic^ function.^ The^ number^ of^ elements^ in^ each dependency vector^ must^ equal^ the^ arity of the corresponding semantic^ function.^ An
attribute a ofthesymbolXi,^1 <^ i^ <^ n,,^ can be represented bytheordered pair^ (i,^ a).
Each element of a dependency vectorisan^ ordered pair ofthatform.^ If^ the^ i·th^ element
of dfa is (i, a), then^ the domain^ of the i-th argument^ of^ !fa^ must^ be^ U^ a·^ The
dependency^ set^ Dfa^ isthe^ union of^ the^ elements of^ dfa·
Let AG be an^ attribute^ grammar, and let^ G^ be its underlying context-free grammar.
An attributed parse^ tree^ APT^ of^ AG^ is a parse^ tree^ T^ of^ G^ together with a function^ p..
Thefunction^ p.^ isthe^ meaning^ of the tree.^ The^ domain of^ p.^ isthe^ set
S = {^ (v,^ a)^ I^ vis^ a nodeofT,and^ a^ E^ A{X)
where X^ is^ the^ symbol^ labeling^ v^ }.
For^ (v,^ a)^ E^ S,^ p.(v,^ a)^ E^ Ua.^ If^ (v,^ a)^ E^ S,^ then^ p.(v,^ a)^ isthe^ value^ of^ a^ at^ v.^ An^ APT
isan^ evaluation^ of the parse tree^ T^ if an only if
1. Tis^ the^ underlying parse tree of^ the^ APT.
2. For each interior node^ v^ of^ T^ whose sole descendant islabeled with
., let X be^ the^ symbol labeling^ v^ and let^ p^ =^ X-^ >..^ For^ each f7 E S(X), p.(v,^ u)^ must^ equal^ fg 17 (p.(v,^ a^1 ),^ ••.^ ,p.(v,^ am)),^ where m is
the arity^ of/&,,^ and^ d&,^ =^ ((0,^ at),^ ...^ ,(O,^ am)).
3. For each interior node^ v^ of^ T^ whose immediate descendants^ v^11 •••^ ,^ lin
are labeled with symbols in^ V,^ let^ v^0 =^ v,^ and let^ X^0 ,^ ...^ ,Xn^ be^ the
symbols labeling^ v^0 ,^ •••^ ,v"^ respectively.^ Let^ p^ =^ X^0 -^ X^1 ...^ Xn.
For each attribute^ u^ E^ S(X 0 ),^ p.(v,^ a)^ must^ equal
/&,(p.(lli^1 ,^ a 1 ),^ •••^ ,p.(vi,.,^ am)),^ where^ m is the^ arity^ of^ fgcn^ and
d&,^ =^ ((i^1 ,^ a^1 ),^ .•.^ ,(im,^ am)).^ Similarly, for 1^ <^ k^ <^ n^ and^ for each
inherited^ attribute^ L^ E^ I(Xk^ ),^ p.(^ vk,^ L)^ must^ equal /f,(p.(lli 1 ,^ a^1 ),^ •••^ ,p.(vi,.,^ am)),^ where^ m^ is^ the^ arity^ of^ /f"^ and
df, = ((i 11 at), ... ,(im,^ am)).
In other^ words,^ an^ APT^ is^ an^ evaluation^ if^ and^ only^ if^ the^ values^ assigned^ to^ the
attributes^ are consistent with^ the^ values of^ the^ semantic functions for those attributes. Let p = X 0 -^ X^1 .•.^ X".^ Let^ a^ be an^ attribute^ of^ x.,^ where 1^ <^ k^ <^ n.^ The
local closure^ Dfa^ of^ Dfa^ is^ the smallest set suchthat
1. Dfa C Dfa,^ and
2. if (i,^ a')^ E^ Dfa,^ then^ Dfa^ C^ Dfa.
An £-attributed grammar^ is^ an^ attribute^ grammar^ AG^ such^ that^ for every rule
p = X 0 -^ X^1 ...^ X"^ of^ the^ underlying context-free^ grammar^ of^ AG
- if^ u,^ rr^ E^ S(X 0 ),^ then^ if^ (0,^ rr)^ E^ Dg 17 ,^ (0,^ u)^ ~^ Dkrr",
2. if" E J(X.~c),^1 <^ k^ <^ n,^ then^ for all^ (i,^ a)^ E^ l5't,^ i^ <^ k,^ and
3. if "E J(X.~c), 1 < k <^ n,^ thenfor all^ (k,^ a)^ E^ DktP,^ a^ E^ J(X.~c)^ and
( k, £) (/. Dfa ·
These restrictions ensurethatitispossible^ to^ evaluate^ the^ attributes^ of any parse^ tree^ in
a single top-down left-to-right^ pass^ overthattree.