Download Computer Language Engineering - Lecture Slides | CS 480 and more Study notes Linguistics in PDF only on Docsity! 1 cs480(Prasad) L2Syntax 1 Computer Language Engineering • How to give instructions to a computer? – Programming Languages. • How to make the computer carryout the instructions efficiently? – Compilers. cs480(Prasad) L2Syntax 2 Anatomy of a Compiler Intermediate Code Optimizer Code Generator Optimized Intermediate Representation Assembly code Intermediate Code Generator Intermediate Representation Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Token Stream Parse Tree Program (character stream) cs480(Prasad) L2Syntax 3 What is a Lexical Analyzer? • Examples of Token • Operators = + - > ( { := == <> • Keywords if while for int double • Numeric literals 43 6.035 -3.6e10 0x13F3A • Character literals ‘a’ ‘~’ ‘\’’ • String literals “3.142” “Fall” “\”\” = empty” • Examples of non-token • White space space(‘ ’) tab(‘\t’) end-of-line(‘\n’) • Comments /*this is not a token*/ Source program text Tokens cs480(Prasad) L2Syntax 4 Lexical Analyzer in Action • Partition input program text into sequence of tokens, attaching corresponding attributes. – E.g., C-lexeme “015” token NUM attribute 13 • Eliminate white space and comments f for o r v a r 1 = 1 0 v a r 1 < = ID(“var1”) eq_op Num(10) ID(“var1”) leq_op 2 cs480(Prasad) L2Syntax 5 Syntax and Semantics of a programming language? • Syntax – What is the structure of the program? – Textual representation. • Formally defined using context-free grammars (Backus-Naur Formalism) • Semantics – What is the meaning of a program? • Harder to give a mathematical definition. cs480(Prasad) L2Syntax 6 Input to and output of a parser - ( ) 123.3 23.6 + minus_op left_paren_op num(123.3) plus_op num(23.6) right_paren_op Token Stream Parse Tree Input: - (123.3 + 23.6) Sy nt ax A na ly ze r ( Pa rs er ) cs480(Prasad) L2Syntax 7 Example: A CFG for expressions • Simple arithmetic expressions with + and * – 8.2 + 35.6 – 8.32 + 86 * 45.3 – (6.001 + 6.004) * (6.035 * -(6.042 + 6.046)) • Terminals (or tokens) – num for all the numbers – plus_op (‘+’), minus_op (‘-’), times_op(‘*’), left_paren_op(‘(‘), right_paren_op(‘)’) • What is the grammar for all possible expressions? cs480(Prasad) L2Syntax 8 Example: A CFG for expressions <expr> → <expr> <op> <expr> <expr> → ‘(’ <expr> ‘)’ <expr> → ‘-’ <expr> <expr> → num <op> → ‘+’ <op> → ‘*’ 5 cs480(Prasad) L2Syntax 17 Removing Ambiguity • Sometimes rewriting a grammar to reflect operator precedence with additional nonterminals will eliminate the ambiguity. * more binding than +. && has precedence over ||. cs480(Prasad) L2Syntax 18 Eliminating Ambiguity <expr> → <term> ‘+’ <expr> <expr> → <term> <term> → <unit> ‘*’ <term> <term> → <unit> <unit> → num <unit> → ‘(’ <expr> ‘)’ <expr> → <expr> <op> <expr> <expr> → ‘(’ <expr> ‘)’ <expr> → num <op> → ‘+’ <op> → ‘*’ cs480(Prasad) L2Syntax 19 Extended BNF <expr> → <term> ( ‘+’ <term> ) * <term> → <unit> ( ‘*’ <unit> ) * <unit> → num | ‘(’ <expr> ‘)’ <expr> → <expr> ( <op> <expr> ) * <expr> → ‘(’ <expr> ‘)’ <expr> → num <op> → ‘+’ | ‘*’ cs480(Prasad) L2Syntax 20 Expression Parser Fragment Node expr() { // PRE: Expects lookahead token. // POST: Consume an Expression // and update Lookahead token. Node temp = term(); while ( inTok.ttype == '+') { inTok.nextToken(); Node temp1 = term(); temp = new OpNode(temp,'+',temp1); } return temp; } 6 cs480(Prasad) L2Syntax 21 Arithmetic Expressions <expr> → <expr> ‘+’ <expr> | <expr> ‘*’ <expr> | ‘(’ <expr> ‘)’ | <variable> | <constant> <variable> → ‘x’ | ‘y’ | ‘z’ <constant> → ‘0’ | ‘1’ | ‘2’ (ambiguous) cs480(Prasad) L2Syntax 22 Resolving Ambiguity in Expressions • Different operators : precedence relation • Same operator : associativity relation 5-2-3 5-2 - 3 5- 2-3 = 0 = 6 Left Associative ( (5-2)-3) 2**3**4 2^12 2^81 Right Associative (2**(3**4)) cs480(Prasad) L2Syntax 23 C++ operator precedence and Associativity Level Operator Function 17R :: global scope (unary) 17L :: class scope (binary) 16L -> , . member selectors 16L [] array index 16L () function call 16L () type construction 15R sizeof size in bytes 15R ++ , -- increment, decrement 15R ~ bitwise NOT 15R ! logical NOT 15R + , - uniary minus, plus 15R * , & dereference, address-of 15R () type conversion (cast) 15R new , delete free store management 14L ->* , .* member pointer select 13L * , / , % multiplicative operators Level Operator Function 12L + , - arithmetic operators 11L << , >> bitwise shift 10L < , <= , > , >= relational operators 9L == , != equaltity, inequality 8L & bitwise AND 7L ^ bitwise XOR 6L | bitwise OR 5L && logical AND 4L || logical OR 3L ?: arithmetic if 2R = , *= , /= , %= , += , -= <<= , >>= , &= , |= , ^= assignment operators 1L , comma operator