Compiler Fundamentals: Understanding the Compilation Process, Study notes of Computer Science

Notes on compiler fundamentals, including the role of projects, new tools like javacc, managing large scale projects, and advanced programming techniques. It covers the basics of what a compiler is, its components such as parser, lexical analyzer, linker, and code generator, and the importance of decomposition in software engineering. The document also discusses lexical analysis, deterministic finite automata (dfas), and regular expressions in the context of compiler design.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-jb4
koofers-user-jb4 ๐Ÿ‡บ๐Ÿ‡ธ

8 documents

1 / 58

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Compilers
CS414-2008S-01
Compiler Basics & Lexical Analysis
David Galles
Department of Computer Science
University of San Francisco
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a

Partial preview of the text

Download Compiler Fundamentals: Understanding the Compilation Process and more Study notes Computer Science in PDF only on Docsity!

Compilers

CS414-2008S-

Compiler Basics & Lexical Analysis

David Galles

Department of Computer Science University of San Francisco

01-0:

Syllabus Office Hours Course Text Prerequisites Test Dates & Testing Policies Projects

Teams of up to 2 Grading Policies Questions?

01-2:

Notes on the Class Projects are non-trivial

Using new tools (JavaCC) Managing a large scale project Lots of complex classes & advancedprogramming techniques.

01-3:

Notes on the Class Projects are non-trivial

Using new tools (JavaCC) Managing a large scale project Lots of complex classes & advancedprogramming techniques. START EARLY!

Projects will take longer than you think(especially starting with the semantic analyzerproject) ASK QUESTIONS!

01-5:

What is a compiler?

Parser

Lexical Analyzer

Linker

SourceFile

TokenStream

Semantic Analyzer^ Assembly TreeGenerator

AbstractAssemblyTree

AbstractSyntax Tree

Code Generator

Assembly Assembler

RelocatableObjectCode

Libraries

Machine code

More Accurate View

01-6:

What is a compiler?

Parser

Lexical Analyzer

Linker

SourceFile

TokenStream

Semantic

Analyzer Assembly

Tree Generator

AbstractAssemblyTree

AbstractSyntax

Tree

Code Generator

Assembly Assembler

RelocatableObjectCode

Libraries

Machine

code

Front

end Back

End

01-8:

Why Use Decomposition?

01-9:

Why Use Decomposition?

Software Engineering!

Smaller units are easier to write, test and debug Code Reuse

Writing a suite of compilers (C, Fortran, C++,etc) for a new architecture Create a new language โ€“ want compilersavailable for several platforms

01-11:

Lexical Analysis Converting input file to stream of tokens

void main() {

IDENTIFIER(void)

print(4);

IDENTIFIER(main)

}^

LEFT-PARENTHESISRIGHT-PARENTHESISLEFT-BRACEIDENTIFIER(print)LEFT-PARENTHESISINTEGER-LITERAL(4)RIGHT-PARENTHESISSEMICOLONRIGHT-BRACE

01-12:

Lexical Analysis

Brute-Force Approach

Lots of nested if statements if (c = nextchar() == โ€™Pโ€™) { if (c = nextchar() == โ€™Rโ€™) { if (c = nextchar() == โ€™0โ€™) {

if (c = nextchar() == โ€™Gโ€™) {/*

Code to handle the rest of eitherPROGRAM or any identifier that startswith PROG/ } else if (c == โ€™Cโ€™) {/

Code to handle the rest of eitherPROCEDURE or any identifier that startswith PROC*/ ...

01-14:

Deterministic Finite Automata Set of states Initial State Final State(s) Transitions

DFA for else, end, identifiersCombine DFA

01-15:

DFAs and Lexical Analyzers Given a DFA, it is easy to create C code toimplement it DFAs are easier to understand than C code^ Visual โ€“ almost like structure charts ... However, creating a DFA for a complete lexicalanalyzer is still complex

01-17:

Formal Languages Alphabet

: Set of all possible symbols

(characters) in the input file

Think of

as the set of symbols on the

keyboard

String

w

: Sequence of symbols from an alphabet

String length

|w

|^ Number of characters in a

string:

|car

|^ = 3,

|abba

|^ = 4

Empty String

วซ: String of length 0:

|^ = 0

Formal Language

: Set of strings over an

alphabet Formal Language

Programming language โ€“ Formal

Language is only a set of strings.

01-18:

Formal Languages

Example formal languages:

Integers

{^0

,^23

,^44

Floating Point Numbers

{^3

.^4 ,

Identifiers {foo, bar,