Data Structures, Recursion, and Java Design, Slides of Programming Languages

Programming Languages and Techniques. Lecture Notes for CIS 120. Steve Zdancewic. Stephanie Weirich. University of Pennsylvania.

Typology: Slides

2022/2023

Uploaded on 05/11/2023

courtneyxxx
courtneyxxx 🇺🇸

4.5

(14)

253 documents

1 / 373

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Programming Languages and Techniques
Lecture Notes for CIS 120
Steve Zdancewic
Stephanie Weirich
University of Pennsylvania
October 7, 2019
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Data Structures, Recursion, and Java Design and more Slides Programming Languages in PDF only on Docsity!

Programming Languages and Techniques

Lecture Notes for CIS 120

Steve Zdancewic

Stephanie Weirich

University of Pennsylvania

October 7, 2019

  • 1 Overview and Program Design
    • 1.1 Introduction and Prerequisites
    • 1.2 Course Philosophy
    • 1.3 How the different parts of CIS 120 fit together
    • 1.4 Course History
    • 1.5 Program Design
  • 2 Introductory OCaml
    • 2.1 OCaml in CIS
    • 2.2 Primitive Types and Expressions
    • 2.3 Value-oriented programming
    • 2.4 let declarations
    • 2.5 Local let declarations
    • 2.6 Function Declarations
    • 2.7 Types
    • 2.8 Failwith
    • 2.9 Commands
    • 2.10 A complete example
    • 2.11 Notes
  • 3 Lists and Recursion
    • 3.1 Lists
    • 3.2 Recursion
  • 4 Tuples and Nested Patterns
    • 4.1 Tuples
    • 4.2 Nested patterns
    • 4.3 Exhaustiveness
    • 4.4 Wildcard (underscore) patterns
    • 4.5 Examples
  • 5 User-defined Datatypes 4 CONTENTS
    • 5.1 Atomic datatypes: enumerations
    • 5.2 Datatypes that carry more information
    • 5.3 Type abbreviations
    • 5.4 Recursive types: lists
  • 6 Binary Trees
  • 7 Binary Search Trees
    • 7.1 Creating Binary Search Trees
  • 8 Generic Functions and Datatypes
    • 8.1 User-defined generic datatypes
    • 8.2 Why use generics?
  • 9 First-class Functions
    • 9.1 Partial Application and Anonymous Functions
    • 9.2 List transformation
    • 9.3 List fold
  • 10 Modularity and Abstraction
    • 10.1 A motivating example: finite sets
    • 10.2 Abstract types and modularity
    • 10.3 Another example: Finite Maps
  • 11 Partial Functions: option types
  • 12 Unit and Sequencing Commands
    • 12.1 The use of ‘ ; ’
  • 13 Records of Named Fields
    • 13.1 Immutable Records
  • 14 Mutable State and Aliasing
    • 14.1 Mutable Records
    • 14.2 Aliasing: The Blessing and Curse of Mutable State
  • 15 The Abstract Stack Machine
    • 15.1 Parts of the ASM
    • 15.2 Values and References to the Heap
    • 15.3 Simplification in the ASM
    • 15.4 Reference Equality
  • 16 Linked Structures: Queues 5 CONTENTS
    • 16.1 Representing Queues
    • 16.2 The Queue Invariants
    • 16.3 Implementing the basic Queue operations
    • 16.4 Iteration and Tail Calls
    • 16.5 Loop-the-loop: Examples of Iteration
    • 16.6 Infinite Loops
  • 17 Local State
    • 17.1 Closures
    • 17.2 Objects
    • 17.3 The generic ' a ref type
    • 17.4 Reference ( == ) vs. Structural Equality ( = )
  • 18 Wrapping up OCaml: Designing a GUI Library
    • 18.1 Taking Stock
    • 18.2 The Paint Application
    • 18.3 OCaml’s Graphics Library
    • 18.4 Design of the GUI Library
    • 18.5 Localizing Coordinates
    • 18.6 Simple Widgets & Layout
    • 18.7 The widget hierarchy and the run function
    • 18.8 The Event Loop
    • 18.9 GUI Events
    • 18.10Event Handlers
    • 18.11Stateful Widgets
    • 18.12Listeners and Notifiers
    • 18.13Buttons (at last!)
    • 18.14Building a GUI App: Lightswitch
  • 19 Transition to Java
    • 19.1 Farewell to OCaml
    • 19.2 Three programming paradigms
    • 19.3 Functional Programming in OCaml
    • 19.4 Object-oriented programming in Java
    • 19.5 Imperative programming
    • 19.6 Types and Interfaces
  • 20 Connecting OCaml to Java
    • 20.1 Core Java
    • 20.2 Static vs. Dynamic Methods
  • 21 Arrays 6 CONTENTS
    • 21.1 Arrays
  • 22 The Java ASM
    • 22.1 Differences between OCaml and Java Abstract Stack Machines
  • 23 Subtyping, Extension and Inheritance
    • 23.1 Interface Recap
    • 23.2 Subtyping
    • 23.3 Multiple Interfaces
    • 23.4 Interface Extension
    • 23.5 Inheritance
    • 23.6 Object
    • 23.7 Static Types vs. Dynamic Classes
  • 24 The Java ASM and dynamic methods
    • 24.1 Refinements to the Abstract Stack Machine
    • 24.2 The revised ASM in action
  • 25 Generics, Collections, and Iteration
    • 25.1 Polymorphism and Generics
    • 25.2 Subtyping and Generics
    • 25.3 The Java Collections Framework
    • 25.4 Iterating over Collections
  • 26 Overriding and Equality
    • 26.1 Method Overriding and the Java ASM
    • 26.2 Overriding and Equality
      • 26.2.1 When to override equals
      • 26.2.2 How to override equals
    • 26.3 Equals and subtyping
      • 26.3.1 Restoring symmetry
  • 27 Exceptions
    • 27.1 Ways to handle failure
    • 27.2 Exceptions in Java
    • 27.3 Exceptions and the abstract stack machine
    • 27.4 Catching multiple exceptions
    • 27.5 Finally
    • 27.6 The Exception Class Hierarchy
    • 27.7 Checked exceptions
    • 27.8 Undeclared exceptions
    • 27.9 Good style for exceptions 7 CONTENTS
  • 28 IO
    • 28.1 Working with (Binary) Files
    • 28.2 PrintStream
    • 28.3 Reading text
    • 28.4 Writing text
    • 28.5 Histogram demo
  • 29 Swing: GUI programming in Java
    • 29.1 Drawing with Swing
    • 29.2 User Interaction
    • 29.3 Action Listeners
    • 29.4 Timer
  • 30 Swing: Layout and Drawing
    • 30.1 Layout
    • 30.2 An extended example
  • 31 Swing: Interaction and Paint Demo
    • 31.1 Version A: Basic structure
    • 31.2 Version B: Drawing Modes
    • 31.3 Version C: Basic Mouse Interaction
    • 31.4 Version D: Drag and Drop
    • 31.5 Version E: Keyboard Interaction
    • 31.6 Interlude: Datatypes and enums vs. objects
    • 31.7 Version F: OO-based Refactoring
  • 32 Java Design Exercise: Resizable Arrays
    • 32.1 Resizable Arrays
  • 33 Encapsulation and Queues
    • 33.1 Queues in ML
    • 33.2 Queues in Java
    • 33.3 Implementing Java Queues

8 CONTENTS

10 Overview and Program Design

gramming components and those of a more theoretical nature. The science of computing can be thought of as a modern day version of logic and critical think- ing. However, this is a more concrete, more potent form of logic: logic grounded in computation. Like any other skill, learning to program takes plenty of practice to master. The tools involved—languages, compilers, IDEs, libraries, and frameworks—are large and complex. Furthermore, many of these tools are tuned for the demands of rigorous software engineering, including extensibility, efficiency and security. The general philosophy for introductory computer science at Penn is to develop programming skills in stages. We start with basic skills of “algorithmic thinking” in our intro CIS 110 course, though students enter Penn already with this ability through exposure to AP Computer Science classes in high school, through summer camps and courses on programming, or independent study. At this stage, students can write short programs, but may have less fluency with putting them together to form larger applications. The first part of CIS 120 continues this process by developing design and analysis skills in the context of larger and more challeng- ing problems. In particular, we teach a systematic process for program design, a rigorous model for thinking about computation, and a rich vocabulary of compu- tational structures. The last stage (the second part of CIS 120 and beyond) is to translate those design skills to the context of industrial-strength tools and design processes. This philosophy influences our choice of tools. To facilitate practice, we prefer mature platforms that have demonstrated their utility and stability. For the first part of CIS 120, where the goal is to develop design and analysis skills, we use the OCaml programming language. In the second half of the semester, we switch to the Java language. This dual language approach allows us to teach program design in a relatively simple environment, make comparisons between different programming paradigms, and motivate sophisticated features such as objects and classes. OCaml is the most-widely used dialect of the ML family of languages. Such languages are not new—the first version of ML, was designed by Robin Milner in the 1970s; the first version of OCaml was released in 1996. The OCaml implemen- tation is a free, open source project developed and maintained by researchers at INRIA, the French national laboratory for computing research. Although OCaml has its origins as a research language, it has also attracted significant attention in industry. For example, Microsoft’s F# language is strongly inspired by OCaml and other ML variants. Scala and Haskell, two other strongly typed functional pro- gramming languages, also share many common traits with OCaml. Java is currently one of the most popularly used languages in the software in- dustry and representative of software object-oriented development. It was orig- inally developed by James Gosling and others at Sun Microsystems in the early

11 Overview and Program Design

nineties and first released in 1995. Like OCaml, Java was released as free, open source software and all of the core code is still available for free. Popular languages related to Java include C# and, to a lesser extent, C++.

Goals

There are four main, interdependent goals for CIS 120.

Increased independence in programming While we expect some familiarity with programming, we don’t expect entering students to be full-blown program- mers. The first goal of 120 is to extend their programming skills, going from the ability to write program that are 10s of lines long to programs that are 1000s of lines long. Furthermore, as the semester progresses, the assignments become less con- strained, starting from the application of simple recipes, to independent problem decomposition.

Fluency in program design The ability to write longer programs is founded on the process of program design. We teach necessary skills, such as test-driven devel- opment, interface specification, modular decomposition, and multiple program- ming idioms that extend individual problem solving skills to system development.

Firm grasp of fundamental principles CIS 120 is not just an introductory pro- gramming course; it is primarily an introductory computer science course. It cov- ers fundamental principles of computing, such as recursion, lists and trees, in- terfaces, semantic models, mutable data structures, references, invariants, objects, and types.

Fluency in core Java We aim to provide CIS 120 students with sufficient core skills in a popular programming language to enable further development in a va- riety of contexts, including: advanced CIS core courses and electives, summer in- ternships, start-up companies, contributions to open-source projects and individ- ual exploration. The Java development environment, including the availability of libraries, tools, communities, and job opportunities, satisfies this requirement. CIS 120 includes enough details about the core Java languages and common libraries for this purpose, though is not an exhaustive overview to Java or object-oriented software engineering. There are many details about the Java language that CIS 120 does not cover; the goal is to provide enough information for future self study.

13 Overview and Program Design

mathematical programming model. Programs can be thought of in terms of “trans- formations” of data instead of “modifications,” leading to simpler, more explicit interfaces.

1.3 How the different parts of CIS 120 fit together

Homework assignments provide the core learning experience for CIS 120. Pro- gramming is a skill, one that is developed through practice. The homework assign- ments themselves include both “etudes”, which cover basic concepts directly and “applications” which develop those concepts in the context of a larger purpose. The homework assignments take time.

Lectures Lectures serve many roles: they introduce, motivate and contextualize con- cepts, they demonstrate code development, they personalize learning, and moderate the speed of information acquisition. To augment your learning, we make lecture slides, demonstration code and lecture notes available on the course website. Given the availability of these re- sources, why come to class at all?

  • To see code development in action. Many lectures include live demonstra- tions of code design, and only the resulting code is posted on the website. That means the thought process of code development is lost: your instruc- tors will show some typical wrong turns, discuss various design trade-offs and demonstrate debugging strategies. Things will not always go accord- ing to plan, and observing what to do when this happens is also a valuable lesson.
  • To interact with the new material as it is presented, by asking questions. Sometimes asking a question right at the beginning can save a lot of later confusion. Also to hear the questions that your classmates have about the new material. Sometimes it is difficult to realize that you don’t fully under- stand something until someone else raises a subtle point.
  • To regulate the timing of information flow for difficult concepts. Sure, you can read the lecture notes in less than fifty minutes. However, sometimes slowing down, working through examples, and thinking deeply is required to internalize a topic. You may have more patience for this during lecture than while reading lecture notes on your own.
  • For completeness. We cannot promise to include everything in the lectures in the lecture notes. Instructors are only human, with limited time to prepare lecture notes, particularly at the end of the semester.

14 Overview and Program Design

Labs The labs (or “recitations”) provide a small-group setting for coding practice and individual attention from the course staff. The lab grade comes primarily from lab participation. The reason is that the lab is there to get you to write code in a low-stress environment. In this environ- ment, we use pair programming, teaming you up with your classmates to find the solutions to the lab problems together. The purpose of the teams is not to divide the work—in fact, we expect that in many cases it will take longer to complete the lab work using a team than by doing it yourself! The benefit of team work lies in discussing the entire exercise with your partner—often you will find that you and your partner have different ideas about how to solve the problem and find different aspects difficult. Furthermore, labs are another avenue for coding practice, adding to the quan- tity of code that you write during the semester. The folllowing parable illustrates the value of this:

The ceramics teacher announced he was dividing his class into two groups. All those on the left side of the studio would be graded solely on the quantity of work they produced, all those on the right graded solely on its quality. His procedure was simple: on the final day of class he would weigh the work of the “quantity” group: 50 pounds of pots rated an A, 40 pounds a B, and so on. Those being graded on “quality”, however, needed to produce only one pot — albeit a perfect one — to get an A. Well, come grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quan- tity! It seems that while the “quantity” group was busily churning out piles of work — and learning from their mistakes — the “quality” group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead clay.

David Bayles and Ted Orland, from “Art and Fear” [1].

Exams The exams provide the fundamental assessment of course concepts, in- cluding both knowledge and application. Some students have difficulty see- ing how pencil-and-paper problems relate to writing computer programs. In- deed, when completing homework assignments, powerful tools such as IDEs, type checkers, compilers, top-levels, and online documentation are available. None of these may be used in exams. Rather, the purpose of exams is to assess both your understanding of these tools and, more importantly, the way you think about pro- gramming problems.

16 Overview and Program Design

We will demonstrate this process by considering the following design problem:

Imagine an owner of a movie theater who wants to know how much he should charge for tickets. The more he charges, the fewer people can afford tickets. After some experiments, the owner has determined a precise relationship between the price of a ticket and average attendance. At a price of $5.00 per ticket, 120 people attend a performance. Decreasing the price by a dime ($.10) increases attendance by 15. However, the increased atten- dance also comes at an increased cost: Each attendee costs an- other four cents ($0.04), on top of the fixed per-performance cost of $180. The owner would like to be able to calculate, for any given ticket price, exactly how much profit he will make.

We will develop a solution to this problem by following the design recipe in the context of the OCaml programming language. In the process, we’ll introduce OCaml by example. In the next chapter, we give a more systematic overview of its syntax and semantics.

Step 1: Understand the problem. In the scenario above, there are five relevant concepts: the ticket price, (number of) attendees, revenue, cost and profit. Among these entities, we can define several relationships. From basic economics we know the basic relationships between profit, cost and revenue. In other words, we have

profit = revenue − cost

and revenue = price × attendees Also, the scenario tells us how to compute the cost of a performance

cost = $180 + attendees × $0. 04

but does not directly specify how the ticket price determines the number of atten- dees. However, because revenue and cost (and profit, indirectly) depend on the number of attendees, they are also determined by the ticket price. Our goal is to determine the precise relationship between ticket price and profit. In programming terms, we would like to define a function that, given the ticket price, calculates the expected profit.

17 Overview and Program Design

Step 2: Formalize the Interface. Most of the relevant concepts—cost, ticket price, revenue and profit—are dollar amounts. That raises a design choice about how to represent money. Like most programming languages, OCaml can calculate with integer and floating point values, and both are attractive for this problem, as we need to represent fractional dollars. However, the binary representation of floating point values makes it a poor choice for money, since some numeric values—such as 0.1—cannot be represented exactly, leading to rounding errors. (This “feature” is not unique to OCaml—try calculating 0.1 + 0.1 + 0.1 in your favorite program- ming language.) So let’s represent money in cents and use integer arithmetic for calculations. Our goal is to define a function that computes the profit given the ticket price, so let us begin by writing down a skeleton for this function—let’s call it profit— and noting that it takes a single input, called price. Note that the first line of this definition uses type annotations to enforce that the input and and output of the function are both integers.^2

let profit ( price :int) : int = ...

Step 3: Write test cases. The next step is to write test cases. Writing test cases before writing any of the interesting code—the fundamental rule of test-driven pro- gram development—has several benefits. First, it ensures that you understand the problem—if you cannot determine the answer for one specific case, you will find it difficult to solve the more general problem. Thinking about tests also influences the code that you will write later. In particular, thinking about the behavior of the program on a range of inputs will help ensure that the implementation does the right thing for each of them. Finally having test cases around is a way of “fu- tureproofing” your code. It allows you to make changes to the code later and automatically check that they have not broken the existing functionality. In the situation at hand, the informal specification suggests a couple of specific test cases: when the ticket price is either $5.00 or $4.90 (and the number of atten- dees is accordingly either 120 or 135). We can use OCaml itself to help compute what the expected values of these test cases should be. The OCaml let -expression gives a name to values that we compute, and we can use these values to compute others with the let - in expression form.

(^2) OCaml will let you omit these type annotations, but including them is mandatory for CIS120. Using type annotations is good documentation; they also improve the error messages you get from the compiler. When you get a type error message, the first thing you should do is check that your type annotations correct.

19 Overview and Program Design

let profit ( price :int) : int = ( revenue price ) - ( cost price )

Next, we can fill these in, in terms of another function attendees that we will write in a minute.

let attendees ( price :int) : int = ...

let revenue ( price :int) : int = price (^) *** (** attendees price )

let cost ( price :int) : int = 18000 + (^4) *** (** attendees price )

For attendees, we can apply the design recipe again. We have the same con- cepts as before, and the interface for attendees is determined by the code above. Furthermore, we can define the test cases for attendees from the problem state- ment.

let test () : bool = ( attendees 500) = 120 ;; run_test "atts. at $5.00" test

let test () : bool = ( attendees 490) = 135 ;; run_test "atts. at $4.90" test

To finish implementing attendees, we make the assumption that there is a lin- ear relationship between the ticket price and the number of attendees. We can graph this relationship by drawing a line given the two points specified by the test cases in the problem statement.

20 Overview and Program Design

A"endees vs. Ticket Price

CIS

0

20

40

60

80

100

120

140

160

$4.75 $4.80 $4.85 $4.90 $4.95 $5.00 $5.05 $5.10 $5.

$0.

We can determine what the function should be with a little high-school algebra. The equation for a line y = mx + b

says that the number of attendees y, is equal to the slope of the line m, times the ticket price y, plus some constant value b. Furthermore, we can determine the slope of a line given two points:

m =

difference in attendance difference in price

Once we know the slope, we can determine the constant b by solving the equation for the line for b and plugging in the numbers from either test case. Therefore b = 120 − (− 15 /10) × 500 = 870. Putting these values together gives us a mathematical formula specifying at- tendees in terms of the ticket price.

attendees = (− 15 /10) × price + 870

Translating that math into OCaml nearly completes the program design.

let attendees ( price **: int) : int = (-15 / 10) *** price + 870