Download Ensuring Software Reliability: Language Choice & Techniques for Correctness - Prof. John K and more Study notes Computer Science in PDF only on Docsity!
Programming Languages And
Correctness By Construction
Implementation
Fault Avoidance
Correctness By Construction © John C. Knight 2009, All Rights Reserved 2
Pedagogical Goals
Be aware of the problem areas that
arise in implementation
Be familiar with the limitations in
various programming languages
Become familiar with the concept of
correctness by construction
Understand the principles of static
analysis
Correctness By Construction © John C. Knight 2009, All Rights Reserved 3
Implementation
Starting with a specification, hopefully
formal and probably in a declarative
language, we have to develop an
implementation.
Implementation written typically in a
high-level programming language
But the program is the bits that go
into memory...
Never forget that
Correctness By Construction © John C. Knight 2009, All Rights Reserved 4
Implementation Of Bits
Specification
Implementation
In High Level
Language
Relocatable
Binary
Implementation
Relocated
Binary
Implementation
Humans
Algorithms & Data structures
Compiler
Machine and Library
Descriptions
Linker & Loader
Libraries
Our
Focus
What Goes Wrong?
Wrong functionality (commission)
Missing functionality (omission)
Incorrect management of resources:
Processors in concurrent systems
Storage in all systems
Files and the I/O devices in all system
Unanticipated exceptions
Translation and library faults
Fault in generated code, wrong library, defective
library, wrong implementation instance, defective
linker or loader
Programming Languages
Good languages do not guarantee good software
but:
Good languages can help the developer
Poor languages can hinder the develop
Many classes of mistakes long recognized to
derive from language facilities, led to:
“Safe” subsets of some languages
Standards that ban known problematic practices
Programming language is the most fundamental
tool that the developer has
Language choice should support dependability
goals
Correctness By Construction © John C. Knight 2009, All Rights Reserved 7
Examples Of The Problem
Taken from “A Software Fault Prevention
Approach in Coding and Root Cause
Analysis” by Weider D. Yu
Obtained from analysis of several million
lines of code in the Lucent 5ESS switching
system—mostly C
Project goal—fault avoidance with large
systems written in C
Correctness By Construction © John C. Knight 2009, All Rights Reserved 8
Examples Of The Problem
Nearly 50% of the errors found were low-
level coding errors
Total 10.0 100
Coding 4.7 47.
Low-level Design 1.5 15.
High-level Design 1.9 18.
Data Design 0.7 6.
Requirements 1.2 12.
PHASE Fault Den/KNCSL Total %
Good Grief!
Correctness By Construction © John C. Knight 2009, All Rights Reserved 9
Operator Precedence
if (blkptr->rpthead.fltdesc &
HMFLTCLAS == HWMATEFLT)
Should be corrected to:
if ((blkptr->rpthead.fltdesc & HMFLTCLAS)
== HWMATEFLT)
== has higher
precedence than &
*numretry++;
Should be corrected to:
(*numretry)++;
Equal precedence but
associate right to left.
What does it mean to
increment the pointer
and then dereference it?
Correctness By Construction © John C. Knight 2009, All Rights Reserved 10
Should be corrected to:
For Loop Control
for (idx = 0; idx < 40;
dispstring[idx] = COTsuccess[idx++]);
for (idx = 0; idx < 40; idx++) {
dispstring[idx] = COTsuccess[idx]
This is
actually
compiler
dependent!
Logic Faults on Lucent 5ESS
Use of uninitialized
variables
Misuse of break and
continue statements
Operator precedence
Loop boundaries
Indexing outside arrays
Truncation of values
Misuse of pointers
Incorrect AND and OR tests
Assignment/equality
Bit fields not unsigned or
enum
Incorrect logical AND and
mask operators
Preprocessor conditional
errors
Comment delimiters
Unsigned variables and
comparisons
Misuse of type casting
Project Results
Techniques developed to help reduce number
of coding defects of this type
Effect was a 34.5% reduction in rate between
releaseT and releaseT+
Corresponding reduction in test costs was
18.3% (testing cost per source code line)
Cost of techniques, $100K, cost savings in
rework and testing, $7M—yes, you can do it
Is this good enough?
Correctness By Construction © John C. Knight 2009, All Rights Reserved 19
From The Ada 2005 LRM
“The need for languages that promote reliability and simplify
maintenance is well established. Hence emphasis was placed on
program readability over ease of writing. For example, the rules of
the language require that program variables be explicitly declared
and that their type be specified. Since the type of a variable is
invariant, compilers can ensure that operations on variables are
compatible with the properties intended for objects of the type.
Furthermore, error-prone notations have been avoided, and the
syntax of the language avoids the use of encoded forms in favor of
more English-like constructs. Finally, the language offers support
for separate compilation of program units in a way that facilitates
program development and maintenance, and which provides the
same degree of checking between units as within a unit.”
Correctness By Construction © John C. Knight 2009, All Rights Reserved 20
What Is Ada?
Targets include:
Embedded systems
Long-lived, complex software
Eliminate as many known “dangerous” elements
of existing languages as possible
Object-based and object-oriented
Parameterized types (generics)
Real-time, concurrent programming
Comprehensive exception handling
Powerful type system
Correctness By Construction © John C. Knight 2009, All Rights Reserved 21
What Is Ada?
Systems programming:
Precise control of machine representation
Access to system-dependent elements
Bare machine targets (no operating system)
Standard packages for many purposes
Comprehensive support for separate compilation
Comprehensive support for numeric types, in
particular representation on target
Correctness By Construction © John C. Knight 2009, All Rights Reserved 22
What Is Ada?
No language subsets permitted
Language extends to link time
Compiler validation process:
All compilers required to “pass” very comprehensive test suite
Ada Compiler Validation Capability (ACVC)
ANSI and ISO language standards
Has a complete infrastructure:
Compilers and other tools
Careful and detailed definitions
Textbooks & training
Cannot cover much of language here
E.g.: Separate Compilation
Essential capability in any development
environment
What are the semantics?
Typically, language semantics:
Different for separate compilation
Parameter types and number not checked
Relies on system linker and utilities like “make”
Ada semantics:
Lazy compilation, identical to file include
Linking is part of compilation
Dependency checking is part of compilation
Correctness By Construction
Could we do better than languages like Ada?
Yes, build implementation so that correctness
accrues from the construction process
Small part of this is to use a programming
language that precludes as many fault sources as
possible
What next? Need an approach or approaches that
help the engineer to the extent possible
Correctness By Construction © John C. Knight 2009, All Rights Reserved 25
C By C—Three Approaches
Specification Implementation
Specification Implementation
Abstract Concrete
Specification Implementation
Human Guided
Synthesis
Human Created Refinement
and Proof
Human Created Synthesis
and Static Analysis
Synthesis
E.g.Simulink
Refinement
E.g.B
Analysis
E.g.SPARK
Correctness By Construction © John C. Knight 2009, All Rights Reserved 26
Correctness By Construction
Synthesis is clearly best, but limited at present
Refinement is good but very restrictive
Analysis is fairly general and works well
Analysis tends to produce:
Vastly better software
Cheaper software (surprisingly)
Basic thesis of the SPARK Ada project
Basic approach also developed for C# and Java
SPARK is not a toy, used in many industrial
development projects
Correctness By Construction © John C. Knight 2009, All Rights Reserved 27
Resources
http://www.sparkada.com
Technical materials
Publications, especially:
http://www.sparkada.com/downloads/SPARK95.pdf
Etc.
“High Integrity Software The SPARK Ada
Approach to Safety and Security
John Barnes, Addison Wesley (2003)
“Correctness by Construction”, Peter Amey
Correctness By Construction © John C. Knight 2009, All Rights Reserved 28
Overview
Programming language with precise, well-defined syntax and
semantics:
SPARK uses an Ada subset
Annotation mechanism designed to allow low-level
specification of software:
SPARK uses stylized comments
Code built from low-level design, not directly from high-level
specification
Analysis tools to establish important properties about code
relative to low-level specification
Verification that low-level design implements high-level
specification is a separate issue
SPARK provides:
Examiner, Simplifier, Prover
SPARK Application Example
Lockheed C130J avionics upgrade
Civil certification to DO-178B level A
Military certification to UK Def Std 00-
Coding in SPARK Ada
Cost of MC/DC testing reduced by 80%
Lockheed claims:
Code quality improved by 10x over industry norms
Productivity improved by 4x
Development costs were 50% of typical
This is cost effective
SPARK Application Example
Significant errors found by static analysis in
code already passing DO-178B level A
SPARK code only 10% of residual errors
found in full Ada
Ada found to have only 10% of the residual
errors found in C
Programming language does matter
Remember, the use of C in safety-critical
systems is…
Correctness By Construction © John C. Knight 2009, All Rights Reserved 37
SPARK Ada Subset
Ada Subset
Precise Syntax & Semantic Definitions
Ada
Carefully Designed
Programming Language
Items removed include:
Generics
Access types
Goto statements
Most tasking
Exceptions
Dynamically sized arrays
Implicit subtypes
Result remains a
powerful and useful
language
Select Language Subset
Eliminate Features For Which Proof
Would Be Problematic
Correctness By Construction © John C. Knight 2009, All Rights Reserved 38
SPARK Ada Subset
A subset of Ada:
Any correct SPARK Ada program is also a correct Ada program
True Ada but impossible to write “erroneous” program in the
sense of the Ada LRM
Logical soundness—no language ambiguities
Formal definition of syntax and logical semantics
Subset removes the troublesome parts of language
Formal semantics
Security—all language semantic rules are efficiently
machine-checkable
Correctness By Construction © John C. Knight 2009, All Rights Reserved 39
SPARK Ada Properties
Bounded time and space—resources required calculated
statically
Expressive power—industrial strength applications
Full Ada:
Designed to run on a bare machine
Requires its own run-time support
SPARK Ada has minimal run-time requirements:
No tasking originally, now relaxed
No exception handling
No need to verify complex run-time system
Correctness By Construction © John C. Knight 2009, All Rights Reserved 40
SPARK Ada
Includes but is not limited to:
Scalar types—enumeration, character, Boolean, float, fixed
Array types and strings
Records
Full range of expressions
Statements—assignment, case, loop, if, exit, procedure call, entry
call, return, delay
Subprograms
Packages (i.e., classes)
Complete separation of specification and implementation
Inheritance
Tasks including entries and entry calls
Separate compilation
What Did SPARK Ada Omit?
Tasks:
Much omitted, Ravenscar profile remains
Many aspects of tasking inhibit proof
Exceptions:
Complex dynamic control flow inhibits proof
Generics (templates):
No fundamental expressive power
Requires quantification over types
Access types (pointers)
Dynamically sized arrays
Implicit subtypes
Goto statements
Access types needed for dynamic storage management
Dynamic allocation is areal problem for
dependability—need to prove that storage
isnever exhausted during execution
Generally infeasible Also need to ensure no memory leaks, no dangling pointers, etc. A pointer!^ ^ Note: The Java approach is not a solution
Formal Semantics
Language
Features
Model
Axioms
Example—“while” statement:
What exactly does it mean?
Model:
Rewrite “while” using model:
Conditions
Branches
Assignment
Axioms:
Basic building blocks of computation
Correctness By Construction © John C. Knight 2009, All Rights Reserved 43
Loop Invariant
How many times is S executed?
We have no idea—it depends on the data during execution
So how can we prove anything?
Invariant:
Documents loop, number of iterations does not matter
True before and after loop body
Not true during loop body
An algebraic , logical expression
Allows logic statement after loop no matter how many times executed
Documents the computation of the loop
while B loop
S;
end loop;
Correctness By Construction © John C. Knight 2009, All Rights Reserved 44
Loop Invariant Example
Loop invariant
x = qy + r ∧ r >= 0
Simple program to do integer division x/y:
True for:
- No iterations
- One iteration
- Any number of iterations
q := 0
r := x;
loop
exit when r < y;
q := q + 1;
r := r – y;
end loop;
Correctness By Construction © John C. Knight 2009, All Rights Reserved 45
The SPARK Annotations
Language in its own right
Embedded in comments that begin --#
SPARK is the Ada subset combined with the annotation language
Annotations define:
Software specification
Intended dependencies between data items
Thus, two types of annotations:
Proof
Specification for proof of correctness properties
Data flow
Informing the tools about various source-code relationships
Those that constitute formal specification of the code make code more
understandable
Annotations should be written before code is written
Correctness By Construction © John C. Knight 2009, All Rights Reserved 46
Annotation Example (Barnes)
Basic procedure specification—says very little:
Procedure Add(X: in integer);
Merely a procedural statement
No notion of declarative specification in the code
Annotated procedure specification—says a lot:
Procedure Add(X: in integer);
--# global in out Total;
--# derives Total from Total, X;
--# pre X > 0;
--# post Total = Total~ + X;
Data Flow
Annotations
Proof
Annotations
Simple Example (Barnes)
package Odometer --# own Trip, Total : Integer; is
procedure Zero_Trip; --# global out Trip; --# derives Trip from ; --# post Trip = 0;
function Read_Trip return Integer; --# global in Trip;
function Read_Total return Integer; --# global in Total;
procedure Inc; --# global in out Trip, Total; --# derives Trip from Trip & Total from Total; --# post Trip = Trip~ + 1 and Total = Total~ + 1;
end Odometer;
Properly
Separated
Specification
Annotated
Think Of This
As An
“Interface”
Simple Example (Barnes)
package body Odometer is Trip, Total : Integer;
procedure Zero_Trip is begin Trip := 0; end Zero_Trip;
function Read_Trip return Integer is begin return Trip; end Read_Trip;
function Read_Total return Integer is begin return Total; end Read_Total;
procedure Inc is begin Trip := Trip + 1; Total := Total + 1; end Inc;
end Odometer;
Implementation
Think Of This
As “Class”