
















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This presentation covers the concepts of software reliability and redundancy, focusing on fault-tolerant computing and the differences between software and hardware. It discusses software unreliability causes, software aging, and software reliability models. The presentation also introduces software verification and validation methods and strategies for software flaw tolerance, such as n-version programming, masking redundancy, and self-checking design.
Typology: Exams
1 / 24
This page cannot be seen from the preview
Don't miss anything!

















Nov. 2007
Software Reliability and Redundancy
Nov. 2007
Software Reliability and Redundancy
Edition
Released
Revised
Revised
First
Nov. 2006
Nov. 2007
This presentation has been prepared for the graduatecourse ECE 257A (Fault-Tolerant Computing) byBehrooz Parhami, Professor of Electrical and ComputerEngineering at University of California, Santa Barbara.The material contained herein can be used freely inclassroom teaching or any other educational setting.Unauthorized uses are prohibited. © Behrooz Parhami
Nov. 2007
Software Reliability and Redundancy “We are neither hardware norsoftware; we are your parents.”
“I haven’t the slightest idea who he is.He came bundled with the software.”
“Well, what’s a piece of softwarewithout a bug or two?”
Nov. 2007
Software Reliability and Redundancy
Slide 5
Component
Logic^
Service
Result
Information
System
Level
→
Low-Level Impaired
Mid-Level Impaired
High-Level Impaired
Unimpaired
Entry Legend:
Deviation
Remedy
Tolerance
Ideal
Defective
Faulty
Erroneous
Malfunctioning
Degraded
Failed
Nov. 2007
Software Reliability and Redundancy
Project initiationNeedsRequirementsSpecificationsPrototype designPrototype testRevision of specsFinal designCodingUnit testIntegration testSystem testAcceptance testField deploymentField maintenanceSystem redesignSoftware discard
Evaluation by both the developer and customerImplementation or programmingSeparate testing of each major unit (module)Test modules within pretested control structureCustomer or third-party conformance-to-specs testNew contract for changes and additional featuresObsolete software is discarded (perhaps replaced)
Software flaws may ariseat several points withinthese life-cycle phases
Nov. 2007
Software Reliability and Redundancy
Major structural and logical problems are removed very early in theprocess of software testing What remains after extensive verification and validation is a collection oftiny flaws which surface under rare conditions or particular combinationsof circumstances, thus giving software failure a statistical nature Software usually contains one or more flaws per thousand lines of code,with < 1 flaw considered good (linux has been estimated to have 0.1) If there are
f^ flaws in a software component, the hazard rate, that is, rate of failure occurrence per hour, is
kf , with
k^ being the constant of
proportionality which is determined experimentally (e.g.,
k^ = 0.0001)
Software reliability:
R ( t ) =
The only way to improve software reliability is to reduce the number ofresidual flaws through more rigorous verification and/or testing
Nov. 2007
Software Reliability and Redundancy
Slide 10
Software flaw/bug
Operational error
Software-induced failure
“Software failure” used informally to denote any software-related problem Initialflaws^ Removing flaws, withoutgenerating new ones
Residual flaws Removed flaws
Start oftesting
Softwarerelease
Time
Initialflaws
Removed flaws
Start oftesting
Softwarerelease
Added flaws
Residualflaws
Time
New flaws introduced areproportional to removal rate
Rate of flaw removaldecreases with time
Nov. 2007
Software Reliability and Redundancy
Slide 11
Initialflaws
Residual flaws Removed flaws
Start oftesting
Softwarerelease
Testingtime
Assume linearly decreasing flawremoval rate (
F^ = residual flaws,
τ^ = testing time, in months) dF (τ
)/ d τ^ = –(
a^ –^ b
τ)
F (τ) =
a τ^ (1 –
b τ/(
a ))
Example:
F (τ) = 130 – 30
τ(1 –
τ/16)
Hazard function z (τ) =
k ( F^0
-^ a τ
b τ/(
a )))
In our example, let
k^ = 0.
R ( t ) = exp(–0.000132(130 – 30
τ(1 –
τ/16))
t )
Assume testing for
τ^ = 8 months:
R ( t ) =
–0.00132 e
t
τ^
MTTF (hr) 0
Nov. 2007
Software Reliability and Redundancy
nd^ moment,.. .) to flaw removal data
Nov. 2007
Software Reliability and Redundancy
Verification:
“Are we building the system right?” (meets specifications)
Validation:
“Are we building the right system?” (meets requirements) Both verification and validation use testing as well as formal methods Software testing Exhaustive testing impossible Test with many typical inputs Identify and test fringe cases
Formal methods Program correctness proof Formal specification Model checking
Example: overlap of rectangles
Examples: safety/security-critical^ Smart cards^ [Requet 2000]
Cryptography device
[Kirby 1999]
Railwayinterlockingsystem [Hlavaty 2001]
Automatedlab analysistest equipment [Bicarregui 1997]
Nov. 2007
Software Reliability and Redundancy
Given that a complex piece of software will contain bugs, can we useredundancy to reduce the probability of software-induced failures?^ Sources:
Software Fault Tolerance
, ed. by M.R. Lyu, Wiley, 2005 (on-line book at
http://www.cse.cuhk.edu.hk/~lyu/book/sft/index.html
)
Flaw avoidance strategies include (structured) design methodologies,software reuse, and formal methods The ideas of masking redundancy, standby redundancy, and self-checking design have been shown to be applicable to software,leading to various types of fault-tolerant software “Flaw tolerance” is a better term; “fault tolerance” has been overused Masking redundancy: N-version programming Standby redundancy: the recovery-block scheme Self-checking design: N-self-checking programming^ Also, “Software Fault Tolerance: A Tutorial,” 2000 (NASA report, available on-line)
Nov. 2007
Software Reliability and Redundancy
Independently develop
N^ different programs (known as “versions”)
from the same initial specification The greater the diversity in the
N^ versions, the less likely
that they will have flaws that produce correlated errors Diversity in: Programming teams (personnel and structure) Software architecture Algorithms used Programming languages Verification tools and methods Data (input re-expression and output adjustment)
Version 1 Version 2 Version 3
Voter^
Output
Input
Adjudicator;Decider;Data fuser
Nov. 2007
Software Reliability and Redundancy
Source: Dugan & Lyu, 1994 and 1995
Nov. 2007
Software Reliability and Redundancy
Back-to-back testing: multiple versions can help in the testing process^ Source: P. Bishop, 1995
Some experiments in N-version programming
B777 flight computer: 3 diverse processors running diverse software Airbus A320/330/340 flight control: 4 dissimilar hardware/softwaremodules drive two independent sets of actuators