Terminology, Models, and Measures for Fault-Tolerant Computing - Prof. B. Parhami, Assignments of Electrical and Electronics Engineering

This presentation covers the terminology, models, and measures related to dependability in fault-tolerant computing. It includes concepts such as impairments to dependability, the fault-error-failure cycle, the four-universe model, and multilevel models. The document also discusses the importance of dependability and various types of dependable computer systems.

Typology: Assignments

Pre 2010

Uploaded on 08/30/2009

koofers-user-c5l
koofers-user-c5l 🇺🇸

5

(1)

8 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Oct. 2007 Terminology, Models, and Measures Slide 1
Fault-Tolerant Computing
Basic Concepts
and Tools
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Terminology, Models, and Measures for Fault-Tolerant Computing - Prof. B. Parhami and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

Oct. 2007

Terminology, Models, and Measures

Slide 1

Fault-Tolerant Computing^ Basic Conceptsand Tools

Oct. 2007

Terminology, Models, and Measures

Slide 2

About This Presentation

Edition

Released

Revised

Revised

First

Oct. 2006

Oct. 2007

This presentation has been prepared for the graduatecourse ECE 257A (Fault-Tolerant Computing) byBehrooz Parhami, Professor of Electrical and ComputerEngineering at University of California, Santa Barbara.The material contained herein can be used freely inclassroom teaching or any other educational setting.Unauthorized uses are prohibited. © Behrooz Parhami

Oct. 2007

Terminology, Models, and Measures

Slide 4

Oct. 2007

Terminology, Models, and Measures

Slide 5

Impairments to Dependability

ERROR

Malfunction

Degradation

Failure

Fault

Intrusion

Hazard

Defect

Flaw

Bug

Crash

Oct. 2007

Terminology, Models, and Measures

Slide 7

The Four-Universe Model

Cause-effect diagram for Avižienis’ four-universemodel of impairments to dependability.

Universe
Impairment
Physical
Failure
Logical
Fault
Informational
Error
External
Crash

Oct. 2007

Terminology, Models, and Measures

Slide 8

Unrolling the Fault-Error-Failure Cycle

Cause-effect diagram for an extended six-levelview of impairments to dependability.

Abstraction

Impairment

Component

Defect

Logic

Fault

Information

Error

System

Malfunction

Service

Degradation

Result

Failure

Low-Level Mid-Level High-Level

FirstCycle SecondCycle

Failure

Aspect

Impairment

Structure

Fault

⇓^

State

Error

⇓^

Behavior

Oct. 2007

Terminology, Models, and Measures

Slide 10

Analogy for the Multilevel Model

An analogy for ourmulti-level model ofdependable computing.Defects, faults, errors,malfunctions,degradations, andfailures are represented by pouringwater from above.Valves representavoidance andtolerance techniques.The goal is to avoidoverflow.

Wall heights represent

inter-level latencies

Drain valves representtolerance techniques

Concentric reservoirs areanalogs of the six model levels,with defect being innermost

I I I I I

I^ I

I^

I^

I^

I^

I

Inlet valves representavoidance techniques

Oct. 2007

Terminology, Models, and Measures

Slide 11

Why Our Concern with Dependability?

Reliability of

n

-transistor system, each having failure rate

λ

R
( t
e

- n

λ t

There are only 3 ways of making systems more reliable Reduce
Reduce
n

1.0 0.8 0.6 0.4 0.2 0.

–ne

.

.

.

.

.

10 10

8 10

6 10

4 10

nt

Reduce
t

Alternative: Change the reliabilityformula by introducingredundancy in system

Oct. 2007

Terminology, Models, and Measures

Slide 13

Aspects of Dependability

RELIABILITY

Maintainability

Availability

Performability

Security Integrity

Serviceabilit

y

Testability

Safety

Robustness

Resilience

Reliability, MTTF = MTFF

Risk, conseque

nce

Controllability,

observability

Performability, MCBF

Pointwise av., Interval av., MTBF, MTTR

Oct. 2007

Terminology, Models, and Measures

Slide 14

Concepts from Probability Theory

Cumulative distribution function: CDF F

( t

) = prob[

x

t

] =

0

f (

x )^ dx

t

Probability density function: pdf f (

t ) = prob[

t^

x

t

dt

] /

dt

=

dF

( t

) /

dt

Time

0

10

20

30

40

50

Time

0

10

20

30

40

50

Time

0

10

20

30

40

50

1.00.80.60.40.20.

CDF pdf

0.050.040.030.020.010.

F(t)

f(t)

Expected value of

x

E

x^

=

−∞

x f

( x

)^ dx

=

k^

x

fk ( x

) k

+∞

Covariance of

x

and

y

ψ

x , y^

=

E

[(

x

E

)( x

y

E

)] y

=

E

[ x y

] –

E

x^ E

y

Variance of

x

σ

x^

=

−∞

( x

E

) x 2 f (

x )^ dx

=

k^

( x

k^

E

) x 2

f (

x

) k

+∞

2

Lifetimes of 20 identical systems

Oct. 2007

Terminology, Models, and Measures

Slide 16

Reliability and MTTF

Reliability:

R

( t

)

Probability that system remains in the“Good” state through the interval [0,

t

]

Two-statenonrepairablesystem

R

( t

dt

) =

R

( t

) [1 –

z

( t

)^ dt

]

Hazard function

Constant hazard function

z

( t

) =

λ ⇒

R

( t

) =

e

  • λ

t

R (system failure rate is independent of its age)

( t

) = 1 –

F

( t

)^

CDF of the system lifetime, or its unreliability

Exponentialreliability law

Mean time to failure: MTTF MTTF =

0

t f

( t

)^ dt

=

0

R

( t

)^ dt

+∞

+∞

Expected value of lifetime

Area under the reliability curve(easily provable)

Startstate

Failure

Up

Down

Oct. 2007

Terminology, Models, and Measures

Slide 17

Failure Distributions of Interest

Exponential:

z

( t

) =

λ

R

( t

) =

e

  • λ

t^

MTTF = 1/

λ

Weibull:

z

( t

) =

αλ

( λ

t )

α

-

R

( t

) =

e

(−λ

α t )

MTTF = (1/

λ

)^

Γ

(1 + 1/

α

)

Erlang:

MTTF =

k

Rayleigh: Gamma: Erlang and exponential are special cases Normal: Reliability and MTTF formulas are complicated

z

( t

) = 2

λ

( λ

t )

R

( t

) =

e

(−λ

(^2) t )

MTTF = (1/

λ

)^

√π / 2

Discrete versions Geometric^ R Discrete Weibull Binomial

( k

) =

q

k

Oct. 2007

Terminology, Models, and Measures

Slide 19

Availability, MTTR, and MTBF

(Interval) Availability:

A

( t

)

Fraction of time that system is in the“Up” state during the interval [0,

t

]

Two-staterepairablesystem

Availability = Reliability, when there is no repair Availability is a function not only of how rarely a system fails (reliability)but also of how quickly it can be repaired (time to repair)

MTTF

MTTF

μ

MTTF + MTTR

MTBF

λ

μ

Pointwise availability:

a

( t

)

Probability that system available at time

t

A

( t

) = (1/

t )

0

a

( x

)^ dx

t

Steady-state availability:

A =

lim

t →∞

A

( t

)

A

=

=

=

Repair rate1/

μ

= MTTR (Will justify thisequation later)

In general,

μ

λ

, leading to

A

1

Repair

Startstate

Failure

Up

Down

Oct. 2007

Terminology, Models, and Measures

Slide 20

System Up and Down Times

Time

Up Down

0

t

Time to first failure

Time between failures

Repair time t^1

t^2

t'^1

t'^2

Short repair time impliesgood

maintainability

(

serviceability

)

Repair

Startstate

Failure

Up

Down