Reliability, Study notes of Microcomputers

imprtant documents..............................................

Typology: Study notes

2017/2018

Uploaded on 11/23/2018

laraibshakeel01
laraibshakeel01 🇵🇰

4

(1)

2 documents

1 / 22

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 26 Page 1
5/28/02
Reliability
Although the technological achievements of the last 50 years can hardly be disputed, there
is one weakness in all mankind's devices. That is the possibility of failure. What person
has not experienced the frustration of an automobile that fails to start or a malfunction of a
household appliance. The introduction of every new device must be accompanied by
provision for maintenance, repair parts, and protection against failure. This is certainly
apparent to the military, where the life-cycle maintenance costs of systems far exceed the
original purchase costs. The problem pervades modern society, from the homeowner who
faces the annoyances of appliance failures, to electric utility companies faced with the
potentially disastrous consequences of nuclear reactor failures. The insurance industry
would not exist without the possibility of one type of failure or another.
A subject that is so important to many decisions in this world could hardly escape
quantitative analysis. The name reliability is given to the field of study that attempts to
assign numbers to the propensity of systems to fail. In a more restrictive sense, the term
reliability is defined to be the probability that a system performs its mission successfully.
Because the mission is often specified in terms of time, reliability is often defined as the
probability that a system will operate satisfactorily for a given period of time. Thus
reliability may be a function of time.
Estimating reliability is essentially a problem in probability modeling. A system
consists of a number of components. In the simplest case, each component has two states,
operating or failed. When the set of operating components and the set of failed components
is specified, it is possible to discern the status of the system. The problem is to compute
the probability that the system is operating -- the reliability of the system.
We use the concepts and methods of probability theory to compute the reliability of
a complex system. In addition, we provide bounds on the probability of success that are
often much easier to compute than the exact reliability. Although the chapter particularly
relates to reliability, the methods described here are appropriate to a much larger class of
problems associated with computing the probability of occurrence of complex events.
26.1 Reliability Models
A device or system is described as a collection of parts or components. The system
operates successfully if all its components operate successfully (do not fail), but it may also
operate if a subset of components has failed. The structure function is a model that
determines the status of the system given the status of its components. We use the
structure function to compute the system reliability.
The Structure Function
The system is a collection of n identifiable components performing some
function. We define two operating states that relate to the system's ability to
perform its function.
Success: The system performs its function satisfactorily for a given
period of time, where the criterion for success is clearly defined.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16

Partial preview of the text

Download Reliability and more Study notes Microcomputers in PDF only on Docsity!

Chapter 26 Page 1

Reliability

Although the technological achievements of the last 50 years can hardly be disputed, there is one weakness in all mankind's devices. That is the possibility of failure. What person has not experienced the frustration of an automobile that fails to start or a malfunction of a household appliance. The introduction of every new device must be accompanied by provision for maintenance, repair parts, and protection against failure. This is certainly apparent to the military, where the life-cycle maintenance costs of systems far exceed the original purchase costs. The problem pervades modern society, from the homeowner who faces the annoyances of appliance failures, to electric utility companies faced with the potentially disastrous consequences of nuclear reactor failures. The insurance industry would not exist without the possibility of one type of failure or another.

A subject that is so important to many decisions in this world could hardly escape quantitative analysis. The name reliability is given to the field of study that attempts to assign numbers to the propensity of systems to fail. In a more restrictive sense, the term reliability is defined to be the probability that a system performs its mission successfully. Because the mission is often specified in terms of time, reliability is often defined as the probability that a system will operate satisfactorily for a given period of time. Thus reliability may be a function of time.

Estimating reliability is essentially a problem in probability modeling. A system consists of a number of components. In the simplest case, each component has two states, operating or failed. When the set of operating components and the set of failed components is specified, it is possible to discern the status of the system. The problem is to compute the probability that the system is operating -- the reliability of the system.

We use the concepts and methods of probability theory to compute the reliability of a complex system. In addition, we provide bounds on the probability of success that are often much easier to compute than the exact reliability. Although the chapter particularly relates to reliability, the methods described here are appropriate to a much larger class of problems associated with computing the probability of occurrence of complex events.

26.1 Reliability Models

A device or system is described as a collection of parts or components. The system operates successfully if all its components operate successfully (do not fail), but it may also operate if a subset of components has failed. The structure function is a model that determines the status of the system given the status of its components. We use the structure function to compute the system reliability.

The Structure Function

The system is a collection of n identifiable components performing some function. We define two operating states that relate to the system's ability to perform its function.

- Success : The system performs its function satisfactorily for a given period of time, where the criterion for success is clearly defined.

2 Reliability

- Failure : The system fails to perform its function satisfactorily.

The system reliability is the probability that a system performs its function satisfactorily (i.e., the probability of success). To provide a mathematical model of system reliability, we first consider the components. Like the system we also allow two possible states for each component. The success indicator for component i is the binary random variable X (^) i that indicates the status of component i.

X (^) i = 1 implies component i is working

= 0 implies component i is failed The status vector is the vector of component status indicators.

X = ( X (^) 1 , X (^) 2 , … , X (^) n )

There are 2 n^ possible realizations of this vector. The structure function is a binary function that indicates the status of the system (success or failure) given the status of each component.

( X (^) 1 , X (^) 2 , … , X (^) n ) or ( X )

is the structure function, which has a value of 1 or 0 for each of the 2 n possible vectors X. The structure function is a complete model of the failure and success characteristics of the system.

Reliability

Given the structure function of the system, one can compute its reliability. The component reliability, pi , is the probability that component i is operating correctly. The component failure probability, qi , is the probability that a component has failed. In terms of the success indicators, pi = P { X i = 1} and qi = P { X i = 0} = 1 – pi.

When the probability of success or failure of a component does not depend on the status of some other component, the components are said to be independent. The assumption of this chapter is that all components are independent. The probability that the system is operating correctly is the system reliability, R. It is the probability that the structure function is 1.

R = P { ( X ) = 1} = E [ ( X )] (1)

4 Reliability

R = 1 – (1 – p 1 ) (1 – p 2 )... (1 – pn ) = 1 – q 1 q 2...^ qn. (5)

Consider again the stereo system with a revised criterion for success. For successful operation the CD player, amplifier and at least one of the speakers must work. To construct the function first note that the two speakers comprise a parallel system. Then the structure function for the speaker combination is

s = 1 – (1 –^ X^^3 )(1 –^ X^^4 ). The speaker combination forms a series system with the CD player and amplifier, so the complete structure function is

( X ) = X (^) 1 X (^) 2 s = X (^) 1 X (^) 2 [1 – (1 – X (^) 3 ) (1 – X (^) 4 )].

To compute the reliability of the system, first compute the reliability of the parallel system of speakers.

R s = 1 – (1 – p 3 )(1 – p 4 ) = 1 – (0.02)(0.02) = 0.9996.

The speaker combination forms a series system with the CD player and amplifier, so the total reliability is

R = p 1 p 2 R s = (0.97)(0.99)(0.9996) = 0.9599.

With the more liberal definition of success, the reliability has increased over the series system.

k -out-of- n System

This system is successful if any k out of the n components are successful.

( X ) =

1, if^ ∑

i =

n X (^) ik

0, if ∑

i =

n X (^) i < k

This illustrates a structure function that is not a simple polynomial expression but involves a logical condition. To compute the reliability, assume that all components have the same reliability, pi = p for all i. Then the reliability of the system is the probability that k or more components are successful. Because of the independence assumption, we can the binomial distribution to compute the reliability.

R =

n i

 (^) pi^ (1− p ) n^ − i i = k

n

Simple Systems 5

For example, a space vehicle has three identical computers operating simultaneously and solving the same problems. The outputs of the three computers are compared, and if two or three of them are identical, that result is used. This is called a majority vote system, and in this mode one of the three computers can fail without causing the system to fail. This is a two out of three system. Identifying the success or failure of each of the computers with the variables X (^) 1 , X (^) 2 , and X (^) 3 , the function is written:

( X ) =

1, if^ X^^1 + X^^2 +^ X^^3 ≥^2 0, if X (^) 1 + X (^) 2 + X (^) 3 < 2

Alternatively, the structure function written in polynomial form is

( X ) = 1 – (1 – X 1 X 2 ) (1 – X 1 X 3 ) (1 – X 2 X 3 )

Any combination of two or more X (^) i set equal to 1 will cause this function to assume the value 1, while fewer than two will result in a 0 value. With the reliability of each computer equal to 0.9, we use the binomial distribution to compute the system reliability.

R =

i

 (^) (0.9) i^ (0. i = 2

3 ∑ )

3 − i

= 3(0.9)^2 (0.1) + (0.9)^3 = 0.972.

The reliability of the combination of three computers is much greater than that of an individual. This is an example of the use of redundancy to increase reliability. Since only one computer is required to perform the function, the other two are redundant from a functional point of view. They do play an important role, however, in increasing the reliability of the system. The independence of failures is important here. If some failure mechanism causes all three computers to fail simultaneously, the reliability improvement will not be realized. For example, if all three computers used the same program and the program had an error, the combination of results will certainly be no better than any one of the individual results. The use of redundancy is common in systems for which failure has particularly severe consequences, such as in the space program or for very complex systems with many components.

Reliability as a Function of Time

Often the reliability of a component is given as functions of time. For example, a common assumption is that components have an exponential distribution for time to failure. In this case the component reliability is

p ( t ) = 1 – P {failure time ≤ t ) = e –^ t.

Complex Systems 7

26.3 Complex Systems

If the system structure is not one of the simple forms, it becomes difficult to compute the exact reliability. To deal with the more general situation, we introduce a graphical network model in which it is possible to determine whether a system is working correctly by determining whether a successful path exists through the system. The system fails when no such path exists. We present both exact and approximate methods for computing the reliability. The methods are based on the dual concepts of minimal cuts or minimal paths of the network.

Coherent System

A system characteristic that plays an important role in the subsequent analysis is coherency. A coherent system has the property that when the system is successful for some status vector X , it remains successful if some components of X change from 0 to 1. Alternatively, if the system is failed for some status vector X , it remains failed if some components of X are changed from 1 to 0. More formally, a coherent system has the property that when X and Y are two status vectors such that YX ,

( Y ) ≥ ( X ).

For a coherent system, repairing a failed component cannot cause a working system to fail. Most real systems have this characteristic including the simple systems given in the previous section.

The Network Model

We describe the system as a directed network consisting of nodes and arcs, as illustrated in Fig. 1. One node is defined as the source (node A in the figure), and a second node is defined as a sink (node D). Each component of the network is identified as an arc passing from one node to another. The arcs are numbered for identification. A failure of a component is equivalent to an arc being removed or cut from the network. The system is successful if there exists a successful path from the source to the sink. The system is failed if no such path exists. The reliability of the system is the probability that there exist one or more successful paths from the source to the sink.

A

B

C

D

Figure 1. Network describing a system of five components

8 Reliability

To describe the reliability of this system, we define the concepts of path, minimal path, cut, and minimal cut for the network. A path for the network is a set of components, such that if all the components in the set are successful, the system will be successful. For example, the set of all components is a path. A minimal path is a set of components that comprise a path, but the removal of any one component will cause the resulting set to not be a path. In other words, if all the components in a minimal path are successful while all other components have failed, the system will be successful. If any one of the components in the minimal path subsequently fails, the system will fail. In terms of the network model, the minimal path corresponds to a simple path from the source to the sink in the network. In the example the sets {1, 4}, {1, 3, 5}, {2, 5} are minimal paths. The set {1, 3, 4} is a path, but not a minimal path. Arc 3 can be removed from the set and the set will still be a path. A cut is a set of components such that if all the components in the cut fail, while all other components are successful, the system will fail. Again, the set of all components is a cut. The minimal cut is a set of components that comprise a cut, but the removal of any one component from the set causes the resulting set to not be a cut. In the network a minimal cut breaks all simple paths from the source to the sink. From Fig. 1 we observe that the minimal cuts are: {1, 2}, {1, 5}, {2, 3, 4}, and {4, 5}.

Structure Function in Terms of Minimal Paths

Knowledge of the complete set of minimal cuts or minimal paths makes it possible to derive the structure function of a complex system represented by the network model. Once we have the structure function we show how to obtain exact and approximate estimates of system reliability. We first determine the structure function in terms of the set of minimal paths. Let P be a set of components comprising a minimal path. Using X (^) i as an indicator of the success of component i , the event of a successful path is the binary function

Xi iP

where iP

∏ means product over the set^ P.^ The event of a failed path is

1 − Xi iP

Let P 1 , P 2 , … , P k be the collection of all minimal paths of the network. The system is successful if all the minimal paths do not fail. Then the structure function is

p( X^ ) = 1 –^1 −^ Xi iP 1

 1 −^ Xi iP 2

 • • •^1 −^ Xi iP k

10 Reliability

Computing the Exact Reliability

Starting from either of the structure functions defined above, one can multiply through the factors to obtain a sum of terms such that each term is a product of factors having the form X (^) i or (1 – X (^) i ). The manipulation of the formulas uses binary arithmetic which recognizes that

X (^) iX (^) i = X (^) i , X (^) i + X (^) i = X (^) i , X (^) i (1 – X (^) i ) = 0, X (^) i + (1 – X (^) i ) = 1.

Manipulating Eq. (10) obtained using the minimal paths for the example, we have the following result.

p( X^ ) = 1 – (1 –^ X^ 1 X^ 4 ) (1 –^ X^ 1 X^ 3 X^ 5 ) (1 –^ X^ 2 X^ 5 )

= X 1 X 4 + X 1 X 3 X 5 + X 2 X 5 – X 1 X 3 X 4 X 5 – X 1 X 2 X 4 X 5

– X 1 X 2 X 3 X 5 + X 1 X 2 X 3 X 4 X 5

Once in the expanded format, we substitute pi for X (^) i and qi for (1 – X (^) i ) to obtain an exact expression for the system reliability. Substituting pi for X (^) i in the preceding expression, the reliability equation for the system is

R = p 1 p 4 + p 1 p 3 p 5 + p 2 p 5 – p 1 p 3 p 4 p 5 – p 1 p 2 p 4 p 5 – p 1 p 2 p 3 p 5 + p 1 p 2 p 3 p 4 p 5.

The problem of finding the exact reliability is made difficult by the large number of minimal cuts and paths for most networks.

Examples

We illustrate the minimal cut and path approach for determining the system reliability with the simple systems of Section 26.2. More complex cases are in the exercises at the end of the chapter.

Example 1: Series System

A system has four components in series with reliabilities p 1 = 0.97, p 2 = 0.99, p 3 = p 4 = 0.98.

We will find the system reliability with both the cut and path approaches. The network for the series system is shown in the figure. The source is A and the sink is E. The component reliabilities are shown in the Fig. 2.

A B C D

1 2 3 4

E

(0.97) (^) (0.99) (0.98) (^) (0.98)

Figure 2. Network for a series system

Complex Systems 11

By observation we note that the system has a single minimal path.

P 1 = {1, 2, 3, 4}

Using this path, the structure function can be written according to Eq. (9).

p( X^ ) = 1 – (1 –^ X^ 1 X^ 2 X^ 3 X^ 4 ) =^ X^ 1 X^ 2 X^ 3 X^ 4.

The reliability of the system is obtained by substituting pi for X (^) i in this expression. R = p 1 p 2 p 3 p 4 = 0.

Alternatively, we can derive the reliability from the cuts of the system. Again by observation we note that there are four minimal cuts.

C 1 = {1}, C 2 = {2}, C 3 = {3}, C 4 = {4}

Using Eq. (11), the system structure function is given below.

c( X^ ) = [1 – (1 –^ X^ 1 )][1– (1 –^ X^ 2 )] [1– (1 –^ X^ 3 )] [1 – (1 –^ X^ 4 )]

= X 1 X 2 X 3 X 4

As expected, the same structure function is obtained.

Example 2: Series - Parallel System

Now we only require that components 1, 2, and either component 3 or 4 must work for system success. The network for this case is shown in the Fig. 3. The network graphically illustrates that components 3 and 4 are now in parallel and the pair is in series with components 1 and 2. Node A is the source and node D is the sink.

A 1 B C D

2

3

4

(0.97) (^) (0.99)

(0.98)

(0.98)

Figure 3. Network for a series-parallel system

To derive the structure function we note that the minimal paths are P 1 = {1, 2, 3}, and P 2 = {1, 2, 4}.

Using Eq. (9), the structure function is

p( X^ ) = [1 – (1 –^ X^ 1 X^ 2 X^ 3 )] [1 – (1 –^ X^ 1 X^ 2 X^ 4 )].

Complex Systems 13

requirement for system operation is that two of three computers must work, so the minimal paths are

P 1 = {1, 2}, P 2 = {1, 3}, and P 3 = {2, 3}.

The minimum requirement for system failure is that two components fail, so the minimal cuts are

C 1 = {1, 2}, C 2 = {1, 3}, and C 3 = {2, 3}.

In this interesting case, the sets defining the minimal cuts and the minimal paths are the same. Using Eq. (9) we derive the structure function

p( X^ ) = 1 – (1 –^ X^ 1 X^ 2 ) (1 –^ X^ 1 X^ 3 ) (1 –^ X^ 2 X^ 3 ). From this expression, we obtain the structure function and the reliability.

p( X^ ) =^ X^ 1 X^ 2 + X^ 1 X^ 3 + X^ 2 X^ 3 – 2 X^ 1 X^ 2 X^ 3

R = p 1 p 2 + p 1 p 3 + p 2 p 3 – 2 p 1 p 2 p 3

When all three components have the same reliability, p ,

R = 3 p^2 – 2 p^3. This expression is equivalent to the one obtained earlier for the two out of three system. A different but equivalent expression can be derived using the minimal cut approach.

Computational Considerations

This procedure can be used to derive the structure function and associated reliability function for any system for which the set of minimal paths or minimal cuts can be identified. There are two difficulties with this approach. The first is determining the set of minimal paths or cuts. In general, the system with many components will have many cuts, and it is a difficult computational problem to determine the complete set. In our examples, we have simply used observation; however, that procedure will hardly be satisfactory for a network of reasonable size. The second problem is to construct and evaluate the structure and reliability functions. In general, if there are k cuts or paths, the corresponding reliability equation will have 2 k^ – 1 terms. Because of these difficulties it is common to approximate the reliability function of complicated systems. This is the subject of the next section. For a complicated network it is often beneficial to look for subsystems of components that form simple structures such as the series, parallel, or k of n structures. The reliabilities of these subsystems can be determined first and the subsystem replaced with a single equivalent component. Even subsystems not having a simple structure can be analyzed with the methods of this section with the subsystem replaced by a single equivalent component whose reliability is the reliability of the subsystem. When as many subsystems as possible have been reduced in this fashion,

14 Reliability

the resultant network will be much smaller and will perhaps have a simple structure or be amenable to further reduction. This decomposition approach is often effective in significantly reducing computational effort. Complex structures will have different numbers of minimal cuts and paths. Because an exact analysis can be done using either, it is best to use the method with the smallest number. For instance, a series system with n components has n cuts but only one path. A parallel system with n components has n paths but only one cut.

16 Reliability

R L = (1 – q 1 q 2 ) (1 – q 1 q 3 ) (1 – q 2 q 3 ).

Assuming all components have equal reliabilities of 0.9 ( p = 0.9, q = 0.1), the bounds become

R U = 1 – (1 – p^2 )^3 = 1 – (0.19)^3 = 0.

R L = (1 – q^2 )^3 = (0.99)^3 = 0.9703.

This compares to the exact reliability calculated earlier, R = 0.972.

Modeling

In the preceding example, the lower bound appears to be closer to the exact reliability than the upper bound. The lower bound will usually be a better approximation when component reliabilities are high (> 0.9). With high component reliabilities it is more likely that a single cut will cause failure rather than a collection of two or more. The assumption of independence of cuts will cause less inaccuracy in this case. The cut approximation converges to the true reliability as the component reliability approaches 1. The upper bound approximation will usually be a better approximation when the component reliability is very low. In most studies, component reliabilities are high implying that a lower bound is a conservative measure of reliability. Therefore, from a practical point of view, the minimal cut approximation is the more important of the two.

Exercises 17

26.5 Exercises

  1. A missile complex has four subsystems: the radars, the missile, the computer control devices, and the human operators. Four radars are provided, of which three are required for successful operation. The complex has only a single missile. There are three computers operating in a majority vote arrangement. There are two human operators, one of whom must be capable of firing the missile. Write the structure function for this system consisting of 10 components.
  2. A student drives to school each day over the same route. She prides herself on her ability to control the speed of her car so she never has to stop at a traffic signal. She calls her trip successful if she can accomplish that feat. If there are six traffic signals on his route, show the structure function for the system of traffic signals for this particular driver. How would the structure function change if the student were willing to change her criterion of success to allow a pause at no more than one signal?
  3. To increase the likelihood that a vaccine for a disease will be discovered, the government awards independent study contracts to four drug firms. Surely, they say, one of the companies will make the discovery. If success is defined as the discovery of the vaccine, what is the structure function for this system?
  4. Compute the reliability of the system described in Exercise 1 when the reliabilities of the various components are given in the following table.

Component Radar Missile Computer Human Reliability 0.9 0.96 0.98 0.

  1. The Defense Department would like to increase the reliability of the missile system described in Exercise 4. Evaluate the system reliabilities for the changes proposed below. The changes are not cumulative. a. Add another radar with the same reliability. Three radars are required for successful operation. b. Add a third human operator. Only one is required. c. Add an entire duplicate missile system at a nearby location. Only one of the two systems must work for mission success.
  2. Three computers are operated in parallel with a majority vote taken of their outputs to determine the proper action. Find the reliability of the system as a function of time. Assume the time for failure of each of the computers has an exponential distribution with a mean time between failures of 50 hours. The failure rate ( ) is the reciprocal of the mean time between failures. Plot a curve of the reliability as a function of time over the range 0 to 100 hours. Plot the curves for the system and for a single computer. Comment on the effects of this arrangement of redundancy over time.
  3. Assume that all three computers in Exercise 6 must be in working order for system success. With the failure rates used in Exercise 6, plot system reliability over the range in time of 0 to 100 hours. What is the failure rate of the system?

Exercises 19

  1. For the 2-out-of-3 system considered in Example 3, compute the upper and lower bound approximations as a function of time. Assume each component has an exponential distribution for time to failure with a failure rate of 0.02/hour. Plot a curve showing the upper and lower bound approximations together with the exact reliability curve over the time range from 0 to 100 hours.
  2. Repeat Exercise 13 if the system is a 1-out-of-3 system.
  3. Repeat Exercise 12 if the system is a 3-out-of-3 system.
  4. The figure shows a system with nine components and corresponding reliabilities. Components 3, 4, and 5 form a 2-out-of-3 system. Write the structure function for this system. Compute the exact probability that a successful path will exist from A to B. Compute the minimal cut and minimal path approximations.

1 2

3

4

5

6

7

8

9

A B

  1. A fuse is used to protect an electrical circuit from overload. The fuse can fail in two modes, short and open. If it fails in the short mode, the fuse will not interrupt the circuit when it is activated. If it fails in the open mode, the circuit will be interrupted by the fuse itself. Assume a single fuse has probabilities of open and short of q o and q s, respectively. The events of open and short are mutually exclusive. Define success in two ways: the system does not fail because of shorts, and the system does not fail because of opens.

For the given arrangements shown in the figure, write structure functions for both definitions. Let the reliability be the probability that the system does not fail in either mode. Compute accurate reliabilities for each definition for success using q o = 0. and q s = 0.1.

20 Reliability

a. fuse

fuse fuse

fuse

fuse

fuse fuse

fuse fuse

fuse fuse

fuse fuse

b.

c.

d.

e.

A B

A B

A B

A B

A B

  1. The figure shows roads between two towns, A and B, in a mountainous area. During the winter, travel is difficult because of the threat of snow. The probability that any given road will be impassable is 0.6. The conditions on the roads are independent. Find the probability that there will be a passable route from A to B. Roads can be traveled in either direction. Find the accurate probability and also the minimal cut and minimal path approximations.

1

2

3

4

5

6

7

8

A B

  1. The figure shows the pipe layout in a lawn sprinkler system. Sprayers are located at every intersection of the pipe and also at the corners of the system. Thus there are 16 sprayers. Two things can happen to a sprayer -- it can break off or it can get clogged. If any one of the sprayers breaks off, a great stream of water will pour forth and the system will have failed. If one of the sprayers clogs, it will fail to water the lawn in the immediate vicinity; however, adjacent sprayers can reach the affected areas. The