Numerical Methods Lecture 7: Statistics, Probability and Reliability, Study notes of Civil Engineering

A portion of lecture notes from a numerical methods course, specifically covering statistics, probability and reliability. It includes explanations of statistical analysis, probability methods, and reliability analysis concepts. Topics include mean, standard deviation, variance, covariance, correlation coefficient, and probability density functions. Examples and applications to engineering are provided.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-ci5t9bydqj
koofers-user-ci5t9bydqj 🇺🇸

5

(1)

10 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CGN 3421 - Computer Methods Gurley
Numerical Methods Lecture 7 - Statistics, Probability and Reliability page 112 of 125
Numerical Methods Lecture 7 - Statistics, Probability and Reliability
Topics
A summary of statistical analysis
A summary of probability methods
A summary of reliability analysis concepts
Statistical Analysis
The value of a measured quantity can often vary from one measurement to the next, and from one sample to the
next (e.g. student grades on an exam, strength of concrete cylinders). We will refer to such a changing quantity
as a ‘random variable’. Statistical analysis allows us to view important characteristics of the random variable
without having full information. That is, we won’t know what the exact strength of the next concrete cylinder
to be tested is, but we can take a good guess based on previous measurements and statistical analysis.
Mean and Standard Deviation of a Single Variable
The most fundamental statistics are the mean and standard deviation .
Given: A single random variable ‘X’ sampled ‘N’ times
The mean of X - denoted : average value of the measured quantity
The standard deviation - denoted : the average distance from the mean, or the average spread
‘varx’ is the variance of x. The standard deviation is the square root of the variance.
an equivalent expression is
A higher standard deviation increases the odds of being far away from the mean.
µσ
µZ
µZ'Z[]
0
----ZK
K
0

σ
Z
σZ8#4 Z'Z µZ
Ō()ZµZ
Ō()[]
0
----ZKµZ
Ō()
()
K
0

σZ
0
----ZK
()
K
0



µZ
Ō
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Numerical Methods Lecture 7: Statistics, Probability and Reliability and more Study notes Civil Engineering in PDF only on Docsity!

Numerical Methods Lecture 7 - Statistics, Probability and Reliability

Topics A summary of statistical analysis A summary of probability methods A summary of reliability analysis concepts

Statistical Analysis

The value of a measured quantity can often vary from one measurement to the next, and from one sample to the next (e.g. student grades on an exam, strength of concrete cylinders). We will refer to such a changing quantity as a ‘random variable’. Statistical analysis allows us to view important characteristics of the random variable without having full information. That is, we won’t know what the exact strength of the next concrete cylinder to be tested is, but we can take a good guess based on previous measurements and statistical analysis.

Mean and Standard Deviation of a Single Variable

The most fundamental statistics are the mean and standard deviation.

Given: A single random variable ‘X’ sampled ‘N’ times

The mean of X - denoted : average value of the measured quantity

The standard deviation - denoted : the average distance from the mean, or the average spread

‘var (^) x ’ is the variance of x. The standard deviation is the square root of the variance.

an equivalent expression is

A higher standard deviation increases the odds of being far away from the mean.

μ σ

μ Z

μ Z ' Z [ ]

---- Z K

K 

  ∑

σ Z

σ Z 8#4 Z ' [ ( Z Ōμ Z ) ( Z Ōμ Z )]

---- ( Z K Ōμ Z )

K 

   ∑

σ Z^ 

---- ( Z K )

K 

μ Z

Example: Two different sets of exam grades

Class #1 and class #2 have about the same mean value (red line)

Class #1 has a small standard deviation: most students are near the mean (blue line borders)

Class #2 has a larger standard deviation, so students have a higher probability of being well over or well under the class average grade.

We can use the mean and standard deviation to estimate the likelihood of deviating from the mean value.

Higher = higher probability of being further from the mean. We will get into quantifying this probabil- ity in a few pages.

The mean and standard deviation are classified as first- and second-order statistics (involving the mean of

X, and mean of X^2 , respectively). If we stick with using these two stats to describe data, we are making assumptions about the form of its probability. We assume the fluctuations about the mean are equally likely to be above or below the mean. That is, the probability behavior is SYMMETRIC about the mean. This will not always be realistic. For example, if I give an easy test, the class average may be 100, but the standard deviation may be 15. If we assume the distribution of grades is symmetric about the mean, that would result in scores above 100, which is out of bounds. So there are cases when just the mean and star- dard deviation are not enough.

We can look at higher-order statistics to help. That is, look beyond mean and standard deviation to explain non-symetric data.

Class #1 Class

diff.=standard deviation

Mean value

Statistics of Two Variables

The relationship between two quantities is an often needed characteristic e.g. relation between the ratio aggregate/water and concrete strength

Given: Two quantities X and Y sampled N times

The Covariance between two random variables X and Y is

, note no

Note that if X and Y are the same process, the covariance becomes the variance, where ,

and the relationship between variance and standard deviation becomes

Covariance can be used to measure the how much X and Y are related to each other by defining a correlation coefficient

Correlation Coefficient - a number that measures the linear relationship between X and Y

, bounded by

The boundaries indicate the following property

Meaning of the correlation coefficient

: perfect linear correlation (identical processes)

: no linear correlation between x and y

: strong negative linear correlation (if x increases, y decreases)

%18 Z[ ' [ ( Z Ōμ Z ) ( [ Ōμ [ )]

---- ( Z K Ōμ Z ) ( [ K Ōμ [ )

K 

  (^) ∑ 

8#4 (^) ZZ

%18 (^) ZZ 8#4 (^) ZZ σ Z   

ρ Z[

%18 Z[

σ Z σ [

 ----------------- Ō ≤ ρ Z[ ≤

%18 Z[ ≤σ Z σ [

ρ Z[  

ρ Z[  

ρ Z[  Ō

Application - two different random variables x and y measured at the same time

Probability analysis - A formal framework for using statistical descriptions

Quantities provided in common engineering applications often are not exact. An uncertainty is often associated. Probability analysis incorporates this uncertainty when providing a solution.

Key Issue : providing a statistical description of the uncertainty

  1. take many samples
  2. estimate statistics
  3. fit a distribution using these statistics

Theme example : strength of concrete cylinders

10 concrete cylinders are tested for compressive strength rated in psi

The number of cylinders that break within a series of 500 psi ranges are plotted below

ρ [Z  Ō

ρ [Z  

ρ [Z  

ρ [Z  

perfect correlation

strong correlation

no correlation

strong negative correlation

x x

x

x

y

y y

y

x1000 psi

Histogram of concrete strength

Probability Density Function

How can we get a more accurate picture of the probability? If more examples are available, say 100, then the histogram can use smaller bins, providing:

What if we could take 1000 samples, 10,000 samples? As the number of samples go to infinity, the size of the bins approach zero, and the histogram becomes the probability density function (PDF)

probability density function (PDF) - p(x)

The area under p(x) represents the probability of the variable x occurring between the integration limits.

again, the total area under the entire curve is unity ( = 1 )

Usually we’ll have an equation to describe p(x)

Any equation for p(x) must satisfy the above restriction, total area = 1

Q: Usually we don’t have an infinite # of samples to numerically generate a PDF p(x). So how do we get an equation for p(x) ???

A: To get this curve , we’ll assume a certain form for the probability of the variable x.

That is, we will curve fit a probability distribution model.

3/(100*dx)

6/(100*dx)

9/(100*dx)

x1000 psi

Normalized Histogram of concrete strength

dx = (6.0-5.5)/

RTQD C ( < Z < D ) R Z ( ) FZ

C

D

RTQD ( Ō ∞< Z <∞) R Z ( ) FZ

R Z ( )

Gaussian distribution

The most common assumed form is the ‘bell curve’, also called the Gaussian distribution. The Gaussian

p(x) is completely defined by the mean and standard deviation.

Gaussian distribution:

This distribution is symmetric about the mean, and tapers in the tails following an exponential shape.

Several examples are given below as mean ( ) and standard deviation ( ) vary.

Properties of Gaussian distribution (Normal distribution)

  • Fully described by and. Referred to as
  • Third-order statistic skewness = 0, i.e. Symmetry about the mean
  • Most commonly used distribution. Many natural random events follow the Gaussian form (e.g. wind speed, rain fall, financial markets...)
  • Mathematical limits
  • Practical Limits are plus/minus 5 standard deviations from the mean: . Beyond this range, area under p(x) ~ 0
  • Linear operations on a Gaussian variable yields a Gaussian result. If is a Gaussian random variable, so is

μ Z σ Z

R Z ( )

σ Z π

Z Ōμ Z

σ Z

 GZR

μ Z σ Z

lect22_fig

[ μ Ō σ,μ σ]

\    Z Z \

Problem : There is no analytical solution to the integration of the Gaussian function...

Option #1: Use Numerical Integration to get the solution!! We’ve done this already.

Option #2: Use tables for probability of standard normal Gaussian PDF

standard normal = Gaussian PDF with , , or N(0,1).

We can ‘normalize’ any quantity associated with the Gaussian PDF by removing the mean value and dividing by the standard deviation:

Example: If is a Gaussian random variable with mean of 5 and standard deviation of 15

then we can produce a standard normal variable with the following operation

where is just a manipulation of each of the values

Z

Z  0 (  , )

[

[

( Z Ōμ Z ) σ Z

 -------------------  0 (  , ) [ Z

For problem #1,

find: for

In standard normal space this is equal to:

for

Tables for N(0,1) say the answer is

There is a built in mathcad function with this table information in there (nice!!)

For problem #

find: for

Look at the red area on the graph on the previous page, this is calculated by:

for

In standard normal space this is equal to:

for

Tables for N(0,1) say the answer is .7106 - .1446 = .5440 = 54.5%

Example #3 ) How do we use these probability models?

Once put into service, the maximum weight placed on a single concrete column is expected to be 200

kips. The concrete column has a surface area of 50 in 2. The and of the concrete strength are the same as the previous example. What is the probability that the column will fail under the maximum expected load?

Answer:

x 50 in^2 = 300,000 lbs. x 50 in^2 = 25,000 lbs. load = 200,000 lbs.

Find:

from N(300000,25000) space to N(0,1) space, the equivalent is

RTQD Z ( <) 0 (  , )

RTQD Z

   RTQD Z ( <Ō) 0 (  , )

RTQD Z ( <Ō)    

RTQD (  < Z <) 0 (  , )

RTQD Z ( <) Ō RTQD Z ( <) 0 (  , )

RTQD Z

  RTQD Z ^ Ō

Ō ^  0 (  , )

μ   RUK

σ   RUK

RTQD UVTGPIVJ ( < NQCF )

σ Z π

Z Ōμ Z

σ Z

GZR FZ

( μ Ōσ)

Analytical

Under certain circumstances, the probability of can be determined without numerical methods. For

example, if our system variables and are both Gaussian, the description for is also Gaussian. Combining Gaussian variables is as done as follows:

Say we want to combine four Gaussian random variables like so:

The resultant is also Gaussian with the following properties:

mean:

standard deviation :

note that all are added, regardless of the minus sign in the equation, while the mean combination

includes the minus sign in front of the term.

Applying this concept to the limit state equation , the probability of failure can be determined

as the probability that the resultant Gaussian variable is less than zero:

prob(failure) =.

This can be evaluated as we did in the probability section. We are assuming all gaussian variables are uncorrelated for simplicity.

We will do an example or two in class on the board. I’ve left some room below to take notes

C E F H , , ,

\  C E Ō H F

\

μ __  μ C μ E Ōμ H μ F

σ \ 5455 σ C  σ E  σ H  σ F   

σ H

R ) ( ) FI

Ō ∞

 ∫

Simulation

Sometimes an analytical solution is not possible (or difficult) because we can’t easily find the resultant

distribution of. The nice formulas on the previous page to find and are not easily gotten. Fur-

ther, if the distribution for is not Gaussian (like we’ve been using so far), then we need more informa-

tion beyond and to get a suitable description of.

case 1

We can combine as many different Gaussian variables as we like so long as we are adding and subtract- ing, and still use the analytical approach. However, often there is a need to combine variables in forms other than adding and subtracting. For example, recall that the tip deflection of a cantilevered beam is

.

Suppose our system is such a beam with a random load on its tip. Suppose also that the length, E, and I are random with their own Gaussian distributions. If we define failure as deflection exceeding a limit, the limit state would look like this:

, where threshold is the limit to deflection.

If the deflection exceeds the threshold, then G < 0, indicating system failure as we define it. Now we are not combining Gaussian variables through addition / subtraction. The nice formulas on the previous page

to find and do not apply here. We’ll look at how to solve this after case 2.

case 2

The probability of load may differ from Gaussian. This can also be true for the resistance. If the vari-

ables in the limit state vary from a Gaussian distribution, we can count on also differing from Gauss-

ian. For the simple case of where either or or both aren’t Gaussian, again our nice formulas don’t apply.

Some special cases can still be solved analytically, but in general it takes great effort to do this.

Solution - Simulation

In both cases above it can be important to account for this change in the distribution of from Gaussian. To handle these cases, we employ simulation methods. This is a pure numerical brute force method of simulating many random numbers. Each set of random numbers fits the distribution of one of the vari- ables in the limit state equation. We then run many experiments on the computer by seeing how many times out of, say 1,000,000, our limit state is < 0. Probability of failure is then

The more experiments we perform, the more reliable our final answer becomes. This is because it takes many random numbers to accurately describe a given distribution. Examples given on the board.

) μ I σ ) ) μ I σ ) )

FGHNGEVKQP 2.



'+

) VJTGUJQNF 2.

  '+

μ I σ )

RTQD HCKN ( ) HCKNWTGU

VQVCNUKOWNCVKQPU