








Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A portion of lecture notes from a numerical methods course, specifically covering statistics, probability and reliability. It includes explanations of statistical analysis, probability methods, and reliability analysis concepts. Topics include mean, standard deviation, variance, covariance, correlation coefficient, and probability density functions. Examples and applications to engineering are provided.
Typology: Study notes
1 / 14
This page cannot be seen from the preview
Don't miss anything!









Topics A summary of statistical analysis A summary of probability methods A summary of reliability analysis concepts
The value of a measured quantity can often vary from one measurement to the next, and from one sample to the next (e.g. student grades on an exam, strength of concrete cylinders). We will refer to such a changing quantity as a ‘random variable’. Statistical analysis allows us to view important characteristics of the random variable without having full information. That is, we won’t know what the exact strength of the next concrete cylinder to be tested is, but we can take a good guess based on previous measurements and statistical analysis.
The most fundamental statistics are the mean and standard deviation.
Given: A single random variable ‘X’ sampled ‘N’ times
The mean of X - denoted : average value of the measured quantity
The standard deviation - denoted : the average distance from the mean, or the average spread
‘var (^) x ’ is the variance of x. The standard deviation is the square root of the variance.
an equivalent expression is
A higher standard deviation increases the odds of being far away from the mean.
μ σ
∑
∑
∑
Example: Two different sets of exam grades
Class #1 and class #2 have about the same mean value (red line)
Class #1 has a small standard deviation: most students are near the mean (blue line borders)
Class #2 has a larger standard deviation, so students have a higher probability of being well over or well under the class average grade.
We can use the mean and standard deviation to estimate the likelihood of deviating from the mean value.
Higher = higher probability of being further from the mean. We will get into quantifying this probabil- ity in a few pages.
The mean and standard deviation are classified as first- and second-order statistics (involving the mean of
X, and mean of X^2 , respectively). If we stick with using these two stats to describe data, we are making assumptions about the form of its probability. We assume the fluctuations about the mean are equally likely to be above or below the mean. That is, the probability behavior is SYMMETRIC about the mean. This will not always be realistic. For example, if I give an easy test, the class average may be 100, but the standard deviation may be 15. If we assume the distribution of grades is symmetric about the mean, that would result in scores above 100, which is out of bounds. So there are cases when just the mean and star- dard deviation are not enough.
We can look at higher-order statistics to help. That is, look beyond mean and standard deviation to explain non-symetric data.
The relationship between two quantities is an often needed characteristic e.g. relation between the ratio aggregate/water and concrete strength
Given: Two quantities X and Y sampled N times
The Covariance between two random variables X and Y is
, note no
Note that if X and Y are the same process, the covariance becomes the variance, where ,
and the relationship between variance and standard deviation becomes
Covariance can be used to measure the how much X and Y are related to each other by defining a correlation coefficient
Correlation Coefficient - a number that measures the linear relationship between X and Y
, bounded by
The boundaries indicate the following property
Meaning of the correlation coefficient
: perfect linear correlation (identical processes)
: no linear correlation between x and y
: strong negative linear correlation (if x increases, y decreases)
(^) ∑
8#4 (^) Z σ Z
%18 (^) ZZ 8#4 (^) ZZ σ Z
Application - two different random variables x and y measured at the same time
Probability analysis - A formal framework for using statistical descriptions
Quantities provided in common engineering applications often are not exact. An uncertainty is often associated. Probability analysis incorporates this uncertainty when providing a solution.
Key Issue : providing a statistical description of the uncertainty
Theme example : strength of concrete cylinders
10 concrete cylinders are tested for compressive strength rated in psi
The number of cylinders that break within a series of 500 psi ranges are plotted below
How can we get a more accurate picture of the probability? If more examples are available, say 100, then the histogram can use smaller bins, providing:
What if we could take 1000 samples, 10,000 samples? As the number of samples go to infinity, the size of the bins approach zero, and the histogram becomes the probability density function (PDF)
probability density function (PDF) - p(x)
The area under p(x) represents the probability of the variable x occurring between the integration limits.
again, the total area under the entire curve is unity ( = 1 )
Usually we’ll have an equation to describe p(x)
Any equation for p(x) must satisfy the above restriction, total area = 1
Q: Usually we don’t have an infinite # of samples to numerically generate a PDF p(x). So how do we get an equation for p(x) ???
A: To get this curve , we’ll assume a certain form for the probability of the variable x.
That is, we will curve fit a probability distribution model.
Gaussian distribution
The most common assumed form is the ‘bell curve’, also called the Gaussian distribution. The Gaussian
p(x) is completely defined by the mean and standard deviation.
Gaussian distribution:
This distribution is symmetric about the mean, and tapers in the tails following an exponential shape.
Several examples are given below as mean ( ) and standard deviation ( ) vary.
Problem : There is no analytical solution to the integration of the Gaussian function...
Option #1: Use Numerical Integration to get the solution!! We’ve done this already.
Option #2: Use tables for probability of standard normal Gaussian PDF
standard normal = Gaussian PDF with , , or N(0,1).
We can ‘normalize’ any quantity associated with the Gaussian PDF by removing the mean value and dividing by the standard deviation:
Example: If is a Gaussian random variable with mean of 5 and standard deviation of 15
then we can produce a standard normal variable with the following operation
where is just a manipulation of each of the values
( Z Ōμ Z ) σ Z
For problem #1,
find: for
In standard normal space this is equal to:
for
Tables for N(0,1) say the answer is
There is a built in mathcad function with this table information in there (nice!!)
For problem #
find: for
Look at the red area on the graph on the previous page, this is calculated by:
for
In standard normal space this is equal to:
for
Example #3 ) How do we use these probability models?
Once put into service, the maximum weight placed on a single concrete column is expected to be 200
kips. The concrete column has a surface area of 50 in 2. The and of the concrete strength are the same as the previous example. What is the probability that the column will fail under the maximum expected load?
Answer:
x 50 in^2 = 300,000 lbs. x 50 in^2 = 25,000 lbs. load = 200,000 lbs.
Find:
from N(300000,25000) space to N(0,1) space, the equivalent is
( μ Ōσ)
Under certain circumstances, the probability of can be determined without numerical methods. For
example, if our system variables and are both Gaussian, the description for is also Gaussian. Combining Gaussian variables is as done as follows:
Say we want to combine four Gaussian random variables like so:
The resultant is also Gaussian with the following properties:
mean:
standard deviation :
note that all are added, regardless of the minus sign in the equation, while the mean combination
includes the minus sign in front of the term.
Applying this concept to the limit state equation , the probability of failure can be determined
as the probability that the resultant Gaussian variable is less than zero:
prob(failure) =.
This can be evaluated as we did in the probability section. We are assuming all gaussian variables are uncorrelated for simplicity.
We will do an example or two in class on the board. I’ve left some room below to take notes
μ __ μ C μ E Ōμ H μ F
σ \ 5455 σ C σ E σ H σ F
σ H
Ō ∞
∫
Sometimes an analytical solution is not possible (or difficult) because we can’t easily find the resultant
distribution of. The nice formulas on the previous page to find and are not easily gotten. Fur-
ther, if the distribution for is not Gaussian (like we’ve been using so far), then we need more informa-
tion beyond and to get a suitable description of.
case 1
We can combine as many different Gaussian variables as we like so long as we are adding and subtract- ing, and still use the analytical approach. However, often there is a need to combine variables in forms other than adding and subtracting. For example, recall that the tip deflection of a cantilevered beam is
.
Suppose our system is such a beam with a random load on its tip. Suppose also that the length, E, and I are random with their own Gaussian distributions. If we define failure as deflection exceeding a limit, the limit state would look like this:
, where threshold is the limit to deflection.
If the deflection exceeds the threshold, then G < 0, indicating system failure as we define it. Now we are not combining Gaussian variables through addition / subtraction. The nice formulas on the previous page
to find and do not apply here. We’ll look at how to solve this after case 2.
case 2
The probability of load may differ from Gaussian. This can also be true for the resistance. If the vari-
ables in the limit state vary from a Gaussian distribution, we can count on also differing from Gauss-
ian. For the simple case of where either or or both aren’t Gaussian, again our nice formulas don’t apply.
Some special cases can still be solved analytically, but in general it takes great effort to do this.
Solution - Simulation
In both cases above it can be important to account for this change in the distribution of from Gaussian. To handle these cases, we employ simulation methods. This is a pure numerical brute force method of simulating many random numbers. Each set of random numbers fits the distribution of one of the vari- ables in the limit state equation. We then run many experiments on the computer by seeing how many times out of, say 1,000,000, our limit state is < 0. Probability of failure is then
The more experiments we perform, the more reliable our final answer becomes. This is because it takes many random numbers to accurately describe a given distribution. Examples given on the board.
) μ I σ ) ) μ I σ ) )
'+
'+
μ I σ )