Stats: Population vs. Sample, Sampling Techniques, Central Tendency, Probability, Study notes of Statistics

An overview of the concepts of population and sample, different sampling techniques, measures of central tendency, and probability distributions. It covers topics such as voluntary and convenience sampling, nominal, ordinal, interval, and ratio data, stratified and cluster sampling, systematic sampling, randomized response technique, skewness, mean, median, trimmed mean, interquartile range, five number summary, box and whisker plot, probability represented by a histogram, binomial distributions, long run average, standard deviation of probability, normal distribution, empirical rule, standard normal distribution, converting z-score to raw score, and converting r to z. It also includes examples and exercises.

Typology: Study notes

Pre 2010

Uploaded on 09/21/2008

rachelhoagland
rachelhoagland 🇺🇸

1 document

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
* Population (environment we’re interested in) vs. sample (subset of population): example: the national rifle association wants to know how urban residents feel about proposed
laws limiting handgun possession. They send a profess. Survey team to interview 100 people selected at random. (pop = all urban residents, sample = 100 selected). * voluntary
and convenience sample are biased. * nominal (name), ordinal (ranking), interval (# values, difference between two is important), ratio ( 4 degrees F is not really 2 times
warmer than 2 degrees. Ratios have to mean something. ABSOLUTE ZERO = kg). * stratified sample (ALL and then SOME) vs. cluster sample (SOME and then ALL). *
systematic sample (every 10th person). * Randomized response technique. (Example: ) * skewed right
(long tail to the right) vs. skewed left ( long tail to the left). * Mean: x bar (total divided by # of trials). * median rank: * trimmed mean: cut off the highest and
lowest values vs. windsored mean (changed the outliers to the closest extremes). * X – X(BAR) = derived from each x minus the mean which shows how far apart each value is
from the mean. * SAMPLE VARIANCE (use when there is not much spread) = MAKE SURE TO SQUARE ALL ANSWERS FIRST SO IT DOES
NOT CANCEL OUT* coefficient of variation (used when the numbers are larger, to show the spread between values in relation to the size of the values) =
(100) (S/XBAR) = has no units, can be used to compare spread for 2 different kind of units * Q1 = median of lower half, Q2 = median, Q3 = median of upper half * IQR:
difference between Q3 and Q1. (use this when there is a lot of spread). * 5 number summary: lowest number, Q1, Q2, Q3, highest number. *box and whisker plot: first line,
lowest, beginning of box is Q1, waist of the box is Q2, end of box is Q3, last line is highest number. * probability represented by a histogram. Height of bar is prob of obtaining
the corresponding value of the variable. Continuous probability – the prob of any single value is zero* binomial distributions: use when you see percentages. Must satisfy 4
conditions (fixed number of trials, only success or failure, probability of success is the same each trial, trials are independent). USE THE BINOMIAL TABLE TO FIND
PROBILITY. R = number of successes. Must find p (probability of success) and find n (number of trials) then find r (how many successes it is asking for) and look at chart. (if p
< .5 it is skewed right, if p is > .5 it is skewed left). *long run average: is the mean (expected value of the random variable) = n times p ( ) * standard deviation of probability
is q= prob of not success. * Normal Distribution: highest point is the mean, it is symmetrical, curve approaches the horizontal axis but never touches or crosses it, the
inflection points between cupping upward and downward occur above the mean plus the standard of deviation and the mean minus the standard of deviation. If the standard
deviation is larger, the curve will be more spread out. Probability corresponds to the area under the curve. * EMPIRICAL RULE: 68% of the observations lie within one standard
deviation of the mean, 95% lie within 2 standard deviations of the mean, and 99.7% lie within three standard deviations of the mean. * Standard Normal Distribution: an
observation X is converted to a Z-SCORE by subtracting the mean and dividing by the standard deviation Z value tells how many standard deviations above or
below the mean the observation lies. *CONVERTING Z-SCORE TO RAW SCORE: solving z-score formula for x yields =
* Standard normal distribution – has mean = 0 and standard deviation = 1. if X values have normal distribution, then z-scores have standard normal distribution. TO
FIND THE Z-SCORE FROM THE PERCENTILE LOOK AT THE CHART BACKWARDS. (find percentile in the chart and see what the z-score) * for large sample
size, binomial distrubtion approaches normal distribution. R=number of successes is approximately normal. * converting r to z:
* sample variance is in squared units, sample standard deviation is in original units
* EXAMPLE: 59 < R, change it to R > 59 which is changed to r = 58.5
* if r is greater than or equal to you subtract .5
* if r is less than or equal to you add .5
P (.45 < Z < 1.27) = P( Z < 1.27) – P(Z < . 45)
= .8980 - .6736 = .2255
EX: Scores on the Stanford-Binet Intelligence Test are known to be normally distributed with
mean m = 100 and standard deviation s = 16. Find the probability that a randomly selected
person has a score between 88 and 104.
Convert the raw scores (X values) to Z-scores, then use table of areas for standard normal
distribution
EXAMPLE: In Richmond, only 40% of all children live with both their natural parents. Smog City, which is a typical area, has 150 children enrolled this year.
what is the probability that less than 56 of these children live with both of their natural parents?
what is the prob. That between 59 and 70 of these children live with both their natural parents?

Partial preview of the text

Download Stats: Population vs. Sample, Sampling Techniques, Central Tendency, Probability and more Study notes Statistics in PDF only on Docsity!

*** Population** (environment we’re interested in) vs. sample (subset of population): example: the national rifle association wants to know how urban residents feel about proposed laws limiting handgun possession. They send a profess. Survey team to interview 100 people selected at random. (pop = all urban residents, sample = 100 selected). * voluntary and convenience sample are biased. * nominal (name), ordinal (ranking), interval (# values, difference between two is important), ratio ( 4 degrees F is not really 2 times warmer than 2 degrees. Ratios have to mean something. ABSOLUTE ZERO = kg). * stratified sample (ALL and then SOME) vs. cluster sample (SOME and then ALL). * systematic sample (every 10th^ person). * Randomized response technique. (Example: ) * skewed right (long tail to the right) vs. skewed left ( long tail to the left). * Mean: x bar ( total divided by # of trials). * median rank: * trimmed mean: cut off the highest and lowest values vs. windsored mean (changed the outliers to the closest extremes). *** X – X(BAR) =** derived from each x minus the mean which shows how far apart each value is from the mean. * SAMPLE VARIANCE (use when there is not much spread) = MAKE SURE TO SQUARE ALL ANSWERS FIRST SO IT DOES NOT CANCEL OUT * coefficient of variation ( used when the numbers are larger, to show the spread between values in relation to the size of the values) = (100) (S/XBAR) = has no units, can be used to compare spread for 2 different kind of units * Q1 = median of lower half, Q2 = median, Q3 = median of upper half * IQR: difference between Q3 and Q1. (use this when there is a lot of spread). * 5 number summary: lowest number, Q1, Q2, Q3, highest number. * box and whisker plot: first line, lowest, beginning of box is Q1, waist of the box is Q2, end of box is Q3, last line is highest number. * probability represented by a histogram. Height of bar is prob of obtaining the corresponding value of the variable. Continuous probability – the prob of any single value is zero* binomial distributions: use when you see percentages. Must satisfy 4 conditions (fixed number of trials, only success or failure, probability of success is the same each trial, trials are independent). USE THE BINOMIAL TABLE TO FIND PROBILITY. R = number of successes. Must find p (probability of success) and find n (number of trials) then find r (how many successes it is asking for) and look at chart. (if p < .5 it is skewed right, if p is > .5 it is skewed left). * long run average: is the mean (expected value of the random variable) = n times p ( ) * standard deviation of probability is q= prob of not success. * Normal Distribution: highest point is the mean, it is symmetrical, curve approaches the horizontal axis but never touches or crosses it, the inflection points between cupping upward and downward occur above the mean plus the standard of deviation and the mean minus the standard of deviation. If the standard deviation is larger, the curve will be more spread out. Probability corresponds to the area under the curve. *** EMPIRICAL RULE:** 68% of the observations lie within one standard deviation of the mean, 95% lie within 2 standard deviations of the mean, and 99.7% lie within three standard deviations of the mean. * Standard Normal Distribution: an observation X is converted to a Z-SCORE by subtracting the mean and dividing by the standard deviation Z value tells how many standard deviations above or below the mean the observation lies. * CONVERTING Z-SCORE TO RAW SCORE: solving z-score formula for x yields =

  • Standard normal distribution – has mean = 0 and standard deviation = 1. if X values have normal distribution, then z-scores have standard normal distribution. TO FIND THE Z-SCORE FROM THE PERCENTILE LOOK AT THE CHART BACKWARDS. (find percentile in the chart and see what the z-score) * for large sample size, binomial distrubtion approaches normal distribution. **R=number of successes is approximately normal. * converting r to z:
  • sample variance is in squared units, sample standard deviation is in original units
  • EXAMPLE: 59 < R, change it to R > 59 which is changed to r = 58.
  • if r is greater than or equal to you subtract.
  • if r is less than or equal to you add. P (.45 < Z < 1.27) = P( Z < 1.27) – P(Z <. 45) = .8980 - .6736 =.** EX: Scores on the Stanford-Binet Intelligence Test are known to be normally distributed with mean m = 100 and standard deviation s = 16. Find the probability that a randomly selected person has a score between 88 and 104. Convert the raw scores (X values) to Z-scores, then use table of areas for standard normal distribution EXAMPLE: In Richmond, only 40% of all children live with both their natural parents. Smog City, which is a typical area, has 150 children enrolled this year.  what is the probability that less than 56 of these children live with both of their natural parents?  what is the prob. That between 59 and 70 of these children live with both their natural parents?