Introduction to Statistics: Key Concepts and Formulas, Study Guides, Projects, Research of Statistics

A concise overview of fundamental statistical concepts, including data types (categorical, quantitative, nominal, ordinal, interval, and ratio), methods for data visualization (pie charts, bar charts, histograms, stemplots, scatterplots), and key statistical rules such as the standard deviation rule and interquartile range (iqr). It also covers sampling techniques (simple random, cluster, stratified, multistage), study types (observational, experimental), and probability rules (complement, addition, multiplication, conditional probability). The document further explains random variables, binomial experiments, and inference methods, making it a valuable resource for students learning introductory statistics. It is useful for understanding basic statistical principles and their applications.

Typology: Study Guides, Projects, Research

2024/2025

Available from 06/24/2025

mariebless0
mariebless0 🇺🇸

3.9

(7)

3.2K documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1 /
11
Straighterline Introduction to Statistics
1. Four steps in the process of statistics: 1. Producing Data
2. Exploratory Data Analysis
3. Probability
4. Inference
2. Categorical variable: places individuals into one of several
groups Two types: nominal and ordinal
3. Quantitative Variable: represents a measurement or a
count Two types: Interval and ratio
4. Nominal Variable: categorical variables where there is no natural order
among the categories
5. Ordinal variable: categorical variables where there is natural order
among the categories
6. Interval Variable: a measurement or count for which it makes sense to talk
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Introduction to Statistics: Key Concepts and Formulas and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

1 /

Straighterline Introduction to Statistics

  1. Four steps in the process of statistics: 1. Producing Data
  2. Exploratory Data Analysis
  3. Probability
  4. Inference
  5. Categorical variable: places individuals into one of several groups Two types: nominal and ordinal
  6. Quantitative Variable: represents a measurement or a count Two types: Interval and ratio
  7. Nominal Variable: categorical variables where there is no natural order among the categories
  8. Ordinal variable: categorical variables where there is natural order among the categories
  9. Interval Variable: a measurement or count for which it makes sense to talk

2 / about the difference between values, but it does not make sense to talk about the ratio between values; 0 does not represent the absence of quanitity

  1. Ratio Variable: quantitative variables for which it makes sense to talk about the difference between values AND the ratio between values; 0 represents the absence of quantity
  2. What type of variable?: eye color: nominal
  3. What type of variable?: socioeconomic status with categories low, med, high: Ordinal
  4. What type of variable?: Temperature: Interval
  5. What type of variable?: Income: Ratio
  6. Visual display and numerical summary for a single categorical variable: pie chart or bar chart

4 / data IQR= Q3-Q

  1. Finding an outlier using IQR: An observation is considered a suspected outlier if it is: less than Q1 - 1.5(IQR), or more than Q3 + 1.5(IQR).
  2. **Interpreting scatterplots:
  3. positive relationship displays as
  4. negative relationship displays as:** 1. upward slope
  5. downward slope
  6. Interpreting Scatterplots: How to tell if a linear relationship is strong or weak: closer to -1 is a strong negative linear relationship closer to +1 is a strong positive linear relationship close to 0 is a weak linear relationship
  7. Interpreting Scatterplots: Linear regression: Finding the line that best fits the pattern of the linear

5 / relationship (the line that describes how the response variable linearly depends on the explana- tory variable

  1. Interpreting Scatterplots: Least Squares Regression Line: Has the smallest sum of squared vertical devia- tions of the data points from the line.
  2. Interpreting Scatterplots: Extrapolation: Prediction for ranges of the explanatory variable that are not in the data; is not reliable and should be avoided
  3. Association (does/does not) imply causation.: Does not
  4. Lurking Variable: a variable that is not among the explanatory or response vari- ables in a study, but could substantially affect your interpretation of the relationship among those variables
  5. Simpson's paradox: When a lurking variable causes you to rethink the direction of an association
  6. Probability sampling plan: any sampling plan that relies on random selection (avoids bias).

7 /

  1. Experiment: researchers "take control" of the values of the explanatory variable because they want to see how changes in the value of the explanatory variable affect the response variable
  2. The Complement Rule: P(not A) = 1 - P(A) useful for finding events of the type "at least one of..."
  3. General Addition Rule: P(A or B) = P(A) + P(B) - P(A and B) used to find events of the type events of the type "A or B"
  4. General Multiplication Rule: P(A and B) = P(A) * P(B | A) Used for events of the type "A and B" or when A and B are independent: P(A and B) = P(A) * P(B)
  5. Conditional probability P(B | A): the conditional probability of event B occurring given that event A has occurred

8 / P(B | A) = P(A and B) / P(A).

  1. Methods for checking for independence of two variables: If the two variables are independent: · P(A|B)= P(A) · P(B|A)=P(B) · P(B|A)=P(B| not A) · P(A and B)= P(A) * P(B)
  2. **1. Discrete random variable
  3. Continuous random variable:** 1. things we count
  4. things we measure
  5. Probability distribution: a list of a variable's possible values and their corre- sponding probabilities
  6. Center of a random variable distribution is measured by its: mean
  7. Spread of a random variable distribution is measured by its: variance or standard deviation
  8. Rules for the linear transformation of one random variable: ¼a(+b)X=a+b

10 /

2. Statistic: 1. Number that describes the population

  1. Number that is computed from the sample
  2. Standard deviation of all sample proportions: {[p(1 p)]/n}.
  3. Standard deviation of all sample means: Ãn/ ()
  4. 3 Types of inference in this course: Point estimation Interval Estimation Hypothesis Testing
  5. In Point Estimation, Estimate the population proportion using the , and the population mean using the .: Sample proportion, sample mean
  6. General formula of confidence intervals: point estimation +- margin of error
  7. Confidence intervals for the population mean: Xhat ± z Ãn/Å()[]
  8. Various values of z for different levels of confidence: 90%= 1.645 times the standard deviation of sample mean

11 / · 95%= 2 (or precisely 1.96) times the standard deviation of sample mean · 99%= 2.576 times the standard deviation of sample mean