Chapter 1

Descriptive statistics: utilizes numerical and graphical methods to look for patterns in a data set, to summarize information revealed in a

data set, and to present that information in a convenient form

Inferential statistics: utilizes sample data to make estimates, decisions, predictions, and other generalizations about a large set of data

Population: a set of units that we are interested in studying

Sample: a subset of the units of a population

Census and Sample: Measurement is a process we use to assign numerical values to variables of individual population units. When we

measure a variable for every unit of a population, the result is a census of the population. If we only measure part of the units in a

population, the result is a sample of the population.

Elements of inferential statistics:

The population or sample of interest

One or more variables that are to be investigated

Tables, graphs, or numerical summary tools

Identifications of patterns in the data

Elements of interferential statistical problems:

The population of interest

One or more variables that are to be investigated

The sample of population units

The inference about the population based on information contained in the sample

A measure of the reliability of the inference

Quantitative data: measurements that are recorded on a naturally occurring numerical scale

Qualitative data: measurements that cannot be measured on a natural numerical scale; can only be classified into one group

Representative sample: exhibits characteristics typical of those possessed by the target population

Measure of reliability: a statement (usually quantitative) about the degree of uncertainty associated with a statistical inference

Chapter 2

Box-Plot: distance between points is figured out by 1.5*IQR

Quantile-Quantile Plot: the dot chart

Stem-and Leaf Display: all the numbers lying out

Mean: average Median: the middle number when the numbers are in order Mode: number that occurs the most

Range: largest number minus smallest number

Standard Deviation (s): defined as the positive square root of the sample variance (s2) or

√

Variance (s2): equal to the sum of the squared distances from the mean, divided by (n-1) or

∑

i−1

−

(∑

i=1

)

n−1

Upper Quartile:

the 75th percentile Lower Quartile: the 25th percentile

IQR: distance between the lower and upper quartile

Chebyshev's Rule: Generally, at least I 1/k2 of the measurements will fall within k standard deviations of the mean for any number of k

greater than 1,regardless of the sharp of the frequency distribution.(a) At least 3/4 of the measurements will fall within the interval (x-2s,

x+2s) for samples and (μ-2σ, μ+2σ) for populations. (b) At least 8/9 of the measurements will fall within the interval (x-3s, x+3s) for

samples and (μ-3σ, μ+3σ) for populations.

Empirical Rule: the empirical rule is a rule of thumb that applies to samples or populations with frequency distributions that are mound-

shaped. (a) Approximately 68% of the measurements will fall within the interval (x-s, x+s) for samples and (μ-σ, μ+σ) for populations.

(b) Approximately 95% of the measurements will fall within the interval (x-2s, x+2s) for samples and (μ-2σ, μ+2σ) for populations. (c)

Approximately 99.7% of the measurements will fall within the interval (x-3s, x+3s) for samples and (μ-3σ, μ+3σ) for populations

Z-Score: suppose x is a measurement from a sample with mean x and standard deviation s. The sample Z score of x is

(

x−x

)

Symmetric:

Skewed: one tail of the distribution has more extreme observations than the other tail; if the lower part is on left it’s a right skew and

vice versa

Mound-Shaped distribution: the mean, median, and mode are all about the same

Chapter 3

Definitions:

Experiment: An experiment is an act or process of observation that leads to a single outcome that cannot be predicted with certainty.

Partial preview of the text

Download Statistics: Descriptive and Inferential Methods and Distributions - Prof. Chung-Ching Wang and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Chapter 1 Descriptive statistics: utilizes numerical and graphical methods to look for patterns in a data set, to summarize information revealed in a data set, and to present that information in a convenient form Inferential statistics: utilizes sample data to make estimates, decisions, predictions, and other generalizations about a large set of data Population: a set of units that we are interested in studying Sample: a subset of the units of a population Census and Sample: Measurement is a process we use to assign numerical values to variables of individual population units. When we measure a variable for every unit of a population, the result is a census of the population. If we only measure part of the units in a population, the result is a sample of the population. Elements of inferential statistics:  The population or sample of interest  One or more variables that are to be investigated  Tables, graphs, or numerical summary tools  Identifications of patterns in the data Elements of interferential statistical problems:  The population of interest  One or more variables that are to be investigated  The sample of population units  The inference about the population based on information contained in the sample  A measure of the reliability of the inference Quantitative data: measurements that are recorded on a naturally occurring numerical scale Qualitative data: measurements that cannot be measured on a natural numerical scale; can only be classified into one group Representative sample: exhibits characteristics typical of those possessed by the target population Measure of reliability: a statement (usually quantitative) about the degree of uncertainty associated with a statistical inference Chapter 2 Box-Plot: distance between points is figured out by 1.5*IQR Quantile-Quantile Plot: the dot chart Stem-and Leaf Display: all the numbers lying out Mean: average Median: the middle number when the numbers are in order Mode: number that occurs the most Range: largest number minus smallest number

Standard Deviation (s): defined as the positive square root of the sample variance (s^2 ) or s =√ s^2

Variance (s^2 ): equal to the sum of the squared distances from the mean, divided by (n-1) or

s

i − 1 n

xi

i = 1 n

xi )

n

n − 1

Upper Quartile: the 75th^ percentile Lower Quartile: the 25th^ percentile IQR: distance between the lower and upper quartile Chebyshev's Rule: Generally, at least I 1/k2 of the measurements will fall within k standard deviations of the mean for any number of k greater than 1,regardless of the sharp of the frequency distribution. (a) At least 3/4 of the measurements will fall within the interval (x-2s, x+2s) for samples and (μ-2σ, μ+2σ) for populations. (b) At least 8/9 of the measurements will fall within the interval (x-3s, x+3s) for samples and (μ-3σ, μ+3σ) for populations. Empirical Rule: the empirical rule is a rule of thumb that applies to samples or populations with frequency distributions that are mound- shaped. (a) Approximately 68% of the measurements will fall within the interval (x-s, x+s) for samples and (μ-σ, μ+σ) for populations. (b) Approximately 95% of the measurements will fall within the interval (x-2s, x+2s) for samples and (μ-2σ, μ+2σ) for populations. (c) Approximately 99.7% of the measurements will fall within the interval (x-3s, x+3s) for samples and (μ-3σ, μ+3σ) for populations

Z-Score: suppose x is a measurement from a sample with mean x and standard deviation s. The sample Z score of x is ¿

( x − x )

s

Symmetric: Skewed: one tail of the distribution has more extreme observations than the other tail; if the lower part is on left it’s a right skew and vice versa Mound-Shaped distribution: the mean, median, and mode are all about the same Chapter 3 Definitions: Experiment: An experiment is an act or process of observation that leads to a single outcome that cannot be predicted with certainty.

Sample Point: A sample point is the most basic outcomes from an experiment. Sample Space: Sample space is the collection of all possible sample points from an experiment. Event: An event is a specific collection of sample points. Union: The union of two events A and B is the event that either A or B or both occur in a single trail of the experiment. We denote the

union of A and B by the symbol. A ∪ B

Intersection: The intersection of A and B is the event that both A and B occur n a single trail of the experiment. We denote the intersection of A and B by the symbol A ∩B Complementary Events: The complement of an event A is the event that A does not occur. This means the complement event of A consists all sample points that are not in event A. We denote the complement of event by the symbol AC Mutually Exclusive Events: Events A and B are mutually exclusive events if event A and event B have no sample points in common. Independent and Dependent: Events A and B are independent events if the occurrence of B does not alter the probability that A has occurred. That is P(A|B) = P(A) and P(B|A) = P(B). Events A and B are dependent if they are not independent Additive Rule of Probability: The probability of the union of events A and B is the sum of probability of event A and the probability of

event B, minus the probability of the intersection of events A and B, i.e., P ( A ∪B ) = P ( A )+ P ( B )− P ( A ∩B )

Event A and Event B are mutually exclusive if contains no sample points, that is, A and B have no sample points in common. A and Ac has no sample points in common, i.e., A and Ac^ are mutually exclusive. Note: (^) P ( A ∪ Ac^ )= P ( A ) + P ( Ac ) P ( A ∩ Ac^ )= 0  The sum of the probabilities of complementary events equals one; that is, P(A) + P(Ac) = 1.  Event A and Event Ac^ have no sample points in common.  Event A and Event Ac^ are mutually exclusive.  Venn diagram is a good graphical tool for understanding the concept of compound events, complementary events, and mutually exclusive.  P(A) + P(Ac) = 1. This means the Event A and the Event Ac^ cover the entire sample space. The multiplicative rule of probability is used to find the probability of intersection of two events. Let A and B be two events, the

multiplicative rule of probability is P(A ∩ B) = P(A|B) * P(B) = P(B|A) * P(A)

We say two events A and B are independent events if the probability of the occurrence of one event (say A) does not alter the probability that another event (say B) has occurred. P(A|B) = P(A) and P(B|A) = P(B) when A and B are independent events.

If events A and B are independent, the multiplicative rule of probability becomes to P(A ∩ B) = P(A) * P(B) = P(B) * P(A). We say

events A and B dependent events (def 3.9) if A and B are not independent.

 Note: (1) P(A ∪ B) = P(A) + P(B) if events A and B are mutually exclusive because P(A ∩ B) = 0. (2) P(A ∩ B) = P(A) * P(B)

if events A and B are independent events. The Multiplicative Rule: you are drawing one element from each of k sets of elements with size n 1 , n2, …nk, respectively. The number of different sample points of this experiment is equal to the product n 1 xn 2 x…nk Permutations Rule: You are drawing n elements from a set of N elements and arranging these n elements in a distinct order. The number of different outcomes of this experiment is equal to = N! / (N-n)!.  Note: (1) N! = N * (N-1) * (N-2) * … * 2 * 1. (2) 5! = 5 * 4 * 3 * 2 * 1 = 120. Combinations Rule: you are drawing n elements from a set of N elements. The number of different outcomes of this experiment is equal to N! / [(N-n)! *n!]. Since the order is of important. Partitions Rule (pg.160): You are partitioning a set on N elements into K groups consisting of elements (N=n 1 +n 2 +...+nk).The number of different outcomes of this experiment is

Statistics: Descriptive and Inferential Methods and Distributions - Prof. Chung-Ching Wang, Study notes of Data Analysis & Statistical Methods

Related documents