Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to statistics, including the concepts of descriptive and inferential statistics, probability, populations, samples, parameters, and statistics. It covers the importance of statistical inference, the difference between a census and a sample, and the concepts of observational units, variables, frequency distributions, relative frequencies, and measures of center and spread. It also introduces the concepts of measures of center (mean and median), measures of variation (range, quartiles, and interquartile range), and the five-number summary.
Typology: Study notes
1 / 3
Statistics Test # 1 Chapter 1 --- Statistics – Science of collecting, summarizing, analyzing, an d interpreting data (info, usually in the form of numbers) o Descriptive Statistics – Summarizing the data o Inferential Statistics – Analyzing and interpreting data. Inductive thought process Probability – Field of mathematics involved with determining the relative frequency of certain events. Deductive thought process Population – Collection of all elements (individuals or items) under consideration o Example: All the possible tosses of a coin Sample – A part of the population from which information is obtained Parameter – A numerical characteristic (descriptive measure) of the population Statistic – A numerical characteristic (descriptive measure) of a sample Statistical Inference – Inductive thought process – Based on observation o Either experimental or hypothesis testing Census – 100% enumeration of a population Why take a sample instead of a census? o Convenience, Necessity, and Accuracy o Random Sample – Elements are drawn at random with replacement o Simple Random Sample – Elements are drawn at random without replacement o Sampling Frame – List of the elements in the population from which the sample will be drawn (n= # of elements in population) Chapter 2 --- Observational Unit – The elements or objects described Variable – The measured characteristic for an observational unit o Qualitative – simply classifies elements into categories or classes Example: non-numeric, numeric but used to rename object, ordinal values o Quantitative – This type of data is numerical and such that differences in the vales have meaning Example: values are finite or countably infinite, positive values is an interval Frequency Distribution – Two column table including the values of the variable and the frequency of each value
Relative Frequency distribution – Two column table including the values of the variable and the relative frequency (frequency divided by the number of observations) of each value o Relative Frequency always totals 1, write as a decimal Bar Chart – Graph with values of a qualitative variable on the horizontal axis and either the frequencies or relative frequencies on the vertical axis o *** Bars don’t touch each other!! Stem and Leaf Diagram – Used for displaying the distribution of the values of a quantitative variable. Leading digits are used as stems and the next digits as leaves o Gives a quick way to look at the main features of the distribution Center, Spread, Shape, Outliers To determine shape (skewed to right/left or symmetric) turn graph on side and skewness is to side with longest tail o Use for small sets of data Classes – Categories for grouping data Frequency – The number of observations that fall in a class Frequency Distribution – A listing of all classes and their frequencies Relative Frequency – The ratio of the frequency of a class to the total number of observations Relative-frequency distribution – A listing of all classes and their relative frequencies Lower cutpoint – The smallest value that could go in a class Upper cutpoint – The smallest value that could go in the next higher class (equivalent to the lower cutpoint of the next higher class) Midpoint – The middle of the class ,found by averaging its cutpoints Width – The difference between the cutpoints of a class Histogram – graph with values of a quantitative variable on the horizontal axis and either the frequencies or the relative frequencies on the vertical axis o **** Bars touch each other!!!! o Use for large data sets o Want 5-12 bars Measures of center of a distribution o Mean – Arithmetic Average of the measurements (Notation: More susceptible to extreme measurements o Median – The number such that half of the measurements are greater and half are smaller than the number (Notation: m) Measures of Variation o Range – The distance between the smallest and the largest measurements (Notation: range = lgest observation – smest observation ) o First quartile – the value such that one fourth of the measurements are less than that
o Third Quartile – the value such that one fourth of the measurements are greater than
o IQR – The distance between the first and third quartiles (Notation: IQR = Q3 – Q1)
Rule of thumb for outliers: Less than Q1 – 1.5 x IQR, or greater than Q3 + 1.5 x IQR Five-Number Summary – Describe the distribution with the numbers: minimum, Q1, m, Q3, maximum Boxplot – graphical display of the five-number summary o Good for comparing several distributions Measures of Spread o Variance – Measures spread as a function of the distance of measurements from the mean (Notation: o Standard Deviation – The square root of the variance (Notation: Measuring Center and Spread in populations o Population – o Population Variance – o Population standard deviation – Chebyshev’s Theorem – For any distribution, a proportion of at least 1 – 1/k^2 of the measurements must lie within k standard deviations of the mean ( for k > 1) Empirical Rule – For the normal distribution, approximately 68% of the measurements are within one standard deviation of the mean, approximately 95% of the measurements are within two standard deviations, and approximately 99.7% of the measurements are within three standard deviations of the mean Z score – measures the number of standard deviations a measurement, x, is from the mean (Notation: Scatterplot – plotting each element as a point on the two –dimensional axis (x on horizontal axis, y on vertical) o Linear relationship – points fall along a line with a slope not equal to 0 o Positive linear relationship – points fall along a line with slope greater than 0 o Negative linear relationship – points fall along a line with a slope less than 0 Linear correlation coefficient – measures the strength of the linear relationship between two variables x and y (Notation: r =