









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to the concepts of parameters and statistics in the context of probability and statistics. It covers the difference between parameters and statistics, probability theory, continuous variables, frequency distributions, grouped frequency distributions, and various graphical methods for representing interval/ratio data. It also discusses unimodal, bimodal, symmetric, positively skewed, and negatively skewed distributions, as well as the normal distribution and various measures of central tendency and variation.
Typology: Exams
1 / 17
This page cannot be seen from the preview
Don't miss anything!










Inductive Statement - -A statement whose truth is assessed by observing a series of examples, by collecting and analyzing data -ex. I choose to fly because I believe air travel is safer than driving -ex. The surgeon general says that smoking cigarettes causes cancer What is research design and how does it relate to statistical reasoning? - -Research design is the science of collecting data, making observations about the real world, considering how many observation to make and under what conditions to make them. -Statistical reasoning begins with the collected data and prescribes the rules by which rational statements about those data can be made -Statistical reasoning and research design are intertwined, dependent skills; one cannot be a good research designer without being a good statistician, and vice versa Pygmalion Effect - -people act in accordance with other's expectations -experimenter bias -ex. If rat handlers expect quick learning, they get quick learning. "maze bright" rats/ "maze-dull" rats -ex. At Oak School, some children were labeled as "bloomers" and the other were labeled as "other". Many bloomers scored higher but not all. Inconclusive. What three characteristics of the experimental statistical method do experimental explorations of the Pygmalion effect illustrate? - -experiments are based on pre-experimental observations
-Population: includes all the members of the group under consideration (can be large or small) -Sample: a subset of a population, a group that is interesting to us not on its own merits but because it somehow represents the larger population -ex. population of our class or sample from school What is the difference between a parameter and a statistic? - -Parameter: any measured (or assumed) characteristic of a population -Statistic: Any measurement on a sample Probability - -a measure of our ignorance or uncertainty about the outcomes of events in the world How do we determine a probability that an event will occur? - -The number of outcomes favorable to that event divided by total number of possible outcomes How do we determine Conditional Probability? - -The probability of an event given that another event has occurred -P(A/B)= # of outcomes of A + B at the same time/# of total outcomes of B -ex. the conditional probability of drawing a heart given that the card is red P(heart/red card) = 13/26=. How to determine how often an event will occur - -find the probability -multiply the probability by the number of occasions -ex. if prob of drawing a heart is .25 and you perform 200 draws: .25 x 200 = 50 times Probabilities of unusual events, very likely events, and events that occur half the time - -unusual events: probabilities close to 0 (less than .05) -very likely events: probabilities close to 1
-ex. temp difference between 34 C and 35 C is the same as 77 C and 78 C -if variable has negative numbers then interval, not ratio. What is the characteristic of a Ratio Scale? - -has all the characteristics of the interval level -furthermore, it requires that the scale have a true zero point -ex. weight, height, time (# of min) -a true zero point means that the thing being measured actually vanishes when the scale reads zero -Celsius not a ratio scale because it does not mean the absence of heat at 0 C What is a continuous variable? - -A variable that has an infinite number of possible values between any two adjacent scale values -ex. height: someone's height could be measured more precisely at 64.4 inches or 64.37 inches and so on What is a discrete variable? - -One that has no possible intermediate values between two adjacent points -ex. number of children a woman has, possible values only have whole numbers. What are the real limits of a measurement? - -the points that are half the measuring unit above and below the measured value -measurements made on continuous variables are only approximations because of the infinite number of possible measured values -ex. we say that Mary is 64 inches tall but her real limits of measurement are 63.5 and 64.5 or 64 plus or minus. How do you round if the remainder is less then 500 - -discard the remainder
-ex. 8.347, the remainder is 470 which is smaller than 500 so discard the remainder and round 8.347 to
How do you round if the remainder is more than 500 - -add 1 to the last digit before the remainder and then discard the remainder -ex. 4.86524, the remainder is 6524 so increase the 8 by 1 and discard the remainder, thus rounding 4.86524 to 4. How do you round if the remainder is equal to 500 - -use the final digit and discard the remainder -ex. 4.850, the remainder is 500 and the final digit is even so stay at 4. -ex. 2.35, the remainder is 500 and the final digit is odd so round up to 2. -"Even-leave it. Odd-up" Summation Notation - -used when summing all the entire in a data set -the character sigma is used or "sum" -(Sigma)Xi = X1 + X2 + X3 + X4 + X5 + X = 7 + 2 + 4 + 2 + 3 + 6 = 24 -ex. SigmaX^2 = sum all the values of X^ -ex. (SigmaX)^2 = sum all the values of X and then square the sum -use PEMDAS How is a tabular frequency distribution constructed? - -It is a table that lists the numerical values of a variable in a logical order along with the frequency of each value -values with 0 frequency are omitted Frequency - -the number of times a particular value of the variable occurs
Relative frequency - -for histograms and frequency polygons it is sometimes desirable to plot a measure of relative frequency, such as a proportion or percentage, on the vertical axis -it is frequency divided by the size of the group, expressed as a proportion or percentage -ex. if 25 men in data, then divide frequency 1 by 25 to get 4% and substitute that into where the frequency previously one -gives same information as regular frequency What are two graphical methods of representing interval/ratio data from grouped frequency distributions? - -histogram -frequency polygon Unimodal distribution - -A distribution that has one most frequently occurring value Bimodal distribution - -A distribution that has two most frequently occurring values Symmetric distribution - -A distribution whose left side is a mirror image of its right side Positively skewed distribution - -A distribution whose right tail is longer than its left tail Negatively skewed distribution - -left tail is longer than its right tail Asymptotic -
-gradually approaching the X-axis -looks like it is touching Normal distribution - -Unimodal -symmetric -asymptotic Bar Graph - -used for nominal and ordinal data -bars are separated from one another -makes it clear that no intermediate values were measured Three important values we can use to measure central tendency - -mode -median -mean What is the mode? - -The most frequently occurring value in a distribution -a value of the variable (ex. Democrats) not the number of times it occurs (frequency) -finding mode using class intervals: add the lowest and highest point of the interval and then divide by 2 With what types of data can the mode be used? - -it can be used for any variable/data (nominal, ordinal, and interval/ratio) How is the mode eyeball-estimated? - -find the point on the X-axis that lies directly below the highest point of the frequency distribution
-range method -most useful when a numerical, not graphical, frequency distribution is available -add the smallest value and the largest value and divide by 2 Where are the measurements of the central tendency in a symmetric distribution? - -mode, median, and mean are in the same position Where are the measurements of the central tendency when a distribution is positively skewed? - -mean pulled to the right by the long tail -moves away from the mode -skew has little effect on the median (moves towards mean a tiny bit) and even less effect on the mode (doesn't move) Where are the measurements of the central tendency when a distribution is negatively skewed? - -mean pulled to the left by the long tail -moves away from the mode -skew has little effect on median (moves towards mean a tiny bit), and even less effect on the mode (doesn't move) What are the three major characteristics of distributions? - -shape -central tendency -variation What are three measures of variation? - -the range -the standard deviation -the variance
What type of information do the three measures of variation provide? - -they convey something about how wide a distribution is (how far it is from the smallest point to the largest point, how far it is from the mean to a representative point, etc.) Which measure of variation can be used for nominal data? - -no measure of variation is appropriate when the data are nominal Which measure of variation can be used for ordinal data? - -only the range can be used for ordinal data Which measure of variation can be used for interval/ratio data? - -the range, standard deviation, and variance can all be used What is the range? - -a measure of a distribution equal to the highest value minus the lowest value What are its advantages and disadvantages as a measure of variation? - Advantage: -it is the simplest measure of variation Disadvantage: -crudest measure because it depends on only two points in the entire distribution -it is only affected by the outermost points in a distribution; the positions of the inner points are immaterial -it is impossible to define for important distributions, like the normal distribution. -since normal distribution is asymptotic it never reaches 0 and the lowest and highest points are infinitely far out in the tails the range is in principle infinite so not informative Deviation - -The distance any point is from the mean -deviations always sum to zero. Therefore the mean of the deviations is always zero
-The score at or below which a specified percentage of scores in the distribution fall -It is a score -ex. My score is at the 75th percentile -to get percentile you count the total number of scores at or below the score you received and divide by the total number of scores What are percentile ranks? - -The percentage of scores equal to or less than the given score -This is a percentage -ex. the percentile rank of my score is 75% -use same steps to get percentile rank but just add a percentage sign What is a standard score or z score? - -A variable whose value counts the number of standard deviations a score is above or below its mean -z=0 denotes the mean, z=1 is one standard deviation above the mean, z=2 is two standard deviations above the mean, z=-1 is one standard deviation below the mean Equation for transforming a raw score to a standard score in a population - z= X- (population/sample mean)/(standard deviation of population/sample) X= raw score -ex. of raw score is 68 on an exam Relative area - -The proportional (or fractional) area under a frequency distribution -the proportion of values in the shaded region What can be said about the areas of regions under the normal distribution? - -relative areas under regions of the normal curve are always the same, regardless of the values of their mean and standard deviation
-the area between the mean and the point one standard deviation above (or below) the mean is approximately 34% -the area between the points one and two standard deviations above (or below) the mean is approx. 14% -the area beyond two standard deviations is approx. 2% 2%, 14%, 34%, 34%, 14%, 2% what z score is 5% of the distribution? -
If you have the z score but want the area under distribution - -enter Table A.1 starting in z column If you have the area under distribution but want the z-score - -enter Table A.1 starting in column A or B If you have the z score but want X, the raw score, or the standardized score - -use X= u +z(o) or X = (population/sample mean) + z(standard deviation) What are representative samples? - -A sample of a population that reflects the characteristics of the parent population Two ways that representative samples can be chosen - -Simple random sampling -Stratified random sampling Simple random sampling - -A random sampling technique whereby all members of the population are treated equally regardless of their characteristics Stratified random sampling - -A sampling procedure whereby the population is divided into subgroups (strata) whose members have the same or similar characteristics, and then simple random samples are taken from each stratum. In what ways do samples usually differ from a parent population? - -the shape of the distribution of a sample does not necessarily reflect the shape of the distribution of the population from which it is drawn (when sample size is small)
What is the sampling distribution of the means? - -the distribution formed by taking repeated samples from the same population, computing the mean of each sample, and forming the distribution of those sample means -every element in the parent population is an individual boys weight -every element in the distribution of means is a mean of each sample Where is the sampling distribution of the means mean in relation to the parent population distributions mean? - -they are almost equal What is the sampling distribution standard error in relation to the parent populations standard deviation? - -the sampling distribution standard error is smaller than the standard deviation