Download Descriptive and Inferential Statistics - Lecture Slides | PSYC 2950 and more Study notes Psychology in PDF only on Docsity! Chapters 14 and 15 Descriptive and Inferential Statistics 1 Two Main Types of Stats • Descriptive: describes or summarizes data • Inferential: makes inferences about populations based on sample data 2 Bar Graphs • Uses vertical bars to visually represent data • Used when you have categorical variables! 5 Histograms • Frequency distribution in a bar format. Used with quantitative variables 6 Line Graphs • Can mirror bar histograms. • Great to visually report results! 7 Measures of Central Tendency • A single numerical value to represent a data set or distribution • How could I use one number to gauge the type of student you are? GPA- Grade Point Average 10 • Yes, the average. Duh. • BUT- the arithmetic mean is calculated like this: • Sum of scores = X # of scores • 1 1 2 3 3 3 3 4 4 5 6 9 9 • X = 53/13 = 4.07 (which I’m sure you also already knew) 11 • Middle value in a series • 1 1 2 3 3 3 3 4 4 5 6 9 9 • Median = 3 12 Measure of Variance 1: Range • Very crude/basic measurement • Highest number – Lowest number • Using the following exam scores Group 1: 73, 74, 75, 75, 74, 81, 83, 76, 72 Group 2: 56, 65, 78, 79, 84, 91, 75, 87, 94 • G1 Range = 11 • G2 Range = 38 15 Measure of variance 2 & 3: variance & standard deviation • Listed together because they’re related • Take into account all values in the set • Variance: average deviation of values from a mean in squared units • Standard Deviation: The square root of the variance. It’s an APPROXIMATE indicator of the average deviation from the mean. 16 Given data 2, 4, 6, 8, 10, you know M = 6. Variance is easy • X M (X-M) (X-M)2 2 6 -4 16 4 6 -2 4 6 6 0 0 8 6 2 4 10 6 4 16 40 Now divide my deviations by n: (40/5) = 8 17 • Note the mean = median = mode • 68% of my scores fall in 1 standard deviation • 95% fall in two standard deviations • 99% fall in 3 • For grades: 68% get C’s, 14% B’s & D’s, 2% A’s & F’s 20 Examining Relationships • In psychology we’re interested in two variables and how they interact (their differences), not one variable So how do we describe these relationships? 21 The difference between means • We you have a quantitative variable and a categorical variable, you can examine the difference between the group means on the quant. Variable. • Deciding how different they are compared to relationships between other variables, we need to standardize the difference • Gives a size of the effect between two groups (Cohen’s d = effect size) • Same as before, just divide your mean difference by the SD = 22 d = Mean difference Std dev M1 – M2 SD Assessing Strength • The further away from zero, the stronger the relationship • Ignore the sign of the number, just go by absolute value • Very straightforward 25 Assessing Directionality • Negative correlation means negative relationship, means variables move opposite • Hours of being awake and test performance 26 • Positive correlation means variables move in the SAME direction 27 What happens when things take a turn? 30 Partial Correlation Coefficient • The correlation between two quantitative variables controlling for one or more variables • There are statistical ways to control for other variables 31 Regression • Simple Regression- Single independent or predictor variable • Multiple Regression- Two or more independent or predictor variables • Regression equations and regression lines • Partial regression coefficient 32 Interval Estimation • Puts a confidence interval around a point estimate • CI: range of numbers inferred from sample that has a specified probability of including a true population value • 95% CI means 95% of the time the true pop parameter is in that range. 35 • Changing out that interval (to 99%) means less precision • I could broaden my range to capture the pop parameter more of the time 99x vs. 95x It would cost me though because I’ve broadened the range • 87-104 instead of 93-97. • 95% is a reasonably acceptable amount 36 • Confidence intervals affected by Level of confidence (95%, 99%, 60%, etc.) Sample size • Higher N gives us a more precise (narrow) confidence interval 37 Critical Regions & p Values • We mathematically generate an overall score for our sample (t, f, z, etc.) • We test that on a distribution Of a null hypothesis, of an “unchanged” population 40 • We compare to that testing or “comparison distribution” of unchanged individuals • We see if the the mathematically generated score for our sample is extreme Extreme enough to concur it wouldn’t happen just by chance • Illustration on next slide 41
This ight area shaded
dark blue is 05
of the total area
under the curve.
o
Normal Probability
1.645
42
Related topic: p-value • Probability value (p-value) • The likelihood of an observed (or more extreme) test statistic if the null were true • 0-1 • Closer to zero it is, less likely you’d have that result if the null were true 45 Ways to think • “If p value is less than alpha, you reject the null hypothesis.” • Choosing alpha .05: If p-value is less than .05, then reject the null 46 To WHAT degree? • Statistical significance vs. practical • Practical: is the observed difference big enough to actually matter • Independent of statistical Small sample weight loss example • Measured in terms of effect size Represents magnitude or strength of relationship 47 Hypothesis Testing Errors • You’re testing hypotheses • That deals with probabilities • That means potential for error (like weather) • Must consider the possibility of error (consider and label!) 50 Types of Errors • Reject a true null (type I) Concluding your tx works when it doesn’t False positive • Fail to reject a false null (type II) Concluding your tx doesn’t work when it does False negative Flushing a good tx down the 51 An easy way to remember
HYPOTHESIS TESTING Reality
OUTCOMES
The Null Hypothesis The Alternative
Is True Hypothesis is True
Accurate Type Il Error
1-0
The Null Hypothesis
Is True ©)
The Alternative
Hypothesis is True
6)-
rorwpeane wD
52
Correlation Designs • Simply checks for relationship or association between two variables Do higher paid quarterbacks throw more touchdowns than lower paid quarterbacks? Does studying more lead to better grades? 55 • Evaluating the differences between two groups • 3 types One sample T-test Independent Samples Matched pairs 56 • Comparison of one sample or group to the population Ex: Psych student GPA (sample) compared to Undergraduate GPA (population) • Problem: rarely know the mean and standard deviation of the population • So…. Use of one sample t-test is rare. 57
e Normality
° Homogeneity of Variance
e Random Sampling
60
• Think about hypothesis testing- determining if differences are due to more than just chance • So- t-test helps us detect the signal from the noise. t = group mean difference (signal) average within group variability (noise) 61 average within group variability (noise) Within group mean difference (signal) t = What does the t-value mean? The greater the t-value the greater the probability that between groups differences are not due to chance 62 • You get an F-value rather than a t-value • Same idea- finding significant differences between groups • EXCEPT! This time you have 3+ groups and unfortunately the ANOVA does not reveal which groups are different 65 • Determining where differences exist • Like doing lots of t-test 66 • 2 Independent Variables (caffeine and sleep) • 1 Dependent Variable (stress) No Caffeine Mean Low Caffeine Mean High Caffeine Mean 4 hrs of sleep 50 65 70 8 hrs of sleep 75 89 80 67 Regression Models • Predictive models Does the number of touchdowns thrown influence a quarterbacks salary for the next year? Do his completion percentage? Which has a bigger effect? • If it’s touchdowns, then just bomb it every play • If it’s completion percentage, go for short easy passes 70 PLEASE NOTE!!! • When something is statistically significant, it is not necessarily practically significant. • Because the numbers are different between treatment groups, it may not amount to a hill of beans in the real world!!!! 71