Download Identifying Statistical Tests and Models: Research Questions and Correlation Analysis and more Quizzes Psychology in PDF only on Docsity! Preparation for the Story Problem portion Quiz #2 (and a few conceptual things) 1. Consider these output from a step in an algorithm-driven modeling (criterion = wellness) and answer the questions. Variables in Equation Variables not in Equation variable b p (of t) variable partial p of t (if entered) Weight 3.26 .43 Pred4 .48 .01 Gender 14.73 .06 Pred1 .51 .02 Pred2 .02 .02 Pred3 .46 .01 Pred5 123.44 .04 Age .15 .53 a. if this were a forward selection -- what would happen next? Would the R² change significantly? b. if this were a backward selection -- what would happen next? Would the R² change significantly? c. if this were a forward stepwise selection -- what would happen next? Would the R² change significantly? d. What would happen to the R² if I added Pred3 to the model? Explain your answer. e. What would happen to the R² if I added Age to the model? Explain your answer. f. What would happen to the R² if I deleted weight from the model? Explain your answer. g. What would happen to the R² if I deleted Pred2 from the model? Explain your answer. h. What would happen if I deleted both Weight and Gender from the model? Explain your answer. i. What would happen to the R² if I deleted both Pred2 and Pred5 from the model? Explain your answer. j. What would happen to the R² if I deleted both Weight and Pred2 from the model? Explain your answer. 2. Identifying types of research questions, correlation models and tests used a. Conceptually speaking: For each of the following, select the appropriate correlational statistic(s) : a) simple corr/regression, b) comparing two (or more) correlations, c) partial correlation, d) semi-partial correlation, e) multiple correlation/regression, f) comparing two nested multiple regression models, g) comparing two non-nested multiple regression models, h) statistical regression modeling -- What is the best individual predictor? -- What is the relationship between the predictor #1 and the criterion? -- What is the relationship between predictor #1 and the criterion controlling the latter for predictor #2 ? -- Will a specific subset of predictors do as well as the full set of predictors? -- What combination of predictors will make-up the best predictive model? -- Will one subset of predictors better predict the criterion better than will the other set? -- What is the relationship between predictor #1 and the criterion controlling both for predictor #2 ? -- How well will the combination of predictors do to predict the criterion? b. With variable names (find one of each). The criterion is social anxiety, the predictors are age, gender, social skills, depression & prior "public trauma" -- Which is the better predictor of social anxiety; age, social skills, or prior public trauma? -- Is social anxiety related to depression? -- Is there a relationship between social anxiety and social skills, when controlling social skills for age? -- Reliable depression and social skills scores are difficult to obtain inexpensively. Do I really need them in this model? -- How might I efficiently predict social anxiety? -- Will combining age and gender predict as well as combining depression and social skills? -- What is the relationship between social anxiety and depression, controlling both for social skills? -- I can get scores on all of these variables from all my client. How well will I do at predicting their social anxiety? 5. Consider these output from a step in an algorithm-driven modeling (criterion = wellness) and answer the questions. Variables in Equation Variables not in Equation variable b p (of t) variable partial p of t (if entered) Weight 3.26 .03 Pred4 .28 .01 Gender 14.73 .01 Pred1 .31 .01 Pred2 .02 .02 Pred3 .25 .01 Pred5 123.44 .04 Age .09 .53 a. if this were a forward selection -- what would happen next? b. if this were a backward selection -- what would happen next? c. if this were a forward stepwise selection -- what would happen next? d. What would happen to the R² if I added Pred3 to the model? Explain your answer. e. What would happen to the R² if I added Age to the model? Explain your answer. f. What would happen to the R² if I deleted weight from the model? Explain your answer. g. What would happen to the R² if I dropped gender and Pred2 from the model? Explain your answer/ 3. Tell the type of statistical analysis(es) (yep, there might be more than one -- be complete) required for each of the following and express the statement/hypothesis symbolically. a. It was once “discovered” that shoe size was a good predictor of spelling ability in grade school children. However, it was later discovered that this correlation “went away” when controlling for age (years old). b. Performance in Psyc 941 is a good predictor of performance in Psyc 942, but an even better predictor of performance in the advanced class is a combination of 941 performance and how much research experience the student has. c. Social skills are strongly correlated with dating success, and this correlation actually gets stronger if one controls the measure of dating success for income. d. The number of pets one has owned is a good predictor of the likelihood that one will help a person who is in distress, in fact, it is a better predictor than is how well one knows the person who is in distress. Answers 1. Consider these output from a step in an algorithm-driven modeling (criterion = wellness) and answer the questions. a. Add pred -- has highest partial of the viable candidates for inclusion. R² will increase significantly b. Toss weight -- has largest p-value of the viable candidates for exclusion c. same as "b" d. Adding a significant (contributing) predictor to the model would increase R² significantly e. Adding a non-significant (non-contributing) predictor to the model would increase R² numerically, but not significantly f. Dropping a non-significant (non-contributing) predictor to the model would decrease R² numerically, but not significantly g. Dropping a significant (ontributing) predictor to the model would decrease R² significantly h. Neither predictor is contributing and we could drop either one without r² decreasing significantly. It is unlikely that dropping both will lead to a decrease in R². Why? Of the two, the larger R² decrease isn't significant, and so the average R² difference from dropping the two of them is unlikely to be significant i. Because both are contributing, and dropping either would lead to a significant decrease in R², it is likely that dropping both would lead to significant average drop in R² j. This is the "iffy one" -- dropping one would lead to a decrease, dropping the oher wouldn't, and we have no way of estimating the result of the "average drop" -- so for this one we should plead "I don't know" and do the nested model comparison test. 2. Identifying types of research questions, correlation models and tests used Conceptually speaking -- b, a, d, f, h, g, c, e With variable names -- b, a, d, f, h, g, c, e 3a. RH: r(sentest,rdg) > r(sentest,sci) & r(sentest,rdg) > r(sentest,ses) non-nested (single-predictor) models (correlated correlations) using Hotelling's t-test or M,R&R Z-test b. The simplest test of this is that the regression weight for SCTYP will not be significant in the full model RH: b = 0 We could also test this by comparing nested models R(sentest. Ses, rdg, sci, sctyp) vs. R(sentest. Ses, rdg, sci) using the R²-change F-test Or we could examine the multiple semi-partial r sentest(sctyp.rdg,sci,ses) c. RH: R(sentest.sctyp,ses) > R(sentest.rdg,math) non-nested models using Hotelling's t-test or M,R&R Z-test RH: R(sentest,sctyp,ses,rdg,math) = R(sentest,sctyp,ses) nested models using the R²-change F-test 4a. pred1 would be added - of the viable candidates for inclusion it has the highest partial corr b. would stop -- all the variables in the model are contributing c. same as "a" d. R² would increase significantly -- would be adding a significant contributor e. R² would increase numerically, but not significantly -- would be adding a non-significant contributor f. R² would decrease significantly -- would be dropping a significant contributor g. R² would drop significantly -- note that dropping either one would lead to a significant drop in R²; if they are both dropped the average R² drop will be significant, because it is larger than the smaller R² drop of the two single-predictor deletions (the really tricky one is when one, but not both, of the variables being dropped is contributing) 5. Most mistakes involved not examining/testing the claim about one or both models/correlations before comparing them. It is different to say, "The first model works and the second model is better." than to just say, "The second model is better than the first." a. rspell,shouesize > 0 however rspell,shoesize.age=0 b. r942, 941 > 0 but R942.941,research > r942, 941 comparing nested models c. rss,date > 0 and rss(date.income) > rss,date d. rhelp, pets > 0 in fact rhelp, pets > rhelp,know comparing non-nested models Practice with collinearity, etc. 1. Answer the following questions based on the information in the correlation matrix -- pay careful attention to how the answers change and don't change as the correlations change! Here are the correlations from a sample of therapy patients -- wellness is the criterion variable. Initial Amount Number of Wellness Age Wellness Prior Current Therapy Sessions Well 1.00 .42 .38 .41 .39 Age .42 1.00 .40 .61 .23 Initial .38 .40 1.00 .15 .23 Prior .41 .61 .15 1.00 -.63 Current .39 .23 -.63 .36 1.00 a. What is the best single predictor of wellness ? b. What predictor would you add to the variable you chose in "a" to produce the largest increase in R²? Explain your answer. c. Reconsider the information in the correlation matrix. Is the two-predictor model you chose in "a & b" likely to be best 2- predictor model available from these variables? If not, what do you think will likely be the best two-predictor model? Explain your answer. Initial Amount Number of Wellness Age Wellness Prior Current Therapy Sessions Well 1.00 .40 .38 .41 .39 Age .40 1.00 .40 .61 .33 Initial .38 .40 1.00 .15 .33 Prior .41 .61 .15 1.00 -.63 Current .39 .33 .33 -.63 1.00 d. What is the best single predictor of wellness ? e. What predictor would you add to the variable you chose in "a" to produce the largest increase in R²? Explain your answer. f. Reconsider the information in the correlation matrix. Is the two-predictor model you chose in "a & b" likely to be best 2- predictor model available from these variables? If not, what do you think will likely be the best two-predictor model? Initial Amount Number of Wellness Age Wellness Prior Current Therapy Sessions Well 1.00 .42 .18 .21 .39 Age .42 1.00 .2 0 .21 .23 Initial .18 .20 1.00 .15 .13 Prior .21 .21 .15 1.00 .16 Current .39 .23 .13 .16 1.00 g. What is the best single predictor of wellness ? h. What predictor would you add to the variable you chose in "a" to produce the largest increase in R²? i. Reconsider the information in the correlation matrix. Is the two-predictor model you chose in "a & b" likely to be best 2- predictor model available from these variables? If not, what do you think will likely be the best two-predictor model? A couple more “tell the analysis” problems Identify the questions asked in the following paragraph and … Show the symbolic representation of the hypothesis If the hypothesis involves a comparison, specify the statistical test that will be used If there is more than one way to test that hypothesis tell 'em all The overall purpose of the study is to examine relationships between school grades, prior work experience and interview process scores to predict job performance (jp; derived from work performance during the 1st six months of work). Here are the predictors.. high school grades (hsg) # school absences (has) food service classes in high school (hsvs) # prior jobs (jobs) why left last job (quit or fired; last) how long on prior job (long) interview score (int) interview performance score (perf) My colleagues and I have several quesitions and hypotheses! Are high school grades correlated with job performance. What about after we control both for number of prior jobs, or both number of prior jobs and whether or not they have had food service classes? Will the model with all these predictors help us identify better workers? Will dropping the interview performance test hurt how well we can predict? What about if we drop both interview measures? I think that high school information will predict worker performance better than information from the interview and as well as using all the predictors together. What is the best predictor of job performance among the high school variables, and is it a better predictor than the interview score? Not everyone has had a job before, does the interview score work better for those who have or those who haven't had a job before. For that matter, do the combination of interview score and interview performance score work better for those who have or those who haven't had a job before. I think that when we consider the relationship between interview score and job performance we need to control interview scores for whether or not they've had a food service class in high school -- you know they practice interviewing in those classes. Identify the questions asked in the following paragraph and … Show the symbolic representation of the hypothesis If the hypothesis involves a comparison, specify the statistical test that will be used If there is more than one way to test that hypothesis tell 'em all In effort to evaluate what makes some print advertisements successful and others not, we generated several ads that had various combinations of the following attributes. When collecting the data each participant viewed one ad, rated how much they thought it would "influence their opinion of the product" (inop) and complete a questionnaire involving some self-descriptive questions. The predictor variable set included… ad size (size) color or black & white ( col) people in ad (peo) rater's age (age) rater's gender (sex) rater's familiarity with the product (fam) Once we include the ad variables I'm not sure that the rater variables will add anything. Except it might be that adding just rater familiarity with the product to the ad variables will help. A related question to ask is if ad attributes are more important than viewer attributes. Either way, I think that the way the ad variables relate to the criterion will be different for males and females. Well, I think that size and color will correlate with ability to influence equally for males and females, but that people being in the ad is more correlated with ability to influence for females than for males. I think that whether ads use color or black & white is related to their ability to influence opinion (even if you control for gender or gender and age) and that this is a more important predictor than the ad size or whether or not people are in the ad.