Download Analysis of Variance (ANOVA) and Regression Analysis and more Exams Nursing in PDF only on Docsity! FINAL EXAM ISDS What is ANONA (analysis of variance) used for? - correct answer ✔✔used to determine if differences exist between the means of three or more populations under independent sampling one way ANOVA compares population means based on one categorical variable or factor what type of test is Anuva? - correct answer ✔✔T test with equal but unknown variances which are applied to a total of c populations, rather than just two. One way ANOVA hypothesis - correct answer ✔✔H0: µ1 = µ2 = ... = µc HA: Not all population means are equal. -> means that one is different than the others one way anova BETWEEN TREATMENTS estimate of variance - correct answer ✔✔Calculate SSTR THEN MSTR MSTR = SSTR/(c-1) c is the number of populations (treatments) one way anova WITHIN TREATMENTS estimate of variance - correct answer ✔✔First calculate the error sum of squares, SSE THEN MSE one way ANOVA F TEST - correct answer ✔✔test statistic for testing whether differences exist between the population means is F (df1, df2) = MSTR / MSE df1 = c -1 df2= NT - c one way anova SST - correct answer ✔✔The total sum of squares SST is the sum of the squared differences of each observation from the grand mean SST = SSTR + SSE Comparison methods - correct answer ✔✔Fishers LSD method and Tukeys HSD methods are used to find which population means differ, after ANOVA finds significant differences between the population means Fishers confidence interval - correct answer ✔✔n's are for the two samples being evaluated if zero falls in between the numbers for answer, there will be no statistical difference (-63, 137) ZERO FALLS BETWEEN = NO DIFFERENCE between boston and new york (794, 1009) - SIGNIFICANTLY DIFFERNT, no zero between Tukeys HSD method - correct answer ✔✔Protects against an inflated risk of a Type 1 error! each interval slightly wider than the Fisher method, to protect against type one error! Balanced vs unbalanced data balanced: each sample size is the same dummy variables - correct answer ✔✔categorical qualitative 1 or 0 category is present or not 1 - male 0 - otherwise 1- at least 60 0 - 0 if otherwise what is a dummy variable? - correct answer ✔✔A dummy variable d for a qualitative variable with two categories assigns a value of 1 for one of the categories and a value of 0 for the other. dummy variable equation - correct answer ✔✔y = β0 + β1 x + β2 d + ε. b0 is a constant x is an explanatory value d is a qualitative state - a category ANOVA assumptions - correct answer ✔✔1.populations are normally distributed 2. population standard deviations are unknown but assumed equal 3. samples are selected independently covariance - correct answer ✔✔measures the direction of the linear relationship between two variables x and y covariance tells you if there is a relationship. sample correlation coefficient - correct answer ✔✔describes both the direction and strength of the linear relationship anything greater than 0.7 is a strong relationship range is between -1 and 1 0 = no relationship -1 is a very strong negative relationship SIGN TELLS YOU DIRECTIONS NUMBERS TELL YOU THE STRENGTH sample correlation coefficient formula - correct answer ✔✔rxy = s xy sx sy where sxy is the sample covariance, sx and sy are the sample standard deviations of x and y. test statistic for the correlation coefficient pxy - correct answer ✔✔ what type of relationship does the correlation coefficient capture? - correct answer ✔✔the correlation coefficient captures only a linear relationship - outliers can give you faulty values - correlation does not imply causation. even if two variables are highly correlated, one does not necessarily cause the other regression analysis - correct answer ✔✔with regression analysis we assume that the response variable, is influenced by other variables, called explanatory variables deterministic - correct answer ✔✔if response variable is determined by the explanatory variable stochastic - correct answer ✔✔relationship is inexact, due to omission of relevant factors loose relationship but not necessarily predicted in nature the simple linear regression model - correct answer ✔✔y = β0 + β1x +ε y - response variable x - explanatory variable β0 and β1 are the parameters that need to be estimated residual - correct answer ✔✔residual = observed - expected e= y - y hat positive residual= model UNDERestimates the response variable method of least squares - correct answer ✔✔Ordinary least squares (OLS) - trying to minimize the amount of error the linear regression model chap 14 - correct answer ✔✔ sample regression equation chap 14 - correct answer ✔✔debt = 210 + 10 x income if you make no money, you are still paying 210 a month. as income goes up, payment goes up by 10 constant + slope x income (x) make sure you pay attention to units if problem was in thousands more than one explanatory variable...... - correct answer ✔✔multiple linear regression y = β0 + β1x1 + β2x2 + ... + βkxk + ε x are explanatory variables test statistic foe a test of individual significance - correct answer ✔✔ confidence interval for bj - correct answer ✔✔ hypothesis test for JOINT SIGNIFICANCE - correct answer ✔✔H0 : β1 = β2 =...= βk = 0 H1 : At least one βj ≠ 0. -AT LEAST ONE OF THEM IS NOT ZERO TEST STATISTIC FOR TEST OF JOINT SIGNIFICANCE - correct answer ✔✔F (d1,d2) = MSR / MSE d1= k d2= n - k - 1 test for joint significance = f test test for individual significance = t test restricted and unrestricted models - correct answer ✔✔restricted: do NOT estimate the coefficients that are restricted under the null hypothesis unrestricted: complete model that imposes no restrictions on the coefficients what are the two types of predictions? chapter 15 - correct answer ✔✔confidence interval and prediction interval confidence interval vs. prediction interval - correct answer ✔✔confidence interval MEAN OF Y -less error when using a mean prediction interval INDIVIDUAL VALUE OF Y -takes into account error, therefore it is WIDER prediction interval equation - correct answer ✔✔ model assumptions and common violaitons - correct answer ✔✔1. coefficients, parameters have to be linear 2. error term expected to be 0 (zero) 3. cannot have perfect linear relationship amongst variables (no perfect multicollinearity) 4. the variance of the error term e is the same for all observations ( no heteroscedasticity) 5. the error term e is uncorrelated across observations (no serial correlation) 6. the error term e is not correlated with the explanatory variables (no endogeneity) 7. the error term e is normally distributed