




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of the concepts of validity and reliability in psychological research. It discusses various types of validity, including construct, internal, and external validity, and their importance in ensuring the accuracy and generalizability of research findings. The document also covers threats to validity, such as instrumentation, history, selection, experimenter bias, and confounding, and strategies for controlling these threats. Additionally, it touches upon the concept of reliability and its relationship to validity.
Typology: Study notes
1 / 8
This page cannot be seen from the preview
Don't miss anything!





TYPES OF VALIDITY OF MEASUREMENTS
**1. Construct Validity
4. Criterion Validity of a Test –test should relate closely to other measures of the same theoretical construct. EG: A valid test of intelligence should correlate highly with other intelligence tests. It should also correlate with behaviors that are considered to require intelligence, such as doing well in school. If criterion of an intelligence test is whether it correlates with how well a child is doing in school at time test is given , it is called concurrent validity. If the criterion of an intelligence test is how well the test can predict some future performance of the child , such as graduation from college, then it is called predictive validity.
VARIABILITY AND MEASURMENT ERRORS
TYPES OF MEASUREMENT ERRORS - two types
1. Systematic Error (also as Constant Error) – measurement error that is associated with consistent bias EG: Body weight not considered an error as it is associated with IVs: loss of water during the night, thirst induced by salt, overeating, etc. Weighing subjects in the morning, as apposed to night, clothed, as apposed to unclothed, introduces systematic errors. However, this is not a bad thing as long as you keep it consistent for all groups. 2. Random Error - variability in DV that is not associated with IV EG: Precisely how a subject is weighed on a floor scale. Random Error in measurement introduced by exactly where subject places feet or leans on the scales. This error is a threat to reliability of measurement because it reduces precision of assessment of effects of the IV.
RELIABILITY OF MEASURES - two types
1. Test-Retest – same result can be obtained over time. EG : Time-dependent changes in the accuracy of a floor scale; retaking the SAT or GRE. 2. Internal Consistency - whether various items on a test are measures of same thing. EG : Tests of internal consistency of DVs ( S plit-Half Reliability ) where items on test divided into two separate tests. Scores on the two halves correlated to see how closely various individuals' scores agree on both halves (good test - high split-half correlation). Kuder-Richardson-20 test for multiple-choice tests computes all possible split-half correlations for agreement.
VALIDITY OF RESEARCH - Problems that threaten validity
THREATS TO INTERNAL VALIDITY
Eight Major Threats to Internal Validity:
**1. History (Subject Traits and Outside Influences)
History can be divided into two parts: A. Proactive History - refers to learned and inherent differences subjects bring with them to the study (height, weight, sex, etc.) EG : Random assignment of subjects to conditions usually control for this. B. Retroactive History - refers to changes in events between the 1st and 2nd measurement.
EG : Subjects given an “attitudes toward police scale” during the 1st week of a study. During time from 1st to 2nd rating session, subjects hear news story of how two students were killed by police. Identifying and removing subjects who have been contaminated by this event common way to control for this.
2. Maturation - source of error related to amount of time between measurements. More critical problem with children because they change more rapidly over time than adults. EG: Study examining motor learning in children would need to take into account significant lapses of time in which changes could occur in motor coordination, knowledge, and the like that could influence results. EG: Maturation can be a critical feature of a study, such as attitudes toward alternative lifestyles as a function of age. 3. Testing - effects which may occur on scores when a test is repeated. Being tested influences performance in later experiments or administration of the test. Subjects become sophisticated about testing procedure or may learn how to take tests so that their later behavior is changed by earlier experience ( Practice Effect ). EG: General observation: students generally do better on second and later tests in a course after experience with the style of testing. Phenomenon similar to maturation in that subjects are changed over time, but is different in that change is caused by the testing procedure itself , rather than by processes unrelated to the test. 4. Statistical Regression - operates when groups are selected on the basis of extreme scores. Tendency of subjects with extreme scores on a first measure to score closer to the mean on a second testing. EG: Regression effect can occur when 2 different variables are correlated, such as SAT score and college GPA. It may also occur when same variable is measured twice, such as a student who repeats the SAT. This arises when there is error associated with unreliability of the measuring device (i.e., the test itself is not a perfect measure of construct being measured). Another example is blood pressure. There are many blood pressure readings a doctor may take from a patient. However, the doctor knows the first reading is generally high. Most blood pressure readings after this decrease to the individual's normal blood pressure. This is regression towards the mean and is typical in most subject responses. Controlled by retesting subject’s or taking more than one measurement. EG: Classic Example …teacher who notices that students who scored highest on the first test usually do less well on the second, whereas those who did the worst improve. The teacher often concludes that the ones who did well the first time rested on their laurels for the second test, whereas the ones who did poorly worked harder. In reality, this is not what happened. Whenever random error exists in the measurement of a variable, individuals will deviate from their true score by chance. Solution …test them repeatedly until they bleed. On retest, errors will tend to average out , and the scores of these previously extreme individuals tend to return toward their true value, closer to the mean. 5. Selection - Except for random selection, any other procedure to choose subjects may result in a sample carrying traits that are not representative of the population as a whole. EG: Many studies compare 2 or more groups. Choice of Neo-Nazi skinheads would not be a good group to compare with Society to Protect Baby Seals (or would it?). But would skinheads from Detroit be a good choice to compare with skinheads from Dresden? Control by matching subjects or making the characteristic an IV.
conducted their own experiment instead of the one you thought they were conducting. Whenever people are aware that they are participating in an experiment, their behavior may be different from their everyday behavior. Common example is reaction of people to having a movie camera pointed at them.
These preconceived ideas lead to several subject tendencies: A. Good-Subject Tendency: tendency of experimental participants to act according to what they think the experimenter wants. Subjects may pretend to be fooled by instructions to be "good subjects”. EG: Subjects may deliberately feign a naive attitude about expected results even though they can guess the true purpose of the study (heard about it or learned of similar studies elsewhere).
B. Evaluation Apprehension: Also known as social desirability – concern on the subject’s part about the impression their behavior will reflect to the experimenter (try to appear as socially desirable as possible). EG: Some subjects convinced that experiment is a carefully disguised measure of intelligence or emotional adjustment. This expectancy gives rise to evaluation apprehension , in which participants tailor their behavior to make themselves look as normal as possible. Develop attitude scales to ensure that various responses appear equally socially desirable so that subjects will not damage results by concealing their true attitudes. EG: Effects of pornography on sexual behavior. Participants asked to keep a diary of all sexual activity for a week before and after they are shown a pornographic movie. People would hesitate to volunteer information about deviant activities. Even if they were honest they might modify their behavior in direction of social desirability.
THREATS TO EXTERNAL VALIDITY
Three Important Threats to External Validity:
1. Other Participants - Must not assume that any animal can be substituted for any other in all situations. EG: College students and white rats are readily accessible but also presumed to be representative. Although we are interested primarily in human behavior, the degree to which common principles of behavior operate across species is impressive. Skinner (1956) showed that behavior of pigeons, rats, and monkeys under certain experimental conditions are identical in all-important respects. Regardless, human subjects should be chosen with attention to their representativeness relative to some larger population. 2. Other Times - Many historical trends render particular research findings invalid, whether they concern use of language, attitudes toward foreign countries, or perception of deviant groups. EG: Perception and attitudes toward sex have changed over time (less or more stringent).
3. Other Settings - How the phenomenon observed in one laboratory can be related to a similar phenomenon observed in another laboratory or in the real world. EG: Although laboratory research ensures higher level of control, it is sometimes not easy to decide if a certain effect is simply a laboratory effect or whether it would survive transplantation to the outside world.
THREATS TO STATISTICAL VALIDITY
VALIDITY in a NUTSHELL (17 points)