







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An in-depth exploration of validity and reliability in psychometric assessments. It covers various types of validity, including face, content, criterion, concurrent, and predictive validity, as well as construct validity. Additionally, it discusses different types of reliability, such as test-retest, multiple forms, and internal consistency. The document also addresses threats to internal and external validity and ways to improve validity and reliability.
Typology: Lecture notes
1 / 13
This page cannot be seen from the preview
Don't miss anything!








● A less subjective form of validity measure than face validity, although it does extend from face validity, which relies on an assessment of whether the proposed measure incorporates all content of a particular construct.
● A much more objective measure than both face validity and content validity, criterion validity relies on comparison between the proposed measure and a measure already established to be valid, as concerns the variable of interest.
Concurrent Validity
● In applying your proposed measure and a measure with established validity to the same data set, concurrent validity refers to strong correlation between the measures' results.
● A correlation of r = 0.50 is the minimum acceptable for declaring concurrent validity.
Predictive Validity
● A valuation of your proposed measure's ability to predict future events.
● Unlike concurrent validity, predictive validity first requires a collection of scores on the measure, followed by a collection of scores on the criterion measure at a later time.
● A complex validity measure, first identified by Lee J. Cronbach and Paul E. Meehl in 1995, construct validity evaluates whether your proposed measure correlates with all concepts related to the theory under investigation.
● Reliability can be broken over the following sub-types:
Test-Retest Reliability
● The most common form of reliability measure, test-retest reliability refers to correlating multiple applications of a single measure, conducted at different times.
● Generally, r ≥ 0.80 is indicative of a reliable measure.
● Similar to test-retest reliability, multiple forms reliability refers to the application of equivalent forms of a measure (e.g. the same measure, with differently worded questions), and their correlation.
● As before, r ≥ 0.80 is indicative of a reliable measure.
● Multiple forms reliability has an advantage over test-retest reliability, in that it can be applied in a single session of testing. In this circumstance, subjects are given both forms of the measure concurrently.
Unfortunately, subjects may come to realize that certain questions are measuring similar concepts, and modify their answers accordingly. This phenomenon is called "multiple testing effects," and is seen to occur in test-retest reliability as well.
● Cronbach's alpha is another internal consistency approach, used to overcome disadvantages seen with the split-half reliability approach.
● Cronbach's alpha is, in essence, the average of all possible split-half correlations within a measure.
● Calculating Cronbach's alpha is beyond the scope of this tutorial, and can be easily done using statistical software packages like SPSS.
● Cronbach's alpha is simply another technique used to establish reliability for a measurement scale.
What is Internal Validity?
● Internal validity seeks to answer the question: Is the observed effect of the independent variable on the dependent variable an actuality?
Put another way, internal validity classifies the effect of the stimulus (independent variable) on the measured variable (dependent variable), attributing it to extraneous sources or not.
● If the observed effect is determined to be due to the stimulus, then the measure is said to be internally valid.
● Threats to internal validity can be broken over the following list:
History
● Most prevalent in field experiments, history refers to events external to the study, that may affect the dependent variable.
Maturation
● Refers to changes in an experimental environment strictly attributable to the passage of time.
Testing
● Refers to threats created by a subject having multiple exposures to the same measure.
Instrumentation
● Refers to a threat to internal validity, which occurs due to a researcher recording observations differently at the start of a study vs. the end of the same study.
Selection
● This threat to internal validity refers to the recruitment of different types of people for different groups within a study, which makes comparisons between the groups largely uninformative.
Experimental Attrition
● This threat refers to differing drop-out rates between groups within a study. As with selection, this results in comparison between the groups being mostly useless.
● When a measure has insufficient validity or reliability, researchers often attempt to redesign the measure, in an effort to reach acceptable levels.
Improved training of those who apply the measure
● Especially useful for subjective measures, researchers can train users of the measure to detect and avoid the introduction of bias.
Interview subjects who have experienced the measuring device
● Receiving feedback from unbiased subjects can illuminate previously unperceived fallbacks of the measure. For instance, perhaps an item the researcher felt was crystal clear, was viewed as being ambiguous by the subject.
Assess each item on a multiple-item questionnaire
● Taking this step may allow for redundant or faulty items to be identified and removed.
Improve testing on minority populations
● Often, researchers do not consider the uniqueness of minority populations in the development of a measure. Proper translation, if applicable, is a must.
measures what it claims to measure.
measure's results for each application of the measure.
which associates the variable of interest with the proposed study variable, by relying heavily on logic and common sense.
than face validity, although it does extend from face validity, which relies on an assessment of whether the proposed measure incorporates all content of a particular construct.
face validity and content validity, criterion validity relies on comparison between the proposed measure and a measure already established to be valid, as concerns the variable of interest.
measure with established validity to the same data set, concurrent validity refers to strong correlation between the measures' results.
ability to predict future events.
Monette, D.R., Sullivan, T.J. & DeJong, C.R. Applied Social Research, A
Tool for the Human Services: Sixth Edition. 2005. Thomson Learning
Inc: Toronto, Canada.