Methods and Legal Implications of Behavioral Assessment and Psychological Testing, Exams of Psychology

An in-depth exploration of behavioral assessment and psychological testing, focusing on the development of tests, their applications, and the legal framework surrounding their use. Topics covered include behavior rating scales, single area behavior rating scales, behavioral assessment, classical conditioning, and the dsm-5. The document also discusses the qualifications required for test users and the sources of laws governing psychological testing. Additionally, it touches upon the education of the handicapped act, the family education rights and privacy act, no child left behind act, and the every student succeeds act.

Typology: Exams

2023/2024

Available from 05/23/2024

CarlyBlair
CarlyBlair 🇺🇸

4

(1)

4.6K documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Testing and Measurement - TEST 3
4 major uses of objective personality tests -
1) Clinical (diagnosis, tx planning, forensic
2) Counseling (ex - couple's therapy)
3) Personnel selection (to ID traits predictive of success or to ID "problem" traits
4) Research (structure of personality: reliability/validity of tests for diff populations; determine how
personality variables relate to other variables
Response set/style -
A person's tendency (either conscious or unconscious) to respond to items in a certain way,
independent of the person's true feelings about the items
- yea-sayer: tendency to agree w/ almost any statement
- nay-sayer: tendency to disagree w/ almost any statement
Response distortion: a person's true feelings are distorted in some way by the response set
Impression management & Faking -
A person wants to create a certain impression by their responses
Faking: deliberate attempt to create a favorable or unfavorable impression
- faking good: to create a favorable impression (Socially desiable responses > generally approved by
society, ex: being friendly/industrious)
- faking bad: to create an unfavorable impression (sometimes called malingering)
- Basic prob: disentangling true personality traits from response sets
4 Strategies for Dealing w/ Response Set and Faking -
1) Checking responses to items with extreme empirical frequencies for normal groups (ex:
"I like every person I meet" > faking good; "Newspapers are full of lies" > faking bad)
2) Checking on response consistency on same or similar items ("I rarely lose my temper" and "I get
mad easily")
3) Balancing direction of items (controls for yea-saying/nay-saying response tendencies)
4) Using forced-choice method for items matched on relevant variable (have to pick statement that best
describes you, ex: "I usually work hard" or "I like most people I meet"; requires examinees to choose
between statements matched on SOCIAL DESIRABILITY)
>>> some of these strategies result in separate scores called "validity indexes"
Content Method (Logical/Rational) - Approach to Objective Personality Test Development -
Develop test items/scales based on simple, straightforward understanding of what you
want to measure
PROs: Simple; easy to generate items
CONs: High face validity >> suspect to distortion through response styles (faking good/bad)
Criterion-Keying Approach - Approach to Objective Personality Test Development -
Items selected strictly in terms of their ability to discriminate between 2 well-defined
groups of examinees (ex: non-depressed vs depressed) > ex: MMPI; Strong Interest Inventory
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Methods and Legal Implications of Behavioral Assessment and Psychological Testing and more Exams Psychology in PDF only on Docsity!

Testing and Measurement - TEST 3

4 major uses of objective personality tests -

  1. Clinical (diagnosis, tx planning, forensic
  2. Counseling (ex - couple's therapy)
  3. Personnel selection (to ID traits predictive of success or to ID "problem" traits
  4. Research (structure of personality: reliability/validity of tests for diff populations; determine how personality variables relate to other variables Response set/style - A person's tendency (either conscious or unconscious) to respond to items in a certain way, independent of the person's true feelings about the items
  • yea-sayer: tendency to agree w/ almost any statement
  • nay-sayer: tendency to disagree w/ almost any statement Response distortion: a person's true feelings are distorted in some way by the response set Impression management & Faking - A person wants to create a certain impression by their responses Faking: deliberate attempt to create a favorable or unfavorable impression
  • faking good: to create a favorable impression (Socially desiable responses > generally approved by society, ex: being friendly/industrious)
  • faking bad: to create an unfavorable impression (sometimes called malingering)
  • Basic prob: disentangling true personality traits from response sets 4 Strategies for Dealing w/ Response Set and Faking -
    1. Checking responses to items with extreme empirical frequencies for normal groups (ex: "I like every person I meet" > faking good; "Newspapers are full of lies" > faking bad)
  1. Checking on response consistency on same or similar items ("I rarely lose my temper" and "I get mad easily")
  2. Balancing direction of items (controls for yea-saying/nay-saying response tendencies)
  3. Using forced-choice method for items matched on relevant variable (have to pick statement that best describes you, ex: "I usually work hard" or "I like most people I meet"; requires examinees to choose between statements matched on SOCIAL DESIRABILITY)

some of these strategies result in separate scores called "validity indexes" Content Method (Logical/Rational) - Approach to Objective Personality Test Development - Develop test items/scales based on simple, straightforward understanding of what you want to measure PROs: Simple; easy to generate items CONs: High face validity >> suspect to distortion through response styles (faking good/bad) Criterion-Keying Approach - Approach to Objective Personality Test Development - Items selected strictly in terms of their ability to discriminate between 2 well-defined groups of examinees (ex: non-depressed vs depressed) > ex: MMPI; Strong Interest Inventory

PROs: directness/simplicity encourage new research; focuses attention on exactly what we want test to do CONs: v nonthereoritcal orientation limits generalizability of score interpretations; applicable only when have very well-defined criterion groups; always overlap in group distributions Factor Analysis - Approach to Objective Personality Test Development - Goal: to ID the dimensions underlying a large number of items in a personality measure; factors are ID'd by examining the correlations among all the items PRO: Brings order to an undifferentiated mass of items CONs: Final results depend on the content of the initial pool of items; doesn't yield a definitive set of factors (more fluid) Theory Driven Approach to Objective Personality Test Development - References a specific personality theory (items are created to reflect the theory) PROs: provides a good operational def of the theory; a good test of a good theory can have great utility CONs: the test's utility is limited by the theory's validity; how well does the test actually reflect the theory? Combinations Approach to Objective Personality Test Development - Most tests employ multiple approaches Classification Scheme for Objective Tests - Orientation: Normal/Abnormal Scope of Coverage: Comprehensive/Specific

  • Normal/Comprehensive: NEO-PI3 (NEO Personality Inventory)
  • Normal/Specific: Piers-Harris-2 (children's self-concept)
  • Abnormal/Comprehensive: MMPI-
  • Abnormal/Specific: Deck Depression Inventory - 2 6 common characteristics of COMPREHENSIVE Inventories -
    1. Many (hundreds) of items
  1. Admin time: 30-90 items
  2. Many scores
  3. Applications: wide variety
  4. Norm groups: good representativeness
  5. Reports: elaborate, often narrative 6 common characteristics of SPECIFIC domain tests -
    1. Few items (usually around 20-80)
  6. Admin time: brief (~10-15)
  7. few scores (often only one)
  8. Applications: narrow/targeted
  9. Norm groups: more limited
  10. Scoring/reports: simple Trait theories of personality - Traits: emotional, cognitive and behavioral tendencies that constitute underlying personality dimensions on which individuals vary

Piers-Harris 2 Psychometrics - Norms: 1400 youth representative of US pop

  • Reliability (Internal Consistency): .90 for total score and .78 for domain scores
  • Validity (factor analysis): median intercorrelation among domains is .64 (some in the .80) >> too HIGH for domains purported to be relatively independent Trends in Objective Personality Tests -
    1. Many new tests are being published & refinements in existing instruments continually appear too
  1. Methods of test development have matured (dealing with response sets/faking); demonstrating test validity empirically through the use of criterion-keying and research on group-differences
  2. Managed care prefers objective tests due to convenience and lower cost
  3. Narrative reports are now common (need for caution interpreting bc ultimate responsibility interpretation rests w psychologist, not w computer generating report)
  4. Online administration is developing rapidly Behavior Rating Scales -
    • Widely used for assessment of attention disorders, hyperactivity, depression and various emotional problems
  • essential features: 1) usually completed by someone other than the person being evaluated (teacher/parent), Likert scales and 2) very specific behaviors
  • 2 broad categories: 1) multi score systems and 2) single area scales Multi-score Behavior Rating Scales - 3 main multi-score systems:
  1. Behavior Assessment System for Children (BASC-3)
  2. Child Behavior Checklist (CBCL)
  3. Conners' Rating Scale (Conners 3) Each system has separate scales to be completed by parents, teachers, and the child
  • covers many problem areas and yields many scores (aggression, anger, hyperactivity, inattention, etc); some scales measure positive behavioral competencies
  • takes about 20-30 to compete) Single Area Behavior Rating Scales -
    • Same type of items/response scales as multi-score systems but concentrate on ONLY ONE area (ex: inattention; ADHD Rating scales)
  • Tend to be much shorter than the multi-score system (takes about 10 min to compete) Behavioral Assessment (BA) - General approach to gathering info about human characteristics (categories of 6 techniques rather than specific tests)
  • observe behavior directly rather than asking test questions about it (contrasted w/ traditional assessment, i.e. tests) 3 main principles:
  1. Focus on overt behaviors rather than underlying traits
  2. To measure a particular kind of bx, get as close to that bx as possible
  3. Relate the measurement as much as possible to what you want to do with the information Behavioral Assessment Development -

Reaction against: Psychodynamic theory (remote roots); Projective techniques (ex - Rorschach); Measurement of v generalized personality traits w/ paper-pencil tests

  • Developed in conjunction w/ behavior therapy, which applied the learning theories of Pavlov and Skinner to clinical problems (needed new measures, ex counts of specific behaviors) Classical (Pavlovian) Conditioning (incl acquisition, extinction and spontaneous recovery) - Reflexive salivation in dogs could be elicited by neutral stimuli associated w feeding
  • conditioning: form of associative learning
  • unconditioned: automatic/reflexive/unlearned
  • UCS > UCR; eventually CS > CR (same response)
  • Acquisition: Learning of a response based on the pairing between a CS and a UCS (usually takes 3- pairings)
  • Extinction: weakening of a CR when the CS is presented repeated w/o the UCS
  • Spontaneous recovery: the reemergence of a previously extinguished CR (usually short-lived)
  • Best temporal order: CS presented first, then UCS Instrumental Conditioning - Thorndike's Law of Effect (behavior is controlled by its consequences) > when bx followed by positive consequences, tends to be repeated, vice versa
  • instrumental conditioning bc the behavior is INSTRUMENTAL to achieving a more satisfying state of affairs (environment is INSTRUMENTAL to changing bx) ex: cat/box/food Operant Conditioning - Skinner, who spent years experimenting w/ the ways in which bx is controlled by its consequences, called instrumental bx operant conditioning
  • learning that results when an organism associates a behavior w/ a particular environmental event >> our behaviors OPERATE on the environment
  • operants: behaviors that are emitted by an organism (rather than elicited by the environment)
  • Skinner box; ABCs Reinforcement - an environmental consequence occurring after a behavior, which increases the possibility that the behavior will recur
  • Positive reinforcement: PRESENTATION of a reward after a behavior makes the behavior more likely to recur
  • Negative Reinforcement: REMOVAL of an aversive event after a behavior makes the behavior more likely to recur
  • Extinction: removal of the reinforcer Punishment - An environmental consequence occurring after a behavior, which decreases the probability that the behavior will recur
  • Positive punishment: PRESENTATION of an aversive event after a behavior makes the bx less likely to recur
  • Negative punishment: removal of a reward after a behavior makes the bx less likely to recur Naturalistic Observation: Behavior Assessment Technique -
  1. Settings for administration: group vs individual in clinical setting
  2. Clinical tests emphasize diagnosis, treatment planning and follow-up evaluation Clinical Interview - Frequency of use: every clinician conducts an interview, but they immensely differ Degree of structure (continuum):
  • Unstructured: "traditional" interview that follows no pattern, varies by clinician and client
  • Semi-structured: Some standard questions but is tailored partly to the individual client
  • Structured: covers the same topics with the same questions with every client DSM-5; mental disorder; syndrome - Diagnostic and Statistical Manual of Mental Disorders
  • mental disorder: "a SYNDROME characterized by clinically significant disturbance in an individual's COGNITION, EMOTION regulation, or BEHAVIOR that reflects a dysfunction in the psychological, biological or developmental processes underlying mental functioning."
  • CATEGORICAL: provides specific diagnostic criteria for each mental disorder
  • based on a disease (medical) model > psychopathology reflected in discrete symptoms called syndromes (assumed to have discrete causes, ETIOLOGIES, and to be treated w/ different therapies)
  • syndrome: constellation of symptoms given a label SCID-5 types - Structured Clinical Interview for DSM-5 Disorders (specifically aligned w/ DSM- disorders). Wasn't published until Oct 2015, even though DSM-5 in May 2013 Several versions:
  • SCID-5-CV: Clinical Version
  • SCID-5-RV: Research Version
  • SCID-5-CT: Clinical Trials Version
  • SCID-5-PD: Personality Disorders SCID-5 Organization - Organized modules > begin w/ an open-ended overview of the present illness (to form tentative diagnoses, to be ruled out by modules); then a systematic inquiry about the presence/absence of particular DSM-5 criterion modules
  • 1 (absent/false) to 3 (threshold/true) Interviewer then fills out Summary Score Sheet; ends with optional Social and Occupational Functioning Assessment Scale MMPI/MMPI-2 -
    • Original edition (1942): pioneer in criterion-keying (comparing scores of "normals" to diagnosed criterion clinical groups) and use of validity indexes (to look at response/biases
  • MMPI-2 (1989): 567 T/F items, 69-90 minutes, numerous scores, very widely used/researched Changes from MMPI to MMPI-2 -
    1. New norms developed
  1. Some items revised/replaced
  2. several new validity indexes were added
  3. Clinical scales now were referenced by number rather than diagnostic category
  4. the T-score to signal high (clinically significant) scores was lowered from 70 to 65
  1. Unlike the MMPI, which was used w/ all ages, the MMPI-2 is to be used only with adults 18+ (also created an adolescent version, MMPI-A) MMPI-2 Validity Indexes (4 traditional, from MMPI) - 1)? (Cannot Say) - a simple count of the number of OMITTED or double-marked
  2. L (Lie): 15 items that assess the denial of commonplace weaknesses > may indicate a tendency to FAKE GOOD (naive; keyed false, so susceptible to nay-sayers; negatively correlated w/ education). Ex: "I do not always say the truth"
  3. F (Infrequency): 60 items w/ low endorsement rates in normals (<10%); may indicate an attempt to FAKE BAD or may indicate severe psychopathology (ex: "there is sthg wrong w/ my mind")
  4. K (Correction): 30 items that may indicate a FAKE-GOOD response tendency, but at a more SUBTLE level than the L-Scale items. Positively correlated w/ education. Ex: "People often disappoint me." New Validity Indexes - MMPI- TRIN; VRIN; Fb; Fp -
  5. TRIN (True Response Inconsistency): # of true or false resposnes on 23 pairs of items opposite in content (if someone gets a high TRIN score, then maybe YAY or NAY-sayer)
  6. VRIN (Variable Response Inconsistency): # of inconsistent responses to 67 pairs of items similar OR opposite in content
  7. Fb (Back Infrequency): 40 F-Type items (endorsed by <10% of normals) that occur in the first half of the test >> helps determine if later response patterns are similar to EARLIER ones
  8. Fp (Infrequency-Psychopathology): 27 items that are endorsed rarely (<20%) by psychiatric inpatients >> less contaminated by psychopathology than F; may be better measure of FAKING BAD MMPI-2 Clinical Scales (10) -
  9. Hypochondriasis > somatic complaints; excessive concern about health status
  10. Depression: depressive symptoms (pessimism, helplessness, etc)
  11. Hysteria > conversion disorder (turning psych symptoms into somatic symptoms)
  12. Psychopathic Deviant (antisocial personality disorder) >> difficult incorporating social standards/values
  13. *** Masc/Fem
  14. Paranoia >> Oversensitivity to paranoid delusions
  15. Psychasthenia >> anxiety/agitation
  16. Schizophrenia
  17. Hypomania >> high energy/narcissism/mania
  18. *** Social Introversion MMPI-2 Codetypes - Codetype: 2 highest elevated clinical scales at a T-score of 65 or higher, highest listed first
  • lots of research determining characteristics of codetypes, over 90 possible
  • WELL-DEFINED codetype: 5 T-score points higher than the other c;inical scales
  • SPIKE: only one clinical scale is elevated
  • Within-Normal-Limits: no clinical scales are above 65 MMPI-2: Content Scales - Developed by rational analysis of the items (not criterion keying) >> experts grouped together items that seemed to measure the same construct, which were refined by item-total correlations to increase homogeneity

50 scales total

  • validity: 15 total (8 from MMPI-2)
  • Higher order: from factor analysis of RC, incl emotional dysfunction (2-7), thought dysfunction (6-8), behavioral dysfunction (4-9)
  • Restructured Clinical Scales - 9 scores
  • Specific Problems Scales, w 4 subcategories: Somatic/Cognitive; Internalizing (anxiet, etc); Externalizing (aggression, etc); Interpersonal
  • Interest Scales (aesthetic-literary; mechanical-physical
  • Personality Psychopathology Five Beck Depression Inventory (BDI-II) -
    • very widely used self-report instrument for measuring the severity of depressing in adolescents and adults (ages 13-80)
  • 21 items, 4-point response scale (choose one statement in each group that best describes the way you feel during the past 2 weeks) >> summing responses leads to raw score from 0 to 63
  • no traditional norms (eg Standard Scores)
  • Cut-points for the degree of depression were derived empirically to distinguish among several clinically evaluated groups (20:28 is moderate, 29-63 is severe)
  • Reliability: Internal consistency is .92, test-retest is.
  • Validity: Support for both convergent and discriminate validity; 2 factors: Somatic-Affective and Cognitive 5 trends in Clinical Instruments -
    1. Influence of DSM: nearly all clinical instruments attempt to link results to DSM diagnostic categories
  1. Emphasis on treatment planning and follow-up evaluation: not just diagnosis
  2. Use of briefer instruments (due to managed care and need for follow-up evaluation)
  3. Growth in number of instruments: especially briefer instruments
  4. Increased use of online administration/scoring and interpretative reporting Projective Technique -
    • 2 major characteristic: ambiguous characteristic and constructed-response (free response) format
  • Projective hypothesis: if the stimuli for a response is ambiguous, then the response itself will be determined by the examinee's personality dynamics (desires, fantasies, inclinations, fears, motives, etc) Unconscious motivation - Historical connection w/ Freud and the psychodynamic approach (ppl may be unaware of the motives for their bxs)
  • unconscious motivation may be assessed using projective techniques in which a person is asked to describe a vague stimulus >> idea is that the examinee's verbal descriptions of the scene will reflect his or her unconscious motivation Rorschach -
    • Most widely used projective technique
  • Materials: 10 bilaterally symmetrical inkblots presented in standard order and in a standard orientation (diff colors)
  • Admin and scoring: Exner's Comprehensive System (CS) is the "industry standard"

Exner's Comprehensive System (CS) - Admin: 2 phases

  1. Response Phase: card handed to examinee, examiner asks "what might this be"; responses recorded verbatim
  2. Inquiry Phase: Each of the 10 inkblots is presented again; examiner asks examinee to explain/elaborate on answers given in the response phase
  • "protocol" = record of responses
  • "Coding" = scoring the responses (interpretation of coded responses is based on a normed-referenced approach Examples of CS Codes - Location: part of the card the examinee referenced
  • W: whole response
  • D: Common detail
  • Dd: unusual detail (<5%)
  • S: Space (white space) > "rebellious toward authority" Determinants: what features of the inkblot influenced the responses (ie color, movement, animal/human figures)
  • CS codes summarized in the Sequence of Scores and Structural Summary Evaluation of the Rorschach -
    • the use of Exner's CS norms leads to overperception of psychopathology (bad for forensics and clinical practice)
  • Response Frequency (R): clients who give more responses appear more pathological on various CS indexes
  • Factor Structure: various scores do not intercorrelate in a way consistent w/ theories
  • Reliability: Inter-rater is bad (about 50% of CS variables)
  • Validity: meta-analyses give validity coefficients of .30 (+/- .05) >> low but now awful BUT this may be an overestimate bc of publication bias
  • Incremental Validity: NO evidence for nearly all Rorschach scores Future of the Rorschach: R-PAS - Rorschach Performance Assessment System: approach to using the Rorschach inkblot test in applied practice; seeks to further the psychometrics development of Rorschach interpretations by:
  • grounding the Rorschach in its evidence base
  • improving the normative foundation
  • integrating international findings
  • reducing examiner variability
  • increasing utility Thematic Apperception Test - Very widely used projective technique (but some decline recently)
  • 31 cards, every person sees 20 cards (based on gender/age)
  • asked to tell a story, including thoughts/feelings; past/present/future; response recorded verbatim.
  • NO standard administration or scoring, so v varied
  • respectable reliability/validity for well-defined constructs (achievement themes and success in business)
  • most reviews unfavorable bc not systematic

291 items in six categories > Occupations, Subject Areas, Activities, Leisure Activities, People, and Your Characteristics.

  • 5-point response scale, takes 35-40 min to complete, intended for HS, college and adults Strong Interest Inventory SCORES -
    1. General Occupational Themes (GOT): based on RIASEC Holland hexagon
  1. Basic Interest Scales (BIS): based on factor analysis (grouped w/in GOTs)
  2. Occupational Scales (OS): classic Strong scales, criterion keyed to various occupational groups
  3. Personal Style Scales: e.g., work style, leadership style, risk taking, team orientation
  4. Administrative (Validity) Indexes: provide info about whether the inventory was answered sincerely (e.g., total responses; typicality index, item response percentage) (over 200 scores for an individual) Strong Interest Inventory NORMS -
    • General representative sample (GRS) w/ 2250 cases divided by genders > basis for T- score system with M=50, SD=10 for General Occupational Themes and Basic Interest Scales
  • Separate T-scores for occupational scales (go beyond GRS)
  • people w/in each occupational norm group had to 1) perform main duties of the job, 2) have at least 3 years of experience in the job, 3) Express satisfaction with the job Strong Interest Inventory PSYCHOMETRICS -
    • Internal consistency (alpha): Median for GOTs (.92), for BIS (.87), for personal style scales (.86),
  • Test-retest: GOTs and BISs (mid- .8s), Occupational Scales (from .85 to .90); Personal style scales (mid .8s)
  • Validity: very large effect size (1.51); people tend to enter occupations where they score high; factor analysis and correlational support Common methods for looking at validity of interest measures - 2 common methods: showing test results differentiate between existing occupational groups in predictable directions OR that scores are predictive of the occupation that people ultimately select Effect size -
    • measure of the magnitude of a statistical phenomenon, independent of sample size
  • Cohen's d >> difference between 2 means divided by a SD for the data (small: d=.2, medium: d=.5, large; d=.8) Kuder Career Interests Assessments + Psychometrics -
    • 32 forced choice triads (marks activities 1, 2, and 3)
  • Reliability: Internal consistency (from .73 to .87)
  • content validity: only items w higher than 80% consensus based on content representation and relevance were retained Conclusion: reliability/validity somewhat limited Self-Directed Search - Very popular instrument that aims at simplicity: Self-admin, self-scored, self-interpreted, based on Holland's RIASEC
  • 6 personality types and 6 work environments
  • items: 228 items in 4 major parts (activities, Competencies, Occupations, Self-estimates), w/in each part, grouped RIASEC code
  • Scoring: add number w/in each area across 4 parts; yields 6 raw scores referred to as SUMMARY SCALE SCORES, then determine 3 highest raw scores (to get 3 letter RIASEC code) >> criterion- referenced but manual also provides percentile rank norms
  • Occupations Finder (booklet listing hundreds of codes)
  • takes about 20 min to complete Self-Directed Search Psychometrics -
    • Reliability: internal consistency (low 90s) and test-retest (.76 to .89)
  • Validity: Scales are relatively independent of one another and consistent w the Holland hexagonal model
  • also has consequential validity, in terms of making a diff in people's lives who take the test 5 Generalizations about Career Interests Measures -
    1. People's career-related interests seem to be quite reliable as measured by these tests, from teens onwards
  1. Career interest tests have a respectable degree of (predictive) validity (if used ultimate career choice as criterion)
  2. Often devoid of references to more modern psychometric techniques
  3. Career interest inventories are being completed online
  4. Movement towards assessing abilities along with interests 3 components of Attitude Measures -
    1. Cognitive: thoughts about the object (most attitude measures concentrate on cognitive)
  5. Emotional: feelings about the object
  6. Behavioral: actions taken or likely to be taken regarding the object (most are paper-pencil tests where person answers questions about the object OR reacts to statements about the object) Thurstone Scale Basic Features -
    1. Numerous items, varying shades of opinion
  7. Mark agree/disagree to each item
  8. Have judges sort the statements into 11 categories from least to most favorable >> aim for EQUAL APPEARING INTERVALS
  9. Determine for each statement the average category placement and a measure of variation (i.e., determine means and SD)
  10. Eliminate statements w large SDs (i.e. judges don't show good agreement for these statements)
  11. Group statements together w/ similar average category values
  12. select a few statements to represent attitudinal positions along the continuum from least to most favorable Likert Scales - Most widely used method for attitude scales, which often is called the method of SUMMATIVE RATINGS 6 basic features:
  13. ID target object
  14. Start w/ large # of items, expressing some aspect of an attitude toward the target object
  15. Use 5-point scale (Strongly agree to strongly disagree)
  • B: requires some knowledge of technical characteristics of tests, usually Masters level (ex: group- admin mental ability tests)
  • C: requires advanced training in test theory, usually doctoral level (ex: individually administered Wechsler IQ tests) 3 Sources of Laws -
    1. Statutory Law (legislation): laws originate w/ a legislative body (ex - state legislature) and are endorsed by an executive
  1. Administrative Law (regulations) - prepared by an administrative agency to provide details on implementing a particular law
  2. Case Law (court decisions) - Courts interpret meaning of laws when applied in particular circumstances (a court's decision, once rendered, has the force of law within the jurisdiction of the court) IDEA - Series of laws (starting 1970s) addressing individuals w/ disabilities in educational settings (Education of hte Handicapped, ten Education for All Handicapped Children, now INDIVIDUALS with DISABILITIES EDUCATION ACT
  • calls for dev of an "individualized education program" (IEP) for each child with a disability
  • essential point of these laws to provide a FREE APPROPRIATE PUBLIC EDUCATION (FAPE) for all kids, with special attention to kids w/ disabilities FERPA - Family Education Rights and Privacy Act > aka the Buckley Amendment 3 Purposes: to guarantee that...
  1. individuals have open access to info about themselves
  2. individuals can challenge the validity of info in agency files
  3. unwarranted other parties don't have access to personal info No Child Left Behind (NCLB) Act - 2002; major testing implications
  4. Educational accountability and standards-based education translated into detailed legal provisions
  5. required specification of educational standards and extensive testing to determine that standards had been met
  6. Emphasized that all students showed adequate test performance
  7. Precise requirements for schools to report test scores to public
  8. Required demonstration of improvement in average scores from year to year Every Student Succeeds Act (ESSA) -
    1. Holding all students to high academic standards to prepare them for success in COLLEGE and CAREERS
  9. Accountability > when students fall behind, states redirect resources to support them/their schools, esp schools w/ high dropout rates/achievement gaps
  10. empowering state/local decisions to develop systems for school improvement based on evidence
  11. reducing testing
  12. Providing more kids w/ access to Pre-K
  13. More resources for proven strategies that'll spur reform Forensic Application of Tests: insanity vs competency to stand trial -
  • insanity: a mental disorder/incapacity at the time a CRIME is committed (inability to distinguish right from wrong or to control bx)
  • Competency to stand trial: a person's mental capacity at the time of TRIAL for a crime Assessment and Forensic Applications -
  1. Child custody cases - attempt to determine qualifications of each parent to serve the best interest of the child
  2. Prediction of future dangerousness - to determine eligibility for parole, incarceration vs release under supervision, etc
  3. Abuse: may help determine the nature and/or extent of abuse