




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Polytomous IRT Models, Dichotomous IRT, Difference Models, Graded Response Model, Operating Characteristic Curve, Category Response Curve, Direct Models, Sufficient Statistics are learning points of this lecture.
Typology: Study notes
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Ch. 5. Polytomous IRT Models (Embretson & Reese)
I. Introduction
A. CTT: The main concern is how a test works. Computes test
indices: reliability and validity.
B. Dichotomous IRT: The main issue is how each item works. Dichotomous IRT
computes item parameters (a-, b-, and c-parameters) and person parameter ( ).
C. Polytomous IRT: The main issue is how each alternative works. Polytomous IRT
computes slope and threshold parameters ( , ) and trait parameter ( ).
II. Different models
A. Indirect (Difference) models
categorical responses. A generalization of the 2-pl model.
a) Operating Characteristic Curve (OCC)
1 exp[ ( )]
exp[ ( )] ( )
i ij
i ij Pix
where
x = j = 1, 2,.. ., mi, score,
i^ = common item slope parameter, and
ij^ = category threshold parameter.
b) Category Response Curve (CRC)
( ) ( ) ( )
( 1 )
Pix Pix Pi x.
The probability of responding on or above the lowest category is
Pi 0 = 1.0, and the probability of responding above the highest
category is
Pi 5 = 0.0.
Thus,
( ) 1. 0 ( )
Pi 0 Pi 1
2
Pi 1 Pi 1 Pi
3
Pi 2 Pi 2 Pi
4
Pi 3 Pi 3 Pi
5
Pi 4 Pi 4 Pi
case of GRM in that it has a location parameter for each item (bi) and a set
of category threshold parameters (cj) for the entire scale from the threshold
parameter in GRM ( ij
a) OCC
1 exp[ ( ( ))]
exp[ ( ( ))] ( )
i i j
i i j ix b c
b c P
b) CRC
( 1 )
Pix Pix Pi x
c) In the GRM one set of category threshold parameters ( ij ) is
estimated for each item while in the M-GRM one set of category
threshold parameters (cj) is estimated for the entire scale, and one
location parameter (bi) is estimated for each item.
d) Developed to describe responses from a Likert format attitude
scale.
B. Direct (Divided-by-total) models
achievement tests. The CRC is directly computed by dividing an
exponential function with the sum of the exponential functions. A
generalization of the Rasch model.
a) CRC
[exp ( )]
exp[ ( )]
0 0
0 r
j
ij
m
r
x
j
ij
Pix (^) i
where 0
0
j
ij^.
b) (^) ij : Step difficulty parameter associated with a category score of j.
relative to other steps within an item.
curves intersect.
c) Sufficient statistics
examinee trait level.
are estimated to have equivalent positions on the latent
trait.
d) In order to estimate the parameters, (^) ix = cix = 0 or in some
cases, the parameters for the lowest response category should be
zero.
e) One (^) ix and cix parameter must be estimated for each of the
(mi + 1) response categories within an item.
III. Number of subjects for polytomous IRT
A. Through a Monte Carlo study, Reise and Yu (1990) showed that the GRM could
be estimated as few as 250, but recommended 500.
B. Some of the threshold parameters were not well established with 350 examinees
with the current example.
Multidimensional Item Response Theory (MIRT, Reckage, 2009)
I. Situation
A. A case where the unidimensionality assumption is violated due to an internal
factor where items measure multiple factors or several external factors such as
cognitive condition, personality, and test-taking attitudes influence test
performance (external factors).
B. IRT models should include these factors in analysis.
II. Two related fields (Factor Analysis, FA, and Unidimensional IRT, UIRT)
A. Similarities
3 Both methods allow rotation of axes for latent traits.
B. Dissimilarities
important latent traits to be estimated in UIRT.
in mean and SD while UIRT starts with item scores.
technique for the interaction between person and item.
III. Mathematical Models
A. Unidimensional IRT (UIRT)
P(Ui = 0| ) = ( ) 1
Dai b i
i i e
c c
where
P(Ui = 0| ) = the probability getting item i correct for given ,
= latent trait (ability or proficiency),
bi = item difficulty parameter,
ai = item discrimnation parameter,
ci = pseudo-chance parameter, and
D = 1.701, a scaling factor to fit NOM.
B. Multidimensional IRT (MIRT)
P(Ui = 0| (^) j ) = ( ) 1
Daij ijb ij
i i e
c c
where
P(Ui = 0| (^) j ) = the probability getting item i correct for latent trait, j, (^) j ,
j = latent trait, j (ability or proficiency),
bij = item difficulty parameter for j latent trait,
aij = item discrimnation parameter for j latent trait,
ci = pseudo-chance parameter, and
D = 1.701, a scaling factor to fit NOM.
C. Factor Analysis (FA)
x = X
where
x = a (qx1) vector of observed exogenous indicators,
X^ = a (qxn) matrix of structural coefficients,
= an (nx1) vector of exogenous concepts,
= a (qx1) vector of errors in the measurement model,
q = the number of x-variables, and
n = the number of -variables.
D. Mathematical relationship between FA and UIRT
i i
i ai ' 1
where
i = factor loading vector for item i, and
= covariance matrix of factors.
i i
i bi ' 1
, where i = the threshold of item i.
the data from an English usage test (a=1 for both model) and that COM
fits the PCOM data as well as the PCOM model but PCOM model does
not fit the COM data.
V. Polytomous MIRT Models (all compensatory)
A. GRM for polytomous MIRT (indirect model)
ij i
i j i
a d
a d
ij j e
e P u k '
'
( | ) ( | ) ( 1 | )
Puij k j P uij k j P uij k j
B. Partial Credit Model for polytomous MIRT (direct model)
i (^) ilk ilk
m
l
jl
ilk ilk
m
l
jl
K
r
b W
b W
ij j
e
e Pu k
0
( )
( )
1
1 ( | )
where
bilk = item difficulty parameter for item i, dimension l, and category k,
Wilk = predefined scoring weight for item i, dimension l, and category k,
k = the category score.
C. Generalized Partial Credit Model for polytomous MIRT (direct model)
i
v
u
ij iu
k
u
ij iu
K
v
va
ka
ij j
e
e Pu k
0
1
'
1
'
where
iu = the threshold parameter for item i and category u, and
k = the category score.
VI. Future Directions of MIRT
A. More models will be proposed concerning the interaction between persons and
items (e.g., assessment of writing). Vectors of the a-parameter for each score
boundary rather than a single a-parameter vector for an item will be more feasible.
B. Students with different scores for dichotomous responses may have different
strategies for solution Introduction of latent class in addition to latent skills.