Biopsies: discrete variable latent class model

Spiegelhalter and Stovin (1983) presented data on repeated biopsies of transplanted hearts, in which a total of 414 biopsies had been taken at 157 sessions. Each biopsy was graded on evidence of rejection using a 4 category scale of none (O), minimal (M), mild (+) and moderate-severe (++). Part of the data is shown below.


[biopsies1]
The sampling procedure may not detect the area of maximum rejection, which is considered the true underlying state at the time of the session and denoted ti --- the underlying probability distribution of the four true states is denoted by the vector p. It is then assumed that each of the observed biopsies are conditionally independent given this truestate with the restriction that there are no`false positives': i.e. one cannot observe a biopsy worse than the true state. We then have the sampling model   

   bi ~ Multinomial(eti, ni)

   ti ~ Categorical(p)


where bi denotes the multinomial response at session i where ni biopsies have been taken, and ejk is the probability that a true state ti = j generates a biopsy in state k.The no-false-positive restriction means that e12 = e13 = e14 = e23 = e24 = e34 = 0. Spiegelhalter and Stovin (1983) estimated the parameters ej and p using the EM algorithm, with some smoothing to avoid zero estimates.

The appropriate graph is shown below, where the role of the true state ti is simply to pick the appropriate row from the 4 x 4 error matrix e. Here the probability vectors ej (j = 1,...,4) and p are assumed to have uniform priors on the unit simplex, which correspond to Dirichlet priors with all parameters being 1.

The BUGS code for this model is given below. No initial values are provided for the latent states, since the forward sampling procedure will find a configuration of starting values that is compatible with the expressed constraints. We also note the apparent ``cycle'' in the graph created by the expression nbiops[i] <- sum(biopsies[i,]). This will lead Such ``cycles'' are permitted provided that they are only data transformation statements, since this does not affect the essential probability model.


   model
   {
      for (i in 1 : ns){
         nbiops[i] <- sum(biopsies[i, ])
         true[i] ~ dcat(p[])
         biopsies[i, 1 : 4] ~ dmulti(error[true[i], ], nbiops[i])
      }
      error[2,1 : 2] ~ ddirich(prior[1 : 2])
      error[3,1 : 3] ~ ddirich(prior[1 : 3])
      error[4,1 : 4] ~ ddirich(prior[1 : 4])
      p[1 : 4] ~ ddirich(prior[]); # prior for p
   }

Data
list(ns = 157,
   error = structure(.Data = c(1, 0, 0, 0,
NA, NA, 0, 0,
            NA, NA, NA, 0,
                              NA, NA, NA, NA), .Dim = c(4, 4)),
   prior = c(1, 1, 1, 1),                              
biopsies = structure(.Data = c(
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         2,0,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         1,1,0,0,
         0,2,0,0,
         0,2,0,0,
         0,2,0,0,
         0,2,0,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         1,0,1,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         0,0,2,0,
         1,0,0,1,
         0,0,1,1,
         0,0,1,1,
         0,0,1,1,
         0,0,0,2,
         0,0,0,2,
         0,0,0,2,
         0,0,0,2,
         0,0,0,2,
         0,0,0,2,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         3,0,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         2,1,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         1,2,0,0,
         0,3,0,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         2,0,1,0,
         1,1,1,0,
         0,2,1,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         1,0,2,0,
         0,1,2,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         0,0,3,0,
         2,0,0,1,
         1,0,1,1,
         0,0,2,1,
         0,0,1,2,
         0,0,1,2,
         0,0,1,2,
         0,0,0,3,
         0,0,0,3,
         0,0,0,3,
         0,0,0,3,
         0,0,0,3
), .Dim = c(157, 4)))
Inits for chain 1
list(p = c(0.25, 0.25, 0.25, 0.25),
error = structure(.Data = c(NA, NA, NA, NA,
0.5, 0.5, NA, NA,
                           0.33333333, 0.33333333,0.33333333, NA,
                           0.25, 0.25, 0.25, 0.25), .Dim = c(4, 4)))
   
Inits for chain 2
list(p = c(0.4, 0.1, 0.1, 0.4),
error = structure(.Data = c(NA, NA, NA, NA,
0.7, 0.3, NA, NA,
                           0.2, 0.3,0.5, NA,
                           0.1, 0.1, 0.1, 0.7), .Dim = c(4, 4)))

Results

[biopsies2]