#### Leuk: Cox regression

Several authors have discussed Bayesian inference for censored survival data where the integrated baseline hazard function is to be estimated non-parametrically Kalbfleisch (1978), Kalbfleisch and Prentice (1980), Clayton (1991), Clayton (1994). Clayton (1994) formulates the Cox model using the counting process notation introduced by Andersen and Gill (1982) and discusses estimation of the baseline hazard and regression parameters using MCMC methods. Although his approach may appear somewhat contrived, it forms the basis for extensions to random effect (frailty) models, time-dependent covariates, smoothed hazards, multiple events and so on. We show below how to implement this formulation of the Cox model in

*BUGS*.

For subjects

*i* = 1,...,n, we observe processes N

_{i}(t) which count the number of failures which have occurred up to time t. The corresponding intensity process I

_{i}(t) is given by

I

_{i}(t)dt = E(dN

_{i}(t) | F

_{t-})

where dN

_{i}(t) is the increment of N

_{i} over the small time interval [t, t+dt), and F

_{t-} represents the available data just before time t. If subject

*i* is observed to fail during this time interval, dN

_{i}(t) will take the value 1; otherwise dN

_{i}(t) = 0. Hence E(dN

_{i}(t) | F

_{t-}) corresponds to the probability of subject

* i* failing in the interval [t, t+dt). As dt -> 0 (assuming time to be continuous) then this probability becomes the instantaneous hazard at time t for subject

*i*. This is assumed to have the proportional hazards form

I

_{i}(t) = Y

_{i}(t)

λ_{0}(t)exp(

**β****z**_{i})

where Y

_{i}(t) is an observed process taking the value 1 or 0 according to whether or not subject

*i* is observed at time t and

λ_{0}(t)exp(

**β****z**_{i}) is the familiar Cox regression model. Thus we have observed data D = N

_{i}(t), Y

_{i}(t),

**z**_{i}; i = 1,..n and unknown parameters

**β** and

Λ_{0}(t) = Integral(

λ_{0}(u), u, t, 0), the latter to be estimated non-parametrically. n

The joint posterior distribution for the above model is defined by

P(

**β**,

Λ_{0}() | D) ~ P(D |

**β**,

Λ_{0}()) P(

**β**) P(

Λ_{0}())

For

*BUGS*, we need to specify the form of the likelihood P(D |

**β**,

Λ_{0}()) and prior distributions for

**β** and

Λ_{0}(). Under non-informative censoring, the likelihood of the data is proportional to

Π_{i=1,...,n}[Π_{t>=0}Ι_{i}(t)

^{dN}^{i}^{(t)}] exp(- I

_{i}(t)dt)

This is essentially as if the counting process increments dN

_{i}(t) in the time interval [t, t+dt) are independent Poisson random variables with means I

_{i}(t)dt:

dN

_{i}(t) ~ Poisson(I

_{i}(t)dt)

We may write

I

_{i}(t)dt = Y

_{i}(t)exp(

**β****z**_{i})d

Λ_{0}(t)

where d

Λ_{0}(t) =

Λ_{0}(t)dt is the increment or jump in the integrated baseline hazard function occurring during the time interval [t, t+dt). Since the conjugate prior for the Poisson mean is the gamma distribution, it would be convenient if

Λ_{0}() were a process in which the increments d

Λ_{0}(t) are distributed according to gamma distributions. We assume the conjugate independent increments prior suggested by Kalbfleisch (1978), namely

d

Λ_{0}(t) ~ Gamma(cd

Λ^{*}_{0}(t), c)

Here, d

Λ^{*}_{0}(t) can be thought of as a prior guess at the unknown hazard function, with c representing the degree of confidence in this guess. Small values of c correspond to weak prior beliefs. In the example below, we set d

Λ^{*}_{0}(t) = r dt where r is a guess at the failure rate per unit time, and dt is the size of the time interval.

The above formulation is appropriate when genuine prior information exists concerning the underlying hazard function. Alternatively, if we wish to reproduce a Cox analysis but with, say, additional hierarchical structure, we may use the multinomial-Poisson trick described in the

*BUGS* manual. This is equivalent to assuming independent increments in the cumulative `non-informative' priors. This formulation is also shown below.

The fixed effect regression coefficients

**β** are assigned a vague prior

β ~ Normal(0.0, 0.000001)

BUGS* language for the Leuk example:*

model

{

# Set up data

for(i in 1:N) {

for(j in 1:T) {

# risk set = 1 if obs.t >= t

Y[i,j] <- step(obs.t[i] - t[j] + eps)

# counting process jump = 1 if obs.t in [ t[j], t[j+1] )

# i.e. if t[j] <= obs.t < t[j+1]

dN[i, j] <- Y[i, j] * step(t[j + 1] - obs.t[i] - eps) * fail[i]

}

}

# Model

for(j in 1:T) {

for(i in 1:N) {

dN[i, j] ~ dpois(Idt[i, j]) # Likelihood

Idt[i, j] <- Y[i, j] * exp(beta * Z[i]) * dL0[j] # Intensity

}

dL0[j] ~ dgamma(mu[j], c)

mu[j] <- dL0.star[j] * c # prior mean hazard

# Survivor function = exp(-Integral{l0(u)du})^exp(beta*z)

S.treat[j] <- pow(exp(-sum(dL0[1 : j])), exp(beta * -0.5));

S.placebo[j] <- pow(exp(-sum(dL0[1 : j])), exp(beta * 0.5));

}

c <- 0.001

r <- 0.1

for (j in 1 : T) {

dL0.star[j] <- r * (t[j + 1] - t[j])

}

beta ~ dnorm(0.0,0.000001)

}

##### Data

```
list(N = 42, T = 17, eps = 1.0E-10,
```

obs.t = c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23, 6,

6, 6, 6, 7, 9, 10, 10, 11, 13, 16, 17, 19, 20, 22, 23, 25, 32, 32, 34, 35),

fail = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0),

Z = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,

-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5),

t = c(1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 22, 23, 35))

##### Inits for chain 1

```
list( beta = 0.0,
```

dL0 = c(1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,

1.0,1.0,1.0,1.0,1.0,1.0, 1.0,1.0))

##### Inits for chain 2

```
list( beta = 1.0,
```

dL0 = c(2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,

2.0,2.0,2.0,2.0,2.0,2.0, 2.0,2.0))

Results