Simulation Model Notes

This simulation is the same as ''Marshall_notes_recentered_dists'' but now instead of the marker having a mixture of normals with zero mean, we set the mean as the crossing point of the two risk curves (ie where the trt effect given Y is zero).

link to the notes for the previous model

plot of interest

Recall that:

the old model: marker distributions are centered at zero
the new model: marker distributions are centered at \( y_0 \) s.t. \( \Delta(y = y_0) = 0 \). (See below for more details regarding the simulation model)

This figure shows \( p_0 = Pr(D=1|T=0) \) and \( p_1 = Pr(D=1|T=1) \) for the new model (red dot) compared to the \( p_0 \) and \( p_1 \) for the old model (black vertical and horizontal lines). We see that shifting the marker mean causes the largest change from old to new when the difference between \( q_0 \) and \( q_1 \) is large. The shift changes the values of \( p_0 \) and \( p_1 \) to be more similar to each other, which defeats the purpose of trying to set the marginals and look at the coverage.

pros to new method: changing \( p \) now has the desired effect of varying the proportion of observations with small treatment effects. see here
cons: we can't set the marginals.

For the figure below \( k = 4 \) and \( p = 0.1 \), but this is largely irrelevant because these plots look very similar for all values of \( k \) and \( p \).

plot of chunk oldvsnew

Modeling assumptions

Y: mixture of normals with: \( f(y) = p\phi(-\beta_1/\psi_2, c_1) + (1-p)\phi(-\beta_1/\psi_2, c_2) \)
- \( p\in(0,1) \)
- \( k = c_2/c_1 \)
- \( var(Y) = 1 \).
\( logit(Pr(D = 1 | T,Y)) = \beta_0 + \beta_1T +\psi_1Y + \psi_2YT \)
- \( \psi_1 = \psi_2=1 \) and \( \beta_0, \beta_1 \) chosen to fix:
- \( q_0 = expit(\beta_0) \)
- \( q_1 = expit(\beta_0 + \beta_1) \)

and

\( p0 = Pr(D=1|T=0) \)
\( p1 = Pr(D=1|T=1) \)

Additional plots

Risk curves
marker distribution
the proportion of observations with 'small' treatment effects (ie the proportion of events near the boundary)
the distribution of treatment effects
the value of theta

Risk Curves(top)

Risk by F(y) given trt by q0 and q1. True value of Theta in grey.

\( P = .1 \), \( k = 30 \).

plot of chunk riskcurves

\( P = .1 \), \( k = 4 \).

plot of chunk riscurvessmallk

Treatment Effect Distribution(top)

Distribution of \( \Delta(y) = Pr(D=1|T=0, Y) - Pr(D-1|T=1, Y) \) by \( F(y) \).

\( K = 30 \)

plot of chunk trteffect

\( K = 4 \)

plot of chunk trteffectsmallk

Theta by p(top)

K is denoted by color

plot of chunk theta

Marker Distribution(top)

pdf of y colored by p. The vertical black line indicates where \( \Delta(y)=0 \).

\( K=30 \)

plot of chunk markerdist

\( K=4 \)

plot of chunk markerdistsmallk

Proportion of markers near the boundary(top)

This is a plot of p (x-axis) vs \( F(y_{\Delta=0} + 0.01) - F(y_{\Delta=0} - 0.01) \) where \( \Delta(y_{\Delta=0}) = 0 \). This is a very arbitrary measure of proportion of observations near the boundary of interest.

K is denoted by color.

plot of chunk smalldelta