A note on DID with noisy treatment times

In my current project on Airbnb’s effect on residential investment (joint with Ron Bekkerman, Maxime Cohen, John Maiden, and Davide Proserpio) we are using the staggered difference-in-differences estimator of Callaway and Sant’Anna (2020) (CSDID). As explained in this very helpful post by Fernando Rios-Avila, CSDID is a method for consistently estimating DID when there are multiple treated units with multiple treatment times. By comparison, the popular twoway fixed effects estimator (TWFE) does not always produce correct estimates.

One issue we’re running into is that CSDID seems sensitive to noise in the observed treatment time. The purpose of this note is to explore this sensitivity using simulation evidence.

Simulation Framework

Unit \(i\) at time \(t\) can have two possible outcomes, the untreated outcome \(y_{0it}\) and the treated outcome \(y_{1it}\). Let \(g^*_i\) denote the time that unit \(i\) is first treated. The observed outcome for unit \(i\) at time \(t\) is:

\[ y_{it} = \left\{ \begin{array}{ll} y_{0it} & \mbox{if } t < g_i^* \\ y_{1it} & \mbox{if } t \geq g_i^* \end{array} \right. \]

\(g_i^*\) is a latent variable that is not observed. Instead, \(g_i\) is observed, where:

\[ g_i = g_i^* + d_i^* \] \(d_i^*\) is an unobserved noise term that takes an integer value. So \(g_i\) is simply the true treatment date perturbed by a noise term.

I simulate \(N=500\) units for \(T=20\) periods each. The untreated outcomes are:

\[ y_{0it} = \delta_i + \gamma_t + \epsilon_{it} \]

where \(\delta_i, \gamma_i \sim N(0,1)\) and \(\epsilon_{it} \sim N(0,0.1)\). \(\delta_i\) are unit fixed effects, \(\gamma_t\) are time fixed effects, and \(\epsilon_{it}\) is white noise.

The treated outcomes are:

\[ y_{1it} = y_{0it} + \theta \]

with \(\theta=1\). So I am assuming a constant and homogeneous treatment effect.

For each \(i\), \(g^*_{i}\) is uniformly distributed from \(t=5\) to \(t=25\), so some units are never treated in the observation period.

In the simulations, I will experiment with different distributions of \(d_i^*\).

No noise

First, I report the result of CSDID when there is no noise in the treatment times (\(d_i^*=0\)). I use the R package did to compute the estimates. First, att_gt is called with the option control_group=“notyettreated”. Then, aggte is called with the options type=“dynamic”, min_e=-10, max_e=10.

The results are reported as an event study. The results can be interpreted as the estimated average treatment effect by length of exposure.

Code
library(did)
library(dplyr)
library(tidyr)
library(stargazer)
library(ggplot2)
library(lfe)
set.seed(500)

N = 500
T = 20
dbase <- merge(data.frame(i=1:N), data.frame(t=1:T), all=T)
d.i <- data.frame(i=1:N, ife=rnorm(N), g=sample(5:(T+5), N, replace=T)) 
d.t <- data.frame(t=1:N, tfe=rnorm(T)) # time fixed effects

simulate <- function(diff_min=0, diff_max=0, ant=0, shift=0, theta=1, noise_level=0.1) {
  set.seed(100)
  d.i.diff <- data.frame(i=1:N, diff=sample(diff_min:diff_max, N, replace=T))
  d <- dbase %>%
    left_join(d.i, by="i") %>%
    left_join(d.t, by="t") %>%
    left_join(d.i.diff, by="i")
  d$noise <- noise_level*rnorm(N*T)
  d$y0 <- d$ife + d$tfe + d$noise
  d$y1 <- d$y0 + theta
  d$treated <- (d$g>0 & d$t >= d$g - d$diff)
  d$y <- (d$treated)*d$y1 + (!d$treated)*d$y0
  
  if (diff_min==diff_max) {
    title = paste0("di = ",diff_min)
  }  
  else {
    title = paste0("di = U(",diff_min,",",diff_max,")")
  }
  if (ant!=0) {
    title = paste0(title, ", Anticipation = ",ant)
  }
  if (shift!=0) {
    title = paste0(title,", Shift = ",shift)
  }
  
  d$g2 <- d$g - shift
  
  r <- att_gt(yname="y",tname="t",idname="i",gname="g2",data=d,
              control_group="notyettreated",anticipation=ant)

  ggdid(aggte(r,type="dynamic",min_e=-10,max_e=10)) + 
    theme_bw() + 
    scale_y_continuous(breaks=seq(0,1,0.25)) + 
    expand_limits(y=c(0,1)) + 
    theme(legend.position="bottom") + 
    ggtitle(title)
}

simulate(diff_min=0, diff_max=0, ant=0)

When there is no noise, we can see that CSDID does a good job of estimating the treatment effects by length of exposure.

True date later than observed date

Now I consider what happens when the true treatment date is later than the observed treatment date by one period. That is, \(d_i^*=-1\) for all \(i\).

simulate(diff_min=-1, diff_max=-1, ant=0)

CSDID did a fine job here. As expected, the treatment effects don’t start until 1 period after the observed treatment date.

True date earlier than observed date

Now I consider what happens if the true treatment date is earlier than the observed treatment date by one period. That is, \(d_i^*=1\) for all \(i\).

simulate(diff_min=1, diff_max=1, ant=0)

Doesn’t look good at all. This happens because CSDID always takes time \(g_i-1\) as the comparison period. Since the true effect already started at time \(g_i-1\), comparing outcomes at \(t=g_i\) to \(t=g_i-1\) results in a null estimate. CSDID is estimating a positive effect at \(t=g_i-1\) because at that period it is simply comparing the outcomes of units one period away from observed treatment (thus actually already treated) to the outcomes of units more than one period away from treatment (thus not yet treated).

CSDID allows one to specify a number of anticipation periods. That is, you can tell CSDID that the effects may start before the observed treatment time. If the allowed anticipation time is \(\delta\) then the comparison period is \(g_i – \delta – 1\). Since the true treatment is 1 period before the observed treatment in our simulation, it makes sense to set an anticipation time of 1 period. Doing so results in this:

simulate(diff_min=1, diff_max=1, ant=1)

Much better, but still not correct. I am not sure why the estimates for exposure lengths ≥ 0 are less than 1. I am also not sure why the estimates for exposure lengths ≤ -2 are less than zero. It seems like it might be an issue related to normalization and/or choice of comparison period. Unfortunately, I do not fully understand how the did package works under the hood.

Instead of using the built-in anticipation option, one can manually shift the observed treatment time. If we shift the observed treatment time back by 1 we get:

simulate(diff_min=1, diff_max=1, shift=1)

This looks correct. So if we know exactly what \(d_i^*\) is we can simply shift the treatment time manually. Of course, in practice we won’t know what it is.

Noisy treatment times

Now let’s consider what happens if \(g_i^*\) is measured with noise. We let \(d_i^*\) be a random integer uniformly distributed between -3 and 3. If we then run CSDID without anticipation we get:

simulate(diff_min=-3, diff_max=3, shift=0)

The estimated treatment effects are heavily attenuated. This is somewhat expected, because the comparison group is always \(t=g_i-1\), which is actually in various stages of treatment exposure. Intuitively, this is like attenuation bias due to measurement error.

If we allow for an anticipation of up to 3 periods, we get:

simulate(diff_min=-3, diff_max=3, ant=3)

And if we shift the observed treatment time back by 3 periods (so that all of the actual treatments happen after the shifted treatment time), we get:

simulate(diff_min=-3, diff_max=3, shift=3)

Allowing for anticipation or shifting the timing both improve the estimates significantly. Shifting the treatment times seems to produce a cleaner result. There is still attenuation bias in the period of time for which there is uncertainty about which units are actually treated or not. However, by the time the length of exposure hits 6, we can be sure all units are treated and a treatment effect of 1 is correctly estimated.

Conclusion

The CSDID estimator is quite sensitive to noise in the observed treatment time. The built-in anticipation option seems to help, though there still appear to be some issues that I still don’t fully understand. If one suspects noise in the observed treatment time, it seems like the safest option when using CSDID is to simply shift the treatment times back a number of periods.