Skip to contents

Estimate DiD for all possible cohorts and event time pairs (g,e), as well as the average across cohorts for each event time (e).

Usage

DiD(
  inputdata,
  varnames,
  control_group = "all",
  base_event = -1,
  min_event = NULL,
  max_event = NULL,
  Esets = NULL,
  return_ATTs_only = TRUE,
  parallel_cores = 1
)

Arguments

inputdata

A data.table.

varnames

A list of the form varnames = list(id_name, time_name, outcome_name, cohort_name), where all four arguments of the list must be a character that corresponds to a variable name in inputdata.

control_group

There are three possibilities: control_group="never-treated" uses the never-treated control group only; control_group="future-treated" uses those units that will receive treatment in the future as the control group; and control_group="all" uses both the never-treated and the future-treated in the control group. Default is control_group="all".

base_event

This is the base pre-period that is normalized to zero in the DiD estimation. Default is base_event=-1.

min_event

This is the minimum event time (e) to estimate. Default is NULL, in which case, no minimum is imposed.

max_event

This is the maximum event time (e) to estimate. Default is NULL, in which case, no maximum is imposed.

Esets

If a list of sets of event times is provided, it will loop over those sets, computing the average ATT_e across event times e. Default is NULL.

return_ATTs_only

Return only the ATT estimates and sample sizes. Default is TRUE.

parallel_cores

Number of cores to use in parallel processing. If greater than 1, it will try to run library(parallel), so the "parallel" package must be installed. Default is 1.

Value

A list with two components: results_cohort is a data.table with the DiDge estimates (by event e and cohort g), and results_average is a data.table with the DiDe estimates (by event e, average across cohorts g). If the Esets argument is specified, a third component called results_Esets will be included in the list of output.

Examples

# simulate some data
simdata = SimDiD(sample_size=200, ATTcohortdiff = 2)$simdata

# define the variable names as a list()
varnames = list()
varnames$time_name = "year"
varnames$outcome_name = "Y"
varnames$cohort_name = "cohort"
varnames$id_name = "id"

# estimate the ATT for all cohorts at event time 1 only
DiD(simdata, varnames, min_event=1, max_event=1)
#> $results_cohort
#>    Cohort EventTime BaseEvent CalendarTime    ATTge  ATTge_SE Ncontrol Ntreated
#> 1:   2007         1        -1         2008 1.674411 0.2372155      151       49
#> 2:   2010         1        -1         2011 4.167621 0.2547631      101       50
#> 3:   2012         1        -1         2013 5.959456 0.2787505       51       50
#> 
#> $results_average
#>    EventTime BaseEvent     ATTe   ATTe_SE Ncontrol Ntreated
#> 1:         1        -1 3.948993 0.1535922      303      149
#>