Estimate DiD for all possible cohorts and event time pairs (g,e), as well as the average across cohorts for each event time (e).
Usage
DiD(
inputdata,
varnames,
control_group = "all",
base_event = -1,
min_event = NULL,
max_event = NULL,
Esets = NULL,
return_ATTs_only = TRUE,
parallel_cores = 1
)
Arguments
- inputdata
A data.table.
- varnames
A list of the form varnames = list(id_name, time_name, outcome_name, cohort_name), where all four arguments of the list must be a character that corresponds to a variable name in inputdata.
- control_group
There are three possibilities: control_group="never-treated" uses the never-treated control group only; control_group="future-treated" uses those units that will receive treatment in the future as the control group; and control_group="all" uses both the never-treated and the future-treated in the control group. Default is control_group="all".
- base_event
This is the base pre-period that is normalized to zero in the DiD estimation. Default is base_event=-1.
- min_event
This is the minimum event time (e) to estimate. Default is NULL, in which case, no minimum is imposed.
- max_event
This is the maximum event time (e) to estimate. Default is NULL, in which case, no maximum is imposed.
- Esets
If a list of sets of event times is provided, it will loop over those sets, computing the average ATT_e across event times e. Default is NULL.
- return_ATTs_only
Return only the ATT estimates and sample sizes. Default is TRUE.
- parallel_cores
Number of cores to use in parallel processing. If greater than 1, it will try to run library(parallel), so the "parallel" package must be installed. Default is 1.
Value
A list with two components: results_cohort is a data.table with the DiDge estimates (by event e and cohort g), and results_average is a data.table with the DiDe estimates (by event e, average across cohorts g). If the Esets argument is specified, a third component called results_Esets will be included in the list of output.
Examples
# simulate some data
simdata = SimDiD(sample_size=200, ATTcohortdiff = 2)$simdata
# define the variable names as a list()
varnames = list()
varnames$time_name = "year"
varnames$outcome_name = "Y"
varnames$cohort_name = "cohort"
varnames$id_name = "id"
# estimate the ATT for all cohorts at event time 1 only
DiD(simdata, varnames, min_event=1, max_event=1)
#> $results_cohort
#> Cohort EventTime BaseEvent CalendarTime ATTge ATTge_SE Ncontrol Ntreated
#> <num> <num> <num> <num> <num> <num> <int> <int>
#> 1: 2007 1 -1 2008 1.674411 0.2372155 151 49
#> 2: 2010 1 -1 2011 4.167621 0.2547631 101 50
#> 3: 2012 1 -1 2013 5.959456 0.2787505 51 50
#>
#> $results_average
#> Key: <EventTime>
#> EventTime BaseEvent ATTe ATTe_SE Ncontrol Ntreated
#> <num> <num> <num> <num> <int> <int>
#> 1: 1 -1 3.948993 0.1535922 303 149
#>