Title: | Finding Heterogeneous Treatment Effects |
---|---|
Description: | The heterogeneous treatment effect estimation procedure proposed by Imai and Ratkovic (2013)<DOI: 10.1214/12-AOAS593>. The proposed method is applicable, for example, when selecting a small number of most (or least) efficacious treatments from a large number of alternative treatments as well as when identifying subsets of the population who benefit (or are harmed by) a treatment of interest. The method adapts the Support Vector Machine classifier by placing separate LASSO constraints over the pre-treatment parameters and causal heterogeneity parameters of interest. This allows for the qualitative distinction between causal and other parameters, thereby making the variable selection suitable for the exploration of causal heterogeneity. The package also contains a class of functions, CausalANOVA, which estimates the average marginal interaction effects (AMIEs) by a regularized ANOVA as proposed by Egami and Imai (2019)<DOI:10.1080/01621459.2018.1476246>. It contains a variety of regularization techniques to facilitate analysis of large factorial experiments. |
Authors: | Naoki Egami, Marc Ratkovic, Kosuke Imai |
Maintainer: | Naoki Egami <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.0 |
Built: | 2024-11-03 04:51:19 UTC |
Source: | https://github.com/cran/FindIt |
Package: | factorEx |
Type: | Package |
Version: | 1.1.5 |
Date: | 2019-11-19 |
Naoki Egami, Marc Ratkovic and Kosuke Imai.
Maintainer: Naoki Egami [email protected]
Imai, Kosuke and Marc Ratkovic. 2013. “Estimating Treatment Effect Heterogeneity in Randomized Program Evaluation.” Annals of Applied Statistics, Vol.7, No.1(March), pp. 443-470. http://imai.fas.harvard.edu/research/files/svm.pdf
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis. Journal of the American Statistical Association, Vol.114, No.526 (June), pp. 529–540. http://imai.fas.harvard.edu/research/files/int.pdf
This data set gives the outcomes a well as treatment assignments of the conjoint analysis in Carlson (2015). Please Carlson (2015) and Egami and Imai (2019) for more details.
A data frame consisting of 7 columns (including a treatment assignment vector) and 3232 observations.
outcome | integer | whether a profile is chosen | 0,1 |
newRecordF | factor | record as a politician | 7 levels |
promise | factor | platform | 3 levels (job, clinic, education) |
coeth_voting | factor | whether a profile is coethnic to a respodent | Yes, No |
Degree | factor | job whether a profile has relevant degrees | 4 Yes, No |
Data from Carlson (2015).
Carlson, E. 2015. “Ethnic voting and accountability in africa: A choice experiment in uganda.” World Politics 67, 02, 353–385.
CausalANOVA
estimates coefficients of the specified ANOVA with
regularization. By taking differences in coefficients, the function recovers
the AMEs and AMIEs.
CausalANOVA( formula, int2.formula = NULL, int3.formula = NULL, data, nway = 1, pair.id = NULL, diff = FALSE, screen = FALSE, screen.type = "fixed", screen.num.int = 3, collapse = FALSE, collapse.type = "fixed", collapse.cost = 0.3, family = "binomial", cluster = NULL, maxIter = 50, eps = 1e-05, fac.level = NULL, ord.fac = NULL, select.prob = FALSE, boot = 100, seed = 1234, verbose = TRUE )
CausalANOVA( formula, int2.formula = NULL, int3.formula = NULL, data, nway = 1, pair.id = NULL, diff = FALSE, screen = FALSE, screen.type = "fixed", screen.num.int = 3, collapse = FALSE, collapse.type = "fixed", collapse.cost = 0.3, family = "binomial", cluster = NULL, maxIter = 50, eps = 1e-05, fac.level = NULL, ord.fac = NULL, select.prob = FALSE, boot = 100, seed = 1234, verbose = TRUE )
formula |
A formula that specifies outcome and treatment variables. |
int2.formula |
(optional). A formula that specifies two-way interactions. |
int3.formula |
(optional). A formula that specifies three-way interactions. |
data |
An optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment from which 'CausalANOVA' is called. |
nway |
With |
pair.id |
(optional).Unique identifiers for each pair of comparison.
This option is used when |
diff |
A logical indicating whether the outcome is the choice between a
pair. If |
screen |
A logical indicating whether select significant factor
interactions with |
screen.type |
Type for screening factor interactions. (1)
|
screen.num.int |
(optional).The number of factor interactions to
select. This option is used when and |
collapse |
A logical indicating whether to collapse insignificant
levels within factors. With |
collapse.type |
Type for collapsing levels within factors. (1)
|
collapse.cost |
(optional).A cost parameter ranging from 0 to 1. 1 corresponds to no collapsing. The closer to 0, the stronger regularization. Default is 0.3. |
family |
A family of outcome variables. |
cluster |
Unique identifies with which cluster standard errors are computed. |
maxIter |
The number of maximum iteration for |
eps |
A tolerance parameter in the internal optimization algorithm. |
fac.level |
(optional). A vector containing the number of levels in
each factor. The order of |
ord.fac |
(optional). Logical vectors indicating whether each factor
has ordered ( |
select.prob |
(optional). A logical indicating whether selection probabilities are computed. This option might take time. |
boot |
The number of bootstrap replicates for |
seed |
Seed for bootstrap. |
verbose |
Whether it prints the value of a cost parameter used. |
Regularization: screen
and collapse
.
Users can implement regularization in order to reduces false discovery rate and facilitates interpretation. This is particularly useful when analyzing factorial experiments with a large number of factors, each having many levels.
When screen=TRUE
, the function selects
significant factor interactions with glinternet
(Lim and Hastie 2015)
before estimating the AMEs and AMIEs. This option is recommended when there
are many factors, e.g., more than 6 factors. Alternatively, users can
pre-specify interactions of interest using int2.formula
and
int3.formula
.
When collapse=TRUE
, the function collapses
insignificant levels within each factor by GashANOVA (Post and Bondell 2013)
before estimating the AMEs and AMIEs. This option is recommended when there
are many levels within some factors, e.g., more than 6 levels.
Inference after Regularization:
When screen=TRUE
or
collapse=TRUE
, in order to make valid inference after regularization,
we recommend to use test.CausalANOVA
function. It takes the output
from CausalANOVA
function and estimate the AMEs and AMIEs with
newdata
and provide confidence intervals. Ideally, users should split
samples into two; use a half for regularization with CausalANOVA
function and use the other half for inference with test.CausalANOVA
.
If users do not need regularization, specify screen=FALSE
and
collapse=FALSE
. The function estimates the AMEs and AMIEs and compute
confidence intervals with the full sample.
Suggested Workflow: (See Examples below as well)
Specify
the order of levels within each factor using levels()
. When
collapse=TRUE
, the function places penalties on the differences
between adjacent levels when levels are ordered, it is crucial to specify
the order of levels within each factor carefully.
Run
CausalANOVA
.
Specify formula
to indicate
outcomes and treatment variables and nway
to indicate the order of
interactions.
Specify diff=TRUE
and pair.id
if the
outcome is the choice between a pair.
Specify screen
.
screen=TRUE
to implement data-driven selection of factor
interactions. screen=FALSE
to specify interactions through
int2.formula
and int3.formula
by hand.
Specify
collapse
. collapse=TRUE
to implement data-driven collapsing of
insignificant levels. collapse=FALSE
to use the original number of
levels.
Run test.CausalANOVA
when select=TRUE
or
collapse=TRUE
.
Run summary
and plot
to explore
the AMEs and AMIEs.
Estimate conditional effects using
ConditionalEffect
function and visualize them using plot
function.
intercept |
An intercept of the estimated ANOVA model.If
|
formula |
The
|
coefs |
A named vector of coefficients of the estimated ANOVA model. |
vcov |
The
variance-covariance matrix for |
CI.table |
The summary of AMEs and AMIEs
with confidence intervals. Only when |
AME |
The estimated AMEs with the grand-mean as baselines. |
AMIE2 |
The estimated two-way AMIEs with the grand-mean as baselines. |
AMIE3 |
The estimated three-way AMIEs with the grand-mean as baselines. |
... |
arguments passed to the function or arguments only for the internal use. |
Naoki Egami and Kosuke Imai.
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
Lim, M. and Hastie, T. 2015. Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics 24, 3, 627–654.
Post, J. B. and Bondell, H. D. 2013. Factor selection and structural identification in the interaction anova model. Biometrics 69, 1, 70–79.
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Without Screening and Collapsing ## ####################################### #################### only AMEs #################### fit1 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson, pair.id=Carlson$contestresp, diff=TRUE, cluster=Carlson$respcodeS, nway=1) summary(fit1) plot(fit1) #################### AMEs and two-way AMIEs #################### fit2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=2) summary(fit2) plot(fit2, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit2, treat.fac="newRecordF", cond.fac="coeth_voting") ## Not run: #################### AMEs and two-way and three-way AMIEs #################### ## Note: All pairs within thee-way interactions should show up in int2.formula (Strong Hierarchy). fit3 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:promise + newRecordF:coeth_voting + promise:coeth_voting, int3.formula = ~ newRecordF:promise:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=3) summary(fit3) plot(fit3, type="AMIE", fac.name=c("newRecordF","promise", "coeth_voting"),space=25,adj.p=2.2) ## End(Not run) ## ####################################### ## With Screening and Collapsing ## ####################################### ## Sample Splitting train.ind <- sample(unique(Carlson$respcodeS), 272, replace=FALSE) test.ind <- setdiff(unique(Carlson$respcodeS), train.ind) Carlson.train <- Carlson[is.element(Carlson$respcodeS,train.ind), ] Carlson.test <- Carlson[is.element(Carlson$respcodeS,test.ind), ] #################### AMEs and two-way AMIEs #################### fit.r2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson.train, pair.id=Carlson.train$contestresp,diff=TRUE, screen=TRUE, collapse=TRUE, cluster=Carlson.train$respcodeS, nway=2) summary(fit.r2) ## refit with test.CausalANOVA fit.r2.new <- test.CausalANOVA(fit.r2, newdata=Carlson.test, diff=TRUE, pair.id=Carlson.test$contestresp, cluster=Carlson.test$respcodeS) summary(fit.r2.new) plot(fit.r2.new) plot(fit.r2.new, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit.r2.new, treat.fac="newRecordF", cond.fac="coeth_voting")
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Without Screening and Collapsing ## ####################################### #################### only AMEs #################### fit1 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson, pair.id=Carlson$contestresp, diff=TRUE, cluster=Carlson$respcodeS, nway=1) summary(fit1) plot(fit1) #################### AMEs and two-way AMIEs #################### fit2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=2) summary(fit2) plot(fit2, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit2, treat.fac="newRecordF", cond.fac="coeth_voting") ## Not run: #################### AMEs and two-way and three-way AMIEs #################### ## Note: All pairs within thee-way interactions should show up in int2.formula (Strong Hierarchy). fit3 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:promise + newRecordF:coeth_voting + promise:coeth_voting, int3.formula = ~ newRecordF:promise:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=3) summary(fit3) plot(fit3, type="AMIE", fac.name=c("newRecordF","promise", "coeth_voting"),space=25,adj.p=2.2) ## End(Not run) ## ####################################### ## With Screening and Collapsing ## ####################################### ## Sample Splitting train.ind <- sample(unique(Carlson$respcodeS), 272, replace=FALSE) test.ind <- setdiff(unique(Carlson$respcodeS), train.ind) Carlson.train <- Carlson[is.element(Carlson$respcodeS,train.ind), ] Carlson.test <- Carlson[is.element(Carlson$respcodeS,test.ind), ] #################### AMEs and two-way AMIEs #################### fit.r2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson.train, pair.id=Carlson.train$contestresp,diff=TRUE, screen=TRUE, collapse=TRUE, cluster=Carlson.train$respcodeS, nway=2) summary(fit.r2) ## refit with test.CausalANOVA fit.r2.new <- test.CausalANOVA(fit.r2, newdata=Carlson.test, diff=TRUE, pair.id=Carlson.test$contestresp, cluster=Carlson.test$respcodeS) summary(fit.r2.new) plot(fit.r2.new) plot(fit.r2.new, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit.r2.new, treat.fac="newRecordF", cond.fac="coeth_voting")
ConditionalEffect
estimates a variety of conditional effects using
the ouput from CausalANOVA
.
ConditionalEffect( object, treat.fac = NULL, cond.fac = NULL, base.ind = 1, round = 3, inference = NULL, verbose = TRUE )
ConditionalEffect( object, treat.fac = NULL, cond.fac = NULL, base.ind = 1, round = 3, inference = NULL, verbose = TRUE )
object |
The output from |
treat.fac |
The name of factor acting as the main treatment variable. |
cond.fac |
The name of factor acting as the conditioning (moderating) variable. |
base.ind |
An indicator for the baseline of the treatment factor. Default is 1. |
round |
Digits to round estimates. Default is 3. |
inference |
(optional). This argument is mainly for internal use. It
indicates whether |
verbose |
Whether it prints the progress. |
See Details in CausalANOVA
.
CondtionalEffects |
The summary of estimated conditional effects. |
... |
Arguments for the internal use. |
Naoki Egami and Kosuke Imai.
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
Lim, M. and Hastie, T. 2015. Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics 24, 3, 627–654.
Post, J. B. and Bondell, H. D. 2013. “Factor selection and structural identification in the interaction anova model.” Biometrics 69, 1, 70–79.
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Without Screening and Collapsing ## ####################################### #################### AMEs and two-way AMIEs #################### fit2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=2) summary(fit2) plot(fit2, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit2, treat.fac="newRecordF", cond.fac="coeth_voting")
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Without Screening and Collapsing ## ####################################### #################### AMEs and two-way AMIEs #################### fit2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cluster=Carlson$respcodeS, nway=2) summary(fit2) plot(fit2, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit2, treat.fac="newRecordF", cond.fac="coeth_voting")
cv.CausalANOVA
implements cross-validation for CausalANOVA
to
select the collapse.cost
parameter. CausalANOVA
runs this
function internally when defaults when collapse.type=cv.min
or
collapse.type=cv.1Std
.
cv.CausalANOVA( formula, int2.formula = NULL, int3.formula = NULL, data, nway = 1, pair.id = NULL, diff = FALSE, cv.collapse.cost = c(0.1, 0.3, 0.7), nfolds = 5, screen = FALSE, screen.type = "fixed", screen.num.int = 3, family = "binomial", cluster = NULL, maxIter = 50, eps = 1e-05, seed = 1234, fac.level = NULL, ord.fac = NULL, verbose = TRUE )
cv.CausalANOVA( formula, int2.formula = NULL, int3.formula = NULL, data, nway = 1, pair.id = NULL, diff = FALSE, cv.collapse.cost = c(0.1, 0.3, 0.7), nfolds = 5, screen = FALSE, screen.type = "fixed", screen.num.int = 3, family = "binomial", cluster = NULL, maxIter = 50, eps = 1e-05, seed = 1234, fac.level = NULL, ord.fac = NULL, verbose = TRUE )
formula |
a formula that specifies outcome and treatment variables. |
int2.formula |
(optional). A formula that specifies two-way interactions. |
int3.formula |
(optional). A formula that specifies three-way interactions. |
data |
an optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment from which 'CausalANOVA' is called. |
nway |
With |
pair.id |
(optional).Unique identifiers for each pair of comparison.
This option is used when |
diff |
A logical indicating whether the outcome is the choice between a
pair. If |
cv.collapse.cost |
A vector containing candidates for a cost parameter
ranging from 0 to 1. 1 corresponds to no regularization and the smaller
value corresponds to the stronger regularization. Default is
|
nfolds |
number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. |
screen |
A logical indicating whether select significant factor
interactions with |
screen.type |
Type for screening factor interactions. (1)
|
screen.num.int |
(optional).The number of factor interactions to
select. This option is used when and |
family |
A family of outcome variables. |
cluster |
Unique identifies with which cluster standard errors are computed. |
maxIter |
The number of maximum iteration for |
eps |
A tolerance parameter in the internal optimization algorithm. |
seed |
an argument for |
fac.level |
optional. A vector containing the number of levels in each
factor. The order of |
ord.fac |
optional. logical vectors indicating whether each factor has
ordered ( |
verbose |
whether it prints the value of a cost parameter used. |
See Details in CausalANOVA
.
cv.error |
The mean cross-validated error - a vector of length
|
cv.min |
A value of |
cv.1Std |
The largest value of |
cv.each.mat |
A matrix containing cross-validation errors for each fold and cost parameter. |
cv.cost |
The |
Naoki Egami and Kosuke Imai.
Post, J. B. and Bondell, H. D. 2013. “Factor selection and structural identification in the interaction anova model.” Biometrics 69, 1, 70–79.
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Collapsing Without Screening ## ####################################### #################### AMEs and two-way AMIEs #################### ## We show a very small example for illustration. ## Recommended to use cv.collapse.cost=c(0.1,0.3,0.5) and nfolds=10 in practice. fit.cv <- cv.CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cv.collapse.cost=c(0.1,0.3), nfolds=2, cluster=Carlson$respcodeS, nway=2) fit.cv
data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## ####################################### ## Collapsing Without Screening ## ####################################### #################### AMEs and two-way AMIEs #################### ## We show a very small example for illustration. ## Recommended to use cv.collapse.cost=c(0.1,0.3,0.5) and nfolds=10 in practice. fit.cv <- cv.CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, int2.formula = ~ newRecordF:coeth_voting, data=Carlson, pair.id=Carlson$contestresp,diff=TRUE, cv.collapse.cost=c(0.1,0.3), nfolds=2, cluster=Carlson$respcodeS, nway=2) fit.cv
FindIt
returns a model with the most predictive treatment-treatment
interactions or treatment-covariate interactions.
FindIt( model.treat, model.main, model.int, data = NULL, type = "binary", treat.type = "multiple", nway, search.lambdas = TRUE, lambdas = NULL, make.twoway = TRUE, make.allway = TRUE, wts = 1, scale.c = 1, scale.int = 1, fit.glmnet = TRUE, make.reference = TRUE, reference.main = NULL, threshold = 0.999999 )
FindIt( model.treat, model.main, model.int, data = NULL, type = "binary", treat.type = "multiple", nway, search.lambdas = TRUE, lambdas = NULL, make.twoway = TRUE, make.allway = TRUE, wts = 1, scale.c = 1, scale.int = 1, fit.glmnet = TRUE, make.reference = TRUE, reference.main = NULL, threshold = 0.999999 )
model.treat |
A formula that specifies outcome and treatment variables. |
model.main |
An optional formula that specifies pre-treatment covariates to be adjusted. |
model.int |
A formula specifying pre-treatment covariates to be
interacted with treatment assignments when |
data |
An optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment from which 'FindIt' is called. |
type |
"binary" for a binary outcome variable, which needs to be
|
treat.type |
"single" for interactions between a single treatment
variable, which needs to be |
nway |
An argument passed to |
search.lambdas |
Whether to search for the tuning parameters for the
LASSO constraints. If |
lambdas |
Tuning parameters to be given to |
make.twoway |
If |
make.allway |
If |
wts |
An optional set of scaling weights. The default is 1. |
scale.c |
A set of weights for recaling the pre-treatment covariates;
only used if |
scale.int |
A set of weights for recaling the covariates to be
interacted with treatment variables ; only used if |
fit.glmnet |
Whether to fit using the coordinate descent method in glmnet (TRUE) or the regularization path method of LARS (FALSE). |
make.reference |
Whether to make a reference matrix to check which
columns are dropped when |
reference.main |
If |
threshold |
An argument passed to |
Implements the alternating line search algorithm for estimating the tuning parameters, as described in Imai and Ratkovic (2013).
coefs |
A named vector of scaled coefficients |
coefs.orig |
A vector of coefficients on the original scale, if scale.c and scale.t was used |
fit |
Fitted values on an SVM scale |
names.out |
Names of the coefficients |
y |
A vector of observed outcomes |
X.c |
A matrix of pre-treatment covariates to be adjusted |
X.t |
A matrix of treatments and treatment-treatment interactions, or treatment-covariate interactions |
GCV |
GCV statistic at the minimum |
ATE |
When |
lambdas |
Tuning parameters used for the fit |
reference |
When |
Naoki Egami, Marc Ratkovic and Kosuke Imai.
Imai, Kosuke and Marc Ratkovic. 2013. “Estimating Treatment Effect Heterogeneity in Randomized Program Evaluation.” Annals of Applied Statistics, Vol.7, No.1(March), pp. 443-470. http://imai.fas.harvard.edu/research/files/svm.pdf
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
################################################### ## Example 1: Treatment-Covariate Interaction ################################################### data(LaLonde) ## The model includes a treatment variable, ## nine covariates to be interacted with the treatment variable, ## and the same nine covariates to be adjusted. ## Not run: ## Run to find the LASSO parameters F1 <-FindIt(model.treat= outcome ~ treat, model.main= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, model.int= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, data = LaLonde, type="binary", treat.type="single") ## End(Not run) ## Fit with uncovered lambda parameters. F1 <-FindIt(model.treat= outcome ~ treat, model.main= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, model.int= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, data = LaLonde, type="binary", treat.type="single", search.lambdas=FALSE, lambdas = c(-3.8760,-4.0025) ) summary(F1) ## Returns all the estimated treatment effects. pred1 <- predict(F1) ## Top10 head(pred1$data, n=10) ## Bottom 10 tail(pred1$data ,n=10) ## Visualize all the estimated treatment effects. ## Not run: plot(pred1) ## End(Not run) ################################################### ## Example 2: Treatment-Treatment Interaction ################################################### ## Not run: data(GerberGreen) ## The model includes four factorial treatments and ## all two, three, four-way interactions between them. ## Four pre-treatment covariates are adjusted. ## Run to search for lambdas. F2<- FindIt(model.treat= voted98 ~ persngrp+phnscrpt+mailings+appeal, nway=4, model.main= ~ age+majorpty+vote96.1+vote96.0, data = GerberGreen, type="binary", treat.type="multiple") ## Fit, given selected lambdas. F2<- FindIt(model.treat= voted98 ~ persngrp+phnscrpt+mailings+appeal, nway=4, model.main= ~ age+majorpty+vote96.1+vote96.0, data = GerberGreen, type="binary", treat.type="multiple", search.lambdas=FALSE, lambdas=c(-15.000,-6.237)) ## Returns coefficient estimates. summary(F2) ## Returns predicted values for unique treatment combinations. pred2 <- predict(F2,unique=TRUE) ## Top 10 head(pred2$data, n=10) ## Bottom 10 tail(pred2$data, n=10) ## Visualize predicted values for each treatment combination. plot(pred2) ## End(Not run)
################################################### ## Example 1: Treatment-Covariate Interaction ################################################### data(LaLonde) ## The model includes a treatment variable, ## nine covariates to be interacted with the treatment variable, ## and the same nine covariates to be adjusted. ## Not run: ## Run to find the LASSO parameters F1 <-FindIt(model.treat= outcome ~ treat, model.main= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, model.int= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, data = LaLonde, type="binary", treat.type="single") ## End(Not run) ## Fit with uncovered lambda parameters. F1 <-FindIt(model.treat= outcome ~ treat, model.main= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, model.int= ~ age+educ+black+hisp+white+ marr+nodegr+log.re75+u75, data = LaLonde, type="binary", treat.type="single", search.lambdas=FALSE, lambdas = c(-3.8760,-4.0025) ) summary(F1) ## Returns all the estimated treatment effects. pred1 <- predict(F1) ## Top10 head(pred1$data, n=10) ## Bottom 10 tail(pred1$data ,n=10) ## Visualize all the estimated treatment effects. ## Not run: plot(pred1) ## End(Not run) ################################################### ## Example 2: Treatment-Treatment Interaction ################################################### ## Not run: data(GerberGreen) ## The model includes four factorial treatments and ## all two, three, four-way interactions between them. ## Four pre-treatment covariates are adjusted. ## Run to search for lambdas. F2<- FindIt(model.treat= voted98 ~ persngrp+phnscrpt+mailings+appeal, nway=4, model.main= ~ age+majorpty+vote96.1+vote96.0, data = GerberGreen, type="binary", treat.type="multiple") ## Fit, given selected lambdas. F2<- FindIt(model.treat= voted98 ~ persngrp+phnscrpt+mailings+appeal, nway=4, model.main= ~ age+majorpty+vote96.1+vote96.0, data = GerberGreen, type="binary", treat.type="multiple", search.lambdas=FALSE, lambdas=c(-15.000,-6.237)) ## Returns coefficient estimates. summary(F2) ## Returns predicted values for unique treatment combinations. pred2 <- predict(F2,unique=TRUE) ## Top 10 head(pred2$data, n=10) ## Bottom 10 tail(pred2$data, n=10) ## Visualize predicted values for each treatment combination. plot(pred2) ## End(Not run)
This data set contains the most recent corrected data from the field experiment analyzed in Gerber and Green (2000).
A data frame consisting of 9 columns and 29,380 observations.
voted98 | integer | voted in 1998 | 0,1 |
persngrp | factor | personal contact attempted | 0,1 |
phnscrpt | factor | script read to phone respondents | 7 levels |
mailings | factor | number of mailings sent | 0 - 3 |
appeal | factor | content of message | 3 levels |
age | integer | age of respondent | |
majorpty | factor | Democratic or Republican | |
voted96.1 | factor | voted in 1996 | 0,1 |
voted96.0 | factor | abstained in 1996 | 0,1 |
Note: The levels of phnscrpt and appeal are follows.
phnscrpt: Script read to phone respondents
0 | No phone |
1 | Civic-Blood |
2 | Civic |
3 | Civic or Blood-Civic |
4 | Neighbor |
5 | Neighbor or Civic-Neighbor |
6 | Close |
appeal: Content of message
1 | Civic Duty |
2 | Neighborhood Solidarity |
3 | Close Election |
Gerber, A. S. and Green, D. P. 2000 . “The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment.” American Political Science Review, Vol.94, No.3, pp. 653-663.
Imai, K. 2005 . “Do get-out-the-vote calls reduce turnout?: The importance of statistical methods for field experiments."" American Political Science Review, Vol.99, No.2, pp. 283-300.
This data set gives the outcomes a well as treatment assignments and covariates for the National Supported Work Study, as analyzed in LaLonde (1986).
A data frame consisting of 12 columns (including a treatment assignment vector) and 2787 observations.
outcome | integer | whether earnings in 1978 are larger than in 1975 | 0,1 |
treat | integer | whether the individual received the treatment | 0,1 |
age | numeric | age in years | |
educ | numeric | education in years | |
black | factor | black or not | 0,1 |
hisp | factor | hispanic or not | 0,1 |
white | factor | white or not | 0,1 |
marr | factor | married or not | 0,1 |
nodegr | factor | an indicator for no high school degree | 0,1 |
log.re75 | numeric | log of earnings in 1975 | |
u75 | factor | unemployed in 1975 | 0,1 |
wts.extrap | numeric | extrapolation weights to the 1978 Panel Study for Income Dynamics dataset |
Data from the National Supported Work Study. A benchmark matching dataset. 1975 earnings are pre-treatment.
LaLonde, R.J. 1986. “Evaluating the econometric evaulations of training programs with experimental data."" American Economic Review, Vol.76, No.4, pp. 604-620.
Plotting CausalANOVA
## S3 method for class 'CausalANOVA' plot( x, fac.name, treat.ind = 1, type = "ConditionalEffect", space = 15, xlim, ... )
## S3 method for class 'CausalANOVA' plot( x, fac.name, treat.ind = 1, type = "ConditionalEffect", space = 15, xlim, ... )
x |
An output from |
fac.name |
Factor names to plot. Length should be 2. |
treat.ind |
Which factor serves as the main treatment. Should be 1 (the first element of |
type |
What types of effects to plot. Should be one of |
space |
Space on the left side of the plot. |
xlim |
Range for the x-axis |
... |
Other graphical parameters |
Plot estimated treatment effects when treat.type="single"
and
predicted outcomes for each treatment combination when
treat.type="multiple"
.
## S3 method for class 'PredictFindIt' plot(x, main, xlab, ylab, interactive = FALSE, ...)
## S3 method for class 'PredictFindIt' plot(x, main, xlab, ylab, interactive = FALSE, ...)
x |
output from |
main |
the argument specifying the main title of the plot. |
xlab |
the argument specifying the name of x axis. |
ylab |
the argument specifying the name of y axis. |
interactive |
whether to make a plot interactive; default is FALSE. |
... |
further arguments passed to or from other methods. |
Plot estimated treatment effects when treat.type="single"
and
predicted outcomes for each treatment combination when
treat.type="multiple"
.
plot |
Plot estimated treatment effects when
|
Naoki Egami, Marc Ratkovic and Kosuke Imai.
## See the help page for FindIt() for an example.
## See the help page for FindIt() for an example.
predict.FindIt
takes an output from FindIt
and returns
estimated treatment effects when treat.type="single"
and predicted
outcomes for each treatment combination when treat.type="multiple"
.
## S3 method for class 'FindIt' predict( object, newdata, sort = TRUE, decreasing = TRUE, wts = 1, unique = FALSE, ... )
## S3 method for class 'FindIt' predict( object, newdata, sort = TRUE, decreasing = TRUE, wts = 1, unique = FALSE, ... )
object |
An output object from |
newdata |
An optional data frame in which to look for variables with
which to predict. If omitted, the data used in |
sort |
Whether to sort samples according to estimated treatment effects. |
decreasing |
When |
wts |
Weights. |
unique |
If |
... |
further arguments passed to or from other methods. |
Useful for computing estimated treatment effects or predicted outcomes for
each treatment combination. By using newdata
, researchers can compute
them for any samples.
data |
A matrix of estimated treatment effects when
|
Naoki Egami, Marc Ratkovic and Kosuke Imai.
## See the help page for FindIt() for an example.
## See the help page for FindIt() for an example.
Summarizing CausalANOVA output
## S3 method for class 'CausalANOVA' summary(object, digit = 4, verbose = TRUE, verbose.full = TRUE, ...)
## S3 method for class 'CausalANOVA' summary(object, digit = 4, verbose = TRUE, verbose.full = TRUE, ...)
object |
An object from |
digit |
the number of digits |
verbose |
report additional summary results |
verbose.full |
report full summary results |
... |
Other parameters |
Summarizing FindIt output
## S3 method for class 'FindIt' summary(object, ...)
## S3 method for class 'FindIt' summary(object, ...)
object |
An object from |
... |
Other parameters |
test.CausalANOVA
estimates the AMEs and AMIEs with confidence
intervals after regularization with CausalANOVA
function.
test.CausalANOVA( fit, newdata, collapse.level = TRUE, diff = FALSE, pair.id = NULL, cluster = NULL )
test.CausalANOVA( fit, newdata, collapse.level = TRUE, diff = FALSE, pair.id = NULL, cluster = NULL )
fit |
The output from |
newdata |
A data frame to use for re-estimating the AMEs and AMIEs with confidence intervals. |
collapse.level |
A logical indicating whether to collapse insignificant
levels within factors as suggested by the |
diff |
A logical indicating whether the outcome is the choice between a
pair. If |
pair.id |
(optional).Unique identifiers for each pair of comparison.
This option is used when |
cluster |
Unique identifies with which cluster standard errors are computed. |
See Details in CausalANOVA
.
fit |
The output of class |
Naoki Egami and Kosuke Imai.
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
Lim, M. and Hastie, T. 2015. Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics 24, 3, 627–654.
Post, J. B. and Bondell, H. D. 2013. “Factor selection and structural identification in the interaction anova model.” Biometrics 69, 1, 70–79.
## ####################################### ## With Screening and Collapsing ## ####################################### data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## Sample Splitting train.ind <- sample(unique(Carlson$respcodeS), 272, replace=FALSE) test.ind <- setdiff(unique(Carlson$respcodeS), train.ind) Carlson.train <- Carlson[is.element(Carlson$respcodeS,train.ind), ] Carlson.test <- Carlson[is.element(Carlson$respcodeS,test.ind), ] #################### AMEs and two-way AMIEs #################### fit.r2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson.train, pair.id=Carlson.train$contestresp,diff=TRUE, screen=TRUE, collapse=TRUE, cluster=Carlson.train$respcodeS, nway=2) summary(fit.r2) ## refit with test.CausalANOVA fit.r2.new <- test.CausalANOVA(fit.r2, newdata=Carlson.test, diff=TRUE, pair.id=Carlson.test$contestresp, cluster=Carlson.test$respcodeS) summary(fit.r2.new) plot(fit.r2.new) plot(fit.r2.new, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit.r2.new, treat.fac="newRecordF", cond.fac="coeth_voting")
## ####################################### ## With Screening and Collapsing ## ####################################### data(Carlson) ## Specify the order of each factor Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE, levels=c("YesLC", "YesDis","YesMP", "noLC","noDis","noMP","noBusi")) Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education")) Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1")) Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1")) ## Sample Splitting train.ind <- sample(unique(Carlson$respcodeS), 272, replace=FALSE) test.ind <- setdiff(unique(Carlson$respcodeS), train.ind) Carlson.train <- Carlson[is.element(Carlson$respcodeS,train.ind), ] Carlson.test <- Carlson[is.element(Carlson$respcodeS,test.ind), ] #################### AMEs and two-way AMIEs #################### fit.r2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree, data=Carlson.train, pair.id=Carlson.train$contestresp,diff=TRUE, screen=TRUE, collapse=TRUE, cluster=Carlson.train$respcodeS, nway=2) summary(fit.r2) ## refit with test.CausalANOVA fit.r2.new <- test.CausalANOVA(fit.r2, newdata=Carlson.test, diff=TRUE, pair.id=Carlson.test$contestresp, cluster=Carlson.test$respcodeS) summary(fit.r2.new) plot(fit.r2.new) plot(fit.r2.new, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting")) ConditionalEffect(fit.r2.new, treat.fac="newRecordF", cond.fac="coeth_voting")