Zelig September 2011

zelig@lists.gking.harvard.edu

4 participants
6 discussions

Re: MatchIt Zelig and multiple imputation

by Kosuke Imai

Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig(). Good luck, Kosuke Department of Politics Princeton University http://imai.princeton.edu On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote: > Dear Professor, > I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible. > > My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include: > > 1/ Doing multiple imputation and testing which variables to include. > > 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?). > > 3/ Calculation of the quantities of interest for each data set > > 4/ Pooling the quantities across data sets. > > I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together. > > I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills). > > > Best Regards! > > > > Jean-Baptiste > - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

10 years, 8 months

summary with imputations bug ... any workarounds?

by Peter Flom

Good morning If I run <<< susan.lsmixed.out <- zelig(formula = unprot_vag_sex ~ married + age + TREATMENT.ARM*time + highest_grade + income + tag(1|id), data = susanMI.out$imputations, model = "ls.mixed") summary(susan.lsmixed.out) >>>> I get an error Error in x$coef : $ operator is invalid for atomic vectors Searching the archives, I see that others have had similar problems. Is there a workaround? summary(susan.lsmixed.out[[1]]) works fine; should I then average across the five imputed data sets? thanks! Peter Peter L. Flom, PhD Statistical Consultant Website: http://www DOT statisticalanalysisconsulting DOT com/ Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

12 years, 1 month

mixed.probit, gamma specification

by Jason McMann

Hello, I am trying to run a multilevel probit model using Zelig, but keep receiving the following error message: " in .deparseTag(TT.vars[[vind]]) : wrong use of tag function!!" A simplified version of the model I am trying to run is: z.out <- zelig(formula= list(mu=investment.binary ~ edlevel + tag(1 + edlevel, gamma | country), gamma = ~ tag(GDPpc06.full| country)), data=data2006.mod1, model="probit.mixed") What I would like to do is allow the intercept and the edlevel variable listed within the first tag() to vary by country as a function of the GDPpc06.full variable, all of which are included in the same dataframe. I followed the syntax here - http://cran.r-project.org/web/packages/Zelig /vignettes/probit.mixed.pdf - but I think that I am incorrectly specifying the gamma part of the syntax, which may be causing the error. I *am* able to get the model to run when I allow the intercept and edlevel variable to vary using the following syntax: z.out <- zelig(investment.binary ~ edlevel + + tag(1 + edlevel | country), data=data2006.mod1, model="probit.mixed") However, this syntax does not allow me to specify that the intercept and edlevel variable should vary as a function of GDPpc06.full, as in the first model specified above. I have tried including multiple tags at the non-group level of the model specification - i.e. one for the intercept and one for the edlevel variable - but this does not seem to work either. Do you have any suggestions for how to fix the syntax? Sincerely, Jason -- Jason I. McMann PhD Student | Department of Politics Princeton University | jmcmann(a)princeton.edu

12 years, 5 months

Re: MatchIt Zelig and multiple imputation

by Kosuke Imai

The argument "weights" takes the name of the variable. So, you should try something like: weights = "weights" Best, Kosuke Department of Politics Princeton University http://imai.princeton.edu On Sep 19, 2011, at 11:45 AM, Pingaul jb wrote: > Dear professor, > Thanks for your answer! I finally built on MatchIt to write quick functions to help in the matching with multiple imputation (equivalent to matchit, summary and match.data). I don't think they are very elegant but I send them to you anyway now that they are done (with a csv file with data as an example). > More importantly, I get an error warning with a syntax adapted from Ho et al. (2011) with MatchIt to calculate ATT. The syntax with the article is with method=”nearest” with no replacement. I tried with replacement. Therefore, it seems I need to introduce weights when estimating the model on the controls. But when I apply the resulting model on the treated I get a problem with different variables length for the weights. To make sure the control group is well matched I think I must introduce the weights anyway but I’m unsure how to do it. Under is my syntax with the lalonde data. > library(MatchIt) > library(Zelig) > data(lalonde) > m.out0 <- matchit(treat ~ age + educ + black + hispan + nodegree + married + re74 + re75, method = "nearest",replace=T, data = lalonde) > datacontrol= match.data(m.out0, "control") > summary(m.out0) > datatreat=match.data(m.out0, "treat") > z.out1 <- zelig(re78 ~ age + educ + black + hispan + nodegree + married + re74 + re75, data = datacontrol,weights=datacontrol$weights, model = "ls") > x.out1 <- setx(z.out1, data = datatreat, cond = TRUE) > s.out1 <- sim(z.out1, x = x.out1) > > > Best regards, > > Jean-Baptiste > > --- En date de : Jeu 15.9.11, Kosuke Imai <kimai(a)Princeton.EDU> a écrit : > > De: Kosuke Imai <kimai(a)Princeton.EDU> > Objet: Re: MatchIt Zelig and multiple imputation > À: "Pingaul jb" <pingaultjb(a)yahoo.fr> > Cc: "matchit" <matchit(a)lists.gking.harvard.edu>, "zelig(a)lists.gking.harvard.edu" <zelig(a)lists.gking.harvard.edu> > Date: Jeudi 15 septembre 2011, 5h03 > > Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig(). > > Good luck, > Kosuke > > Department of Politics > Princeton University > http://imai.princeton.edu > > > On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote: > > > Dear Professor, > > I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible. > > > > My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include: > > > > 1/ Doing multiple imputation and testing which variables to include. > > > > 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?). > > > > 3/ Calculation of the quantities of interest for each data set > > > > 4/ Pooling the quantities across data sets. > > > > I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together. > > > > I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills). > > > > > > Best Regards! > > > > > > > > Jean-Baptiste > > > > <MatchItMI.txt><DataExample.csv> - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

12 years, 7 months

Re: Problem with bprobit

by Kosuke Imai

It's possible that as it is implemented, the bprobit in R does not take an endogenous variable... Best, Kosuke Department of Politics Princeton University http://imai.princeton.edu On Sep 20, 2011, at 2:06 AM, Stefanie Schurer wrote: > Hi Kosuke, > > I am a problem with Zelig’s bivariate probit model. I am trying to replicate a study by Carrasco 2001, JBES, which estimates jointly labour supply and fertility with a bivariate probit model. I can replicate her binary response results with R, but once using zelig’s - bprobit – command I get 14 error messages and totally nonsensical results. The problem is that when estimating the exact same model with STATA’s – biprobit -- command, I am able to replicate Carrasco’s results. Is there any known bug in bprobit which I happen to be not aware of? > > This is what I programmed (please note that this is a recursive model in which f = fertility is an endogenous RHS variable, and thus is separately modelled in the second equation, using an instrument for identification “dsex”). > > Any help would be highly appreciated as I intend to teach this to my third year econometrics students next Thursday. > > Cheers, > Stefi > > ###### > > fml <- list(mu1 = dhw ~ f + ags26l + fxag26l + educ2 + educ3 + drace + age + income + dhwl, mu2 = f ~ ags26l + educ2 + educ3 + drace + age + income + dsex) > z.out <- zelig(fml, model = "blogit", data = mydata) > z.out > > Below is the errors I get > > Warning messages: > 1: glm.fit: algorithm did not converge > 2: In checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) : > 805 elements replaced by 1.819e-12 > 3: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 4: In checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) : > 2064 elements replaced by 1.819e-12 > 5: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 6: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 7: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 8: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 9: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 10: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 11: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 12: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 13: In tfun(mu = mu, y = y, w = w, res = FALSE, eta = eta, ... : > fitted values close to 0 or 1 > 14: In eval(expr, envir, enclos) : > iterations terminated because half-step sizes are very small - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

12 years, 7 months

Problem in specifying a proxy variable in setx()

by Francois Maurice

Hi, I'm trying to do elegant coding, but I have trouble with setx(). I defined two objects, tx.var, wich contains the name of the treatment variable, and causal.model, which contains the model used in zelig() (see the code below). Everything works fine except in setx(). When I specify tx.var instead of TREAT, which is the name of the treatment variable, sim() produce zero effect. But when I specify TREAT, sim() produces a quantity. Is there something I can do to correct this ? tx.var <- c("TREAT") causal.model <- QASBAT ~ TREAT + T + T2 + T3 + tag(T | ID.factor) (some other codes here) z.out.1 <- zelig(formula= as.formula(causal.model), data=matched.1.mtch.long, model="ls.mixed") x.out.0.1 <- setx(z.out.1, fn=NULL, tx.var=0) x.out.1.1 <- setx(z.out.1, fn=NULL, tx.var=1) s.out.1 <- sim(z.out.1, x=x.out.0.1, x1=x.out.1.1) Merci, François Maurice, B. Sc., A. Stat. Candidat à la maîtrise Département de sociologie Université de Montréal

12 years, 8 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Zelig September 2011