Zelig September 2012

zelig@lists.gking.harvard.edu

6 participants
6 discussions

Re: MatchIt Zelig and multiple imputation

by Kosuke Imai

Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig(). Good luck, Kosuke Department of Politics Princeton University http://imai.princeton.edu On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote: > Dear Professor, > I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible. > > My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include: > > 1/ Doing multiple imputation and testing which variables to include. > > 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?). > > 3/ Calculation of the quantities of interest for each data set > > 4/ Pooling the quantities across data sets. > > I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together. > > I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills). > > > Best Regards! > > > > Jean-Baptiste > - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

10 years, 8 months

ZeligMultinomial

by Josue Guzman

*Please, let me know how can I install ZeligMultinomial package. I want to use mlogit, which according to the manual (page 50), is found in said package. I tried with the command, install.packages("ZeligMultinomial", repos="http://r.iq.harvard.edu/", type="source") But received the following message: Warning: dependency 'MNP' is not available trying URL ' http://r.iq.harvard.edu/src/contrib/ZeligMultinomial_0.5-4.tar.gz' Content type 'application/x-gzip' length 9730 bytes opened URL ================================================== downloaded 9730 bytes During startup - Warning messages: 1: Setting LC_CTYPE failed, using "C" 2: Setting LC_TIME failed, using "C" 3: Setting LC_MESSAGES failed, using "C" 4: Setting LC_PAPER failed, using "C" ERROR: dependency 'MNP' is not available for package 'ZeligMultinomial' * removing '/Library/Frameworks/R.framework/Versions/2.14/Resources/library/ZeligMultinomial' The downloaded packages are in '/private/var/folders/UL/ULLu+bi5GR8IM1G-7XZBJU+++TI/-Tmp-/Rtmp6jw6M9/downloaded_packages' Warning message: In install.packages("ZeligMultinomial", repos = "http://r.iq.harvard.edu/", : installation of package 'ZeligMultinomial' had non-zero exit status ***

11 years, 5 months

ATT standard errors

by jrickles＠ucla.edu

Zelig experts, [I apologize in advance for the long email.] I am working with a colleague to figure out a good way to place a confidence interval around an average treatment effect for the treated (ATT) when using the Zelig sim(z.out) function and sample size is not particularly large. It would be great to get some advice about whether we are correctly using sim.out and whether our alternative approach makes sense. My (cursory) understanding of sim.out is that the ATT point estimate and confidence interval is based on the posterior distribution of the 1,000 conditional expected values for the counterfactual. One concern is that this can produce an appropriate confidence interval asymptotically but may be too narrow in finite samples. As context, we were asked to estimate the effect of attending a magnet school verses a comparison school (for simplicity assume strong ignorability even though it probably doesn't hold and the within-school nesting) on a test score. We tried to equate treatment & control groups using inverse probability of treatment weighting (IPTW). After running x.out1 <- setx(z.out, data =data.t, cond = TRUE) s.out <- sim(z.out, x = x.out1) We get the following: > summary(s.out) Model: ls Number of simulations: 1000 Mean Values of Observed Data (n = 156) (Intercept) ZMath1011 ZRead1011 ZWrite1011 1.00000000 0.06322742 0.05521872 -0.28610638 Pooled Expected Values: E(Y|X) mean sd 2.5% 97.5% 0.03321294 0.80429472 -1.56758254 1.44453357 Pooled Average Treatment Effect for the Treated: Y - EV mean sd 2.5% 97.5% 0.05244464 0.01945088 0.01379341 0.08863216 We're worried the sd of 0.02 only reflects between-imputation variance and not within-sample variance, so we pulled out the expected value matrix and recalculated the ATT & standard error treating the 1,000 expected values as 1,000 multiply imputed data sets and then used Little & Rubin combination rules to get total variance: > ## Merge Expected Values with Main Treatment-Unit Data File ## > id<-data.t$SASID # vector of student ids > ev<-s.out$qi$ev # matrix of expected values & students > > datar <- NULL for (i in 1:ncol(ev)) { # loop over each treatment student + tmp <- cbind(c(1:nrow(ev)),rep(id[i],nrow(ev)),ev[,i]) + datar <- rbind(datar,tmp) + } > > datar <- data.frame(datar) names(datar) <- c("m","SASID","EV") > > datat<-merge(datar,data.t, by="SASID") # merge with main data set > > ## Calculate ATT ## > > datat$ATT<-datat$ZMath1112-datat$EV # individual level effect > att.m<-aggregate(datat$ATT,by=list(datat$m),mean) # mean ATT per imputation > att.v<-aggregate(datat$ATT,by=list(datat$m),var) # variance of ATT > per imputation > > W <- mean(att.v$x) # average within variance > B <- sum((att.m$x-mean(att.m$x))^2)/(nrow(ev)-1) # between variance > T <- sqrt(W/ncol(ev)) + (1+(1/nrow(ev)))*B # total standard error > > # ATT point estimate & standard error # > mean(att.m$x); T [1] 0.05244464 [1] 0.02913942 > # ATT confidence interval # > mean(att.m$x)-2*T; mean(att.m$x)+2*T [1] -0.005834205 [1] 0.1107235 So using this approach returns the same point estimate, but a somewhat larger standard error (0.029 vs. 0.019). As a point of reference, if you just run a regression on the full sample (weighted by IPTW) you get ATT=0.053 (se=0.031). We would like to estimate the ATT for different subgroups as well as the overall ATT, and sample size will really become an issue for some subgroups. Our main question is whether you think our approach is appropriate or whether we should stick with the sd & confidence interval produced by sim(z.out) ... or if there's something better we should do. Thank you, Jordan Rickles

11 years, 7 months

Weighted Multinomial Logistic

by Rogério Barbosa

Hi, I'm trying to run a weighted multinomial logistic -- but "mlogit" family doesn't allow weights... I tryed to implement an external function from VGAM package not multinomial() -- which seems to be the Zelig's built-in; but vglm() ] using Zelig2... But I'm couldn't make it work. Is there any other way to use weights in a mlogit? thanks, Rogério J. Barbosa Researcher at Centre for Metropolitan Studies/Cebrap São Paulo - Brazil

11 years, 7 months

using tag() function constraining parameters in SUR model with ZELIG

by Azzarello, Samantha

Hello all, I am following some standard code from Zelig manual when using a SUR (Seemingly Unrelated Regression Model) to constrain parameters across equations. Please see code below: setwd("C:/Research/Economics/SUR_FX/Model") # Seemingly Unrelated Regression # Load our library. library(Zelig) library(systemfit) library(zoo) # data. factors <- read.table("./RFactors.txt", header = TRUE) returns <- read.table("./RReturns.txt", header = TRUE) myData <- c(factors,returns) # This is our system of equations. mySys <- list(mu1 = USDEUR ~ USDRATE + tag(USDYC, "USDYC")+ USDCC + EURRATE + EURYC + EURLY, mu2 = USDGBP ~ USDRATE + tag(USDYC, "USDYC") + USDCC + GBPRATE + GBPYC + GBPLY, mu3 = USDCHF ~ USDRATE + tag(USDYC, "USDYC") + USDCC + CHFRATE + CHFYC + CHFLY, mu4 = USDSEK ~ USDRATE + tag(USDYC, "USDYC") + USDCC + SEKRATE + SEKYC + SEKLY, mu5 = USDNOK ~ USDRATE + tag(USDYC, "USDYC") + USDCC + NOKRATE + NOKYC + NOKLY, mu6 = USDJPY ~ USDRATE + tag(USDYC, "USDYC") + USDCC + JPYRATE + JPYYC + JPYLY, mu7 = USDSGD ~ USDRATE + tag(USDYC, "USDYC") + USDCC + SGDRATE + SGDYC + SGDLY, mu8 = USDAUD ~ USDRATE + tag(USDYC, "USDYC") + USDCC + AUDRATE + AUDYC + AUDLY, mu9 = USDCAD ~ USDRATE + tag(USDYC, "USDYC") + USDCC + CADRATE + CADYC + CADLY, mu10 = USDNZD ~ USDRATE + tag(USDYC, "USDYC") + USDCC + NZDRATE + NZDYC + NZDLY) # Here is our zelig function call. z.out <- zelig(mySys,"sur", myData) ------ After this I am getting error: Error in eval(expr, envir, enclos) : could not find function "tag" there exists no info (I could find anyways..) RE installing tag() seperate, also tag should be able to be used for the SUR model... Any help would be appreciated. Thanks

11 years, 7 months

Problem with sim when running negative binomial model

by Merlin, Louis

The zelig() and setx() functions are working fine for me but I am having trouble with the sim() command. I get the following error: Error in mvrnorm(num, mu = coef(object), Sigma = vcov(object)) : incompatible arguments I am running a negative binomial model with a large number of covariates (about 300) and a large data set (about 150,000 observations). I would be interested in bootstrapping as an alternative, but I need a way to bootstrap with smaller subsets of my dataset or I run out of memory, and I am not sure how to do this either. Running R version 2.15.0. Using Zelig (Version 3.5.5, built: 2010-01-20) Thanks! ________________________________ Mr. Louis Merlin, AICP Doctoral Student UNC CH Department of City and Regional Planning

11 years, 7 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Zelig September 2012