Zelig November 2011

zelig@lists.gking.harvard.edu

8 participants
9 discussions

Re: MatchIt Zelig and multiple imputation

by Kosuke Imai

Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig(). Good luck, Kosuke Department of Politics Princeton University http://imai.princeton.edu On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote: > Dear Professor, > I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible. > > My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include: > > 1/ Doing multiple imputation and testing which variables to include. > > 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?). > > 3/ Calculation of the quantities of interest for each data set > > 4/ Pooling the quantities across data sets. > > I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together. > > I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills). > > > Best Regards! > > > > Jean-Baptiste > - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

10 years, 8 months

summary with imputations bug ... any workarounds?

by Peter Flom

Good morning If I run <<< susan.lsmixed.out <- zelig(formula = unprot_vag_sex ~ married + age + TREATMENT.ARM*time + highest_grade + income + tag(1|id), data = susanMI.out$imputations, model = "ls.mixed") summary(susan.lsmixed.out) >>>> I get an error Error in x$coef : $ operator is invalid for atomic vectors Searching the archives, I see that others have had similar problems. Is there a workaround? summary(susan.lsmixed.out[[1]]) works fine; should I then average across the five imputed data sets? thanks! Peter Peter L. Flom, PhD Statistical Consultant Website: http://www DOT statisticalanalysisconsulting DOT com/ Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

12 years, 1 month

mixed.probit, gamma specification

by Jason McMann

Hello, I am trying to run a multilevel probit model using Zelig, but keep receiving the following error message: " in .deparseTag(TT.vars[[vind]]) : wrong use of tag function!!" A simplified version of the model I am trying to run is: z.out <- zelig(formula= list(mu=investment.binary ~ edlevel + tag(1 + edlevel, gamma | country), gamma = ~ tag(GDPpc06.full| country)), data=data2006.mod1, model="probit.mixed") What I would like to do is allow the intercept and the edlevel variable listed within the first tag() to vary by country as a function of the GDPpc06.full variable, all of which are included in the same dataframe. I followed the syntax here - http://cran.r-project.org/web/packages/Zelig /vignettes/probit.mixed.pdf - but I think that I am incorrectly specifying the gamma part of the syntax, which may be causing the error. I *am* able to get the model to run when I allow the intercept and edlevel variable to vary using the following syntax: z.out <- zelig(investment.binary ~ edlevel + + tag(1 + edlevel | country), data=data2006.mod1, model="probit.mixed") However, this syntax does not allow me to specify that the intercept and edlevel variable should vary as a function of GDPpc06.full, as in the first model specified above. I have tried including multiple tags at the non-group level of the model specification - i.e. one for the intercept and one for the edlevel variable - but this does not seem to work either. Do you have any suggestions for how to fix the syntax? Sincerely, Jason -- Jason I. McMann PhD Student | Department of Politics Princeton University | jmcmann(a)princeton.edu

12 years, 5 months

updating an amelia object

by Yph Lelkes

Hello all, Is there a way to update() an amelia object to use in Zelig? thanks, yph

12 years, 5 months

documentation updates

by Xu Wang

I'm just curious -- is the documentation (listed here http://gking.harvard.edu/sites/scholar.iq.harvard.edu/files/zelig.pdf) usually updated for the minor releases? That is, will it be updated for 3.5.3 or not for a while? Thank you, Xu Wang

12 years, 5 months

Zelig 3.5.3... ERROR due to VGAM? I'm using Mac OS X 10.7.2

by Mike Daly

Hello, I was using Zelig yesterday (Mac OS 10.7.2, R 2.13.0 x86_64-apple-darwin9.8.0/x86_64 (64-bit), Zelig 3.5.3, VGAM 0.8-4) and everything was working just fine. However, this morning I updated all of my R packages and now I cannot even run the sample commands in the Zelig manual (and my project is due soon!). I have no idea how to fix this -- what should I do? > data(mexico) > z.out1 <- zelig(as.factor(vote88) ~ pristr + othcok + othsocok, model = "mlogit", data = mexico) Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/Library/Frameworks/R.framework/Versions/2.13/Resources/library/VGAM/libs/ x86_64/VGAM.so': dlopen(/Library/Frameworks/R.framework/Versions/2.13/ Resources/library/VGAM/libs/x86_64/VGAM.so, 6): Library not loaded: /usr/local/lib/libgfortran.2.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.13/Resources/library/VGAM/libs/ x86_64/VGAM.so Reason: image not found Error in loadModelDeps(model) : This model depends on package "VGAM". Please install this package and try again In addition: Warning message: package 'VGAM' was built under R version 2.13.2 Error: package/namespace load failed for 'VGAM' > install.packages("VGAM") trying URL ' http://software.rc.fas.harvard.edu/mirrors/R/bin/macosx/leopard/contrib/2.1… ' Content type 'application/x-gzip' length 4525473 bytes (4.3 Mb) opened URL ================================================== downloaded 4.3 Mb The downloaded packages are in /var/folders/0z/6lnh31r9211fldsd3hxl7zdw0000gn/T//Rtmpy6JZzF/downloaded_ packages Any help would be greatly appreciated. I **really** like the Zelig framework, but if I can't get this to work soon then unfortunately I'll have to use SAS instead, which would be a real bummer. Thanks! Mike -- Michael Daly, MSc University of California, Irvine, School of Medicine Medical Student, Year IV -- Research Year Beth Israel Deaconess Medical Center Dr. Charles Day's Orthopaedic Surgery Research Group mcdaly(a)bidmc.harvard.edu | (925) 997-2786

12 years, 5 months

Zelig 3.5-2 (Bug fixes for VGAM 0.8-4)

by Matt Owen

Hi All - Zelig has been updated to version 3.5-2 in order to correspond with a change in VGAM version 0.8-4. This change will only affect Windows and Unix users, as the new revision of VGAM is only available for these operating systems at the moment. To run the latest version of Zelig 3.5, the user must: 1. Install VGAM 0.8-4 (Windows and Linux/Unix only) 2. Install Zelig 3.5-2 from source within an R-session via 'install.packages': > install.packages("Zelig", type="source") (Note that the ">" symbol should be ignored when typing commands into an R-session.) Zelig 3.5-2 will be released for Mac OS X once VGAM 0.8-4 has a Mac Binary available on CRAN. If there are any questions/comments, please do not hesitate to contact the Zelig mailing list. Cheers - Matt email: mowen(a)iq.harvard.edu - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

12 years, 5 months

[Patch] fixing Zelig-3.5.1 bootstrap in order to run with lme4 and boot-1.3-3

by Hugo Mildenberger

Dear list members, in a recent thread of the R-sig-ME mailing list[1], Christoph Scherber communicated the following script which demonstrated a problem within Zelig's bootstrap code: require(Zelig) data(voteincome) z.out1 <- zelig(vote ~ education + age + female + tag(1 | state), data = voteincome, model = "logit.mixed") x <- setx(z.out1, education = 4) s.out1 <- sim(z.out1, x = x, bootstrap=TRUE) #Error in `*tmp*`$call : $ operator not defined for this S4 class This error stemmed from the implementation of bootfn.default() in Zelig/R/bootfn.default.R, which generally uses "object$call" instead of "object@call" (and therefore never worked with lme4) After having fixed this, the next problem became manifest when zelig::sim() was in bootstrapping mode. R could not find a summary method for the object passed to param.mer(). Why this happens, is still unclear to me. Anyway, when having fixed that bug along the lines of the non-bootstrapping branch of param.mer() by using selectMethod(), the next problem surfaced: Error in t.star[r, ] <- res[[r]] : incorrect number of subscripts on matrix After a while it became clear to me that the list returned by param.mer() was incompatible with the type boot::boot() expects (in /boot/R/bootfuns.q as of boot version 1.3-3): 193 } else lapply(seq_len(RR), fn) 194 t.star <- matrix(, RR, length(t0)) 195 for(r in seq_len(RR)) t.star[r, ] <- res[[r]] "t.star" is the name of the matrix boot::boot() uses to collect results from the statistic function. "fn" is the "statistic" parameter passed to boot::boot(), which in turn translates to bootfn.default() as defined in Zelig/R/bootfn.default.R. The function bootfn.default() in turn passes the result of param.mer(), which generally was "list(betas=betas, gammas=gammas, scale=scale)". length(t0) was three, because the list returned by param.mer() had three elements. The implicit type conversion of "t.star" (from matrix to list) R does after the first assignment to "t.star" then triggered the index error with the next assignment. A change of boot::boot() code in order to become compatible with Zelig appears to be out of question. However, in order to keep the default configuration working, param.mer() needs to return a list, because part of the result is a complete matrix (e.g."betas") then, not only a single row of a matrix. I therefore made the return type of param.mer(object,num,bootstrap,) dependent on "bootstrap", returning a list in the default case, and a one-dimensional array while bootstrapping. The related function qi.mer() in Zelig/R/qi.mixed.R is of course now able to handle both types. Since the column names appearing in the array returned by param.mer(..,bootstrap=T) are determined by the user controlled model specification, the name space separation a list provides implicitly had to be coded explicitly. The rudimentary regression testing I did for sim(,bootstrap=F) using patched code against the original version resulted in a quite close numerical agreement. I do attribute the differences (generally in the order of 1E-3) to a different RNG state, but I haven't examined this very closely. Generally, the effective code path of the default case should have remained unchanged. The attached patch is against the sources found in the Cran package Zelig_3.5.1.tar.gz Unrelated to this patch, plotting of simulated results generally does not work as intended: > plot(s.out1) Error in plot(density(qi), main = x$qi.name[[i]], xlab = xlab, ...): error in evaluating the argument 'x' in selecting a method for function 'plot': Error in density.default(qi) : argument 'x' must be numeric No plot generated for model logit.mixed Also, excluding the intercept using the ~ -1 notation is currently disregarded. Best regards Hugo Mildenberger [1]h ttps://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q4/006933.html

12 years, 5 months

Zelig help

by Laura Lazarus Frankel

Hi All, Fledgling graduate student here. I'm trying to run zelig for a logit model that will run in the R base package, but no matter what I seem to do to the recoded variables (attach, detach, reattach, put them back into the dataset, create a new dataset, etc.) I keep getting the following error message: Error in `[.data.frame`(d, , all.vars(as.expression(formula))) : undefined columns selected The most concise form of the code is below. Thanks in advance! Laura library(foreign) library(car) library(survival) library(lmeSplines) library(MASS) library(nnet) library(nlme) library(reshape) library(plyr) library(grid) library(proto) library(ggplot2) library(lattice) gss<-read.dta('/Users/Laura/Dropbox/projects/PanelAttrition/GSS Data/GSS.dta') names(gss) summary(gss$panstat_2) summary(gss$panstat_3) gssattrition<-c() gssattrition[gss$panstat_3=='selected, eligible, but not reinterviewed']<-1 gssattrition[gss$panstat_2=='selected, eligible, but not reinterviewed']<-1 gssattrition[gss$panstat_3=='selected, eligible, and reinterviewed']<-0 summary(gssattrition) gssattrition<-c() gssattrition[gss$panstat_3=='selected, eligible, but not reinterviewed']<-1 gssattrition[gss$panstat_3=='attrited']<-1 gssattrition[gss$panstat_3=='selected, eligible, and reinterviewed']<-0 gssattrition[gss$panstat_2=='selected, but not eligible and not reinterviewd']<-NA gssattrition[gss$panstat_2=='selected, but not eligible and not reinterviewd because r was in institution']<-NA gssattrition[gss$panstat_2=='selected, but not eligible and not reinterviewd because r was in institution']<-NA gssattrition[gss$panstat_2=='selected, but not eligible and not reinterviewd because r was deceased']<-NA summary(gssattrition) gssfulltime<-c() gssfulltime[gss$wrkstat_1=='working fulltime']<-1 gssfulltime[gss$wrkstat_1=='working parttime']<-1 gssfulltime<-recode(gssfulltime, 'NA=0') gssforeign<-c() gssforeign[gss$born_1=='yes']<-0 gssforeign[gss$born_1=='no']<-1 gsshome<-c() summary(gss$dwelown_1) gsshome[gss$dwelown_1=='own or is buying']<-1 gsshome[gss$dwelown_1=='pays rent']<-0 gsshome[gss$dwelown_1=='other']<-0 summary(gsshome) gssrace<-c() gssrace[gss$race_1=='2']<-1 gssrace[gss$race_1=='1']<-0 gssrace[gss$race_1=='3']<-1 gssincome<-c() gssincome[gss$income06_1=='under $1 000']<-1 gssincome[gss$income06_1=='$1 000 to 2 999']<-1 gssincome[gss$income06_1=='$3 000 to 3 999']<-1 gssincome[gss$income06_1=='$4 000 to 4 999']<-1 gssincome[gss$income06_1=='$5 000 to 5 999']<-2 gssincome[gss$income06_1=='$6 000 to 6 999']<-2 gssincome[gss$income06_1=='$7 000 to 7 999']<-2 gssincome[gss$income06_1=='$8 000 to 9 999']<-3 gssincome[gss$income06_1=='$10000 to 12499']<-4 gssincome[gss$income06_1=='$12500 to 14999']<-5 gssincome[gss$income06_1=='$15000 to 17499']<-6 gssincome[gss$income06_1=='$17500 to 19999']<-6 gssincome[gss$income06_1=='$20000 to 22499']<-7 gssincome[gss$income06_1=='$22500 to 24999']<-7 gssincome[gss$income06_1=='$25000 to 29999']<-8 gssincome[gss$income06_1=='$30000 to 34999']<-9 gssincome[gss$income06_1=='$35000 to 39999']<-10 gssincome[gss$income06_1=='$50000 to 59999']<-12 gssincome[gss$income06_1=='$40000 to 49999']<-11 gssincome[gss$income06_1=='$60000 to 74999']<-13 gssincome[gss$income06_1=='$75000 to $89999']<-14 gssincome[gss$income06_1=='$90000 to $109999']<-15 gssincome[gss$income06_1=='$110000 to $129999']<-16 gssincome[gss$income06_1=='$130000 to $149999']<-17 gssincome[gss$income06_1=='$150000 or over']<-18 summary(gssincome) summary(gss$degree_1) gsseducation<-c() gsseducation[gss$degree_1=='lt high school']<-1 gsseducation[gss$degree_1=='high school']<-2 gsseducation[gss$degree_1=='junior college']<-3 gsseducation[gss$degree_1=='bachelor']<-4 gsseducation[gss$degree_1=='graduate']<-5 gssage<-gss$age_1 gssalone<-c() gssalone[gss$hompop_1=='1']<-1 gssalone<-recode(gssalone, 'NA=0') gssexperience<-gss$intyrs_1 gsslength<-gss$lngthinv_1 gssfee<-gss$feelevel_1 gssfee[gss$feeused_1=='no']<-0 gssface<-c() gssface[gss$mode_1=='in-person']<-1 gssface[gss$mode_1=='over the phone']<-0 gssface[gss$mode_1=='in-person/phone']<-0 gssintwoman<-c() summary(gss$intsex_1) gssintwoman[gss$intsex_1=='female']<-1 gssintwoman<-recode(gssintwoman, 'NA=0') gssintwhite<-c() gssintwhite[gss$intrace1_1=='white']<-1 gssintwhite<-recode(gssintwhite, 'NA=0') gsshispanic<-c() gsshispanic[gss$hispanic_1=='not hispanic']<-0 gsshispanic<-recode(gsshispanic, 'NA=1') gssracesame<-c() gssracesame[gss$intrace1_1=='white' & gss$racecen1_1=='white']<-1 gssracesame[gss$intrace1_1=='black or african american' & gss$racecen1_1=='black or african american']<-1 gssracesame[gss$intrace1_1=='asian indian' & gss$racecen1_1=='asian indian' ]<-1 gssracesame[gss$intrace1_1=='chinese' & gss$racecen1_1=='chinese']<-1 gssracesame[gss$intrace1_1=='filipino' & gss$racecen1_1=='filipino']<-1 gssracesame[gss$intrace1_1=='japanese' & gss$racecen1_1=='japanese']<-1 gssracesame[gss$intrace1_1=='korean' & gss$racecen1_1=='korean']<-1 gssracesame[gss$intrace1_1=='vietnamese' & gss$racecen1_1=='vietnamese']<-1 gssracesame[gss$intrace1_1=='other asian' & gss$racecen1_1=='other asian']<- 1 gssracesame[gss$intrace1_1=='native hawaiian' & gss$racecen1_1=='native hawaiian']<-1 gssracesame[gss$intrace1_1=='other pacific islander' & gss$racecen1_1=='other pacific islander']<-1 gssracesame[gss$intrace1_1=='some other race' & gss$racecen1_1=='some other race']<-1 gssracesame[gss$intrace1_1=='hispanic' & gss$racecen1_1=='hispanic']<-1 gssracesame[gsshispanic=='1' & gss$intethn=='hispanic']<-1 gssracesame<-recode(gssracesame, 'NA=0') gsssexsame<-c() gsssexsame[gss$sex_1=='1' & gssintwoman=='0']<-1 gsssexsame[gss$sex_1=='2'& gssintwoman=='1']<-1 gsssexsame<-recode(gsssexsame, 'NA=0') gsssex<-c() gsssex[gss$sex_1=='1']<-0 gsssex[gss$sex_1=='2']<-1 gssvote04<-c() gssvote04[gss$vote04_1=='voted']<-1 gssvote04[gss$vote04_1=='did not vote']<-0 summary(gssvote04) summary(gss$babies_1) gchildren<-gss$babies_1 summary(gchildren) gss$gssattriiton1<-gssattrition gss$gssage1<-gssage gss$gsssex1<-gsssex gss$gssrace1<-gssrace gss$gssfulltime1<-gssfulltime gss$gsseducation1<-gsseducation gss$gssattriiton1<-gssincome gss$gsshome1<-gsshome gss$gssfulltime1<-gssfulltime gss$gchildren1<-gchildren gss$gssforeign1<-gssforeign gss$gssattriiton1<-gssvote04 gss$gssalone1<-gssalone gss$gssexperience1<-gssexperience gss$gssintwoman1<-gssintwoman gss$gsssexsame1<-gsssexsame library(Zelig) gsslogit<- zelig(gssattrition1~gssage1+gsssex1+gssrace1+gssfulltime1+ gsseducation1+gssincome1+gsshome1+gssfulltime1+gchildren1+gssforeign1+ gssvote041+gssalone1+gssexperience1+gssintwoman1+gsssexsame1, model="logit" , data=gss) gss.data<-gss[,c(1833, 1855:1867)] names(gss.data) gsslogit<- zelig(gssattrition1~gssage1+gsssex1+gssrace1+gssfulltime1+ gsseducation1+gssincome1+gsshome1+gssfulltime1+gchildren1+gssforeign1+ gssvote041+gssalone1+gssexperience1+gssintwoman1+gsssexsame1, model="logit" , data=gss.data)

12 years, 5 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Zelig November 2011