Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig().
Good luck,
Kosuke
Department of Politics
Princeton University
http://imai.princeton.edu
On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote:
> Dear Professor,
> I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible.
>
> My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include:
>
> 1/ Doing multiple imputation and testing which variables to include.
>
> 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?).
>
> 3/ Calculation of the quantities of interest for each data set
>
> 4/ Pooling the quantities across data sets.
>
> I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together.
>
> I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills).
>
>
> Best Regards!
>
>
>
> Jean-Baptiste
>
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Good morning
If I run
<<<
susan.lsmixed.out <- zelig(formula = unprot_vag_sex ~ married + age + TREATMENT.ARM*time + highest_grade + income + tag(1|id),
data = susanMI.out$imputations, model = "ls.mixed")
summary(susan.lsmixed.out)
>>>>
I get an error
Error in x$coef : $ operator is invalid for atomic vectors
Searching the archives, I see that others have had similar problems. Is there a workaround?
summary(susan.lsmixed.out[[1]])
works fine; should I then average across the five imputed data sets?
thanks!
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Hello,
I am trying to run a multilevel probit model using Zelig, but keep receiving
the following error message: " in .deparseTag(TT.vars[[vind]]) : wrong use
of tag function!!"
A simplified version of the model I am trying to run is:
z.out <- zelig(formula= list(mu=investment.binary ~ edlevel +
tag(1 + edlevel, gamma | country),
gamma = ~ tag(GDPpc06.full| country)), data=data2006.mod1,
model="probit.mixed")
What I would like to do is allow the intercept and the edlevel variable
listed within the first tag() to vary by country as a function of the
GDPpc06.full variable, all of which are included in the same dataframe. I
followed the syntax here - http://cran.r-project.org/web/packages/Zelig
/vignettes/probit.mixed.pdf - but I think that I am incorrectly specifying
the gamma part of the syntax, which may be causing the error.
I *am* able to get the model to run when I allow the intercept and edlevel
variable to vary using the following syntax:
z.out <- zelig(investment.binary ~ edlevel +
+ tag(1 + edlevel | country),
data=data2006.mod1, model="probit.mixed")
However, this syntax does not allow me to specify that the intercept and
edlevel variable should vary as a function of GDPpc06.full, as in the first
model specified above. I have tried including multiple tags at the
non-group level of the model specification - i.e. one for the intercept and
one for the edlevel variable - but this does not seem to work either.
Do you have any suggestions for how to fix the syntax?
Sincerely,
Jason
--
Jason I. McMann
PhD Student | Department of Politics
Princeton University | jmcmann(a)princeton.edu
Hello,
I was using Zelig yesterday (Mac OS 10.7.2, R 2.13.0
x86_64-apple-darwin9.8.0/x86_64 (64-bit), Zelig 3.5.3, VGAM 0.8-4) and
everything was working just fine. However, this morning I updated all of my
R packages and now I cannot even run the sample commands in the Zelig
manual (and my project is due soon!). I have no idea how to fix this --
what should I do?
> data(mexico)
> z.out1 <- zelig(as.factor(vote88) ~ pristr + othcok + othsocok, model =
"mlogit", data = mexico)
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object
'/Library/Frameworks/R.framework/Versions/2.13/Resources/library/VGAM/libs/
x86_64/VGAM.so':
dlopen(/Library/Frameworks/R.framework/Versions/2.13/
Resources/library/VGAM/libs/x86_64/VGAM.so,
6): Library not loaded: /usr/local/lib/libgfortran.2.dylib
Referenced from:
/Library/Frameworks/R.framework/Versions/2.13/Resources/library/VGAM/libs/
x86_64/VGAM.so
Reason: image not found
Error in loadModelDeps(model) :
This model depends on package "VGAM". Please install this package and try
again
In addition: Warning message:
package 'VGAM' was built under R version 2.13.2
Error: package/namespace load failed for 'VGAM'
> install.packages("VGAM")
trying URL '
http://software.rc.fas.harvard.edu/mirrors/R/bin/macosx/leopard/contrib/2.1…
'
Content type 'application/x-gzip' length 4525473 bytes (4.3 Mb)
opened URL
==================================================
downloaded 4.3 Mb
The downloaded packages are in
/var/folders/0z/6lnh31r9211fldsd3hxl7zdw0000gn/T//Rtmpy6JZzF/downloaded_
packages
Any help would be greatly appreciated. I **really** like the Zelig
framework, but if I can't get this to work soon then unfortunately I'll
have to use SAS instead, which would be a real bummer.
Thanks!
Mike
--
Michael Daly, MSc
University of California, Irvine, School of Medicine
Medical Student, Year IV -- Research Year
Beth Israel Deaconess Medical Center
Dr. Charles Day's Orthopaedic Surgery Research Group
mcdaly(a)bidmc.harvard.edu | (925) 997-2786
Hi All -
Zelig has been updated to version 3.5-2 in order to correspond with a change in VGAM version 0.8-4. This change will only affect Windows and Unix users, as the new revision of VGAM is only available for these operating systems at the moment.
To run the latest version of Zelig 3.5, the user must:
1. Install VGAM 0.8-4 (Windows and Linux/Unix only)
2. Install Zelig 3.5-2 from source within an R-session via 'install.packages':
> install.packages("Zelig", type="source")
(Note that the ">" symbol should be ignored when typing commands into an R-session.)
Zelig 3.5-2 will be released for Mac OS X once VGAM 0.8-4 has a Mac Binary available on CRAN.
If there are any questions/comments, please do not hesitate to contact the Zelig mailing list.
Cheers -
Matt
email: mowen(a)iq.harvard.edu
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Dear list members,
in a recent thread of the R-sig-ME mailing list[1], Christoph Scherber
communicated the following script which demonstrated a problem within
Zelig's bootstrap code:
require(Zelig)
data(voteincome)
z.out1 <- zelig(vote ~ education + age + female + tag(1 | state),
data = voteincome, model = "logit.mixed")
x <- setx(z.out1, education = 4)
s.out1 <- sim(z.out1, x = x, bootstrap=TRUE)
#Error in `*tmp*`$call : $ operator not defined for this S4 class
This error stemmed from the implementation of bootfn.default() in
Zelig/R/bootfn.default.R, which generally uses "object$call" instead of
"object@call" (and therefore never worked with lme4)
After having fixed this, the next problem became manifest when
zelig::sim() was in bootstrapping mode. R could not find a summary
method for the object passed to param.mer(). Why this happens, is still
unclear to me.
Anyway, when having fixed that bug along the lines of the
non-bootstrapping branch of param.mer() by using selectMethod(), the
next problem surfaced:
Error in t.star[r, ] <- res[[r]] :
incorrect number of subscripts on matrix
After a while it became clear to me that the list returned by
param.mer() was incompatible with the type boot::boot() expects (in
/boot/R/bootfuns.q as of boot version 1.3-3):
193 } else lapply(seq_len(RR), fn)
194 t.star <- matrix(, RR, length(t0))
195 for(r in seq_len(RR)) t.star[r, ] <- res[[r]]
"t.star" is the name of the matrix boot::boot() uses to collect results
from the statistic function.
"fn" is the "statistic" parameter passed to boot::boot(), which in turn
translates to bootfn.default() as defined in Zelig/R/bootfn.default.R.
The function bootfn.default() in turn passes the result of param.mer(),
which generally was "list(betas=betas, gammas=gammas, scale=scale)".
length(t0) was three, because the list returned by param.mer() had three
elements. The implicit type conversion of "t.star" (from matrix to list)
R does after the first assignment to "t.star" then triggered the index
error with the next assignment.
A change of boot::boot() code in order to become compatible with Zelig
appears to be out of question.
However, in order to keep the default configuration working, param.mer()
needs to return a list, because part of the result is a complete matrix
(e.g."betas") then, not only a single row of a matrix.
I therefore made the return type of param.mer(object,num,bootstrap,)
dependent on "bootstrap", returning a list in the default case, and a
one-dimensional array while bootstrapping. The related function qi.mer()
in Zelig/R/qi.mixed.R is of course now able to handle both types.
Since the column names appearing in the array returned by
param.mer(..,bootstrap=T) are determined by the user controlled model
specification, the name space separation a list provides implicitly had
to be coded explicitly.
The rudimentary regression testing I did for sim(,bootstrap=F) using
patched code against the original version resulted in a quite close
numerical agreement. I do attribute the differences (generally in the
order of 1E-3) to a different RNG state, but I haven't examined this
very closely. Generally, the effective code path of the default case
should have remained unchanged.
The attached patch is against the sources found in the Cran package
Zelig_3.5.1.tar.gz
Unrelated to this patch, plotting of simulated results generally does
not work as intended:
> plot(s.out1)
Error in plot(density(qi), main = x$qi.name[[i]], xlab = xlab, ...):
error in evaluating the argument 'x' in selecting a method for
function 'plot': Error in density.default(qi) : argument 'x'
must be numeric
No plot generated for model logit.mixed
Also, excluding the intercept using the ~ -1 notation is currently
disregarded.
Best regards
Hugo Mildenberger
[1]h ttps://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q4/006933.html
Hi All,
Fledgling graduate student here. I'm trying to run zelig for a logit model
that will run in the R base package, but no matter what I seem to do to the
recoded variables (attach, detach, reattach, put them back into the
dataset, create a new dataset, etc.) I keep getting the following error
message:
Error in `[.data.frame`(d, , all.vars(as.expression(formula))) :
undefined columns selected
The most concise form of the code is below.
Thanks in advance!
Laura
library(foreign)
library(car)
library(survival)
library(lmeSplines)
library(MASS)
library(nnet)
library(nlme)
library(reshape)
library(plyr)
library(grid)
library(proto)
library(ggplot2)
library(lattice)
gss<-read.dta('/Users/Laura/Dropbox/projects/PanelAttrition/GSS
Data/GSS.dta')
names(gss)
summary(gss$panstat_2)
summary(gss$panstat_3)
gssattrition<-c()
gssattrition[gss$panstat_3=='selected, eligible, but not reinterviewed']<-1
gssattrition[gss$panstat_2=='selected, eligible, but not reinterviewed']<-1
gssattrition[gss$panstat_3=='selected, eligible, and reinterviewed']<-0
summary(gssattrition)
gssattrition<-c()
gssattrition[gss$panstat_3=='selected, eligible, but not reinterviewed']<-1
gssattrition[gss$panstat_3=='attrited']<-1
gssattrition[gss$panstat_3=='selected, eligible, and reinterviewed']<-0
gssattrition[gss$panstat_2=='selected, but not eligible and not
reinterviewd']<-NA
gssattrition[gss$panstat_2=='selected, but not eligible and not
reinterviewd because r was in institution']<-NA
gssattrition[gss$panstat_2=='selected, but not eligible and not
reinterviewd because r was in institution']<-NA
gssattrition[gss$panstat_2=='selected, but not eligible and not
reinterviewd because r was deceased']<-NA
summary(gssattrition)
gssfulltime<-c()
gssfulltime[gss$wrkstat_1=='working fulltime']<-1
gssfulltime[gss$wrkstat_1=='working parttime']<-1
gssfulltime<-recode(gssfulltime, 'NA=0')
gssforeign<-c()
gssforeign[gss$born_1=='yes']<-0
gssforeign[gss$born_1=='no']<-1
gsshome<-c()
summary(gss$dwelown_1)
gsshome[gss$dwelown_1=='own or is buying']<-1
gsshome[gss$dwelown_1=='pays rent']<-0
gsshome[gss$dwelown_1=='other']<-0
summary(gsshome)
gssrace<-c()
gssrace[gss$race_1=='2']<-1
gssrace[gss$race_1=='1']<-0
gssrace[gss$race_1=='3']<-1
gssincome<-c()
gssincome[gss$income06_1=='under $1 000']<-1
gssincome[gss$income06_1=='$1 000 to 2 999']<-1
gssincome[gss$income06_1=='$3 000 to 3 999']<-1
gssincome[gss$income06_1=='$4 000 to 4 999']<-1
gssincome[gss$income06_1=='$5 000 to 5 999']<-2
gssincome[gss$income06_1=='$6 000 to 6 999']<-2
gssincome[gss$income06_1=='$7 000 to 7 999']<-2
gssincome[gss$income06_1=='$8 000 to 9 999']<-3
gssincome[gss$income06_1=='$10000 to 12499']<-4
gssincome[gss$income06_1=='$12500 to 14999']<-5
gssincome[gss$income06_1=='$15000 to 17499']<-6
gssincome[gss$income06_1=='$17500 to 19999']<-6
gssincome[gss$income06_1=='$20000 to 22499']<-7
gssincome[gss$income06_1=='$22500 to 24999']<-7
gssincome[gss$income06_1=='$25000 to 29999']<-8
gssincome[gss$income06_1=='$30000 to 34999']<-9
gssincome[gss$income06_1=='$35000 to 39999']<-10
gssincome[gss$income06_1=='$50000 to 59999']<-12
gssincome[gss$income06_1=='$40000 to 49999']<-11
gssincome[gss$income06_1=='$60000 to 74999']<-13
gssincome[gss$income06_1=='$75000 to $89999']<-14
gssincome[gss$income06_1=='$90000 to $109999']<-15
gssincome[gss$income06_1=='$110000 to $129999']<-16
gssincome[gss$income06_1=='$130000 to $149999']<-17
gssincome[gss$income06_1=='$150000 or over']<-18
summary(gssincome)
summary(gss$degree_1)
gsseducation<-c()
gsseducation[gss$degree_1=='lt high school']<-1
gsseducation[gss$degree_1=='high school']<-2
gsseducation[gss$degree_1=='junior college']<-3
gsseducation[gss$degree_1=='bachelor']<-4
gsseducation[gss$degree_1=='graduate']<-5
gssage<-gss$age_1
gssalone<-c()
gssalone[gss$hompop_1=='1']<-1
gssalone<-recode(gssalone, 'NA=0')
gssexperience<-gss$intyrs_1
gsslength<-gss$lngthinv_1
gssfee<-gss$feelevel_1
gssfee[gss$feeused_1=='no']<-0
gssface<-c()
gssface[gss$mode_1=='in-person']<-1
gssface[gss$mode_1=='over the phone']<-0
gssface[gss$mode_1=='in-person/phone']<-0
gssintwoman<-c()
summary(gss$intsex_1)
gssintwoman[gss$intsex_1=='female']<-1
gssintwoman<-recode(gssintwoman, 'NA=0')
gssintwhite<-c()
gssintwhite[gss$intrace1_1=='white']<-1
gssintwhite<-recode(gssintwhite, 'NA=0')
gsshispanic<-c()
gsshispanic[gss$hispanic_1=='not hispanic']<-0
gsshispanic<-recode(gsshispanic, 'NA=1')
gssracesame<-c()
gssracesame[gss$intrace1_1=='white' & gss$racecen1_1=='white']<-1
gssracesame[gss$intrace1_1=='black or african american' &
gss$racecen1_1=='black
or african american']<-1
gssracesame[gss$intrace1_1=='asian indian' & gss$racecen1_1=='asian indian'
]<-1
gssracesame[gss$intrace1_1=='chinese' & gss$racecen1_1=='chinese']<-1
gssracesame[gss$intrace1_1=='filipino' & gss$racecen1_1=='filipino']<-1
gssracesame[gss$intrace1_1=='japanese' & gss$racecen1_1=='japanese']<-1
gssracesame[gss$intrace1_1=='korean' & gss$racecen1_1=='korean']<-1
gssracesame[gss$intrace1_1=='vietnamese' & gss$racecen1_1=='vietnamese']<-1
gssracesame[gss$intrace1_1=='other asian' & gss$racecen1_1=='other asian']<-
1
gssracesame[gss$intrace1_1=='native hawaiian' & gss$racecen1_1=='native
hawaiian']<-1
gssracesame[gss$intrace1_1=='other pacific islander' & gss$racecen1_1=='other
pacific islander']<-1
gssracesame[gss$intrace1_1=='some other race' & gss$racecen1_1=='some other
race']<-1
gssracesame[gss$intrace1_1=='hispanic' & gss$racecen1_1=='hispanic']<-1
gssracesame[gsshispanic=='1' & gss$intethn=='hispanic']<-1
gssracesame<-recode(gssracesame, 'NA=0')
gsssexsame<-c()
gsssexsame[gss$sex_1=='1' & gssintwoman=='0']<-1
gsssexsame[gss$sex_1=='2'& gssintwoman=='1']<-1
gsssexsame<-recode(gsssexsame, 'NA=0')
gsssex<-c()
gsssex[gss$sex_1=='1']<-0
gsssex[gss$sex_1=='2']<-1
gssvote04<-c()
gssvote04[gss$vote04_1=='voted']<-1
gssvote04[gss$vote04_1=='did not vote']<-0
summary(gssvote04)
summary(gss$babies_1)
gchildren<-gss$babies_1
summary(gchildren)
gss$gssattriiton1<-gssattrition
gss$gssage1<-gssage
gss$gsssex1<-gsssex
gss$gssrace1<-gssrace
gss$gssfulltime1<-gssfulltime
gss$gsseducation1<-gsseducation
gss$gssattriiton1<-gssincome
gss$gsshome1<-gsshome
gss$gssfulltime1<-gssfulltime
gss$gchildren1<-gchildren
gss$gssforeign1<-gssforeign
gss$gssattriiton1<-gssvote04
gss$gssalone1<-gssalone
gss$gssexperience1<-gssexperience
gss$gssintwoman1<-gssintwoman
gss$gsssexsame1<-gsssexsame
library(Zelig)
gsslogit<- zelig(gssattrition1~gssage1+gsssex1+gssrace1+gssfulltime1+
gsseducation1+gssincome1+gsshome1+gssfulltime1+gchildren1+gssforeign1+
gssvote041+gssalone1+gssexperience1+gssintwoman1+gsssexsame1, model="logit"
, data=gss)
gss.data<-gss[,c(1833, 1855:1867)]
names(gss.data)
gsslogit<- zelig(gssattrition1~gssage1+gsssex1+gssrace1+gssfulltime1+
gsseducation1+gssincome1+gsshome1+gssfulltime1+gchildren1+gssforeign1+
gssvote041+gssalone1+gssexperience1+gssintwoman1+gsssexsame1, model="logit"
, data=gss.data)