Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig().
Good luck,
Kosuke
Department of Politics
Princeton University
http://imai.princeton.edu
On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote:
> Dear Professor,
> I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible.
>
> My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include:
>
> 1/ Doing multiple imputation and testing which variables to include.
>
> 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?).
>
> 3/ Calculation of the quantities of interest for each data set
>
> 4/ Pooling the quantities across data sets.
>
> I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together.
>
> I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills).
>
>
> Best Regards!
>
>
>
> Jean-Baptiste
>
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
*Please, let me know how can I install ZeligMultinomial package. I want to
use mlogit, which according to the manual (page 50), is found in said
package.
I tried with the command,
install.packages("ZeligMultinomial", repos="http://r.iq.harvard.edu/",
type="source")
But received the following message:
Warning: dependency 'MNP' is not available
trying URL '
http://r.iq.harvard.edu/src/contrib/ZeligMultinomial_0.5-4.tar.gz'
Content type 'application/x-gzip' length 9730 bytes
opened URL
==================================================
downloaded 9730 bytes
During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_TIME failed, using "C"
3: Setting LC_MESSAGES failed, using "C"
4: Setting LC_PAPER failed, using "C"
ERROR: dependency 'MNP' is not available for package 'ZeligMultinomial'
* removing
'/Library/Frameworks/R.framework/Versions/2.14/Resources/library/ZeligMultinomial'
The downloaded packages are in
'/private/var/folders/UL/ULLu+bi5GR8IM1G-7XZBJU+++TI/-Tmp-/Rtmp6jw6M9/downloaded_packages'
Warning message:
In install.packages("ZeligMultinomial", repos = "http://r.iq.harvard.edu/",
:
installation of package 'ZeligMultinomial' had non-zero exit status
***
Good morning
If I run
<<<
susan.lsmixed.out <- zelig(formula = unprot_vag_sex ~ married + age + TREATMENT.ARM*time + highest_grade + income + tag(1|id),
data = susanMI.out$imputations, model = "ls.mixed")
summary(susan.lsmixed.out)
>>>>
I get an error
Error in x$coef : $ operator is invalid for atomic vectors
Searching the archives, I see that others have had similar problems. Is there a workaround?
summary(susan.lsmixed.out[[1]])
works fine; should I then average across the five imputed data sets?
thanks!
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Hey all again,
I would like to inform that the most updated version of plot.ci function
doesn't have the item "logit.survey" listed on its algorithm. So it is not
possible to plot confidence intervals for predicted values calculated using
this model.
I tried to make that inclusion myself -- and it worked.
Here it goes the modified function.
(I found the original at:
http://r.iq.harvard.edu/src/contrib/Zelig/R/plot.ci.R)
Best,
plot.ci <- function(x, CI=95, qi = "ev", main = "",
ylab = NULL, xlab = NULL, xlim = NULL,
ylim = NULL, col = c("red", "blue"), ...) {
"%w/o%" <- function(x,y) x[!x %in% y] #-- x without y
if (class(x) != "zelig")
stop(" plot.ci() works only for sim() output.")
if (!(x$zelig.call$model) %in%
c("ls", "logit",* "logit.survey"*, "normal.survey","probit", "exp",
"gamma", "lognorm",
"weibull", "normal", "poisson", "tobit", "relogit",
"negbin", "logit.bayes", "probit.bayes",
"poisson.bayes", "normal.bayes", "tobit.bayes",
"ls.mixed", "logit.mixed", "probit.mixed",
"gamma.mixed", "poisson.mixed",
"logit.gam", "gamma.gee", "normal.gam", "poisson.gam",
"probit.gam", "logit.gee", "normal.gee",
"poisson.gee", "probit.gee", "normal.survey"))
stop("\n plot.ci() is valid only for non-categorical, univariate
response models.")
cip <- c((100-CI)/200, 1-(100-CI)/200)
summarize <- function(z, cip){
res <- NULL
res <- cbind(res, apply(z, 2, quantile, prob=cip[1]))
res <- cbind(res, apply(z, 2, quantile, prob=cip[2]))
res
}
vv <- apply(x$x, 2, unique)
idx <- sapply(vv, length)
cidx <- which(idx > 1)
if (!is.null(x$x1)) {
vv1 <- apply(x$x1, 2, unique)
idx1 <- sapply(vv1, length)
cidx1 <- which(idx1 > 1)
if (!identical(names(idx), names(idx1)))
stop("variables in x and x1 do not match.")
## Checking for one dimension of variation, including interaction terms
if (length(cidx) > length(cidx1)) {
tmp <- names(idx)[cidx %w/o% cidx1]
tmp1 <- names(idx)[cidx[cidx %in% cidx1]]
}
else {
tmp <- names(idx1)[cidx1 %w/o% cidx]
tmp1 <- names(idx1)[cidx1[cidx1 %in% cidx]]
}
check <- grep(tmp1, tmp)
if (length(check) != length(tmp))
stop("x and x1 vary on more than one dimension.")
}
var <- vv[[cidx[1]]]
q <- pmatch(qi, names(x$qi))
qofi <- x$qi[[q]]
sum.qi <- summarize(qofi, cip)
if (!is.null(x$x1) && qi == "ev") {
fd <- x$qi$fd
ev1 <- fd + qofi
sum.qi1 <- summarize(ev1, cip)
}
else sum.qi1 <- NULL
if (is.null(ylab)) ylab <- x$qi.name[[q]]
if (is.null(xlab)) xlab <- paste("Range of", colnames(x$x)[cidx[1]])
if (is.null(ylim)) {
if (is.null(sum.qi1)) ylim <- c(min(sum.qi), max(sum.qi))
else ylim <- c(min(sum.qi, sum.qi1), max(sum.qi, sum.qi1))
}
if (is.null(xlim)) xlim <- c(min(var), max(var))
plot.default(var, type = "n", ylab = ylab, main = main, xlab = xlab,
xlim = xlim, ylim = ylim)
for (i in 1:length(var)) {
lines(c(var[i], var[i]), c(sum.qi[i,1], sum.qi[i,2]), col = col[1], ...)
if (!is.null(x$x1) && qi == "ev")
lines(c(var[i], var[i]), c(sum.qi1[i,1], sum.qi1[i,2]), col = col[2],
...)
}
}
- - - - -
Rogério J. Barbosa
Researcher at Centre for Metropolitan Studies/Cebrap
São Paulo - Brazil
Hi Matt,
Thanks for your help! But, unfortunately, it didn't work. The error keep
happening even if I use just one independent variable.
For instance:
> x.out = setx(z.out, age=mean)
*Error in dta[complete.cases(mf), names(dta) %in% vars, drop = FALSE] : *
* incorrect number of dimensions*
Just an observation: it stated to occur suddenly. The day before the model
was working fine.
- - - - -
Rogério J. Barbosa
Researcher at Centre for Metropolitan Studies/Cebrap
São Paulo - Brazil
2012/3/2 Matt Owen <mowen(a)iq.harvard.edu>
> My best guess is that the error occurs because the values for "gender" and
> "race.white" have a different length than "age" and "age2"
>
> Try something like:
>
> setx(z.out, age=age.sim, age2=age2.sim, gender=rep(0, length(age.sim)),
> race.white=rep(1, length(age.sim)))
>
>
> On Mar 1, 2012, at 11:06 PM, Rogério Barbosa wrote:
>
> Hey all,
>
> I get this error when I run setx: "Error in dta[complete.cases(mf),
> names(dta) %in% vars, drop = FALSE] : incorrect number of dimensions"
> I found some people with the same problem on internet - but apparently no
> one has worked it out.
> Does anyone know how to solve it?
>
> My output is bellow the message.
>
> Thanks,
> Rogério Barbosa
>
>
> > z.out = zelig(t1 ~ age + age2 + gender + race.white, model = "logit",
> data = pad)
> How to cite this model in Zelig:
> Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression
> for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia
> Lau, "Zelig: Everyone's Statistical Software,"
> http://gking.harvard.edu/zelig
>
> > age.sim=6:19
> > age2.sim=age.sim^2
>
> > x.out <- setx(z.out, age=age.sim, age2=age2.sim, gender=0, race.white=1)
> *Error in dta[complete.cases(mf), names(dta) %in% vars, drop = FALSE] : *
> * incorrect number of dimensions*
>
>
>
> Rogério J. Barbosa
> Researcher at Centre for Metropolitan Studies/Cebrap
> São Paulo - Brazil
> -
> --
> Zelig Mailing List, served by HUIT
> Send messages: zelig(a)lists.gking.harvard.edu
> [un]subscribe Options:
> http://lists.gking.harvard.edu/mailman/listinfo/zelig
> Zelig program information: http://gking.harvard.edu/zelig/
> Zelig mailing list
> Zelig(a)lists.gking.harvard.edu
> https://lists.gking.harvard.edu/mailman/listinfo/zelig
>
>
>
> -
> --
> Zelig Mailing List, served by HUIT
> Send messages: zelig(a)lists.gking.harvard.edu
> [un]subscribe Options:
> http://lists.gking.harvard.edu/mailman/listinfo/zelig
> Zelig program information: http://gking.harvard.edu/zelig/
> Zelig mailing list
> Zelig(a)lists.gking.harvard.edu
> https://lists.gking.harvard.edu/mailman/listinfo/zelig
>
Hey all,
I get this error when I run setx: "Error in dta[complete.cases(mf),
names(dta) %in% vars, drop = FALSE] : incorrect number of dimensions"
I found some people with the same problem on internet - but apparently no
one has worked it out.
Does anyone know how to solve it?
My output is bellow the message.
Thanks,
Rogério Barbosa
> z.out = zelig(t1 ~ age + age2 + gender + race.white, model = "logit",
data = pad)
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression
for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia
Lau, "Zelig: Everyone's Statistical Software,"
http://gking.harvard.edu/zelig
> age.sim=6:19
> age2.sim=age.sim^2
> x.out <- setx(z.out, age=age.sim, age2=age2.sim, gender=0, race.white=1)
*Error in dta[complete.cases(mf), names(dta) %in% vars, drop = FALSE] : *
* incorrect number of dimensions*
Rogério J. Barbosa
Researcher at Centre for Metropolitan Studies/Cebrap
São Paulo - Brazil