Unfortunately, I don't think we have an automated procedure for everything. You would have to multiply impute the data, do matching on each imputed data set, and then combine it in zelig() using mi() function. But this does not require any programming. You can simply run the same matching procedure on each data set via matchit() and then feed the resulting multiple matched data sets into zelig().
Good luck,
Kosuke
Department of Politics
Princeton University
http://imai.princeton.edu
On Sep 13, 2011, at 6:02 PM, Pingaul jb wrote:
> Dear Professor,
> I’m a post-doctoral student at Montreal University. I’m actually in Columbia, working and propensity scores with a colleague and using MatchIt and Zelig. First, congratulations for your packages that are very flexible.
>
> My question is about multiple imputation and propensity scores with these softwares. From what I understand, combining both approaches would include:
>
> 1/ Doing multiple imputation and testing which variables to include.
>
> 2/ Propensity score analysis on each imputed data set and pooling the overall balance to check if it is ok (or on each data set?).
>
> 3/ Calculation of the quantities of interest for each data set
>
> 4/ Pooling the quantities across data sets.
>
> I would like to know if there is a written syntax to perform the MatchIt analysis for all of the imputed data set without having to do it manually and check the overall balance. Also, in theory, the number of individuals retained after propensity score matching and the weights can be different for each imputed data set. So that we have to perform the final analysis on each one and then pool the data with a specific procedure to take into account the eventual varying Ns? I normally use Mice package for multiple imputation but it seems that Zelig handle Amelia. My colleague seems to do be able to do all that in stata, but I’m not sure how to make all the three R packages work together.
>
> I would be very happy if you could indicate to me a reference or a place where I can find the syntax to do that (I’ve been using R for some times so I can use packages easily but I have no programming skills).
>
>
> Best Regards!
>
>
>
> Jean-Baptiste
>
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Hi there,
I am encountering this error when using sim after logit.mixed or
probit.mixed:
Error in qifunction(obj, x = x, x1 = x1, y = y, param = param, num = num)
> : could not find function "setup.x.matrix"
This is with "ZeligMixed".
If I use ZeligMultilevel I get:
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),
> :
> attempt to set an attribute on NULL
Toy example:
d <- expand.grid(y=rbinom(10,1,.33),x=rnorm(10,1),t=1:4)
z.out <- zelig(y~x+tag(1|t),data=d,model="logit.mixed")
sim(z.out)
thanks,
Yph
Hello - I was happy to find the Zelig package as I had been looking for
code for implementing a generalized linear mixed model. I'm having some
issues obtaining predicted probabilities from my rather simple logit.mixed
regression. I am using logit.mixed with avoidance behavior as the binomial
response variable, height as the predictor (ranging from 150 to 1500m), and
year as a random effect (there are 3 years). The model runs fine, but I
encounter an error starting with x.out. I've looked at the package
documentation to try to solve the error, and have been unsuccessful. I'd
like to produce one figure showing with the predicted probability on the
y-axis and height on the x-axis. I'd like to show the regression line with
the 95% CI for the predicted probabilities. I've been successful at coding
this analysis and figure for a fixed effects logistic regression, but not
so much for the mixed effects logistic regression.
Thank you
library(Zelig)
library(ZeligMultilevel)
avd$year <- factor(avd$year)
z.out <- zelig(avoid ~ height + tag(1|year), data = avd, model =
"logit.mixed")
summary(z.out)
x.out <-setx(z.out, fn = NULL)
s.out <- sim(z.out, x = x.out)
summary(s.out)
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type> summary(s.out)Error in summary(s.out) :
error in evaluating the argument 'object' in selecting a method for
function 'summary': Error: object 's.out' not found
--
Karl Kosciuch
kosciuch(a)gmail.com
Hi,
I want to combine multiple amelia runs, and use the zelig function on the entire set of amelia outputs. However, I noticed that when I run zelig on the combined set, the coefficients don't look right. Here are the script and the R outputs:
> require(Amelia)
## Amelia II: Multiple Imputation
## (Version 1.7, built: 2013-02-10)
> library(Zelig)
Attaching package: ‘zoo’
The following object(s) are masked from ‘package:base’:
as.Date, as.Date.numeric
ZELIG (Versions 4.1-3, built: 2013-01-30)
> data(freetrade)
> a.out1 <-amelia(freetrade, m = 1, ts = "year", cs = "country")
-- Imputation 1 --
1 2 3 4 5 6 7 8 9 10 11 12 13 14
> a.out2 <-amelia(freetrade, m = 1, ts = "year", cs = "country")
-- Imputation 1 --
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> a.out3 <-ameliabind(a.out1, a.out2) # combine the amelia runs
> summary(a.out3)
Amelia output with 2 imputed datasets.
Return code: 1
Message: Normal EM convergence
Chain Lengths:
--------------
Imputation 1: 14
Imputation 2: 15
Rows after Listwise Deletion: 96
Rows after Imputation: 171
Patterns of missingness in the data: 8
Fraction Missing for original variables:
-----------------------------------------
Fraction Missing
year 0.00000000
country 0.00000000
tariff 0.33918129
polity 0.01169591
pop 0.00000000
gdp.pc 0.00000000
intresmi 0.07602339
signed 0.01754386
fiveop 0.10526316
usheg 0.00000000
# run zelig on the first amelia run
> z.out1 <- zelig(tariff ~ polity + pop + gdp.pc, data = a.out1$imputations, model = "ls", cite = FALSE)
> summary(z.out1)
Call:
lm(formula = formula, weights = weights, model = F, data = data)
Residuals:
Min 1Q Median 3Q Max
-29.686 -11.067 -3.367 10.003 45.556
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.317e+01 1.891e+00 17.538 < 2e-16 ***
polity -1.060e-01 2.434e-01 -0.435 0.664
pop 2.925e-08 5.405e-09 5.411 2.15e-07 ***
gdp.pc -2.738e-03 5.235e-04 -5.231 4.99e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 16.62 on 167 degrees of freedom
Multiple R-squared: 0.3234, Adjusted R-squared: 0.3112
F-statistic: 26.6 on 3 and 167 DF, p-value: 4.07e-14
# run zelig on the second amelia run
> z.out2 <- zelig(tariff ~ polity + pop + gdp.pc, data = a.out2$imputations, model = "ls", cite = FALSE)
> summary(z.out2)
Call:
lm(formula = formula, weights = weights, model = F, data = data)
Residuals:
Min 1Q Median 3Q Max
-40.13 -11.72 -3.63 10.16 43.77
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.400e+01 1.984e+00 17.133 < 2e-16 ***
polity -6.123e-02 2.545e-01 -0.241 0.81
pop 3.289e-08 5.689e-09 5.781 3.56e-08 ***
gdp.pc -3.016e-03 5.501e-04 -5.484 1.51e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 17.47 on 167 degrees of freedom
Multiple R-squared: 0.3495, Adjusted R-squared: 0.3378
F-statistic: 29.91 on 3 and 167 DF, p-value: 1.574e-15
# run zelig on the combined amelia runs
> z.out3 <- zelig(tariff ~ polity + pop + gdp.pc, data = a.out3$imputations, model = "ls", cite = FALSE)
> summary(z.out3)
Call:
lm(formula = formula, weights = weights, model = F, data = data)
Residuals:
Min 1Q Median 3Q Max
-29.686 -11.067 -3.367 10.003 45.556
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.317e+01 1.891e+00 17.538 < 2e-16 ***
polity -1.060e-01 2.434e-01 -0.435 0.664
pop 2.925e-08 5.405e-09 5.411 2.15e-07 ***
gdp.pc -2.738e-03 5.235e-04 -5.231 4.99e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 16.62 on 167 degrees of freedom
Multiple R-squared: 0.3234, Adjusted R-squared: 0.3112
F-statistic: 26.6 on 3 and 167 DF, p-value: 4.07e-14
Can someone please tell me why summary(z.out3) is returning the exact same results as summary(z.out1)?
How can I fix this problem so that the ameliabind output is treated the same as a direct output from the amelia function?
Thanks.
Evann,
I'm glad you worked that out. And of course, now that correction is on the list should it come up for others. However, those are also data issues (missingness, factors) that we should try to catch with more proactive internal checking in Zelig and softer, more informative error messages. Thanks for pointing that out,
James.
--
James Honaker, Senior Research Scientist
//// Institute for Quantitative Social Science, Harvard University
-----Original message-----
From: "Smith, Evann" <egsmith(a)fas.harvard.edu>
To: "zelig(a)lists.gking.harvard.edu" <zelig(a)lists.gking.harvard.edu>
Sent: Fri, Mar 29, 2013 23:57:39 GMT+00:00
Subject: Re: [zelig] sim() problems with ordered logit: incorrect dimensions
In case anyone is wondering, I figured out my problems.
1) I needed to set my dependent variable as a factor outside the zelig call.
2) It didn't like that I had missing data (which produced NAs in the setx values). Listwise deletion prior to running the model solved that problem.
Cheers,
Evann
On Mar 29, 2013, at 6:25 PM, wrote:
Hi,
When running an ordered logit, I get the following error:
d1t.x <- setx(d1t)
d1t.s <- sim(d1t, x=d1t.x)
Error in as.matrix(x)[, -1] : incorrect number of dimensions
As best I can discern, this is because the coef(d1t) command pulls the coefficients from the model, but not the values of the intercepts. While vcov(d1t) includes the intercepts.
Is there a fix or workaround for this? Or am I doing something else wrong?
Cheers,
Evann
====================================
Evann Smith
Ph.D. Candidate
Department of Government
Harvard University
egsmith(a)fas.harvard.edu<mailto:egsmith@fas.harvard.edu>
====================================
====================================
Evann Smith
Ph.D. Candidate
Department of Government
Harvard University
egsmith(a)fas.harvard.edu<mailto:egsmith@fas.harvard.edu>
====================================
In case anyone is wondering, I figured out my problems.
1) I needed to set my dependent variable as a factor outside the zelig call.
2) It didn't like that I had missing data (which produced NAs in the setx values). Listwise deletion prior to running the model solved that problem.
Cheers,
Evann
On Mar 29, 2013, at 6:25 PM, wrote:
Hi,
When running an ordered logit, I get the following error:
d1t.x <- setx(d1t)
d1t.s <- sim(d1t, x=d1t.x)
Error in as.matrix(x)[, -1] : incorrect number of dimensions
As best I can discern, this is because the coef(d1t) command pulls the coefficients from the model, but not the values of the intercepts. While vcov(d1t) includes the intercepts.
Is there a fix or workaround for this? Or am I doing something else wrong?
Cheers,
Evann
====================================
Evann Smith
Ph.D. Candidate
Department of Government
Harvard University
egsmith(a)fas.harvard.edu<mailto:egsmith@fas.harvard.edu>
====================================
====================================
Evann Smith
Ph.D. Candidate
Department of Government
Harvard University
egsmith(a)fas.harvard.edu<mailto:egsmith@fas.harvard.edu>
====================================
Hi,
When running an ordered logit, I get the following error:
> d1t.x <- setx(d1t)
> d1t.s <- sim(d1t, x=d1t.x)
Error in as.matrix(x)[, -1] : incorrect number of dimensions
As best I can discern, this is because the coef(d1t) command pulls the coefficients from the model, but not the values of the intercepts. While vcov(d1t) includes the intercepts.
Is there a fix or workaround for this? Or am I doing something else wrong?
Cheers,
Evann
====================================
Evann Smith
Ph.D. Candidate
Department of Government
Harvard University
egsmith(a)fas.harvard.edu
====================================
Hi,
I am a novice user of Zelig and Amelia, and installed version 4.1-3. I am puzzled
on how I can obtain the degrees of freedom, t-statistic, and f-values
of combined multiply imputed data using Amelia or Zelig.
I created imputations for a dataset. Then, to
understand how to combine imputations, I used code straight from the
Amelia manual. When I loaded Zelig and used the zelig function, I got
coefficients that appeared to be coefficients of individual imputations,
and not those of the combined set. The code I used was:
a.out <- amelia(freetrade, m = 5)
z.out.imp <- zelig(tariff ~ polity + pop + gdp.pc + year + country, data = a.out$imputations, model = "ls")
summary(z.out.imp)
Furthermore, I have also tried this code:
> b.out <-NULL
> se.out <-NULL
> for(i in 1:a.out$m){
+ ols.out <- lm(tariff ~ polity + pop + gdp.pc, data = a.out$imputations[[i]])
+ b.out <- rbind(b.out, ols.out$coef)
+ se.out <-rbind(se.out, coef(summary(ols.out))[,2])
+ }
> combined.results<-mi.meld(q=b.out, se = se.out)
> combined.results
The
returned information reflects the combined values and the combined
standard error. There seems to be no way to get the t-statistic,
degrees of freedom, or f-values using mi.meld().
Without having to do a manual calculation or use Stata, how can I obtain the degrees of freedom, t-statistic, and f-values?
Thanks.
Dear Developers,
I would be interested in the function you used for the sim() command to
create the 95% confidence intervals for the predicted probabilities. (I
have not found it in the documentation).
I've managed to get the predicted probabilities with the /qi$ev, and
then replicate the mean and sd, but I seem to be too ignorant to
replicate the confidence intervals that the sim() command gives me. I
originally thought it was a basic calculation used for any normal
distribution, but that did not yield identical results.
My code, that I tried was the following:
GAPtrend <- zelig(as.factor(jp) ~
youngA+transgen+statshift+lostgen+male+educ+empl+inc+ind+egali,
model="mlogit", data=isjp)
transEMPL <- setx(GAPtrend, youngA=1, transgen=0,
statshift=0,lostgen=0, male=1, educ=4, empl=1, inc=52, ind=0, egali=0)
pp.transEMPL <- sim(GAPtrend, x = transEMPL)
predicted<-pp.transEMPL$qi$ev
###Calculation of CI
lower <- mean(predicted$Pr.Y.0.)-1.96*(sd(predicted$Pr.Y.0.))
upper <- mean(predicted$Pr.Y.0.)+1.96*(sd(predicted$Pr.Y.0.))
Thank you in advance for your help,
Zsófia Ignácz
--
Zsófia Ignácz
PhD Student
Humboldt Universität zu Berlin
Berlin Graduate School of Social Sciences European PhD for
Social-economic
and Social Statistics
Email: zsofia.ignacz(a)sowi.hu-berlin.de
Research Associate
Jacobs University Bremen
Campus Ring 1
D-28759 Bremen
Germany
Phone: +49 421 200-3012
Email: z.ignacz(a)jacobs-university.de
URL: http://jacobs-university.de/directory/zignacz