Zelig January 2008

zelig@lists.gking.harvard.edu

9 participants
7 discussions

predicted probabilities with zelig and imputed data

by Amber Wichowsky

Hello,I have estimated a logistic regression model with Zelig using the mi command to combine multiple imputed datasets. I understand that the summary output is the combined result and that I can also print out individual results by dataset. However, in calculating predicted probabilities, I am wondering which coefficients are being used by Zelig - would these be from the combined results? In addition, when setting the independent variables to their means (setx command) - would this be some weighted mean of that variable across those five datasets? Finally, could someone clarify the difference between the expected values and the predicted values that are given in the output. Thank you, Amber Wichowsky University of Wisconsin-Madison

16 years, 1 month

Basic question on counterfactual simulation

by 宋时歌

Dear Colleagues: I am using Zelig to estimate a population model and would like to do some counterfactual simulation based on the results. Both the dependent (whether have a newly born boy) and the main independent variable of interest (whether had a prior abortion) are binary ones. Zelig can handle questions like: what is the difference in probability of having a boy between those who had prior abortion and those who had not. I want to go a step further and ask: give the estimated model, if 50% of the total women had abortion, what the percentage of boys will be in all the newly born babies? What if only 25% of the women had abortion? And what about 75% of women had abortion? etc. Is there an easy way to do this? Thanks. Best, Shige - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

16 years, 3 months

Problems with memory size and workaround

by 宋时歌

Dear Colleagues: In the process of simulating the sex ratio at birth using Zelig, I encountered a memory problem. Here is my program: ------------------- z.out <- zelig(...) x.0 <- setx(z.out, abort = 0, fn = NULL) x.1 <- setx(z.out, abort = 1, fn = NULL) s.out <- sim(z.out, x=x.0, x1=x.1) summary(s.out) ------------------- The first three statements ran fine. But I got error message with the fourth statement saying something like "cannot allocate vector size of 380 MB". If I get rid of the "fn = NULL" option, the problem goes away. is this because my data is too big? I am running Ubuntu linux on a core2duo laptop with 2 GB of memory. My data file contains about 50,000 observations. The workaround (other than buying a new computer with 64-bit processor) is draw a random sample from the data file and do the counterfactual simulation on that smaller data, something like: ------------------------ z.out <- zelig(...) sd <- d[sample(1:nrow(d), 2000, replace = F),] x.0 <- setx(z.out, abort = 0, fn = NULL, data=sd) x.1 <- setx(z.out, abort = 1, fn = NULL, data=sd) s.out <- sim(z.out, x=x.0, x1=x.1) summary(s.out) ----------------------- Is this a viable option? Are there other options? Thanks. Best, Shige - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

16 years, 3 months

Re: Predicted values for lm.mixed + patches

by Delia Bailey

Hi Gregor, You are correct that the predicted values should be draws from the normal distribution. Thanks for catching this! Additionally, the code benefits from your clean-up proposals ... pulling definitions out of loops, adding stop() errors, updating dependencies, etc. I appreciate your edits. My code can virtually always benefit from an editor! Finally, the log-normal support is a straightforward extension. Thanks for the additional utility. So, in short, thank you for your careful attention and contribution to the public good. Much appreciation, Delia ************************************************ Delia Bailey Washington University in St. Louis School of Law, CERL Campus Box 1120 One Brookings Drive St. Louis, MO 63130 dbailey(a)wustl.edu http://delia.wustl.edu ************************************************ Begin forwarded message: > From: Gregor Gorjanc <gregor.gorjanc(a)bfro.uni-lj.si> > Date: January 8, 2008 1:03:25 PM EST > To: zelig(a)lists.gking.harvard.edu > Subject: [zelig] Re: Predicted values for lm.mixed + patches > > Hi! > > I fixed the issue with predicted values for lm.mixed plus some > other minor > tweaks. I uploaded the files as noted bellow. Please let me know > what you > think about these fixes/additions. > > Regards, Gregor > > ---------------------------------------------------------------------- > ---- > > http://gregor.gorjanc.googlepages.com/qi.mixed.R > > - fixed issue with predicted values > > - added two errors i.e. stop() if family or link not provided by > qi.mixed.R is being used. Additionally, I added support for log- > normal > model to above file, so that one can use > > z.out1 <- zelig(income ~ education + age + female + tag(1 | state), > data=voteincome, model="ls.mixed", family=gaussian > (link="log")) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > ---------------------------------------------------------------------- > ---- > > http://gregor.gorjanc.googlepages.com/ls.mixed.Rnw > > - added note about support for log-normal mixed model > > ---------------------------------------------------------------------- > ---- > > http://gregor.gorjanc.googlepages.com/param.mixed.R > > - removed duplicated code so that it is easier to understand what > is done > > ---------------------------------------------------------------------- > ---- > > http://gregor.gorjanc.googlepages.com/gamma.mixed.Rnw > http://gregor.gorjanc.googlepages.com/logit.mixed.Rnw > http://gregor.gorjanc.googlepages.com/poisson.mixed.Rnw > http://gregor.gorjanc.googlepages.com/probit.mixed.Rnw > > - changed Matrix package to lme4, since lmer is now back in lme4 > > - > Zelig Mailing List, served by Harvard-MIT Data Center > Send messages: zelig(a)lists.gking.harvard.edu > [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig > Zelig program information: http://gking.harvard.edu/zelig/ > > ************************************************ Delia Bailey Washington University in St. Louis School of Law, CERL Campus Box 1120 One Brookings Drive St. Louis, MO 63130 dbailey(a)wustl.edu http://delia.wustl.edu ************************************************

16 years, 3 months

can we use zelig with survey data?

by Lim, Nelson

I apologize if I am missing something ovious. I am analyzing the Current Population Survey (CPS) which uses the multi-stage stratified samples. __________________________________________________________________________ This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

16 years, 3 months

Add *.mixed models to plot.ci

by Gregor Gorjanc

"lm.mixed", "logit.mixed", "probit.mixed", "gamma.mixed", "poisson.mixed" - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

16 years, 3 months

Predicted values for lm.mixed

by Gregor Gorjanc

Hello! I noticed that the predicted vaues (PR) in ls.mixed are not caulculated as in other {poisson,gamma,probit,logit}.mixed models. In ls.mixed the PR are equal to Y = Xbeta + Zb while PR in other *.mixed models are equal to rdist(linkinv(eta)) where eta = Xbeta + Zb Current implementation for PR in ls.mixed is not predicting the values from the normal distribution, but just adding variability from beta and b posteriors to fitted values (Xbeta + Zb). If I got the whole thing right, the code in qi.mixed.R (from line 95) should be else if (class(object) == "lmer"){ ev <- betas %*% t(as.matrix(fTerms)) mu <- ev ## For predicted values, add in random effects draws for (i in 1:length(rTerms)){ mu <- mu + gammas[[names(rTerms[i])]] %*% t(as.matrix(rTerms[[i]])) } ## Sample from normal distribution with given mean and variance n <- length(mu[,1]) for (i in 1:ncol(mu)){ pr[,i] <- rnorm(n, mean=mu[,i], sd=attr(VarCorr(object), "sc")) } } Of course ls.mixed vignette should be updated. My proposal is: The predicted values (qi$pr) are draws from the gaussian (normal) distribution defined by mean $\mu_{ij}$ and variance $\sigma^2$, $$\mu_{ij} = X_{ij}\beta + Z_{ij} b_{i}$$ given $X_{ij}$ and $Z_{ij}$ and simulations of $\beta$ and $b_{i}$ from their posterior distributions. The estimated variance covariance matrices are taken as correct and are themselves not simulated. Additionally the code in qi.mixed.R uses length(mut[,i]) or length(mut[i,]) within each loop of prediction. I propose the following change starting on line 65 n <- length(mut[, 1]) if (family == "binomial"){ ev <- theta for (i in 1:ncol(mut)){ pr[,i] <- as.character(rbinom(n, 1, mut[,i])) } if (!is.null(y)) { if (NCOL(y) > 1) { y <- y[,1] } } } else if (family == "Gamma"){ ev <- theta * 1/alpha n <- length(mut[i,]) for (i in 1:nrow(mut)){ pr[i,] <- rgamma(n, shape = mut[i,], scale= 1/alpha) } } else if (family == "poisson"){ ev <- theta for (i in 1:ncol(mut)){ pr[,i] <- rpois(n, lambda = mut[,i]) } } Regards, Gregor - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

16 years, 3 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Zelig January 2008