Zelig May 2008

zelig@lists.gking.harvard.edu

10 participants
9 discussions

Question regarding multiple records per subject in cox proportional hazards model

by aleman＠fordham.edu

Dear Zelig users, A colleague and I are trying to use the survival package to run Cox models in Zelig. We are using a partitioned likelihood approach to model several kinds of events using Cox models. Since we have panel data, we have multiple records per subject and the possibility that a country may experience more than one type of event. Practically speaking, this will require us to estimate 4 models, 1 each for each risk. The basic idea here is that we treat the J-1 remaining risks as if they are right-censored cases. This isolates the risk of interest. In stata this is easy, you just use the id() option. In R this doesn't seem to be so straightforward because the cluster function is the equivalent of the cluster() option in Stata, but the equivalent of the id() option in Stata does not to have been implemented. Another alternative is to use the strata () option to stratify by event type, but the usual and mostly valid complaint against this model is that covariate effects are constrained to be equal across risks. Is there a way to model multiple records per subject in R while allowing for varying covariate effects across risks? Thanks, Jose Aleman, Ph.D. Assistant Professor Political Science Department Fordham University - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

15 years, 11 months

another amelia question

by Donald Braman

I'm attempting to use AmeliaII in the following way: vars_to_impute <- gundata[,c("progun", "egalitarianism", "individualism", "crfear", "victim", "female", "RACE", "income", "URBANKID", "URBANNOW", "RELIGION", "iss", "democrat", "conservative")] imputed <- amelia(data=vars_to_impute, p2s=2, noms=c("RACE","RELIGION","URBANKID","URBANNOW"), outname="imputation", ords=c("democrat", "conservative") The problem that I run into is that in about half of the imputation attempts fail due to non-invertible covariance matrices. I'm curious if there is any way to deal with this aside from removing the most highly covariant variables? For example, given that it produces imputed data about 1/5 of the time, would it be acceptable to use successful imputations? E.g., can I just set m=100 and use as many imputations as I like from the resulting set of ~25 imputations? If I do need to remove the covariate variables, do you know of a simple way to check for that among a given set of variables? -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206

15 years, 11 months

referring to names of dataset with Zelig

by Michael A. Gilchrist

Hi, I am trying to apply separate rare event logistic regression (relogit) to a combined dataset which consists 1000's of independent datasets, running a statistical test on each one. However, I cannot seem to figure out the proper syntax for taking the name for each dataset from the file's header and applying a model to this data. Although I have some programming experience, I will admit I don't find R very intuitive (hence my hopes are high for using Zelig). I've been struggling with this issue all day and, as a result, I am incredibly frustrated. Any help would be greatly appreciated. More details..... I'm running R 2.6.1 and Zelig 3.1-0 on a Linux machine. I want to run relogit on multiple datasets (~5000 different datasets in all). I have all of the data combined together into a single file which I load using rawdata = read.csv("filename.csv"). In the csv file, each dataset is in its own column and and has a unique header. The first column is the X variable. An example of the dataset would be, Position, YAL058W, YAL063C, ... 1, NA, 0, ... 2, 0, 0, ... 3, 1, 0, ... . . . If I use the command, > zelig(formula = YAL058W ~ Position, model="relogit",data=rawdata) Zelig runs fine returning: -------- Call: zelig(formula = YAL058W ~ Position, model = "relogit", data = rawdata) Coefficients: (Intercept) Position -3.43368 -0.00280 Degrees of Freedom: 475 Total (i.e. Null); 474 Residual Null Deviance: 73 Residual Deviance: 71.7 AIC: 75.7 Rare events bias correction performed -------- So I thought, "Cool! I can write a for loop and apply the model to each dataset." However, running > genelist=names(rawdata) > for(i in 2:20){ genename = genelist[[i]]; z.out[i]<-zelig(Position~genename,model="relogit",data=rawdata) } produces the error, ------ Error in `[.data.frame`(d, , all.vars(as.expression(formula))) : undefined columns selected ----- Confused by this, I ran variations of the code which also produced errors. For example, > z.out<-zelig("YAL058W" ~ Position, model="relogit",data=rawdata) Error in 1 - "YAL058W" : non-numeric argument to binary operator > z.out<-zelig("YAL058W" ~ "Position", model="relogit",data=rawdata) Error in complete.cases(d[, all.vars(as.expression(formula))]) : negative length vectors are not allowed So, I thought there was an issue with the fact that I was using a string for a variable name. However, even if I run > genelist = lapply(names(rawdata),as.name) so that the object genelist is a list of expressions(?) Position, YAL001W, YAL003C, ... and > genelist[[4]] returns YAL058W (w/o "") Unfortunately, I still have a problem if I run > zelig(formula = genelist[[4]] ~ Position, model="relogit",data=rawdata) Error in `[.data.frame`(d, , all.vars(as.expression(formula))) : undefined columns selected Now I'm completely confused. Could any one provide some guidance here? Thanks in advance. Mike ----------------------------------------------------- Department of Ecology & Evolutionary Biology 569 Dabney Hall University of Tennessee Knoxville, TN 37996-1610 phone:(865) 974-6453 fax: (865) 974-6042 web: http://eeb.bio.utk.edu/gilchrist.asp ----------------------------------------------------- - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

15 years, 11 months

simple question

by Donald Braman

I'm trying to use Zelig and Amelia. I've tried it a few different ways, but can't seem to get it right. Here's what I've tried: ## first I impute the missing data, which goes just splendidly vars_to_impute <- gundata[c("progun", "egalitarianism", "individualism", "crfear", "victim", "female", "black", "white", "income", "ruralkid", "ruralnow", "protestant", "jewish", "iss", "democrat", "cons")] imputed <- amelia(data=vars_to_impute) ## Then I tried this z.out <- zelig(progun ~ egalitarianism + individualism , model = "ls", data = imputed) ## but I just get this error: ## Error in data.frame(m = 5, idvars = NULL, logs = NULL, ts = NULL, cs = NULL, : ## arguments imply differing number of rows: 1, 0, 13, 8, 16, 21 ## I've also tried this inmi1 <- read.csv("outdata1.csv") inmi2 <- read.csv("outdata2.csv") inmi3 <- read.csv("outdata3.csv") inmi4 <- read.csv("outdata4.csv") inmi5 <- read.csv("outdata5.csv") z.out <- zelig(progun ~ egalitarianism + individualism , model = "ls", data = mi(immi1, immi2, immi3, immi4, immi5)) ## which produces this error: ## Error in `[.data.frame`(d, , all.vars(as.expression(formula))) : ## undefined columns selected ## I'm sure it's a simple error on my part, I just can't figure out what it might be. Don -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

15 years, 11 months

RE: A question about Zelig

by Kosuke Imai

Hi Bilal, > Thanks for your previous help. Sorry to bother you again. I have few > other questions that I need help with respect to Zelig. > > 1. I am trying to use e.i. R into C in order to calculate quantities of > interest at precinct level. I tried > > s.out$qi$ev > > but I got the simulation results instead of precinct level results I > believe. How can I get precinct level results in e.i. R into C along > with confidence intervals? The contributors of this model should be able to answer you question. I will forward your message to them just in case they miss this message. > 2. I have many missing values in my data as well as in many precints a > particular political party candidate did not contest. I can not use list > wise deletion as it will really make the data very short and un > reliable. What should I do in this case? Multiple imputation is a standard approach to the missing data problem, but in this case I assume that these data are not actually missing in the usual sense (they are missing because there are no candidates). In principle, there are ways to deal with this kind of problem, but I cannot think of any software that does this. > 3. What about complex sample module. Is it ready for use? Yes, they are there now! See survey.xxx models. Kosuke > Thanks again for your help > > Regards, Bilal Hassan Khan > > > > >> Date: Mon, 10 Mar 2008 14:45:54 -0400> From: kimai(a)Princeton.EDU> To: sbhk597(a)hotmail.com> CC: zelig(a)latte.harvard.edu> Subject: Re: A question about Zelig> > We are working on these complex survey models right now. We hope to be > able to release these models very soon. Stay tuned...> > Kosuke> > ---------------------------------------------------------> Kosuke Imai Office: Corwin Hall 041> Assistant Professor Phone: 609-258-6601> Department of Politics eFax: 973-556-1929> Princeton University Email: kimai(a)Princeton.Edu> Princeton, NJ 08544-1012 http://imai.princeton.edu/> ---------------------------------------------------------> > On Mon, 10 Mar 2008, Bilal Khan wrote:> > >> >> > Hi> >> > I need some help on Zelig. I have been trying to subscribe Zelig list but have not been successful. I hope you can help me with few problems in Zelig.> >> > First of all, I would like to know how to weight by a weight variable while building models in Zelig. Lots of survey data have w! eights and one uses weights while calculating quantities of interest however, I dont know how to use a weight variable while building models in Zelig.> >> > I would be looking forward to your reply. Thanks again for your kind help in this regard.> >> > Regards, Bilal Hassan Khan> > _________________________________________________________________> > Express yourself instantly with MSN Messenger! Download today it's FREE!> > http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ > _________________________________________________________________ > Connect to the next generation of MSN Messenger� > http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=…

15 years, 11 months

Problem with setx after using as.factor in the regression

by Andrew Stokes

Hello, When I run a regression using as.factor to create indicator variables from categorical variables, the subsequent simulation steps using setx and sim do not work. When I run the regression where the indicator variables were created beforehand, the simulation steps work. So for example, I run z.out<-zelig(severe_anemic ~ net_status + anc_visit + as.factor(age) + as.factor(wealth) + as.factor(region), model = "logit", data = my.data) And then I run: x.out0<-setx(z.out, net_status = 0) x.out1<-setx(z.out, net_stutus = 1) s.out<-sim(z.out, x=x.out0, x1=x.out1, num=1000) After each of the preceding lines, I receive this error: Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : contrasts can be applied only to factors with 2 or more levels Thanks for any advice! Andrew Stokes Institute for Health Metrics and Evaluation

15 years, 11 months

Plotting Survival Curves with Uncertainty Estimates

by Gorjanc Gregor

Hi, John Graves[1] wrote a function plot.surv() that can be used with output from Zelig's Cox proportional hazards model. Perhaps this function could be included into the Zelig package. [1]http://www.iq.harvard.edu/blog/sss/archives/2008/05/plotting_surviv.shtml -- Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana PhD student Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com Zootechnical Department blog: http://ggorjan.blogspot.com Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia, Europe tel: +386 (0)1 72 17 861 ---------------------------------------------------------------------- - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

15 years, 12 months

Zelig for R 2.70

by Shige Song

Dear All, I noticed the version of Zelig on CRAN is 3.10 instead of the most recent 3.21. I tried to install from Zelig's official web site, with mixed results. On linux (Ubuntu 8.04), I can grab the source and compile without problems; on windows, I got error message saying something like "failed to get index". Is there a way to ignore this error message and proceed? By the way, I am running R 2.70. Thanks. Shige

16 years

Random Effects with MI data

by Didi Kuo

Dear Zelig List, I am trying to do estimate a random effects logit model using multiply imputed data, but I get the following error message: out<-zelig(cwarcons~rival_lag+weak10+civwar_nb+lref_totalnbl+lpopl + +lrgdp96l+polity2l+politysq+ethfrac+peace1+eeurop+lamerica+ssafrica+asia+ + nafrme+lmtnest+elevdiff+Oil+instab+tag(1|ccode),data=mi(m1,m2,m3,m4,m5), + model="logit.mixed") > summary(out) Error in apply(coef1, 1, mean) : dim(X) must have a positive length In addition: Warning messages: 1: In x$coef : $ operator is invalid for atomic vectors, returning NULL 2: In x$coef : $ operator is invalid for atomic vectors, returning NULL 3: In x$coef : $ operator is invalid for atomic vectors, returning NULL 4: In x$coef : $ operator is invalid for atomic vectors, returning NULL 5: In x$coef : $ operator is invalid for atomic vectors, returning NULL Any ideas about how I might fix this? Thanks, Didi Kuo - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

16 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Zelig May 2008