Hi,
First, please update your Zelig to the most current version. You can do
it by
install.packages("Zelig", repos = "http://gking.harvard.edu")
Next, I am not sure exactly what you are trying to do. Perhaps, you can
elaborate it a bit more. I note that you should not be using quotation or
a list expression (i.e., [[1]] etc.) in R formula. That might be a reason
why you are getting those errors.
Kosuke
On Mon, 26 May 2008, Michael A. Gilchrist wrote:
Hi,
I am trying to apply separate rare event logistic regression (relogit) to a
combined dataset which consists 1000's of independent datasets, running a
statistical test on each one. However, I cannot seem to figure out the
proper syntax for taking the name for each dataset from the file's header and
applying a model to this data.
Although I have some programming experience, I will admit I don't find R very
intuitive (hence my hopes are high for using Zelig). I've been struggling
with this issue all day and, as a result, I am incredibly frustrated. Any
help would be greatly appreciated.
More details.....
I'm running R 2.6.1 and Zelig 3.1-0 on a Linux machine. I want to run
relogit on multiple datasets (~5000 different datasets in all). I have all
of the data combined together into a single file which I load using
rawdata = read.csv("filename.csv").
In the csv file, each dataset is in its own column and and has a unique
header. The first column is the X variable. An example of the dataset would
be,
Position, YAL058W, YAL063C, ...
1, NA, 0, ...
2, 0, 0, ...
3, 1, 0, ...
.
.
.
If I use the command,
zelig(formula = YAL058W ~ Position,
model="relogit",data=rawdata)
Zelig runs fine returning:
--------
Call: zelig(formula = YAL058W ~ Position, model = "relogit", data = rawdata)
Coefficients:
(Intercept) Position
-3.43368 -0.00280
Degrees of Freedom: 475 Total (i.e. Null); 474 Residual
Null Deviance: 73
Residual Deviance: 71.7 AIC: 75.7
Rare events bias correction performed
--------
So I thought, "Cool! I can write a for loop and apply the model to each
dataset." However, running
genelist=names(rawdata)
for(i in 2:20){
genename = genelist[[i]];
z.out[i]<-zelig(Position~genename,model="relogit",data=rawdata)
}
produces the error,
------
Error in `[.data.frame`(d, , all.vars(as.expression(formula))) :
undefined columns selected
-----
Confused by this, I ran variations of the code which also produced errors.
For example,
z.out<-zelig("YAL058W" ~ Position,
model="relogit",data=rawdata)
Error in 1 - "YAL058W" :
non-numeric argument to binary operator
z.out<-zelig("YAL058W" ~
"Position", model="relogit",data=rawdata)
Error in
complete.cases(d[, all.vars(as.expression(formula))]) :
negative length vectors are not allowed
So, I thought there was an issue with the fact that I was using a string for
a variable name. However, even if I run
genelist = lapply(names(rawdata),as.name)
so that the object genelist is a list of expressions(?)
Position, YAL001W, YAL003C, ...
and
genelist[[4]]
returns YAL058W (w/o
"")
Unfortunately, I still have a problem if I run
zelig(formula = genelist[[4]] ~ Position,
model="relogit",data=rawdata)
Error in `[.data.frame`(d, ,
all.vars(as.expression(formula))) :
undefined columns selected
Now I'm completely confused. Could any one provide some guidance here?
Thanks in advance.
Mike
-----------------------------------------------------
Department of Ecology & Evolutionary Biology
569 Dabney Hall
University of Tennessee
Knoxville, TN 37996-1610
phone:(865) 974-6453
fax: (865) 974-6042
web:
http://eeb.bio.utk.edu/gilchrist.asp
-----------------------------------------------------
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options:
http://lists.gking.harvard.edu/?info=zelig
Zelig program information:
http://gking.harvard.edu/zelig/
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: