Try something like
setx(z.out, fn = NULL, data = myFile[1:10,])
if you want to do predictions using a data set that is different from what
you used to create z.out.
Kosuke
-----------------------------------------------------
Kosuke Imai Office: Corwin Hall 041
Assistant Professor Phone: 609-258-6601
Department of Politics eFax: 973-556-1929
Princeton University Email: kimai(a)Princeton.Edu
Princeton, NJ 08544-1012
-----------------------------------------------------
On Tue, 25 Jul 2006, dysgraphia2325 wrote:
G'day Zeligers,
Learning R and Zelig, thank you for the help so far!
Running some code for oprobit analysis gives me:
myFile = read.csv(file =
"C:/TestSetForR.csv",header=TRUE)
names(myFile)
[1] "X1" "X2" "X3" "X4" "X5"
"X6"
[7] "Tumor_1_8"
#Dependent variable, tumor grading 1 to 8
#Independent variables, X1 to X6 histochemical marker values
dim(myFile)
[1] 1009 7
summary(myFile)
X1 X2 X3 X4 X5
Min. : 0.00 Min. : 3.0 Min. : 0.0 Min. : 0.28 Min. : 1.00
1st Qu.: 1.00 1st Qu.: 11.0 1st Qu.:12.0 1st Qu.: 3.34 1st Qu.: 3.00
Median : 2.00 Median : 15.0 Median :18.0 Median : 4.70 Median : 6.00
Mean : 3.15 Mean : 25.5 Mean :17.9 Mean : 5.17 Mean : 6.53
3rd Qu.: 5.00 3rd Qu.: 21.0 3rd Qu.:23.0 3rd Qu.: 6.54 3rd Qu.: 9.00
Max. :31.00 Max. :159.0 Max. :47.0 Max. :14.90 Max. :19.00
X6 Tumor_1_8
Min. : 4.0 Min. :1.00
1st Qu.: 9.0 1st Qu.:3.00
Median :11.0 Median :6.00
Mean :11.0 Mean :5.29
3rd Qu.:13.0 3rd Qu.:7.00
Max. :19.0 Max. :8.00
#Check a few observations to see how we are
going
obs.out <- myFile[1:10,]
obs.out
X1 X2 X3 X4 X5 X6 Tumor_1_8
1 3 16 25 6.24 10 8 3
2 0 159 20 4.40 3 8 6
3 2 6 12 3.66 7 9 7
4 1 9 8 7.10 5 12 4
5 0 6 11 2.60 17 15 2
6 1 24 15 5.12 8 8 1
7 0 159 15 4.70 5 11 8
8 16 20 35 6.14 5 10 5
9 4 25 9 2.20 12 13 3
10 0 23 6 4.68 15 16 4
z.out <- zelig(as.factor(Tumor_1_8) ~
X1+X2+X3+X4+X5+X6,model = "oprobit", data = myFile)
summary(z.out)
Call:
zelig(formula = as.factor(Tumor_1_8) ~ X1 + X2 + X3 + X4 + X5 +
X6, model = "oprobit", data = myFile)
Coefficients:
Value Std. Error t value
X1 0.026997 0.011081 2.436
X2 0.005686 0.001086 5.235
X3 -0.012237 0.004625 -2.646
X4 0.066867 0.017485 3.824
X5 -0.054901 0.011927 -4.603
X6 -0.069625 0.016020 -4.346
Intercepts:
Value Std. Error t value
1|2 -2.282 0.249 -9.146
2|3 -1.805 0.245 -7.361
3|4 -1.473 0.243 -6.053
4|5 -1.189 0.243 -4.901
5|6 -0.929 0.242 -3.839
6|7 -0.496 0.241 -2.057
7|8 0.088 0.242 0.362
Residual Deviance: 3822.96
AIC: 3848.96
I would like to determine the output probabilities for each category
for some of the observations and for a holdout dataset.
That is, to examine results for part of the dataset myFile, say myFile([1:10],])
and for a holdout data set C:/newData.csv. I have been unable to find
the correct syntax. What I want to do is answer questions like: given certain values
of the X1 to X6 independent variables then what is the associated or
predicted probability that the dependent variable Tumor_1_8 will be in each of
the 8 categories.
Any suggestions or advice appreciated!
Cheers, Peter - Zelig Mailing List, served by Harvard-MIT Data Center Send messages:
zelig(a)lists.gking.harvard.edu [un]subscribe Options:
http://lists.gking.harvard.edu/?info=zelig Zelig program information:
http://gking.harvard.edu/zelig/
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: