Obtaining probabilities in oprobit

List overview All Threads
Download

newer

older

bug in setx?

Re: zelig question

dysgraphia2325

25 Jul 2006 25 Jul '06

12:41 p.m.

Attachments:

attachment.htm (text/html — 3.5 KB)

Show replies by date

Kosuke Imai

26 Jul 26 Jul

12:56 a.m.

Try something like setx(z.out, fn = NULL, data = myFile[1:10,]) if you want to do predictions using a data set that is different from what you used to create z.out. Kosuke ----------------------------------------------------- Kosuke Imai Office: Corwin Hall 041 Assistant Professor Phone: 609-258-6601 Department of Politics eFax: 973-556-1929 Princeton University Email: kimai(a)Princeton.Edu Princeton, NJ 08544-1012 http://imai.princeton.edu ----------------------------------------------------- On Tue, 25 Jul 2006, dysgraphia2325 wrote:

...

G'day Zeligers, Learning R and Zelig, thank you for the help so far! Running some code for oprobit analysis gives me:

myFile = read.csv(file = "C:/TestSetForR.csv",header=TRUE) names(myFile)

[1] "X1" "X2" "X3" "X4" "X5" "X6" [7] "Tumor_1_8"

#Dependent variable, tumor grading 1 to 8 #Independent variables, X1 to X6 histochemical marker values dim(myFile)

[1] 1009 7

summary(myFile)

X1 X2 X3 X4 X5 Min. : 0.00 Min. : 3.0 Min. : 0.0 Min. : 0.28 Min. : 1.00 1st Qu.: 1.00 1st Qu.: 11.0 1st Qu.:12.0 1st Qu.: 3.34 1st Qu.: 3.00 Median : 2.00 Median : 15.0 Median :18.0 Median : 4.70 Median : 6.00 Mean : 3.15 Mean : 25.5 Mean :17.9 Mean : 5.17 Mean : 6.53 3rd Qu.: 5.00 3rd Qu.: 21.0 3rd Qu.:23.0 3rd Qu.: 6.54 3rd Qu.: 9.00 Max. :31.00 Max. :159.0 Max. :47.0 Max. :14.90 Max. :19.00 X6 Tumor_1_8 Min. : 4.0 Min. :1.00 1st Qu.: 9.0 1st Qu.:3.00 Median :11.0 Median :6.00 Mean :11.0 Mean :5.29 3rd Qu.:13.0 3rd Qu.:7.00 Max. :19.0 Max. :8.00

#Check a few observations to see how we are going obs.out <- myFile[1:10,] obs.out

X1 X2 X3 X4 X5 X6 Tumor_1_8 1 3 16 25 6.24 10 8 3 2 0 159 20 4.40 3 8 6 3 2 6 12 3.66 7 9 7 4 1 9 8 7.10 5 12 4 5 0 6 11 2.60 17 15 2 6 1 24 15 5.12 8 8 1 7 0 159 15 4.70 5 11 8 8 16 20 35 6.14 5 10 5 9 4 25 9 2.20 12 13 3 10 0 23 6 4.68 15 16 4

z.out <- zelig(as.factor(Tumor_1_8) ~ X1+X2+X3+X4+X5+X6,model = "oprobit", data = myFile) summary(z.out)

Call: zelig(formula = as.factor(Tumor_1_8) ~ X1 + X2 + X3 + X4 + X5 + X6, model = "oprobit", data = myFile) Coefficients: Value Std. Error t value X1 0.026997 0.011081 2.436 X2 0.005686 0.001086 5.235 X3 -0.012237 0.004625 -2.646 X4 0.066867 0.017485 3.824 X5 -0.054901 0.011927 -4.603 X6 -0.069625 0.016020 -4.346 Intercepts: Value Std. Error t value 1|2 -2.282 0.249 -9.146 2|3 -1.805 0.245 -7.361 3|4 -1.473 0.243 -6.053 4|5 -1.189 0.243 -4.901 5|6 -0.929 0.242 -3.839 6|7 -0.496 0.241 -2.057 7|8 0.088 0.242 0.362 Residual Deviance: 3822.96 AIC: 3848.96 I would like to determine the output probabilities for each category for some of the observations and for a holdout dataset. That is, to examine results for part of the dataset myFile, say myFile([1:10],]) and for a holdout data set C:/newData.csv. I have been unable to find the correct syntax. What I want to do is answer questions like: given certain values of the X1 to X6 independent variables then what is the associated or predicted probability that the dependent variable Tumor_1_8 will be in each of the 8 categories. Any suggestions or advice appreciated! Cheers, Peter - Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

- Zelig Mailing List, served by Harvard-MIT Data Center Send messages: zelig(a)lists.gking.harvard.edu [un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig Zelig program information: http://gking.harvard.edu/zelig/

6507

days inactive

6508

days old

zelig@lists.gking.harvard.edu

Manage subscription

1 comments

2 participants

tags (0)

participants (2)

dysgraphia2325
Kosuke Imai