Hello all,
I tried to estimate a multinomial logit using zelig, mlogit. My problem is
that I do not know how to determine the base variable. I red in the mailing
archive from last year, that in the case of one formula the baseline is the
last for numeric endogenous variables. But the case of different formula for
each level is not mentioned so I expected that the number, which is not
included in the formulas is the baseline. So I tried the examples on
http://gking.harvard.edu/zelig/docs/Examples21.html
<http://gking.harvard.edu/zelig/docs/Examples21.html>
and changed in the case of different formula for each level the levels, but
the results didn't change.
Here is the "unchanged" example output:
data(mexico)
> z.out2 <- zelig(list(id(vote88, "1") ~ pristr + othcok,
+ id(vote88, "2") ~ othsocok),
+ model = "mlogit", data = mexico)
> summary(z.out2)
Call:
zelig(formula = list(id(vote88, "1") ~ pristr + othcok, id(vote88,
"2") ~ othsocok), model = "mlogit", data = mexico)
Pearson Residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -4.1 -0.72 0.29 0.738 2.0
log(mu[,2]/mu[,3]) -2.1 -0.47 -0.21 -0.086 4.6
Coefficients:
Value Std. Error t value
(Intercept):1 2.20 0.276 8.0
(Intercept):2 -0.40 0.218 -1.9
pristr 0.66 0.077 8.6
othcok -1.19 0.091 -13.2
othsocok 0.21 0.138 1.5
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Dispersion Parameter for multinomial family: 1
Residual Deviance: 2367 on 2713 degrees of freedom
Log-likelihood: -1183 on 2713 degrees of freedom
Number of Iterations: 4
And for this run I have substituted in the second formula the 2 by a 3. But
the output didn't change.
z.out2 <- zelig(list(id(vote88, "1") ~ pristr + othcok,
+ id(vote88, "3") ~ othsocok),
+ model = "mlogit", data = mexico)
> summary(z.out2)
Call:
zelig(formula = list(id(vote88, "1") ~ pristr + othcok, id(vote88,
"3") ~ othsocok), model = "mlogit", data = mexico)
Pearson Residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -4.1 -0.72 0.29 0.738 2.0
log(mu[,2]/mu[,3]) -2.1 -0.47 -0.21 -0.086 4.6
Coefficients:
Value Std. Error t value
(Intercept):1 2.20 0.276 8.0
(Intercept):2 -0.40 0.218 -1.9
pristr 0.66 0.077 8.6
othcok -1.19 0.091 -13.2
othsocok 0.21 0.138 1.5
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Dispersion Parameter for multinomial family: 1
Residual Deviance: 2367 on 2713 degrees of freedom
Log-likelihood: -1183 on 2713 degrees of freedom
Number of Iterations: 4
So, what have I done wrong?
Thank you in advance!
Joerg Mueller-Scheessel
Hi -
when I use setx whith a factor that has extra levels not in the
dataset it gives "non-conformable arguments".
dta <- data.frame(y=rnorm(10),x1=1:10,x2=factor(c(rep(1,3),rep(2,4),rep(3,3))))
z1 <- zelig(y~x1+x2,model="ls",data=dta)
v1 <- setx(z1)
s1 <- sim(z1,x=v1)
##this works
##now change data, coding factor 3 as 2
dta$x2[7:10] <- 2
z1 <- zelig(y~x1+x2,model="ls",data=dta)
v1 <- setx(z1)
Warning message:
There is more than one mode. The first level is selected. in: FUN(X[[1]], ...)
> s1 <- sim(z1,x=v1)
Error in coef %*% t(x) : non-conformable arguments
If this is indeed a bug, a possible solution is to set somewhere in
setx function
for (i in 1:ncol(dta)) {
vnow <- dta[,i]
dta[,i] <- vnow[drop=TRUE]
}
but may have unintended consequences that I am not aware of.
best,
-eduardo
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Hi Alexis,
I think most models should be able to incoporate the weights by passing
"weights" as an optional argument in zelig().
Kosuke
On Mon, 17 Jul 2006, Alexis Diamond wrote:
> hi kosuke,
>
> hope you are having a wonderful day! :)
>
> i was wondering if zelig has the capability to accept observation-specific
> weights. didn't see it in the documentation. did i miss it? jens and i are
> working on a consulting project for the government of germany, and we would
> like to propose that they use Zelig (very intensively), but the thing is,
> their dataset involves observation-specific weights. any suggestions?
>
> thank you,
>
> alexis
>
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Thanks for the tip Olivia!
Cheers, Peter
> ----- Original Message -----
> From: "Olivia Lau" <olau(a)fas.harvard.edu>
> To: "Gary King" <king(a)harvard.edu>
> Subject: Re: [zelig] Ordered Probit in Zelig/R and EasyReg
> Date: Sat, 15 Jul 2006 16:36:30 -0400
>
>
> Sidebar: As a general rule, one should never ever attach(data) when data is
> an argument to a function that takes a formula. You will confuse R. Best,
> Olivia
>
> On 7/15/06, Gary King <king(a)harvard.edu> wrote:
> >
> >
> > you don't need to import results from Zelig into excel to compute
> > quantities of interest, since Zelig will do it for you automatically. See
> > the setx() and sim() steps.
> > Gary King
> >
> > On Sat, 15 Jul 2006, dysgraphia2325(a)lycos.com wrote:
> >
> > > Using Win XP, current R and Zelig.
> > > I am new to R and Zelig.
(tail snipped for brevity)
--
_______________________________________________
Search for businesses by name, location, or phone number. -Lycos Yellow Pages
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.as…
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
Using Win XP, current R and Zelig.
I am new to R and Zelig.
The ability to program Zelig/R is valuable....not available in EasyReg
In order to familiarize myself with the programs I ran the ordered probit
model in Zelig then in EasyReg using the same dataset.
Here are output details for my R session:
> myFile = read.table(file = "C:/TestSetForR.txt",header=T)
> attach(myFile)
> z.out <- zelig(as.factor(Grade_1_8) ~ X1 + X2 + X3 + X4, model = "oprobit", data = myFile)
> summary(z.out)
Call:
zelig(formula = as.factor(Grade_1_8) ~ X1 + X2 + X3 + X4, model = "oprobit", data = myFile)
Coefficients:
Value Std. Error t value
X1 -0.054702 0.018865 -2.900
X2 0.005183 0.001054 4.919
X3 0.085056 0.016026 5.307
X4 -0.088101 0.014033 -6.278
Intercepts:
Value Std. Error t value
1|2 -4.857 1.006 -4.830
2|3 -4.383 1.003 -4.370
3|4 -4.055 1.002 -4.048
4|5 -3.775 1.001 -3.770
5|6 -3.519 1.001 -3.516
6|7 -3.093 1.000 -3.093
7|8 -2.516 1.000 -2.517
Residual Deviance: 3843.73
AIC: 3865.73
################################################
Here are output details for the same data run in EasyReg.
EasyReg International [April 2, 2006]
Session date: Friday July 14, 2006
Session time: 07:48:11
----------------------------------------------
Ordered Probit model:
Dependent variable:
Y = Grade_1_8
Characteristics:
Grade_1_8
First observation = 1
Last observation = 1009
Number of usable observations: 1009
Minimum value: 1.0000000E+000
Maximum value: 8.0000000E+000
Sample mean: 5.2893954E+000
This variable is integer valued.
A discrete dependent variable model is suitable.
X variables:
X(1) = X1
X(2) = X2
X(3) = X3
X(4) = X4
X(5) = 1
Model:
P(Y-1=0|x) = F(-b'x)
P(Y-1=1|x) = F(-b'x+exp(b(6)))- F(-b'x)
For j=2,..,m-1,
P(Y-1=j|x) = F(-b'x+exp(b(6))+..+exp(b(5+j)))
- F(-b'x+exp(b(6))+..+exp(b(4+j)))
and
P(Y-1=m|x) = 1 - F(-b'x+exp(b(6))+..+exp(b(4+m)))
where m =7, b'x = b(1)x(1)+..+b(5)x(5), and
F(u) = c.d.f. of N(0,1) distribution.
Newton iteration succesfully completed after 30 iterations
Last absolute parameter change = 0.0001
Last percentage change of the likelihood = -0.0459
Maximum likelihood estimation results:
Variable Par. ML estimate t-value [p-value]
x(1)=X1 b(1) -0.0547016 -2.87 [0.00407]
x(2)=X2 b(2) 0.0051827 4.75 [0.00000]
x(3)=X3 b(3) 0.0850551 5.29 [0.00000]
x(4)=X4 b(4) -0.0881011 -6.25 [0.00000]
x(5)=1 b(5) 4.8565051 4.74 [0.00000]
b(6) -0.7482279 -7.65 [0.00000]
b(7) -1.1145713 -10.92 [0.00000]
b(8) -1.2712482 -12.77 [0.00000]
b(9) -1.3635656 -13.48 [0.00000]
b(10) -0.8530407 -11.26 [0.00000]
b(11) -0.5500365 -8.14 [0.00000]
[The two-sided p-values are based on the normal approximation]
Log likelihood: -1.05015155949E+003
Sample size (n): 1009
Information criteria:
Akaike: 2.103373
Hannan-Quinn: 2.123736
Schwarz: 2.156974
In the main the independent coefficients and t-values were quite close in value except that
the intercepts are dissimilar...not unexpected as the contrasts are defined differently, as differences.
As shown above, EasyReg provides expressions for the probabilities for the various levels viz
P(Y-1=0|x) = F(-b'x)
P(Y-1=1|x) = F(-b'x+exp(b(6)))- F(-b'x)
For j=2,..,m-1,
P(Y-1=j|x) = F(-b'x+exp(b(6))+..+exp(b(5+j)))
- F(-b'x+exp(b(6))+..+exp(b(4+j)))
and
P(Y-1=m|x) = 1 - F(-b'x+exp(b(6))+..+exp(b(4+m)))
where m =7, b'x = b(1)x(1)+..+b(5)x(5), and
F(u) = c.d.f. of N(0,1) distribution.
This is useful for me as I enter these equations into an Excel spreadsheet, useful to test holdout samples and for teaching purposes for example.
My question: in the Zelig documentation does there exist equivalent equations to calculate probabilities of the different levels?
Any advice appreciated!....Cheers, Peter.
--
_______________________________________________
Search for businesses by name, location, or phone number. -Lycos Yellow Pages
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.as…
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/
The Zelig package is fantastic. I am still new to R and have found the
functionality of Zelig truly impressive. Two quick questions:
1) Is there a way to invoke a sandwich estimator for clustered
observations, similar to the robcov function in Harrell's Hmisc
package? If not, could I change the standard errors manually in from a
model (e.g., logit) generated via zelig? I am guessing there is a way,
since just about anything is possible with R. (If somebody could
provide code for doing this, that would be great.)
2) Are there any plans for adding Cox regression to the available stock
of survival models?
Thanks in advance,
Brian
-
Zelig Mailing List, served by Harvard-MIT Data Center
Send messages: zelig(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=zelig
Zelig program information: http://gking.harvard.edu/zelig/