We show empirically and theoretically that Pr(Y=1)=p is underestimated in
logistic regression (at the stage of computing the coefficients and again
at the stage of computing p given the coefficients). this also implies
that 1-p is overestimated. the two cancel each other out in your
calculation, which means that the calculation doesn't bear on the issue of
bias.
Gary King
: Gary King, King(a)Harvard.Edu
http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
On Mon, 11 Nov 2002, Harald Scheule wrote:
Dear professor King,
I am working at the business faculty of the University of Regensburg,
Germany and we are estimating the probabilities of corporate defaults using
logistic regression models. With this background I have read your article
"Logistic regression in Rare Events Data" with great interest.
In Chapter 5: Rare Event, Finite Sample Corrections you argue that the
probability for an event is underestimated because of rare events and
randomness of the estimated paramter.
What I do not understand is: If I run a logistic regression e.g. with PROC
LOGISTIC in SAS and estimate the default probabilities, their sum equals the
observed sum of events. If probabilities are underestimated, shouldn't their
sum be lower that the observed number of events?
I would be very thankful if you could help me to solve my problem
Harald Scheule_________________
Harald Scheule
Dipl.-Kfm.
Lehrstuhl für Statistik
Universität Regensburg
93040 Regensburg
Germany
Tel.: +49 (0)941/943-2287
Fax.: +49 (0)941/943-4936
________________
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe:
http://lists.hmdc.harvard.edu/?info=relogit