UNSUBSCRIBE AT THE BOTTOM
--------------------------------------------------------
Dear Subscriber/Member ,
Don't spend money on a DVD burner
to copy or backup DVD's!
DvD Pro is the most technologically advanced method
of DvD/Sony game reproduction available. It's the only
system you will need to create backups or copys of DvD/Games that
will play in any home entertainment system.
Get it Today and you get:
------------------------------
+ 70% off the Regular Price
+ Full Technical Support
+ Free Updates
For more information, please visit one of their Web Sites:
http://www.eclipseway.com/20026/
***********************************************************
Due to the high volume of traffic to our website
for this product, it may be down at times.
***********************************************************
------------------------ [ REMOVE ] ------------------------
You are receiving this email as a subscriber to our mailing list.
To remove yourself from this and related email lists click here:
http://www.eclipseway.com/remove.php
166165mwrsl
ickqbuiefpmbgxyvhhtecyohb
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
I would like to calculate marginal effects for the relogit estimation.
However, when I run the estimation and then run 'mfx compute', the
marginal effects are equal to the coefficients given with the relogit
results, any suggestions?
Anne Anderson
McClelland Hall 309
520-621-1701
Wir sind überzeugt, dass Sie unsern SQLArtist testen sollten!
Mit dem SQLArtist verwalten Sie auch komplexe Datenbanken mühelos und sicher.
- Editieren von DB-Strukturen
- SQL-Scripts erstellen (Wizard)
- DB-Versions-Verwaltung
- Möglichkeit der Anbindung an Borland Delphi und PowerDesigner
- und noch viele weitere Funktionen
URL zu Produktebeschreibung und Download Vollversion:
http://www.vid-software.com/prod_sqlartist.php
Mit freundlichen Grüssen
VID-Software GmbH
Stefano Marinello
Hofstrasse 99
8620 Wetzikon
Switzerland
-----------------------------------------------------------------------
Um sich von dieser Mailingliste abzumelden
http://www.vid-software.com/newsletter/lists/?p=unsubscribe&hash=a0a0303866…
Um Ihre Benutzerdaten zu manipulieren
http://www.vid-software.com/newsletter/lists/?p=preferences&uid=a0a0303866b…
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
Dear Gary
I get an error message "no observations" with relogit. I have included the
output with trace on to show where the error occured. Any idea how to deal
with it? The error message goes away when I use the "nomcn" option, but I
do want to make finite sample corrections. The program seems to be failing
at the piece of code inside ReLogit:
matrix accum `A' = `rhsvars' [pw=`wtmcn'] if `touse', `noconst'
giving r(2000) "no observations" error
I have a feeling it has something to do with some of my variables on the
RHS (same_pclass same_nclass same_subcat same_cat) being dummies to capture
fixed effects, and some of the successes being "completely determined",
though I am not sure. In any case, I don't think the program should just
crash if that is the case (e.g. just a simple logit on the same data runs
just fine even though it tells that some of the successes being "completely
determined").
many thanks!
Jasjit
[I am taking the liberty of ccing this to the ReLogit list as others might
have seen this problem too]
relogit cite_wc_1 same_pclass same_nclass same_subcat same_cat n_overlap
lag lag2
> , wc(0.000005)
- version 6.0
- syntax varlist [if] [in] [aw fw pw iw/] [, pc(numlist sort max=2 >=0 <=1)
wc(numli
> st max=1 >=0 <=1) NOMCN FIRTH NORobust CLuster(varname) NOCONstant
Level(int $S_le
> vel)]
- marksample touse
- if "`firth'" ~= "" {
di in r _n "Relogit for Stata does not yet support the FIRTH option"
exit 198
}
- if "`norobust'" == "" { local robust robust }
- if "`cluster'" ~= "" {
local cluster cl(`cluster')
if "`norobust'" ~= "" {
di in r "Error: the norobust & cluster options are incompatible"
exit 198
}
}
- if "`wc'" ~= "" {
- if "`norobust'" ~= "" {
di in r _n "Warning: Traditional variance estimates do not make sense"
di in r "with the wc() option"
}
- if `wc' == 0 { local wc = .1^8 }
- else if `wc' == 1 { local wc = 1 - .1^8 }
- }
- if "`pc'" ~= "" {
if "`wc'" ~= "" {
di in r _n "Error: Can't use the pc() & wc() options together."
exit 198
}
if "`noconst'" ~= "" {
di in r "Error: Can't use the pc() & noconstant options together."
exit 198
}
gettoken pcLO pcHI : pc
if "`pcHI'" == "" {
if `pc' == 0 { local pc = .1^8 }
else if `pc' == 1 { local pc = 1 - .1^8 }
}
else {
if `pcLO' == 0 { local pcLO = .1^8 }
if `pcHI' == 1 { local pcHI = 1 - .1^8 }
}
}
- tempname cat ybar wt0 wt1 b V bfull Vfull df_m k
- gettoken depvar rhsvars : varlist
- tab `depvar' if `touse', matrow(`cat')
cite_wc_11 | Freq. Percent Cum.
------------+-----------------------------------
0 | 126330 96.91 96.91
1 | 4023 3.09 100.00
------------+-----------------------------------
Total | 130353 100.00
- if `cat'[1,1] ~= 0 | `cat'[2,1] ~= 1 {
di in r "Error: Dependent variable can only take-on values of 0 or 1"
exit 198
}
- su `depvar' if `touse', meanonly
- scalar `ybar' = r(mean)
- if "`wc'" ~= "" {
- scalar `wt0' = (1-`wc') / (1-`ybar')
- scalar `wt1' = `wc' / `ybar'
- }
- else {
scalar `wt0' = 1
scalar `wt1' = 1
}
- tempvar wt
- if "`exp'" == "" { local exp 1 }
- gen `wt' = `exp'* (`wt0'*(1-`depvar')+`wt1'*`depvar')
- display "doing logit"
doing logit
- logit `varlist' if `touse' [aweight=`wt'], `robust' `cluster' `noconst'
(sum of wgt is 1.3035e+05)
Iteration 0: log likelihood = -8.6072543
Iteration 1: log likelihood = -7.5214269
Iteration 2: log likelihood = -6.6296738
Iteration 3: log likelihood = -6.4399945
Iteration 4: log likelihood = -6.2323357
Iteration 5: log likelihood = -6.2131791
Iteration 6: log likelihood = -6.2127935
Iteration 7: log likelihood = -6.2127932
Logit estimates Number of obs = 130353
Wald chi2(7) = 6748.41
Prob > chi2 = 0.0000
Log likelihood = -6.2127932 Pseudo R2 = 0.2782
------------------------------------------------------------------------------
| Robust
cite_wc_11 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
same_pclass | -.4992331 .9210067 -0.54 0.588 -2.304373 1.305907
same_nclass | 2.383971 .0905139 26.34 0.000 2.206567 2.561375
same_subcat | 1.607911 .0696091 23.10 0.000 1.47148 1.744343
same_cat | 1.292383 .064568 20.02 0.000 1.165832 1.418934
n_overlap | 2.137836 .1874946 11.40 0.000 1.770353 2.505319
lag | .2288113 .0481226 4.75 0.000 .1344928 .3231298
lag2 | -.0130391 .0028906 -4.51 0.000 -.0187046 -.0073735
_cons | -14.44045 .1554302 -92.91 0.000 -14.74509 -14.13581
------------------------------------------------------------------------------
- display "logit done"
logit done
- matrix `b' = e(b)
- matrix `V' = e(V)
- scalar `df_m' = e(df_m)
- scalar `k' = colsof(`V')
- local N = e(N)
- if "`nomcn'" == "" & "`firth'" == "" {
- tempname A Ainv row biasMCN
- tempvar pi wtmcn ksi
- predict `pi', p
- version 7.0
- if _caller()<=5 | "`e(predict)'"=="" {
_predict `0'
}
- else {
- local v=_caller()
- version `v'
- `e(predict)' `0'
- version 6
- local myopts "DBeta DEviance DX2 DDeviance Hat Number Pr Residuals RStandard"
- _pred_se "`myopts'" `0'
- version 6.0
- sret clear
- gettoken ouser 0 : 0
- local orig `"`0'"'
- gettoken varn 0 : 0, parse(" ,[")
- gettoken nxt : 0, parse(" ,[(")
- if !(`"`nxt'"'=="" | `"`nxt'"'=="if" | `"`nxt'"'=="in" | `"`nxt'"'==",") {
local typ `varn'
gettoken varn 0 : 0, parse(" ,[")
}
- syntax [if] [in] [, `ouser' CONStant(varname numeric) noOFFset *]
- if `"`options'"' != "" {
_predict `orig'
sret local done 1
exit
}
- confirm new var `varn'
- sret local done 0
- sret local typ `"`typ'"'
- sret local varn `"`varn'"'
- sret local rest `"`0'"'
- if `s(done)' { exit }
- local vtyp `s(typ)'
- local varn `s(varn)'
- local 0 `"`s(rest)'"'
- syntax [if] [in] [, `myopts' noOFFset]
- local type
"`dbeta'`devianc'`dx2'`ddevian'`hat'`number'`pr'`residua'`rstanda'"
- if "`type'"=="" | "`type'"=="pr" {
- if "`type'"=="" {
di in gr "(option p assumed; Pr(`e(depvar)'))"
}
- _predict `vtyp' `varn' `if' `in', `offset'
- label var `varn' "Pr(`e(depvar)')"
- exit
- }
- gen `wtmcn' = (`pi'-`pi'^2)*`wt'
- matrix accum `A' = `rhsvars' [pw=`wtmcn'] if `touse', `noconst'
no observations
matrix `Ainv' = inv(`A')
local vars : colnames `Ainv'
local sum 0
local i 1
while `i' <= `k' {
gettoken var vars : vars
matrix `row' = (`Ainv'[1...,`i'])'
tempvar c`i'
matrix score `c`i'' = `row'
if `i'==`k' & "`noconst'" == "" { local sum `sum' + `c`i'' }
else { local sum `sum' + `c`i''*`var' }
local i = `i' + 1
}
display "generating ksi..."
gen `ksi'= .5 * (`sum') * [(1+`wt1')*`pi'-`wt1']
display "regressing ksi ..."
regress `ksi' `rhsvars' [aw=`wtmcn'] if `touse', `noconst'
display "get estimated bias..."
matrix `biasMCN' = e(b)
display "correct coefs.."
matrix `b' = `b' - `biasMCN'
display "fix covariance matrix..."
matrix `V' = `V'*(`N'/(`N'+`k'))^2
}
r(2000); t=23.36 12:21:42
______________________________________________________
Jasjit Singh
PhD Candidate, Business Economics, Harvard University
jsingh(a)hbs.edu http://www.people.hbs.edu/jsingh
617 495 6041 (office) 617 497 8436 (home)
Mail: Baker 180-F, Harvard Business School, Boston, MA 02163
Dear Professor King,
I am currently writing a paper on estimating rare events in financial data. I
applied your program for GAUSS relogit and I obtained the output that I
needed. When I ran relogitq, however, I received the following message:
LO=RECODE(LO,LO==0,_RELOGIT_N[2]/_RELOGITQ_POP)
^
C:\GAUSS\VALERI\SEMINARR(530) : error G0064 : Operand missing
I am a beginner in the GAUSS and I cannot see the problem. I would appreciate
it immensely if I could get any help and suggestions how to deal with the
problem.
Sincerely yours
Valeri Voev
Master Student in International Economic Relations
University of Konstanz
Germany
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
see below..
On Mon, 6 Jan 2003, sabri boubaker wrote:
>
> Dear Professor King,
>
>
>
> I am a PhD student in Finance at Paris XII University. I am estimating a
> Relogit regression of the determinants of firms use of non-voting
> traded shares (11 firms use such device out of 510 firms included in the
> sample).
>
>
>
> I would like to know:
>
> 1/ if there is a way to compute Pseudo R2 with Relogit? If not, should I
> consider the Pseudo R2 from ordinary logit conducted on the same
> variables that are used in the Relogit?
you could do this by hand, but it would probably not be appropriate. if
you want the best fitting regression, just run logit. the problem is that
the best fitting regression is also bias. so you have to take less fit
(which after all is not a principle of inference and so not particularly
relevant) to reduce bias. Basically, i wouldn't worry about it.
>
> 2/ How to get the marginal effects after running Relogit?
better than marginal effects are first differences, which are exact.
Relogit will compute these in the same way as clarify.
>
> I would greatly appreciate your help in this regard.
Best of luck with your research,
Gary King
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
download relogit.zip from my web page, and follow the directions in the
readme.txt file.
thanks for the kind words.
Gary King
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
On Tue, 12 Nov 2002, Kisangani Emizet wrote:
> Dear Professor King:
>
> Your argument criticizing "fixed effects" advocated by Green, Kim, and
> Yoon was a time and "sample" saver. I tried to run "relogit" with
> Stata7, but couldn't. What should I do to download or run it?
> Thanks
> Kisangani Emizet
>
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
Thanks for your note. The best way to fix the two forms of bias in rare
events logistic regression is to use one of the techniques we discussed in
the paper you cite. Weighting so that they have a 50:50 distribution
would not help since you'd have to weight it back down (using what we call
prior correction or weighting) to get unbiased estimates.
the point that is easy to confuse is that there are 2 things going on.
first is in case-control data, you must correct for the fact that you're
sampling retrospectively. Then whether or not you have case-control data
or prospectively collected data, "rare events" is defined as the fraction
of y's in the population, not in your sample. With rare events, logistic
regression (or logistic regression with a correction for case-control
sampling) underestimates Pr(Y=1). the corrections we discuss will fix
this problem.
Gary King
David Florence Professor of Government
Director, Harvard-MIT Data Center
http://GKing.Harvard.Edu, King(a)Harvard.Edu
Direct (617) 495-2027, Assistant (617) 495-9271
Data Center (617) 495-4734, eFax (928) 832-7022
On Mon, 11 Nov 2002, Siwik, Thomas (DE - Duesseldorf) wrote:
> Bitte beachten Sie zunächst die Informationen am Ende dieser E-Mail / at
> first please note the information at the end of this email
> ----------------------------------------------------------------------------
>
>
>
> Dear Mr. King,
>
> I found at your HP the interesting paper:
> Logistic Regression in Rare Events Data
> Gary King & Langche Zeng
>
> I have to evaluate an logistic regression, in which
> the rare events are weighted equally to the frequent events
> to avoid the problem of bias of coefficients.
>
> May I allowed to ask an expert on that field two simple question:
> - Is a 50%/50%-weighting an appropriate method to avoid bias?
> - Aren't the drawbacks even worse biased betas and the risk
> of overfitting the rare events?
>
> I cannot find any reliable reference emphasizing this method.
> However, may be it makes sense as an easy to apply rule of thumb.
>
> Forth following I have attached the short discussion I have initiated
> at the s-news-list.
>
> Thank you & Best regards
> Thomas Siwik
> ________________________________________________
>
> Dr. Thomas Siwik
> Deloitte & Touche
> Financial Risk Solutions
> Bahnstrasse 16
> 40212 Duesseldorf
> Germany
>
> eMail: tsiwik(a)deloitte.de
> Fon: ++49.(0)211.8772 - 147 / - 133
> Fax: ++49.(0)211.8772 - 443
> http: www.deloitte.de/Dienstl/Bran-fr1.htm
> ________________________________________________
>
>
> -----Original Message-----
> From: Siwik, Thomas (DE - Duesseldorf) [mailto:tsiwik@deloitte.de]
> Sent: Montag, 4. November 2002 20:24
> To: 'Hongjiew(a)aol.com'
> Cc: 's-news(a)lists.biostat.wustl.edu'
> Subject: Re: [S] general statistical issue: weighting observations in
> logi
>
> > Statistician (1999). One of the nice properties of logistic
> > regression (not sure it carries over to general logit models)
> > is that if oversample is done based on the response variable,
> > the coefficients estimates of the predictors are not changed.
> > Only the intercept needs to be adjusted. There are researches
>
> I understand your point. If one gives weight lambda to the
> observations with Y=1 the odds-ratio is - heuristically speaking -
> changed by lambda as well. That results in an adjustment of the
> intercept.
>
> However, couldn't it be an asympthotic property of the
> predictors? I tried to find your assertion in the normal equation:
> lambda*Sum(P(Y=0|x_i)*x_i;i|y_i=1) = Sum(P(Y=1|x_i)*x_i;i|y_i=0)
> If this equation is re-written with the adjusted P(Y) there
> remains still a term, which vanishes only for T->infinity. Maybe
> I put it wrong, but additionally I ran an example in S+ showing
> different estimates for all coefficients.
>
> My intuition tells me that the overweighting of rare events
> causes overfitting and high sensibility to rare events. It
> seems to me to be not an appropriate method to reduce a possible
> bias of predictors of rare events.
>
> I found a very readable paper illustrating the problem of
> rare events:
> Logistic Regression in Rare Events Data
> Gary King & Langche Zeng
> http://gking.harvard.edu
>
> Thomas
>
> > -----Original Message-----
> > From: Hongjiew(a)aol.com [mailto:Hongjiew@aol.com]
> > Sent: Donnerstag, 31. Oktober 2002 17:48
> > To: fharrell(a)virginia.edu; dcts(a)dcts.de
> > Cc: s-news(a)lists.biostat.wustl.edu
> > Subject: Re: [S] general statistical issue: weighting observations in
> > logit-regression
> >
> >
> > I agree that if any weighted sampling is used, the parameters
> > need to be adjusted to make valid inference on the original
> > population. But there might be practical reasons to use
> > response based sampling. In the area I am in (database
> > marketing), we frequently encounter situation where the
> > occurrences of events are very "rare" or E(Y=1)<=0.01 for
> > example. There will be computational problems associated with
> > the prediction, see "Predictive performance of the binary
> > logit model in unbalanced samples" by J. S Cramer in "The
> > Statistician (1999). One of the nice properties of logistic
> > regression (not sure it carries over to general logit models)
> > is that if oversample is done based on the response variable,
> > the coefficients estimates of the predictors are not changed.
> > Only the intercept needs to be adjusted. There are researches
> > on the area of comparing statistical efficiency between
> > multiple sampling schemes in logistic regression setting. For
> > example, if N (overall population) =100K where 10% of them
> > Y=1.One may take a response based sample (so #y=1 is close to
> > #y=0) and one may make a true random sample. The first sample
> > usually can be much smaller than the second one to generate
> > comparable estimates (see " The effect of sample size and
> > proportion of buyers in the sample on the performance of list
> > segmentation equations generated by regression analysis" by
> > Berger and Magliozzi in Journal of direct marketing.
> >
> >
> >
> >
> >
> > In a message dated 10/30/2002 8:27:15 PM Eastern Standard
> > Time, fharrell(a)virginia.edu writes:
> >
> > >
> > >
> > > On Wed, 30 Oct 2002 23:22:05 +0100
> > > DCTS <dcts(a)dcts.de> wrote:
> > >
> > > >
> > > > I am confronted with a Logit-regression, in which y=0 is
> > much less frequent
> > > > than y=1. It is argued that the less frequent
> > observations with y=0 should
> > > > receive higher weights in the regression, such that the
> > proportion is
> > > > balanced between Ys being 0 and 1.
> > >
> > > Who argues that? No, you don't want to distort the data.
> > If your sample is a random sample from the population to
> > which you want to infer, then rely on maximum likelihood to
> > give good parameter estimates. You weight observations if
> > you oversampled a segment of the population and you want to
> > represent the original population [even then don't always
> > weight as this reduces efficiency when compared with
> > covariate adjustment for oversampling factors].
> > >
> > > Frank Harrell
> > >
> > > >
> > > > To my knowledge there are usually two motivations to use
> > weights others than
> > > > unity:
> > > > - prior knowledge of the probability of y=0
> > > > - optimisation of a cost function (in the example above
> > y=0 is much more
> > > > expensive and should be predicted with higher attention)
> > > >
> > > > In my limited econometric library and in the internet I
> > wasn't able to find
> > > > a discussion on the issue of weighting observations. If
> > someone has a good
> > > > hint to a source or could sketch the ideas of
> > consequences, pros and cons I
> > > > would be very pleased.
> > > >
> > > >
> > > > Thank you,
> > > > Thomas
> > > >
> > > >
> > --------------------------------------------------------------------
> > > > This message was distributed by
> > s-news(a)lists.biostat.wustl.edu. To
> > > > unsubscribe send e-mail to
> > s-news-request(a)lists.biostat.wustl.edu with
> > > > the BODY of the message: unsubscribe s-news
> > >
> > >
> > > --
> > > Frank E Harrell Jr Prof. of Biostatistics & Statistics
> > > Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
> > > U. Virginia School of Medicine
> > http://hesweb1.med.virginia.edu/biostat
> > > --------------------------------------------------------------------
>
> ----------------------------------------------------------------------------
>
> Diese Nachricht und jeder übermittelte Anhang beinhaltet vertrauliche
> Informationen und ist nur für die Personen oder das Unternehmen bestimmt, an
> welche sie tatsächlich gerichtet ist.
> Sollten Sie nicht der Bestimmungsempfänger sein, weisen wir Sie darauf hin,
> dass die Verbreitung, das (auch teilweise) Kopieren sowie der Gebrauch der
> empfangenen E-Mail und der darin enthaltenen Informationen gesetzlich
> verboten ist und gegebenenfalls Schadensersatzpflichten auslösen kann.
> Sollten Sie diese Nachricht aufgrund eines Übermittlungsfehlers erhalten
> haben, bitten wir Sie, den Sender unverzüglich hiervon in Kenntnis zu
> setzen.
> Sicherheitswarnung: Bitte beachten Sie, dass das Internet kein sicheres
> Kommunikationsmedium ist. Obwohl wir im Rahmen unseres Qualitätsmanagements
> und der gebotenen Sorgfalt Schritte eingeleitet haben, um einen
> Computervirenbefall weitestgehend zu verhindern, können wir wegen der Natur
> des Internet das Risiko eines Computervirenbefalls dieser E-Mail nicht
> ausschliessen.
> This message (including any attachments) contains confidential information
> intended for a specific individual or entity as the intended recipient.
> If you are not the intended recipient, you are hereby notified that any
> distribution, any copying of this message in part or in whole, or any taking
> of action based on it, is strictly prohibited by law and may cause
> liability. In case you have received this message due to an error in
> transmission, we ask you to notify the sender immediately.
> Safety warning: Please note that the Internet is not a safe means of
> communication or form of media. Although we are continuously increasing our
> due care of preventing virus attacks as a part of our Quality Management, we
> are not able to fully prevent virus attacks as a result of the nature of the
> Internet.
> ----------------------------------------------------------------------------
>
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
We show empirically and theoretically that Pr(Y=1)=p is underestimated in
logistic regression (at the stage of computing the coefficients and again
at the stage of computing p given the coefficients). this also implies
that 1-p is overestimated. the two cancel each other out in your
calculation, which means that the calculation doesn't bear on the issue of
bias.
Gary King
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
On Mon, 11 Nov 2002, Harald Scheule wrote:
> Dear professor King,
>
> I am working at the business faculty of the University of Regensburg,
> Germany and we are estimating the probabilities of corporate defaults using
> logistic regression models. With this background I have read your article
> "Logistic regression in Rare Events Data" with great interest.
>
> In Chapter 5: Rare Event, Finite Sample Corrections you argue that the
> probability for an event is underestimated because of rare events and
> randomness of the estimated paramter.
>
> What I do not understand is: If I run a logistic regression e.g. with PROC
> LOGISTIC in SAS and estimate the default probabilities, their sum equals the
> observed sum of events. If probabilities are underestimated, shouldn't their
> sum be lower that the observed number of events?
>
> I would be very thankful if you could help me to solve my problem
>
>
> Harald Scheule_________________
>
> Harald Scheule
> Dipl.-Kfm.
> Lehrstuhl für Statistik
> Universität Regensburg
> 93040 Regensburg
> Germany
>
> Tel.: +49 (0)941/943-2287
> Fax.: +49 (0)941/943-4936
> ________________
>
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit
Thanks for your note. There are no mins or maxes. If you have not very
rare events, or too many observations, you'll find that you get the same
results as regular logit. so no harm done. (Be aware tho, that the
biggest differences normally come from calculating quantities of interest
like probabilities and such.)
Best of luck with your research.
Gary King
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
On Fri, 20 Sep 2002, Philip Roessler wrote:
> Dear Dr. Gary King,
>
> I am a graduate student at the University of Maryland working with
> Christian Davenport. He said I should refer to you on a question I have
> about your ReLogit Program for Stata. Is there a minimum number of rare
> observations that should be included in the data, either a minimum
> absolute number or a minimum proportion, when running the ReLogit
> program? For example, I have a dataset with 1900 observations and 19
> (or 1 per cent) of those observations are "yes" observations and the
> rest are "no." Are there too few "yes" observations even for the
> ReLogit statistical program? I know in your International Organization
> article you use the ReLogit program with a dataset that has only 0.3%
> war observations, but over 1,000 absolute observations.
>
> Thank you in advance for your help.
>
> Sincerely,
>
> Philip Roessler
> Research Assistant
> Integrated Network for Societal Conflict Research (INSCR)
>
-
relogit mailing list served by Harvard-MIT Data Center
List Address: relogit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=relogit