Re: Rare Event Logistic Regression - Relogit

27 Jul 2005

thanks for your note.  pls see below...

On Wed, 27 Jul 2005, Parry Clarke wrote:

...
  Dear Professor King,

 My name is Parry Clarke and I am currently writing up my PhD on baboon 
 intersexual conflict. Im sorry to bother you but I was wondering if you 
 could answer some questions regarding Logistic Regression of Rare events.

 I am currently attempting to model the occurrence of male aggression 
 directed at oestrous females. The behaviour itself is fairly rare and so 
 I have binary coded its occurrence. In total I have 1745 sampling units, 
 or rows of data, with only 21 of these coded 1 and 1724 coded 0. 
 Initially I carried out standard logistic regression, but found that 
 when it came to diagnosis all my influential points were my 1s. So 
 that, if deletion diagnostics were performed I was left with no variance 
 In addition, my intercepts did not seem very convincing.

 However, having now discovered your work on the subject and the package you 
 have written for R things are looking up!, but I just have a couple of 
 questions:

 1)Are there diagnostics unique to the rare event analysis?
 2)Is influential point examination redundant in rare event logistic 
 regression?
 3)How do I get deviance estimates of the final model? 
relogit estimates the same coefficients as logit from the same model. 
they do differ tho in order to get better properties.  so just as using 
weighted least squares will give different answers -- and will fit the 
data less well than least squares -- we generally prefer wls to ls when 
there are weights available.  so in both relogit and wls, you have the 
same issue of how to deal with diagnostics.  there are no special 
diagnostics, but in both (and lots of other methods) the issue is that you 
can't really treat all the observations equally and an outlier for one 
observation isn't the same as another.  so in relogit for example, an 
extra 1 inadvertently included in the dataset will be much more 
consequential than an extra 0.

...
  4)Part of my analysis has to been trying to relate
other forms of male 
 aggression to oestrous female-directed aggression. However, these other forms 
 are also rare and so when I enter them into the analysis as a dichotomous 
 explanatory variable I get answers that are not really supported by the 
 observed data: For example: overlap between the 1s in the response and 
 explanatory variables may only be 2 or 3 data points but the final model 
 suggests that a large amount of the deviance is explained and the 
 coefficients are highly significant: Is this simply an artifact of the rarity 
 of both the dependent and independent variable and am I, therefore, better 
 off excluding them from the multivariate analysis and doing separate 
 contingency table analysis with them? 
rare events in explanatory variables is a different issue involving the 
sensitivity of the estimates to the coding of X.  since almost all 
relevant models are conditional on X, you have little choice but to run 
the analysis as is unless you reconceptualize the project (such as having 
both variables being dependent variables).

...
  5)In a number of your articles (e.g. Explaining Rare
Events In
  International Relations) your talk at length about sampling strategies
  and database trimming to create favourable ratio of 0s to 1s and to cut
  down on costs. In a situation such as mine where the data is collected
  and the database fixed is there any need to trim and subset. If so how
  do you suggest I go about that and is there a command in R for randomly
  selecting subsets of data? 
if you have the data, you should use it.  no reason to subsample at that 
point.  we do it to demonstrate what would happen if you couldn't afford 
to collect all the data, as is the case in many fields.  but more data are 
better generally and here too.

...

 Once again I am sorry to bother you with such a lengthy email and I hope it 
 is not too much of an imposition. 
best of luck with your research,

Gary King

...

 Yours sincerely,

 Parry Clarke.

 _________________________________________________________________
 Winks & nudges are here - download MSN Messenger 7.0 today! 
 http://messenger.msn.co.uk