Dear ReadMe users,
I have two questions I am looking for help with. I am working with a
stratified sample of articles in which the probability of selection into
the sample varies for different types of articles. Does anyone know of a
good way to weight articles for analysis in ReadMe to get a better estimate
of category proportions in the population of articles from which the sample
was drawn? The only thing I could think of so far is to duplicate articles
to attain the proper relative weights (specifically, three different strata
of articles had a 10%, 15%, and 100% probability of being included in the
sample so I duplicated these 30, 20, and 3 times respectively), but this
seems like a crude weighting scheme that could have undesirable
consequences.
Secondly, for this same sample of articles, I want to look at change in
category proportions over time (years) from 1980-2012. The full sample
(about 1,200 articles) has been hand-coded for 1980-2010, but I want to use
ReadMe to cross-validate the hand-coded estimates of category proportions
over time. Some of the year samples are so small (< 50 articles) that the
ReadMe analysis won't even converge, but I can achieve convergence by
pooling years (it seems to converge when there are at least 70 articles or
so). Does a sample of that size, ~70 articles, with about a 1/3 trainingset
(although I could change that) pose problems for the validity of the ReadMe
results, or can I expect to achieve relatively unbiased estimates even with
a small sample?
Thanks for any guidance,
Derek
--
Derek Burk
PhD Candidate
Department of Sociology
Northwestern University