Professor King and other ei practitioners:
I am hoping for some clarification and/or advice. My polisci
colleague and I study political movements and elections in Ecuador. We
have analyzed some of the results of the 2002 elections there, primarily
the votes for President in the first and second rounds. We are
particularly interested in the voting differences between indigenous
peoples (hereafter Indians) and others (mestizos, blanco-mestizos,
etc.). I won't go into the methodological problems of estimating
ethnicity here. We have the data at the parish level (the closest thing
to a precinct) and have looked at the relationship between %Indian and %
voting for Presidential Candidate G, who was in an alliance with an
indigenous-led political movement. The hypothesis is simple: a larger %
of Indians should vote for candidate G than should non-Indians. For the
943 parishes we first ran the "Goodman regression" and find that the
estimate for the proportion of Indians casting their vote for candidate
G is .465 and for non-Indians it is .163. Then, using the ezi program
we run the regular or first-stage ei and the "Aggregate Quantities of
Interest" are .463 for Indians and .164 for non-Indians. These results,
and others, are very close between the "no-intercept" OLS regression and
ei.
So, it seems we have two results: 1) a much higher proportion of
Indians voted for candidate G than did non-Indians and,
2) there is not an "ecological fallacy" or aggregation problem with that
conclusion. The thing is, we have a very important "control" variable -
region. There are very strong regional differences in Ecuador in voting
patterns. There are three regions: Coast, Sierra, and Oriente (jungle).
The third is not very important since only 3% of the population lives
there. Candidate G is a Sierra (and Oriente) candidate and he did not
receive a lot of support on the Coast (especially in the first round).
This confounds the ethnic differences because Indians live
overwhelmingly in the Sierra, not on the coast.
I have read numerous other articles using ei, including those going
on to a "second-stage" and the exchange between Herron and Shotts and
Adolph and King. But we are not interested in analyzing the variance in
the ei estimates of the proportion of Indians voting for candidate G
across the parishes, which is the equivalent of what the other
researchers have been doing. If the ecological inference issue has been
resolved, i.e., a higher % of Indians really did vote for candidate G
than did non-Indians, would it not be appropriate to just return to
simple OLS regression if I want to explain variance in votes for
candidate G across parishes, with %Indian and two dummy variables
representing the Coast and Oriente regions of Ecuador? This is to
resolve the question of whether a higher proportion of Indians voted for
candidate G in the Sierra than did non-Indians, which is the case. By
the way, we also estimated this by "regular" ei by just using the
parishes in the Sierra, which once again produced estimates very close
to the Goodman regression.
SO, I JUST WANT TO KNOW WHETHER USING OLS REGRESSION WITH OUR
ORIGINAL DEPENDENT VARIABLE, THE %INDIAN PREDICTOR AND A COUPLE OF
CONTROL VARIABLES IS AN ACCEPTABLE APPROACH. (By the way, in the OLS
regressions we do weight the cases by size of parish).
I appreciate any comments,
Scott H. Beck, Professor
Department of Sociology and Anthropology
East Tennessee State University
Johnson City, TN 37614
Tel.: 423-439-6648
Email: r30scott(a)etsu.edu