Dear lists
I am using both MatchIt and Zelig to perform propensity score and post hoc
analyses. It appears, particularly with zelig, that the only way I can
specify an equation is to first attach the dataset to the environment. The
unfortunate consequence of doing this is that some variables in the dataset
are masked and inaccessible.. I have tried running this program using with (
{} ) and by prepending the dataset name to the variables in my equations,
but each of these strategies cause other errors (e.g., variables not being
found, or confusion between the dataset name and the variable name). So my
question is hether there are other alternatives that could be used in lieu
of attach() or ways of avoiding variable masking.
Thanks in advance
Barth
Dear Zelig team,
I am an HMS student trying to implement the "coxph" model in zelig. The
most recent version appears not to have this implemented, as it gives the
error:
> ** The model "coxph" is not available with the currently loaded packages,
> ** and is not an official Zelig package.
> ** The model's name may be a typo.
After sleuthing, I found that an older version of Zelig 3.5.4 does have it
implemented and have been running my analyses using this version. I am at
the stage where I might submit my findings to a conference and wanted to
double-check that the results should not be grossly wrong with the older
version.
Thanks so much for your help!
Best,
Sameer
Hi there,
I posted to the google group but didn't receive a response so though I
would try here...
I get the following error when using setx with twosls*;*
*E**rror in form[[3]] : subscript out of bounds*
reproducible example:
data <- data.frame(y=rnorm(0:100),x=rnorm(0:100),z=rnorm(0:100))
fml <- list("mu1" = y ~x,
"mu2" = x ~ z,
"inst" = ~ z)
z.out <- zelig(formula = fml, model = "twosls", data = data)
x.out <- setx(z.out)
any idea?
thanks!
yph
We are excited to announce the first version of a complete top-to-bottom
rewrite of Zelig: Everyone’s Statistical Software <http://zeligproject.org>.
Zelig is an easy-to-use, free, open source, general purpose statistics
program for estimating, interpreting, and presenting results from any
statistical method. Zelig turns the power of R, with thousands of open
source packages -- but with free ranging syntax, diverse examples, and
documentation written for different audiences -- into the same three
commands and consistent documentation for every method. Zelig uses R code
from many researchers, making it “everyone’s statistical software.” It is
the easiest way to learn new methods and use them immediately.
More information is at our new project page: http://zeligproject.org.
**For users**, Our new architecture makes Zelig more capable, and a much
more stable platform. We now have automated code checking, so bugs should
be infrequent or fixed automatically. With our new architecture, we will be
quickly expanding the range of models included and the available ways to
both interpret, diagnose, and evaluate models. We have written functions
that will allow old Zelig code using the zelig(), setx() and sim() calls to
continue to work in the present version; however, please see the new,
simplified ways of implementing these steps in the analysis.
**For model developers and package writers**, the new architecture will
make it much more simple to incorporate your model or methods into the
Zelig framework, giving your methods more visibility and ease of use.
Zelig also now gives you much infrastructure you can use in your package
without you having to write it yourself. You will have available all the
methods for substantive interpretation (expected values, predicted values,
first differences, etc.), test diagnostics (bootstraps, jackknifes,
small-sample bias corrections) utilities (seamless integration with
multiply imputed data, matched data, weighting) and other features. You can
focus on writing new innovative models, and leave all the time consuming
pragmatic utilities to help your users to Zelig. Writing the few bridge
functions to make your package usable within Zelig will also ensure your
packages, methods, and papers get the visibility they deserve.
The present version of Zelig has more than 28 statistical models
<http://docs.zeligproject.org/en/latest/>, and it is set to grow
continuously. You can see our whole development path
<https://github.com/IQSS/Zelig/milestones>, milestones, and all the new
models we plan to add. Please feel free to make requests or add your own;
we will update this continuously.
**Mailing List* *We are transitioning our long-standing mailing list to a
Google group here
<https://groups.google.com/forum/?hl=en#!forum/zelig-statistical-software>.
We hope you will share your feedback, ideas, concerns or issues in this
forum. Also feel free to raise issues on the GitHub issue queue
<https://github.com/IQSS/Zelig/issues> where you can see our progress on
features we are developing, and more information about milestones that will
mark each upcoming release.
Gary
--
*Gary King* - Albert J. Weatherhead III University Professor - Director,
IQSS <http://iq.harvard.edu/>- Harvard University
GaryKing.org <http://garyking.org/> - King(a)Harvard.edu - @KingGary
<https://twitter.com/kinggary> - 617-500-7570 - fax 812-8581 - Assistant
<king-assist(a)iq.harvard.edu>: 495-9271
Hello:
I've been refer to this list by Dr. Merce Crosas by way of Dr. Gary King. I've been involved in a "freebie" project of as an offshoot of a project that was funded by the Alzheimer Association to create a scale to measure "caregiver burden". My former social work chief and a colleague involved in giving support group therapy to caregivers of veterans with Alzheimer approached me to adapt and validate to Spanish the Marwit Meuser Caregiver Grief Inventory (Short Form) back in 2008. I had experience in doing so with a 312 item battery for screening psychosocial problems called the MPSI by Walter Hudson so I agreed (late 1990's early 2000's).
The project went real slow as our sample came in gradually through volunteers in the VA caregiver support groups and we also experienced a one year research suspension at our hospital. Finally in 2013 we had finished collecting the data and proceeded to analyze it. I'm not fully in academia but have been a PT lecturer at a local university school of social work and my research has been sporadic so I wasn't up to date in the issues of dealing with missing data until an expert psychometrician who has published a bit on the topic questioned my approach to handling of the missing data. I had done the traditional scale average to substitute missing data points. He said multiple imputation was the approach to be used. So I read a bit and agreed. That was in June 2014. This meant re-doing all my prior analysis and the bad news is that we did not have the software at the VA.
I found out the software was free but there was a rigorous protocol in the VA for installing new software: it must be reviewed by an expert group at the national level. So the process started with R; followed by Amelia II; followed by Zelig and then Lavaan (did also test one called FACTOR). The process of review and approval has taken 5 months so now we are ready to roll.
As this is a "freebie" for the social work profession we don't have a programmer and I've had to read the manuals and do the work myself (I did know well SPSS Mainframe and SPSSPC but the version we had did not do multiple imputing and the purchase price of their programs were not in our 0 budget). So I've learn to run a bit of R; managed to run Amelia II and create my 5 multiple data frames of our scale.
The scale has 18 items with responses in a 5 point Likert scale format ( 1 through 5). We have the Spanish version responses of 100 subjects of an approved R&D project. Short term goal is to get the 5 data frames integrated into one data frame using Zelig. After that we will run lavaan to see if the dimensions of the Spanish version are the same as the English version of the instrument....
I've downloaded the Zelig Manual but couldn't find the specifics on how the program integrates 5 data sets into one under R. I've written to the author of the Manual and Dr. Merce and it seems that they are hopeful that a technical programmer may have a simple solution to this issue. Please be aware that we cannot export the data outside of the VA and the work must be done at my PC so maybe some examples of programming statements may suffice.
Any help will be appreciated. I can promise you a free lunch and a tour of Piñones beach area if you come down to PR when things get too cold up there.
Jaime Alvelo, DSW
VA Caribbean Healthcare System
R&D Service (151)
10 Casia Street
San Juan, PR 00921-3201
787-641-7582 Ext 10175
Main Building-Basement-Office A55
Jaime.Alvelo(a)va.gov<mailto:Jaime.Alvelo@va.gov>
PD: I'm at the VA Mondays, Wednesdays and Fridays
e-mail at home is : aibonito(a)caribe.net<mailto:aibonito@caribe.net>
Cellular phone on Tuesdays and Thursdays: 787-306-4934<tel:787-306-4934>
The current version of the DESCRIPTION file for Zelig has 12 packages
listed under Depends. Writing R Extensions explains why this is a bad idea.
http://cran.r-project.org/doc/manuals/r-release/R-exts.html#Package-Depende…
Key passage:
"Field ‘Depends’ should nowadays be used rarely, only for packages which
are intended to be put on the search path to make their facilities
available to the end user (and not to the package itself): for example it
makes sense that a user of package *latticeExtra*
<http://CRAN.R-project.org/package=latticeExtra> would want the functions
of package *lattice* <http://CRAN.R-project.org/package=lattice> made
available.
Almost always packages mentioned in ‘Depends’ should also be imported from
in the NAMESPACE file: this ensures that any needed parts of those packages
are available when some other package imports the current package."
This is especially true for Zelig because of naming conflicts between some
of the packages in Depends. For me (and others?) the key issue involves
dplyr and MASS. Both have a function called "select." That function is
critically important to any user of dplyr. (I don't think it matters much
to users of MASS.) But because MASS is listed before dplyr in Depends in
Zelig, there is no way (if you have Zelig loaded) to have "select" refer to
the function in dplyr.
Also, including both plyr and dplyr in Depends seems very sloppy.
The solution (and current best practice in R) is to not use Depends so
much, if at all, and to only import functions from a package like MASS that
you actually need.
Thanks.
I *believe* that there is a bug in Zelig's scoping. This exists on the cran
version (from where it bit me) but also in the development version.
> packageVersion("Zelig")
[1] "5.0.2"
>
Interactively, the Zelig demo works fine:
> data(cars)
> x <- cars
> zelig(dist ~ speed, data = x, model = "ls")
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Olivia Lau. 2007.
...
>
Note that assigning cars to x and then passing x as the argument to data is
needed to demonstrate the bug. The problem comes when we wrap this code
within a function.
> rm(list = ls())
> test_function <- function(){data(cars); x <- cars; zelig(dist ~ speed,
data = x, model = "ls")}
> test_function()
Error in callSuper(formula = formula, data = data, ..., weights = NULL, :
object 'x' not found
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Olivia Lau. 2007.
...
>
It is the Error above that is the problem. (Strangely, the code still seems
to work in the development version after issuing that Error message. In the
CRAN version, it does not.)
Note that you need to keep your workplace clean to see this clearly. See
the rm(list = ls()) above. If you don't issue this command (after doing the
interactive code), you don't get any error:
> rm(list = ls())
> data(cars)
> x <- cars
> zelig(dist ~ speed, data = x, model = "ls")
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Olivia Lau. 2007.
...
> test_function <- function(){data(cars); x <- cars; zelig(dist ~ speed,
data = x, model = "ls")}
> test_function()
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Olivia Lau. 2007.
...
>
No Error because Zelig finds x in the global environment. Of course, it
should be using the local x.
My colleague thinks that, in the CRAN version, this snippet of code is
wrong:
divided.data <- eval(call("multi.dataset", substitute(data)))
But I am not enough of an R guru to determine the cause of the bug.
Thanks,
Dave Kane
Hi everyone,
I am running a logit.bayes model and I get the results that closely
resemble regular logistic regression. I need to determine the importance of
predictors.
I want to try and determine which coefficients are more "powerful." I have
ordered variables, numeric as well as binary as predictors. I attempted to
standardize these using the rescale function from arm package but still the
process fails for ordered variables (coefficients do not seem to be
standardized). In general standardizing variables to obtain beta
coefficients may not be the most efficient way to determine the importance
of variables.
Another way to do this is to use an model information value or similar
value. For example, DIC (or R^2 in traditional regression). I know that
rjags does report DIC but I would have to build the models manually there.
Is there a way to include DIC for zelig bayesian models?
If anyone has an alternative way to determine the importance of predictors
when predictors are on different scales, I would love to hear about it.
Thanks,
Michael
Hi Zelig Helplist,
I've am looking to add a covariate to several ei.RxC models I have run. But
I've run into several problems. I think I may be issuing a command
incorrectly...
When I added a covariate to my model, my output changed slightly. But then
I realized that no matter the covariate I was using, the output was the
same each time (i.e. the output was the same regardless of the covariate
used...)! In another model, when I added a covariate, the output didn't
change at all. And I was remembering to set the covariate to its mean.
The identity columns I am using are numbers of individuals, as opposed to
percents. So I was thinking that the problem was I was using a covariate
that was a decimaled percent. I replaced with the whole number equivalent
and the same result occurred.
I'm attaching a picture of the end of my dataset to this email. You'll see
that the ethnic categories I'm using, such as Mo and Sisala, are reported
in whole population numbers. The covariate I am attempting to use is
percentage of urban residents in a particular district. 'urban' is listed
as a decimaled proportion, but I also tried 'urban2' which is a whole
population number (i.e. the district pop total * urban).
Below is some sample code:
STATA <- read.dta(file.choose())
#using 2012registrationtoEstimates
View(STATA)
set.seed(7)
z.out <- zelig(cbind(registeredvoters, nonregistered) ~ agona + ahafo +
ahanta + akuapem + akwamu + akyem + aowin + asante + asen + boron + chokosi
+ denkyira + evalue + fante + kwahu + nzema + sefwi + wasa + bawle +
otherakan + dangme + ga + otherga + ewe + guan1 + guan2 + guan3 + guan4 +
guan5 + guan6 + guan7 + guan8 + otherguan + bimoba + kokomba + basare +
pilapila + salfalba + kotokoli + chamba + othergurma + builsa + dagarte +
wali + dagomba + kusasi + mamprusi + namnam + nankansi + nanumba + mosi +
othermole + kasena + mo + sisala + vagala + othergrusi1 + othergrusi2 +
busanga + wangara + othermande + otherinside + otheroutside, COVAR =
~urban2, model = "ei.RxC", data = STATA)
x.out <- setx(z.out)
s.out <- sim(z.out, num = 100)
var7 <- summary(s.out)
dataframe=as.data.frame(var7$qi.stats)
write.dta(dataframe, "2012regisURBANnew.dta")
regis2012URB <- read.table(file.choose(), header=TRUE)
View(regis2012URB)
stargazer(regis2012URB[ , ], style="apsr", rownames=FALSE, summary=FALSE)
Thanks very much,
Jennifer
--
PhD Candidate
Department of Political Science
University of Florida
PO Box 117325
Gainesville, FL 32611-7325