Olivia Lau writes:
> In any case, I suggest that if you have further issues with Zelig that you
> contact me personally. This isn't an appropriate discussion for a public
> users' forum.
I do not have "issues with Zelig." If I did, I would contact you
directly or, more appropiately, send a question to the Zelig mailing
list. I responded with comments on Zelig because you asked me a
question. The only reason that I cc'd the MatchIt mailing list is
because *you* cc'd the MatchIt mailing list when you wrote to me
directly. (For the record, I think that this was good because these
issues are relevant to the discussion occuring on that mailing list.)
My apologies if you did not want this conversation to include the
mailing list.
Actually, my last message does not seem to have gone through to the
MatchIt mailing list. Is someone filtering this?
> If you read Kurt's email carefully, it means that just the tests
> associated with installation are not run. The rest of the R CMD
> check tests *are* run.
>
> If you are unclear as to what those tests are, I suggest that you refer to
> the Writing Extensions for R document. Or just try running it yourself on
> the Zelig -tar ball.
I am afraid that you do not understand what "tests" are not run when
the --no-install flag is used. Note that this is difficult to see in
Zelig since there is no tests directory in the first place. A clearer
example is the portfolio package. I have attached the full results of
R CMD check at the end of this message. As you can see, the base run
includes all sorts of checks and tests, including:
-----------------------------------------
chosin:~/temp/junk [sac] $ R CMD check portfolio_0.2-1.tar.gz
WARNING: ignoring environment value of R_HOME
* checking for working latex ... OK
* using log directory '/home/kane/temp/junk/portfolio.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'portfolio/DESCRIPTION' ... OK
* this is package 'portfolio' version '0.2-1'
* checking if this is a source package ... OK
* Installing *source* package 'portfolio' ...
** R
** data
** inst
** preparing package for lazy loading
Creating a new generic function for 'summary' in 'portfolio'
Creating a new generic function for 'plot' in 'portfolio'
Creating a new generic function for 'mean' in 'portfolio'
...
* creating portfolio-Ex.R ... OK
* checking examples ... OK
* checking tests ...
make[1]: Entering directory `/home/kane/temp/junk/portfolio.Rcheck/tests'
Running 'portfolioBasic.contribution.test.R'
Running 'portfolioBasic.Arith.test.R'
Running 'portfolio.Arith.test.R'
Running 'portfolio.mvShort.test.R'
Running 'portfolio.create.test.R'
Running 'df.category.mean.test.R'
Running 'portfolioBasic.performance.test.R'
Running 'portfolio.calcWeights.test.R'
Running 'portfolio.mvLong.test.R'
Running 'nearest.multiple.test.R'
Running 'portfolio.calcShares.test.R'
Running 'weight.test.R'
Running 'portfolioBasic.test.R'
Running 'classes.test.R'
make[1]: Leaving directory `/home/kane/temp/junk/portfolio.Rcheck/tests'
OK
* checking package vignettes in 'inst/doc' ... OK
...
chosin:~/temp/junk [sac] $
-----------------------------------------
This is what it means to have "tests" in an R package. You have code
in the tests directory which runs and either passes or fails. Now, R
CMD check also checks other things (syntax, directory structure, et
cetera). But these are separate from the tests.
What happens when one uses the --no-install option?
-----------------------------------------
chosin:~/temp/junk [sac] $ R CMD check --no-install portfolio_0.2-1.tar.gz
WARNING: ignoring environment value of R_HOME
* checking for working latex ... OK
* using log directory '/home/kane/temp/junk/portfolio.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'portfolio/DESCRIPTION' ... OK
* this is package 'portfolio' version '0.2-1'
* checking if this is a source package ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for syntax errors ... OK
* checking R files for library.dynam ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking Rd files ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking DVI version of manual ... OK
chosin:~/temp/junk [sac] $
-----------------------------------------
No tests are run! That is the problem. Some of the checks that are
done in the base case are still done here (but others are not). Unless
and until Zelig (and MatchIt) are checked every night at CRAN the way
most packages are, there is no way for a user (or you!) to know if
some change in the on-going development work of R has affected them.
> We know it works under R 2.3.0 because it passes R CMD check on R 2.3.0 and
> none of the proposed changes to R 2.3.0 affect Zelig (Kurt would have told
> us if they had).
So Kurt knows about every single change in R *and* every single change
in Zelig? Not only that, but he knows about everything else in both R
and Zelig so he can figure out if there is a problem? Kurt is way
smarter than I am but he is not that smart.
The problem here is the the meaning of "affect Zelig." I think you
mean that, if Zelig stopped passing R CMD check under 2.3.0, Kurt
would tell you. That makes sense. But, without test cases, you do not
know if a function in Zelig produces the same answer in 2.3 that it
did in 2.2. Packages with test cases can know this because they test
them explicitly.
Again, given your primary audience, I don't think that you care. Most other
academics do not even know what test cases are, much less why they
matter. They do not care if Zelig (or MatchIt) has a tests directory
or if it is fully checked on CRAN each night.
But the only way to know if changes in R "affect Zelig" is to have
tests which check this claim.
> We were notified prior to the release of R 2.2.0 that we
> needed to change some functions and we did. And actually, yes, I do run
> every single demo myself before we release each version of Zelig.
It depends on what you mean by "run every single demo." If you just
press a button and run it, then all you can be sure of is that the
code "runs", you can't be sure that it is correct. Or do you run the
demos, and, for each one, look at the numeric answer and then compare
it (visually?) to the answer that you know is correct? That would take
a lot of time and be error-prone as well.
By the way, I was going to provide an example to illustrate this
point, but I couldn't even get the first demo that I looked at in
Zelig to work!
------------------------------------
> demo(beta)
demo(beta)
---- ~~~~
Type <Return> to start :
> data(house)
> z.out <- zelig(dpct86 ~ dpct84 + dwin86 + incum86,
data = house, model = "beta")
Error in inherits(x, "data.frame") : object "house" not found
In addition: Warning message:
data set 'house' not found in: data(house)
>
------------------------------------
Anyway, this no doubt sounds way harsher than I mean it to sound. Let
me make it up to the MatchIt/Zelig team by taking everyone out to
lunch. My treat! John Harvard's?
Dave
Full R CMD check results:
-----------------------------------------
chosin:~/temp/junk [sac] $ R CMD check portfolio_0.2-1.tar.gz
WARNING: ignoring environment value of R_HOME
* checking for working latex ... OK
* using log directory '/home/kane/temp/junk/portfolio.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'portfolio/DESCRIPTION' ... OK
* this is package 'portfolio' version '0.2-1'
* checking if this is a source package ... OK
* Installing *source* package 'portfolio' ...
** R
** data
** inst
** preparing package for lazy loading
Creating a new generic function for 'summary' in 'portfolio'
Creating a new generic function for 'plot' in 'portfolio'
Creating a new generic function for 'mean' in 'portfolio'
** help
>>> Building/Updating help pages for package 'portfolio'
Formats: text html latex example
assay text html latex example
contribution-class text html latex
contributionHistory-class text html latex
dow.jan.2005 text html latex example
exposure-class text html latex
exposureHistory-class text html latex
global.2004 text html latex example
global.2004.history text html latex example
objectHistory-class text html latex example
performance-class text html latex
performanceHistory-class text html latex
portfolio-class text html latex
portfolioBasic-class text html latex example
portfolioHistory-class text html latex example
weight text html latex example
** building package indices ...
* DONE (portfolio)
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking package dependencies ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for syntax errors ... OK
* checking R files for library.dynam ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking Rd files ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* creating portfolio-Ex.R ... OK
* checking examples ... OK
* checking tests ...
make[1]: Entering directory `/home/kane/temp/junk/portfolio.Rcheck/tests'
Running 'portfolioBasic.contribution.test.R'
Running 'portfolioBasic.Arith.test.R'
Running 'portfolio.Arith.test.R'
Running 'portfolio.mvShort.test.R'
Running 'portfolio.create.test.R'
Running 'df.category.mean.test.R'
Running 'portfolioBasic.performance.test.R'
Running 'portfolio.calcWeights.test.R'
Running 'portfolio.mvLong.test.R'
Running 'nearest.multiple.test.R'
Running 'portfolio.calcShares.test.R'
Running 'weight.test.R'
Running 'portfolioBasic.test.R'
Running 'classes.test.R'
make[1]: Leaving directory `/home/kane/temp/junk/portfolio.Rcheck/tests'
OK
* checking package vignettes in 'inst/doc' ... OK
* creating portfolio-manual.tex ... OK
* checking portfolio-manual.tex ... OK
chosin:~/temp/junk [sac] $
chosin:~/temp/junk [sac] $ R CMD check --no-install portfolio_0.2-1.tar.gz
WARNING: ignoring environment value of R_HOME
* checking for working latex ... OK
* using log directory '/home/kane/temp/junk/portfolio.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'portfolio/DESCRIPTION' ... OK
* this is package 'portfolio' version '0.2-1'
* checking if this is a source package ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for syntax errors ... OK
* checking R files for library.dynam ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking Rd files ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking DVI version of manual ... OK
chosin:~/temp/junk [sac] $
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
R-devel,
There has been some confusion on the MatchIt package mailing list on
the meaning of [--install=no] in the comment column of CRAN's
automated package check.
It's my understanding that, at the very least, a package marked like
this will not have its test cases run each night. Are there other
checks that are omitted?
How, if at all, are such install flags related to the parameters one
can pass R CMD check, such as --no-install, --no-test, etc.?
Thanks,
Jeff
--
Jeff Enos
Kane Capital Management
jeff(a)kanecap.com
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
Olivia Lau writes:
> I'm sorry. Are you talking about MatchIt or Zelig with respect to (1)-(6)
> below? If you are talking about Zelig:
We are talking about MatchIt. Although we have not used Zelig nearly
as much, it seems to do a better job on several of these
issues. Perhaps the Zelig maintainer had a hard-assed GOV 1000
instructor who taught her good habits . . . ;-)
> (3) The --install=no flag does not refer to VGAM as I have numerous emails
> from the CRAN package maintainers verifying that they have had to download
> and install the proper versioning of VGAM to get Zelig to pass checks.
The --install=no flag means that, at least, tests are not run. (It may
also mean that demos and vignettes are not checked/run/whatever; the
details are unclear to me. But the key point is that any status which
prevents test cases from being run is unacceptable for professional
work.)
> (5) We have test cases in demo files. You can run them, if you wish, but
> R CMD check does not do it at present. There is some discussion on the R
> developers list to force R CMD check to run the demo files, but it hasn't
> been implemented yet, to my knoweldge.
You may have a misunderstanding about what "test cases" mean in
R. Tests go in the test directory. See Writing R Extensions. Zelig
does not have a test directory, so it has no test cases. Now, there is
some demo code in Zelig. If the demo code does not run, you can
probably be sure that there is a problem. But, given the proliferation
of user.prompt() calls in the demos, I do not see how these could ever
be run in an automated fashion.
Assume for a second that you know that Zelig ran perfectly under R 2.2.0
because you checked by hand every function. How do you know that it
runs correctly under R 2.3.0? You don't. The fact that the demos may (or
may not) run tells you little of interest. Do you even re-run all the
demos yourself when a new version of R comes out? I hope not!
Every time that you check that a function in Zelig does what you want
it to do, you should add a test case to the test directory. Test cases
are an endless bother, but they are a requirement in serious software
development.
We remain big fans of all these packages and of the efforts
that have gone into them.
Dave
> (6) I assume you are referring to MatchIt and not Zelig. 8)
>
> Best,
>
> Olivia Lau
>
> On Tue, 25 Apr 2006, David Kane wrote:
>
> > Gary King writes:
> > >
> > > yes, pls send a complete description of what you're talking about.
> > > We are planning to add versioning for packages used by Zelig, which
> > > this is related to.
> >
> > A complete description might take some time, but here is a good start.
> >
> > 1) Visible change-log in the package.
> >
> > 2) All documentation included in the package.
> >
> > 3) All required packages must be on CRAN. (Problem here *seems* to be
> > that MatchIt needs Zelig which suggests VGAM which still has not made
> > it to CRAN.)
> >
> > 4) Package must pass tests on CRAN. Right now, the CRAN maintainers
> > don't even bother to test the package because, I think, of 3).
> >
> > 5) Test cases.
> >
> > 6) A decent regard for R coding standards.
> >
> > 1) and 2) are obvious. See virtually every other R package. 3) is what
> > really forced us to fork right now, although we have complained about
> > 2) in the past. 4) is also a requirment. Why should anyone assume that
> > the package does what it claims to do if the CRAN maintainers won't
> > even install it?
> >
> > 5) will be the hardest for non-professionals to do or see the purpose
> > of. You can check out the test cases in R itself (as well as in
> > packages like ours) to get a sense of how important these are.
> >
> > How do you know that the latest version of MatchIt is producing
> > correct answers? How do you know that, for example, the upgrade to R
> > 2.3 does not introduce some bug? Do you rerun every analysis that you have run before?
> >
> > We can be fairly confident that the move to 2.3 does not break the
> > portfolio package because all our test cases produce the same
> > answer. The same applies to R itself. Can the users of MatchIt be
> > similarly confident?
> >
> > 6) is the hardest be clear about. The change in which you just started
> > giving errors for dataframes that had NAs, even in columns not being
> > used in the analysis, was just beyond the pale. Any developper would
> > tell you the same. But how were you to know that ahead of time? Tough
> > to know. Tacit knowledge is slow to come by and hard to explain.
> >
> > Another example, see previous discussion on the list, was the
> > insistence that a package like MatchIt require/suggest a bunch of
> > other packages that it really doesn't need, mainly because you think
> > that everyone should have them installed. I should be able to require
> > MatchIt for my package and my users without entailing the installation
> > of every IQSS package in existence!
> >
> > Anytime we complain, you know that 6) is an issue! Just kidding!
> > Mostly . . .
> >
> > Anyway, those are the key issues. If they were taken care of, we would
> > be eager to have MatchIt be a required part of our package. We have
> > zero interest in becoming experts in writing matching software. We
> > just want our users to have a simple (for them) way of creating
> > matched portfolios, *and* to be sure that the results they get are
> > correct.
> >
> > Again, we love MatchIt and want it to be successful.
> >
> > Dave
> >
> > PS. It would also be nice to answer user questions (see mine on this
> > list from last week). But, strictly speaking, that isn't a
> > requirement. If the software is good/tested/professional, we will use
> > it even if no one answers our questions.
> >
> >
> > >
> > > thanks,
> > > Gary
> > >
> > > On Tue, 25 Apr 2006, David Kane wrote:
> > >
> > > > Kosuke Imai writes:
> > > > > Yes.
> > > >
> > > > Well, if you really want MatchIt to be used by other (serious) package
> > > > writers, then you will need to do much, much more. Let us know if you
> > > > want a more complete description, but test cases would be a good place
> > > > to start. Note that the CRAN maintainers refuse to even *install* MatchIt
> > > > for regular testing.
> > > >
> > > > http://cran.us.r-project.org/src/contrib/checkSummary.html
> > > >
> > > > This means that any other package, like ours, with MatchIt as a
> > > > Suggests will not be tested either. This is simply unacceptable and
> > > > means that we will need to fork out the code that we need for our
> > > > portfolio package. This is not a great option and creates more work
> > > > for us, but we have no choice.
> > > >
> > > > > We try to keep the changes minimum (as far as the syntax goes). All
> > > > > the changes (minor or major) are and will be noted in the documentation.
> > > > > See http://gking.harvard.edu/matchit/docs/What_s_New.html
> > > >
> > > > But this is not even distributed with the package! Again, any
> > > > professional software developper would regard your failure to
> > > > distribute this (along with your failure to distribute all the
> > > > documentation, as previously noted on this list) as proof enough of a
> > > > lack of concern for the needs of other developpers.
> > > >
> > > > Of course, this all sounds harsher than I mean it to sound. I love
> > > > MatchIt. I think it is a great program and I believe that it is and
> > > > will be *very* successful for the purpose that you intend it for (use
> > > > by other academics in writing academic papers). Indeed, I plan on
> > > > using it myself for that every purpose!
> > > >
> > > > But, if you want MatchIt to be used by other package writers (like us)
> > > > in the development of their packages (like portfolio), then you will
> > > > need to handle things differently. You can't just make a change that
> > > > blows us (and our users) up without taking adequate care and giving
> > > > due warning.
> > > >
> > > > But, again, thanks for making MatchIt available and open-source. It is
> > > > a fine program and we have benefitted from your efforts.
> > > >
> > > > Dave
> > > >
> > > > > Kosuke
> > > > >
> > > > > On Fri, 21 Apr 2006, Jeff Enos wrote:
> > > > >
> > > > > > Thanks, Kosuke.
> > > > > >
> > > > > > Does the MatchIt team care whether its package is used in and relied
> > > > > > upon by other open source efforts? If not, no problem -- we're still
> > > > > > very happy to have the MatchIt package available.
> > > > > >
> > > > > > If so, I think that major changes in behavior should be kept to a
> > > > > > minimum in non-beta releases, and especially patch releases, and be
> > > > > > clearly documented in release notes. If a change like the one I
> > > > > > mentioned in this thread occurs, I think it's reasonable to issue a
> > > > > > fix patch to correct the change in behavior that isn't required for
> > > > > > correctness and will probably be reverted soon anyway.
> > > > > >
> > > > > > Jeff
> > > > > >
> > > > > > Kosuke Imai writes:
> > > > > > > Hi,
> > > > > > > These issues with missing data are on our to-do list. Hopefully, we can
> > > > > > > implement them sometime soon. For now, you would have to delete missing
> > > > > > > data by hand or even better incorporate missing data patterns as
> > > > > > > covariates.
> > > > > > > Best,
> > > > > > > Kosuke
> > > > > > >
> > > > > > > -----------------------------------------------------
> > > > > > > Kosuke Imai Office: Corwin Hall 041
> > > > > > > Assistant Professor Phone: 609-258-6601
> > > > > > > Department of Politics eFax: 973-556-1929
> > > > > > > Princeton University Email: kimai(a)Princeton.Edu
> > > > > > > Princeton, NJ 08544-1012 http://imai.princeton.edu
> > > > > > > -----------------------------------------------------
> > > > > > >
> > > > > > > On Wed, 19 Apr 2006, Jeff Enos wrote:
> > > > > > >
> > > > > > > > MatchIt list,
> > > > > > > >
> > > > > > > > Thanks very much for providing MatchIt as an open source package.
> > > > > > > >
> > > > > > > > I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in
> > > > > > > > the matchit function, a stop is now encountered if there are any NA
> > > > > > > > values in any column of the data frame supplied as the 'data'
> > > > > > > > parameter:
> > > > > > > >
> > > > > > > > if(sum(is.na(data))>0)
> > > > > > > > stop("Missing values exist in the data")
> > > > > > > >
> > > > > > > > Unless I'm missing something, isn't this overly restrictive? I think
> > > > > > > > it's standard R practice to only make requirements of columns used in
> > > > > > > > the function's calculation (as in the 'lm' function).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jeff
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -
> > > > > > > MatchIt mailing list served by Harvard-MIT Data Center
> > > > > > > List Address: matchit(a)latte.harvard.edu
> > > > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
> > > > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > -
> > > > > MatchIt mailing list served by Harvard-MIT Data Center
> > > > > List Address: matchit(a)latte.harvard.edu
> > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
> > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
> > > >
> > > >
> >
> >
--
David Kane
Kane Capital Management
646-644-3626
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
-----Original Message-----
From: Olivia Lau <olau(a)fas.harvard.edu>
Date: Tuesday, Apr 25, 2006 4:55 pm
Subject: Re: [matchit] The matchit function and NA's
I'm sorry. Are you talking about MatchIt or Zelig with respect to (1)-(6) below? If you are talking about Zelig:
(1) A change log for Zelig is available as README in the .tar.gz bundle. If you need more detail (e.g., the CVS directory log), you can ask me personally for it, but most of it is not of interest to the general
public, and so I just do a short summary of what's changed in the README.
(2) All functions exported in the Zelig namespace are documented in the Zelig man directory. If it's not documented, you probably can't call it directly from another package anyway since it's not exported.
(3) The --install=no flag does not refer to VGAM as I have numerous emails from the CRAN package maintainers verifying that they have had to download and install the proper versioning of VGAM to get Zelig to pass checks.
(4) Zelig does pass all checks on CRAN or Kurt and co would not post it as passed.
(5) We have test cases in demo files. You can run them, if you wish, but R CMD check does not do it at present. There is some discussion on the R developers list to force R CMD check to run the demo files, but it hasn't been implemented yet, to my knoweldge.
(6) I assume you are referring to MatchIt and not Zelig. 8)
Best,
Olivia Lau
On Tue, 25 Apr 2006, David Kane wrote:
> Gary King writes:
>
> yes, pls send a complete description of what you're talking about.
> We are planning to add versioning for packages used by Zelig, which
> this is related to.
> A complete description might take some time, but here is a good start.
> 1) Visible change-log in the package.
> 2) All documentation included in the package.
> 3) All required packages must be on CRAN. (Problem here *seems* to be
that MatchIt needs Zelig which suggests VGAM which still has not made
it to CRAN.)
> 4) Package must pass tests on CRAN. Right now, the CRAN maintainers
don't even bother to test the package because, I think, of 3).
> 5) Test cases.
> 6) A decent regard for R coding standards.
> 1) and 2) are obvious. See virtually every other R package. 3) is what
really forced us to fork right now, although we have complained about
2) in the past. 4) is also a requirment. Why should anyone assume that
the package does what it claims to do if the CRAN maintainers won't
even install it?
> 5) will be the hardest for non-professionals to do or see the purpose
of. You can check out the test cases in R itself (as well as in
packages like ours) to get a sense of how important these are.
> How do you know that the latest version of MatchIt is producing
correct answers? How do you know that, for example, the upgrade to R
2.3 does not introduce some bug? Do you rerun every analysis that you have run before?
> We can be fairly confident that the move to 2.3 does not break the
portfolio package because all our test cases produce the same
answer. The same applies to R itself. Can the users of MatchIt be
similarly confident?
> 6) is the hardest be clear about. The change in which you just started
giving errors for dataframes that had NAs, even in columns not being
used in the analysis, was just beyond the pale. Any developper would
tell you the same. But how were you to know that ahead of time? Tough
to know. Tacit knowledge is slow to come by and hard to explain.
> Another example, see previous discussion on the list, was the
insistence that a package like MatchIt require/suggest a bunch of
other packages that it really doesn't need, mainly because you think
that everyone should have them installed. I should be able to require
MatchIt for my package and my users without entailing the installation
of every IQSS package in existence!
> Anytime we complain, you know that 6) is an issue! Just kidding!
Mostly . . .
> Anyway, those are the key issues. If they were taken care of, we would
be eager to have MatchIt be a required part of our package. We have
zero interest in becoming experts in writing matching software. We
just want our users to have a simple (for them) way of creating
matched portfolios, *and* to be sure that the results they get are
correct.
> Again, we love MatchIt and want it to be successful.
> Dave
> PS. It would also be nice to answer user questions (see mine on this
list from last week). But, strictly speaking, that isn't a
requirement. If the software is good/tested/professional, we will use
it even if no one answers our questions.
>
>
> thanks,
> Gary
>
> On Tue, 25 Apr 2006, David Kane wrote:
>
> > Kosuke Imai writes:
> > > Yes.
> >
> > Well, if you really want MatchIt to be used by other (serious) package
> > writers, then you will need to do much, much more. Let us know if you
> > want a more complete description, but test cases would be a good place
> > to start. Note that the CRAN maintainers refuse to even *install* MatchIt
> > for regular testing.
> >
> > http://cran.us.r-project.org/src/contrib/checkSummary.html
> >
> > This means that any other package, like ours, with MatchIt as a
> > Suggests will not be tested either. This is simply unacceptable and
> > means that we will need to fork out the code that we need for our
> > portfolio package. This is not a great option and creates more work
> > for us, but we have no choice.
> >
> > > We try to keep the changes minimum (as far as the syntax goes). All
> > > the changes (minor or major) are and will be noted in the documentation.
> > > See http://gking.harvard.edu/matchit/docs/What_s_New.html
> >
> > But this is not even distributed with the package! Again, any
> > professional software developper would regard your failure to
> > distribute this (along with your failure to distribute all the
> > documentation, as previously noted on this list) as proof enough of a
> > lack of concern for the needs of other developpers.
> >
> > Of course, this all sounds harsher than I mean it to sound. I love
> > MatchIt. I think it is a great program and I believe that it is and
> > will be *very* successful for the purpose that you intend it for (use
> > by other academics in writing academic papers). Indeed, I plan on
> > using it myself for that every purpose!
> >
> > But, if you want MatchIt to be used by other package writers (like us)
> > in the development of their packages (like portfolio), then you will
> > need to handle things differently. You can't just make a change that
> > blows us (and our users) up without taking adequate care and giving
> > due warning.
> >
> > But, again, thanks for making MatchIt available and open-source. It is
> > a fine program and we have benefitted from your efforts.
> >
> > Dave
> >
> > > Kosuke
> > >
> > > On Fri, 21 Apr 2006, Jeff Enos wrote:
> > >
> > > > Thanks, Kosuke.
> > > >
> > > > Does the MatchIt team care whether its package is used in and relied
> > > > upon by other open source efforts? If not, no problem -- we're still
> > > > very happy to have the MatchIt package available.
> > > >
> > > > If so, I think that major changes in behavior should be kept to a
> > > > minimum in non-beta releases, and especially patch releases, and be
> > > > clearly documented in release notes. If a change like the one I
> > > > mentioned in this thread occurs, I think it's reasonable to issue a
> > > > fix patch to correct the change in behavior that isn't required for
> > > > correctness and will probably be reverted soon anyway.
> > > >
> > > > Jeff
> > > >
> > > > Kosuke Imai writes:
> > > > > Hi,
> > > > > These issues with missing data are on our to-do list. Hopefully, we can
> > > > > implement them sometime soon. For now, you would have to delete missing
> > > > > data by hand or even better incorporate missing data patterns as
> > > > > covariates.
> > > > > Best,
> > > > > Kosuke
> > > > >
> > > > > -----------------------------------------------------
> > > > > Kosuke Imai Office: Corwin Hall 041
> > > > > Assistant Professor Phone: 609-258-6601
> > > > > Department of Politics eFax: 973-556-1929
> > > > > Princeton University Email: kimai(a)Princeton.Edu
> > > > > Princeton, NJ 08544-1012 http://imai.princeton.edu
> > > > > -----------------------------------------------------
> > > > >
> > > > > On Wed, 19 Apr 2006, Jeff Enos wrote:
> > > > >
> > > > > > MatchIt list,
> > > > > >
> > > > > > Thanks very much for providing MatchIt as an open source package.
> > > > > >
> > > > > > I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in
> > > > > > the matchit function, a stop is now encountered if there are any NA
> > > > > > values in any column of the data frame supplied as the 'data'
> > > > > > parameter:
> > > > > >
> > > > > > if(sum(is.na(data))>0)
> > > > > > stop("Missing values exist in the data")
> > > > > >
> > > > > > Unless I'm missing something, isn't this overly restrictive? I think
> > > > > > it's standard R practice to only make requirements of columns used in
> > > > > > the function's calculation (as in the 'lm' function).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jeff
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > -
> > > > > MatchIt mailing list served by Harvard-MIT Data Center
> > > > > List Address: matchit(a)latte.harvard.edu
> > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
> > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
> > > >
> > > >
> > >
> > >
> > > -
> > > MatchIt mailing list served by Harvard-MIT Data Center
> > > List Address: matchit(a)latte.harvard.edu
> > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
> > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
> >
> >
>
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
MatchIt list,
Thanks very much for providing MatchIt as an open source package.
I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in
the matchit function, a stop is now encountered if there are any NA
values in any column of the data frame supplied as the 'data'
parameter:
if(sum(is.na(data))>0)
stop("Missing values exist in the data")
Unless I'm missing something, isn't this overly restrictive? I think
it's standard R practice to only make requirements of columns used in
the function's calculation (as in the 'lm' function).
Thanks,
Jeff
--
Jeff Enos
Kane Capital Management
jeff(a)kanecap.com
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
I can now reproduce the problem that I reported yesterday on simpler
data. Consider:
> set.seed(1)
> x <- data.frame(treated = ifelse(rnorm(100) < -1, 1, 0), b = rnorm(100), c = ifelse(rnorm(100) < 0, "green", "blue"))
> summary(x)
treated b c
Min. :0.00 Min. :-1.9144 blue :50
1st Qu.:0.00 1st Qu.:-0.6510 green:50
Median :0.00 Median :-0.1772
Mean :0.11 Mean :-0.0378
3rd Qu.:0.00 3rd Qu.: 0.5009
Max. :1.00 Max. : 2.3080
> mt <- matchit(treated ~ b, data = x, exact = "c")
> summary(mt)
Error in x * w : non-numeric argument to binary operator
In addition: Warning messages:
1: argument is not numeric or logical: returning NA in: mean.default(x1, na.rm = T)
2: argument is not numeric or logical: returning NA in: mean.default(x0, na.rm = T)
> traceback()
7: sum(x * w)
6: weighted.mean(X.t.m, ww1[ww1 > 0])
5: FUN(newX[, i], ...)
4: apply(XX, 2, qoi, tt = treat, ww = weights, standardize = standardize)
3: summary.matchit(mt)
2: summary(mt)
1: summary(mt)
>
Why does summary() fail?
Dave Kane
--
David Kane
Kane Capital Management
646-644-3626
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
Hi,
I am making much more use of MatchIt and enjoying it. I hope to have a
series of questions over the next few days. See below for my version
information. I only see these problems with my data, so I have made a
copy avalaible for others to see.
> load(url("http://www.kanecap.com/R/portfolio/assay.RData"))
> summary(assay)
id symbol name
Length:4000 Length:4000 Length:4000
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
country currency price
USA :2032 USD :2034 Min. : 0.28
JPN : 788 JPY : 788 1st Qu.: 16.95
GBR : 282 EUR : 459 Median : 34.21
AUS : 126 GBP : 282 Mean : 7123.29
FRA : 119 AUD : 126 3rd Qu.: 81.08
HKG : 90 HKD : 90 Max. :2880000.00
(Other): 563 (Other): 221
sector liquidity assay
Financials :826 Min. :-3.48e+00 Mode :logical
Staples :665 1st Qu.:-6.74e-01 FALSE:3967
Cyclicals :664 Median : 0.00e+00 TRUE :33
Industrials :660 Mean :-3.97e-17
Communications:330 3rd Qu.: 6.74e-01
Technology :283 Max. : 3.48e+00
(Other) :572
ret.0.3.m ret.0.6.m
Min. :-0.8385 Min. :-0.8293
1st Qu.:-0.0652 1st Qu.:-0.1007
Median : 0.0157 Median :-0.0164
Mean : 0.0144 Mean :-0.0199
3rd Qu.: 0.0965 3rd Qu.: 0.0663
Max. : 1.1318 Max. : 0.9860
I apologize for the hackiness of the dataframe with the same name
(assay) as the key variable in it. My first question concerns an error
in summary().
> mt.1 <- matchit(assay ~ liquidity + sector, data = assay)
> mt.2 <- matchit(assay ~ liquidity, data = assay, exact = "sector")
> summary(mt.2)
Error in x * w : non-numeric argument to binary operator
In addition: Warning messages:
1: argument is not numeric or logical: returning NA in: mean.default(x1, na.rm = T)
2: argument is not numeric or logical: returning NA in: mean.default(x0, na.rm = T)
>
Note that a summary(mt.1) works fine and that the mt.2 object does
print.
> mt.2
Call:
matchit(formula = assay ~ liquidity, data = assay, exact = "sector")
Sample sizes:
Control Treated
All 3967 33
Matched 33 33
Unmatched 3934 0
Discarded 0 0
>
Why does summary(mt.1) fail and what can I do about it?
Thanks,
Dave Kane
--------------------------
> packageDescription("MatchIt")
Package: MatchIt
Version: 2.2-5
Date: 2005-12-07
Title: MatchIt: Nonparametric Preprocessing for Parametric Casual Inference
Author: Daniel Ho <daniel.e.ho(a)gmail.com>, Kosuke Imai <kimai(a)Princeton.Edu>, Gary King <king(a)harvard.edu>,
Elizabeth Stuart <stuart(a)stat.harvard.edu>
Maintainer: Kosuke Imai <kimai(a)Princeton.Edu>
Depends: R (>= 2.2), MASS, Zelig
Suggests: optmatch, Matching, WhatIf
Description: MatchIt selects matched samples of the the original treated and control groups with similar
covariate distributions -- can be used to match exactly on covariates, to match on propensity
scores, or perform a variety of other matching procedures.
LazyLoad: yes
LazyData: yes
License: GPL version 2 or newer
URL: http://gking.harvard.edu/matchit
Packaged: Wed Dec 7 09:39:20 2005; kimai
Built: R 2.2.0; ; 2006-01-11 10:17:37; unix
-- File: /home/kane/rlib/MatchIt/DESCRIPTION
>
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/
Hi this is the test. Am I on the mailing list?
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
Hi,
I have problems with reestimating the propensity score after discarding
those cases with no
common support. I use the following command:
> m.out=matchit(DIV2~SEKSE + O0PSI + O0SES + O0FAD + K0HAGEN + GEN_INT,
+ data=imptra, model="logit", method="nearest", replace=TRUE,
discard="both", reestimate=TRUE)
which results in this error:
Error in var(distance[in.sample == 1]) : missing observations in cov/cor
There are no missing values in my data and the data format is ok, so I
wonder why the analysis fails.
Does anyone know what goes wrong here?
Regards,
Claire Aussems
-
MatchIt mailing list served by Harvard-MIT Data Center
List Address: matchit(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit
MatchIt Software and Documentation: http://gking.harvard.edu/matchit/