Matchit April 2006

matchit@lists.gking.harvard.edu

6 participants
10 discussions

Re: [matchit] The matchit function and NA's

by David Kane

Olivia Lau writes: > In any case, I suggest that if you have further issues with Zelig that you > contact me personally. This isn't an appropriate discussion for a public > users' forum. I do not have "issues with Zelig." If I did, I would contact you directly or, more appropiately, send a question to the Zelig mailing list. I responded with comments on Zelig because you asked me a question. The only reason that I cc'd the MatchIt mailing list is because *you* cc'd the MatchIt mailing list when you wrote to me directly. (For the record, I think that this was good because these issues are relevant to the discussion occuring on that mailing list.) My apologies if you did not want this conversation to include the mailing list. Actually, my last message does not seem to have gone through to the MatchIt mailing list. Is someone filtering this? > If you read Kurt's email carefully, it means that just the tests > associated with installation are not run. The rest of the R CMD > check tests *are* run. > > If you are unclear as to what those tests are, I suggest that you refer to > the Writing Extensions for R document. Or just try running it yourself on > the Zelig -tar ball. I am afraid that you do not understand what "tests" are not run when the --no-install flag is used. Note that this is difficult to see in Zelig since there is no tests directory in the first place. A clearer example is the portfolio package. I have attached the full results of R CMD check at the end of this message. As you can see, the base run includes all sorts of checks and tests, including: ----------------------------------------- chosin:~/temp/junk [sac] $ R CMD check portfolio_0.2-1.tar.gz WARNING: ignoring environment value of R_HOME * checking for working latex ... OK * using log directory '/home/kane/temp/junk/portfolio.Rcheck' * using R version 2.2.1, 2005-12-20 * checking for file 'portfolio/DESCRIPTION' ... OK * this is package 'portfolio' version '0.2-1' * checking if this is a source package ... OK * Installing *source* package 'portfolio' ... ** R ** data ** inst ** preparing package for lazy loading Creating a new generic function for 'summary' in 'portfolio' Creating a new generic function for 'plot' in 'portfolio' Creating a new generic function for 'mean' in 'portfolio' ... * creating portfolio-Ex.R ... OK * checking examples ... OK * checking tests ... make[1]: Entering directory `/home/kane/temp/junk/portfolio.Rcheck/tests' Running 'portfolioBasic.contribution.test.R' Running 'portfolioBasic.Arith.test.R' Running 'portfolio.Arith.test.R' Running 'portfolio.mvShort.test.R' Running 'portfolio.create.test.R' Running 'df.category.mean.test.R' Running 'portfolioBasic.performance.test.R' Running 'portfolio.calcWeights.test.R' Running 'portfolio.mvLong.test.R' Running 'nearest.multiple.test.R' Running 'portfolio.calcShares.test.R' Running 'weight.test.R' Running 'portfolioBasic.test.R' Running 'classes.test.R' make[1]: Leaving directory `/home/kane/temp/junk/portfolio.Rcheck/tests' OK * checking package vignettes in 'inst/doc' ... OK ... chosin:~/temp/junk [sac] $ ----------------------------------------- This is what it means to have "tests" in an R package. You have code in the tests directory which runs and either passes or fails. Now, R CMD check also checks other things (syntax, directory structure, et cetera). But these are separate from the tests. What happens when one uses the --no-install option? ----------------------------------------- chosin:~/temp/junk [sac] $ R CMD check --no-install portfolio_0.2-1.tar.gz WARNING: ignoring environment value of R_HOME * checking for working latex ... OK * using log directory '/home/kane/temp/junk/portfolio.Rcheck' * using R version 2.2.1, 2005-12-20 * checking for file 'portfolio/DESCRIPTION' ... OK * this is package 'portfolio' version '0.2-1' * checking if this is a source package ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for syntax errors ... OK * checking R files for library.dynam ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking Rd files ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking package vignettes in 'inst/doc' ... OK * checking DVI version of manual ... OK chosin:~/temp/junk [sac] $ ----------------------------------------- No tests are run! That is the problem. Some of the checks that are done in the base case are still done here (but others are not). Unless and until Zelig (and MatchIt) are checked every night at CRAN the way most packages are, there is no way for a user (or you!) to know if some change in the on-going development work of R has affected them. > We know it works under R 2.3.0 because it passes R CMD check on R 2.3.0 and > none of the proposed changes to R 2.3.0 affect Zelig (Kurt would have told > us if they had). So Kurt knows about every single change in R *and* every single change in Zelig? Not only that, but he knows about everything else in both R and Zelig so he can figure out if there is a problem? Kurt is way smarter than I am but he is not that smart. The problem here is the the meaning of "affect Zelig." I think you mean that, if Zelig stopped passing R CMD check under 2.3.0, Kurt would tell you. That makes sense. But, without test cases, you do not know if a function in Zelig produces the same answer in 2.3 that it did in 2.2. Packages with test cases can know this because they test them explicitly. Again, given your primary audience, I don't think that you care. Most other academics do not even know what test cases are, much less why they matter. They do not care if Zelig (or MatchIt) has a tests directory or if it is fully checked on CRAN each night. But the only way to know if changes in R "affect Zelig" is to have tests which check this claim. > We were notified prior to the release of R 2.2.0 that we > needed to change some functions and we did. And actually, yes, I do run > every single demo myself before we release each version of Zelig. It depends on what you mean by "run every single demo." If you just press a button and run it, then all you can be sure of is that the code "runs", you can't be sure that it is correct. Or do you run the demos, and, for each one, look at the numeric answer and then compare it (visually?) to the answer that you know is correct? That would take a lot of time and be error-prone as well. By the way, I was going to provide an example to illustrate this point, but I couldn't even get the first demo that I looked at in Zelig to work! ------------------------------------ > demo(beta) demo(beta) ---- ~~~~ Type <Return> to start : > data(house) > z.out <- zelig(dpct86 ~ dpct84 + dwin86 + incum86, data = house, model = "beta") Error in inherits(x, "data.frame") : object "house" not found In addition: Warning message: data set 'house' not found in: data(house) > ------------------------------------ Anyway, this no doubt sounds way harsher than I mean it to sound. Let me make it up to the MatchIt/Zelig team by taking everyone out to lunch. My treat! John Harvard's? Dave Full R CMD check results: ----------------------------------------- chosin:~/temp/junk [sac] $ R CMD check portfolio_0.2-1.tar.gz WARNING: ignoring environment value of R_HOME * checking for working latex ... OK * using log directory '/home/kane/temp/junk/portfolio.Rcheck' * using R version 2.2.1, 2005-12-20 * checking for file 'portfolio/DESCRIPTION' ... OK * this is package 'portfolio' version '0.2-1' * checking if this is a source package ... OK * Installing *source* package 'portfolio' ... ** R ** data ** inst ** preparing package for lazy loading Creating a new generic function for 'summary' in 'portfolio' Creating a new generic function for 'plot' in 'portfolio' Creating a new generic function for 'mean' in 'portfolio' ** help >>> Building/Updating help pages for package 'portfolio' Formats: text html latex example assay text html latex example contribution-class text html latex contributionHistory-class text html latex dow.jan.2005 text html latex example exposure-class text html latex exposureHistory-class text html latex global.2004 text html latex example global.2004.history text html latex example objectHistory-class text html latex example performance-class text html latex performanceHistory-class text html latex portfolio-class text html latex portfolioBasic-class text html latex example portfolioHistory-class text html latex example weight text html latex example ** building package indices ... * DONE (portfolio) * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking package dependencies ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for syntax errors ... OK * checking R files for library.dynam ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking Rd files ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * creating portfolio-Ex.R ... OK * checking examples ... OK * checking tests ... make[1]: Entering directory `/home/kane/temp/junk/portfolio.Rcheck/tests' Running 'portfolioBasic.contribution.test.R' Running 'portfolioBasic.Arith.test.R' Running 'portfolio.Arith.test.R' Running 'portfolio.mvShort.test.R' Running 'portfolio.create.test.R' Running 'df.category.mean.test.R' Running 'portfolioBasic.performance.test.R' Running 'portfolio.calcWeights.test.R' Running 'portfolio.mvLong.test.R' Running 'nearest.multiple.test.R' Running 'portfolio.calcShares.test.R' Running 'weight.test.R' Running 'portfolioBasic.test.R' Running 'classes.test.R' make[1]: Leaving directory `/home/kane/temp/junk/portfolio.Rcheck/tests' OK * checking package vignettes in 'inst/doc' ... OK * creating portfolio-manual.tex ... OK * checking portfolio-manual.tex ... OK chosin:~/temp/junk [sac] $ chosin:~/temp/junk [sac] $ R CMD check --no-install portfolio_0.2-1.tar.gz WARNING: ignoring environment value of R_HOME * checking for working latex ... OK * using log directory '/home/kane/temp/junk/portfolio.Rcheck' * using R version 2.2.1, 2005-12-20 * checking for file 'portfolio/DESCRIPTION' ... OK * this is package 'portfolio' version '0.2-1' * checking if this is a source package ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for syntax errors ... OK * checking R files for library.dynam ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking Rd files ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking package vignettes in 'inst/doc' ... OK * checking DVI version of manual ... OK chosin:~/temp/junk [sac] $ - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

Daily package check and --install=no

by Jeff Enos

R-devel, There has been some confusion on the MatchIt package mailing list on the meaning of [--install=no] in the comment column of CRAN's automated package check. It's my understanding that, at the very least, a package marked like this will not have its test cases run each night. Are there other checks that are omitted? How, if at all, are such install flags related to the parameters one can pass R CMD check, such as --no-install, --no-test, etc.? Thanks, Jeff -- Jeff Enos Kane Capital Management jeff(a)kanecap.com - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

Re: [matchit] The matchit function and NA's

by David Kane

Olivia Lau writes: > I'm sorry. Are you talking about MatchIt or Zelig with respect to (1)-(6) > below? If you are talking about Zelig: We are talking about MatchIt. Although we have not used Zelig nearly as much, it seems to do a better job on several of these issues. Perhaps the Zelig maintainer had a hard-assed GOV 1000 instructor who taught her good habits . . . ;-) > (3) The --install=no flag does not refer to VGAM as I have numerous emails > from the CRAN package maintainers verifying that they have had to download > and install the proper versioning of VGAM to get Zelig to pass checks. The --install=no flag means that, at least, tests are not run. (It may also mean that demos and vignettes are not checked/run/whatever; the details are unclear to me. But the key point is that any status which prevents test cases from being run is unacceptable for professional work.) > (5) We have test cases in demo files. You can run them, if you wish, but > R CMD check does not do it at present. There is some discussion on the R > developers list to force R CMD check to run the demo files, but it hasn't > been implemented yet, to my knoweldge. You may have a misunderstanding about what "test cases" mean in R. Tests go in the test directory. See Writing R Extensions. Zelig does not have a test directory, so it has no test cases. Now, there is some demo code in Zelig. If the demo code does not run, you can probably be sure that there is a problem. But, given the proliferation of user.prompt() calls in the demos, I do not see how these could ever be run in an automated fashion. Assume for a second that you know that Zelig ran perfectly under R 2.2.0 because you checked by hand every function. How do you know that it runs correctly under R 2.3.0? You don't. The fact that the demos may (or may not) run tells you little of interest. Do you even re-run all the demos yourself when a new version of R comes out? I hope not! Every time that you check that a function in Zelig does what you want it to do, you should add a test case to the test directory. Test cases are an endless bother, but they are a requirement in serious software development. We remain big fans of all these packages and of the efforts that have gone into them. Dave > (6) I assume you are referring to MatchIt and not Zelig. 8) > > Best, > > Olivia Lau > > On Tue, 25 Apr 2006, David Kane wrote: > > > Gary King writes: > > > > > > yes, pls send a complete description of what you're talking about. > > > We are planning to add versioning for packages used by Zelig, which > > > this is related to. > > > > A complete description might take some time, but here is a good start. > > > > 1) Visible change-log in the package. > > > > 2) All documentation included in the package. > > > > 3) All required packages must be on CRAN. (Problem here *seems* to be > > that MatchIt needs Zelig which suggests VGAM which still has not made > > it to CRAN.) > > > > 4) Package must pass tests on CRAN. Right now, the CRAN maintainers > > don't even bother to test the package because, I think, of 3). > > > > 5) Test cases. > > > > 6) A decent regard for R coding standards. > > > > 1) and 2) are obvious. See virtually every other R package. 3) is what > > really forced us to fork right now, although we have complained about > > 2) in the past. 4) is also a requirment. Why should anyone assume that > > the package does what it claims to do if the CRAN maintainers won't > > even install it? > > > > 5) will be the hardest for non-professionals to do or see the purpose > > of. You can check out the test cases in R itself (as well as in > > packages like ours) to get a sense of how important these are. > > > > How do you know that the latest version of MatchIt is producing > > correct answers? How do you know that, for example, the upgrade to R > > 2.3 does not introduce some bug? Do you rerun every analysis that you have run before? > > > > We can be fairly confident that the move to 2.3 does not break the > > portfolio package because all our test cases produce the same > > answer. The same applies to R itself. Can the users of MatchIt be > > similarly confident? > > > > 6) is the hardest be clear about. The change in which you just started > > giving errors for dataframes that had NAs, even in columns not being > > used in the analysis, was just beyond the pale. Any developper would > > tell you the same. But how were you to know that ahead of time? Tough > > to know. Tacit knowledge is slow to come by and hard to explain. > > > > Another example, see previous discussion on the list, was the > > insistence that a package like MatchIt require/suggest a bunch of > > other packages that it really doesn't need, mainly because you think > > that everyone should have them installed. I should be able to require > > MatchIt for my package and my users without entailing the installation > > of every IQSS package in existence! > > > > Anytime we complain, you know that 6) is an issue! Just kidding! > > Mostly . . . > > > > Anyway, those are the key issues. If they were taken care of, we would > > be eager to have MatchIt be a required part of our package. We have > > zero interest in becoming experts in writing matching software. We > > just want our users to have a simple (for them) way of creating > > matched portfolios, *and* to be sure that the results they get are > > correct. > > > > Again, we love MatchIt and want it to be successful. > > > > Dave > > > > PS. It would also be nice to answer user questions (see mine on this > > list from last week). But, strictly speaking, that isn't a > > requirement. If the software is good/tested/professional, we will use > > it even if no one answers our questions. > > > > > > > > > > thanks, > > > Gary > > > > > > On Tue, 25 Apr 2006, David Kane wrote: > > > > > > > Kosuke Imai writes: > > > > > Yes. > > > > > > > > Well, if you really want MatchIt to be used by other (serious) package > > > > writers, then you will need to do much, much more. Let us know if you > > > > want a more complete description, but test cases would be a good place > > > > to start. Note that the CRAN maintainers refuse to even *install* MatchIt > > > > for regular testing. > > > > > > > > http://cran.us.r-project.org/src/contrib/checkSummary.html > > > > > > > > This means that any other package, like ours, with MatchIt as a > > > > Suggests will not be tested either. This is simply unacceptable and > > > > means that we will need to fork out the code that we need for our > > > > portfolio package. This is not a great option and creates more work > > > > for us, but we have no choice. > > > > > > > > > We try to keep the changes minimum (as far as the syntax goes). All > > > > > the changes (minor or major) are and will be noted in the documentation. > > > > > See http://gking.harvard.edu/matchit/docs/What_s_New.html > > > > > > > > But this is not even distributed with the package! Again, any > > > > professional software developper would regard your failure to > > > > distribute this (along with your failure to distribute all the > > > > documentation, as previously noted on this list) as proof enough of a > > > > lack of concern for the needs of other developpers. > > > > > > > > Of course, this all sounds harsher than I mean it to sound. I love > > > > MatchIt. I think it is a great program and I believe that it is and > > > > will be *very* successful for the purpose that you intend it for (use > > > > by other academics in writing academic papers). Indeed, I plan on > > > > using it myself for that every purpose! > > > > > > > > But, if you want MatchIt to be used by other package writers (like us) > > > > in the development of their packages (like portfolio), then you will > > > > need to handle things differently. You can't just make a change that > > > > blows us (and our users) up without taking adequate care and giving > > > > due warning. > > > > > > > > But, again, thanks for making MatchIt available and open-source. It is > > > > a fine program and we have benefitted from your efforts. > > > > > > > > Dave > > > > > > > > > Kosuke > > > > > > > > > > On Fri, 21 Apr 2006, Jeff Enos wrote: > > > > > > > > > > > Thanks, Kosuke. > > > > > > > > > > > > Does the MatchIt team care whether its package is used in and relied > > > > > > upon by other open source efforts? If not, no problem -- we're still > > > > > > very happy to have the MatchIt package available. > > > > > > > > > > > > If so, I think that major changes in behavior should be kept to a > > > > > > minimum in non-beta releases, and especially patch releases, and be > > > > > > clearly documented in release notes. If a change like the one I > > > > > > mentioned in this thread occurs, I think it's reasonable to issue a > > > > > > fix patch to correct the change in behavior that isn't required for > > > > > > correctness and will probably be reverted soon anyway. > > > > > > > > > > > > Jeff > > > > > > > > > > > > Kosuke Imai writes: > > > > > > > Hi, > > > > > > > These issues with missing data are on our to-do list. Hopefully, we can > > > > > > > implement them sometime soon. For now, you would have to delete missing > > > > > > > data by hand or even better incorporate missing data patterns as > > > > > > > covariates. > > > > > > > Best, > > > > > > > Kosuke > > > > > > > > > > > > > > ----------------------------------------------------- > > > > > > > Kosuke Imai Office: Corwin Hall 041 > > > > > > > Assistant Professor Phone: 609-258-6601 > > > > > > > Department of Politics eFax: 973-556-1929 > > > > > > > Princeton University Email: kimai(a)Princeton.Edu > > > > > > > Princeton, NJ 08544-1012 http://imai.princeton.edu > > > > > > > ----------------------------------------------------- > > > > > > > > > > > > > > On Wed, 19 Apr 2006, Jeff Enos wrote: > > > > > > > > > > > > > > > MatchIt list, > > > > > > > > > > > > > > > > Thanks very much for providing MatchIt as an open source package. > > > > > > > > > > > > > > > > I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in > > > > > > > > the matchit function, a stop is now encountered if there are any NA > > > > > > > > values in any column of the data frame supplied as the 'data' > > > > > > > > parameter: > > > > > > > > > > > > > > > > if(sum(is.na(data))>0) > > > > > > > > stop("Missing values exist in the data") > > > > > > > > > > > > > > > > Unless I'm missing something, isn't this overly restrictive? I think > > > > > > > > it's standard R practice to only make requirements of columns used in > > > > > > > > the function's calculation (as in the 'lm' function). > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Jeff > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > > > MatchIt mailing list served by Harvard-MIT Data Center > > > > > > > List Address: matchit(a)latte.harvard.edu > > > > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit > > > > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/ > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > MatchIt mailing list served by Harvard-MIT Data Center > > > > > List Address: matchit(a)latte.harvard.edu > > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit > > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/ > > > > > > > > > > > > -- David Kane Kane Capital Management 646-644-3626 - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

FW: Re: [matchit] The matchit function and NA's

by Gary King

-----Original Message----- From: Olivia Lau <olau(a)fas.harvard.edu> Date: Tuesday, Apr 25, 2006 4:55 pm Subject: Re: [matchit] The matchit function and NA's I'm sorry. Are you talking about MatchIt or Zelig with respect to (1)-(6) below? If you are talking about Zelig: (1) A change log for Zelig is available as README in the .tar.gz bundle. If you need more detail (e.g., the CVS directory log), you can ask me personally for it, but most of it is not of interest to the general public, and so I just do a short summary of what's changed in the README. (2) All functions exported in the Zelig namespace are documented in the Zelig man directory. If it's not documented, you probably can't call it directly from another package anyway since it's not exported. (3) The --install=no flag does not refer to VGAM as I have numerous emails from the CRAN package maintainers verifying that they have had to download and install the proper versioning of VGAM to get Zelig to pass checks. (4) Zelig does pass all checks on CRAN or Kurt and co would not post it as passed. (5) We have test cases in demo files. You can run them, if you wish, but R CMD check does not do it at present. There is some discussion on the R developers list to force R CMD check to run the demo files, but it hasn't been implemented yet, to my knoweldge. (6) I assume you are referring to MatchIt and not Zelig. 8) Best, Olivia Lau On Tue, 25 Apr 2006, David Kane wrote: > Gary King writes: > > yes, pls send a complete description of what you're talking about. > We are planning to add versioning for packages used by Zelig, which > this is related to. > A complete description might take some time, but here is a good start. > 1) Visible change-log in the package. > 2) All documentation included in the package. > 3) All required packages must be on CRAN. (Problem here *seems* to be that MatchIt needs Zelig which suggests VGAM which still has not made it to CRAN.) > 4) Package must pass tests on CRAN. Right now, the CRAN maintainers don't even bother to test the package because, I think, of 3). > 5) Test cases. > 6) A decent regard for R coding standards. > 1) and 2) are obvious. See virtually every other R package. 3) is what really forced us to fork right now, although we have complained about 2) in the past. 4) is also a requirment. Why should anyone assume that the package does what it claims to do if the CRAN maintainers won't even install it? > 5) will be the hardest for non-professionals to do or see the purpose of. You can check out the test cases in R itself (as well as in packages like ours) to get a sense of how important these are. > How do you know that the latest version of MatchIt is producing correct answers? How do you know that, for example, the upgrade to R 2.3 does not introduce some bug? Do you rerun every analysis that you have run before? > We can be fairly confident that the move to 2.3 does not break the portfolio package because all our test cases produce the same answer. The same applies to R itself. Can the users of MatchIt be similarly confident? > 6) is the hardest be clear about. The change in which you just started giving errors for dataframes that had NAs, even in columns not being used in the analysis, was just beyond the pale. Any developper would tell you the same. But how were you to know that ahead of time? Tough to know. Tacit knowledge is slow to come by and hard to explain. > Another example, see previous discussion on the list, was the insistence that a package like MatchIt require/suggest a bunch of other packages that it really doesn't need, mainly because you think that everyone should have them installed. I should be able to require MatchIt for my package and my users without entailing the installation of every IQSS package in existence! > Anytime we complain, you know that 6) is an issue! Just kidding! Mostly . . . > Anyway, those are the key issues. If they were taken care of, we would be eager to have MatchIt be a required part of our package. We have zero interest in becoming experts in writing matching software. We just want our users to have a simple (for them) way of creating matched portfolios, *and* to be sure that the results they get are correct. > Again, we love MatchIt and want it to be successful. > Dave > PS. It would also be nice to answer user questions (see mine on this list from last week). But, strictly speaking, that isn't a requirement. If the software is good/tested/professional, we will use it even if no one answers our questions. > > > thanks, > Gary > > On Tue, 25 Apr 2006, David Kane wrote: > > > Kosuke Imai writes: > > > Yes. > > > > Well, if you really want MatchIt to be used by other (serious) package > > writers, then you will need to do much, much more. Let us know if you > > want a more complete description, but test cases would be a good place > > to start. Note that the CRAN maintainers refuse to even *install* MatchIt > > for regular testing. > > > > http://cran.us.r-project.org/src/contrib/checkSummary.html > > > > This means that any other package, like ours, with MatchIt as a > > Suggests will not be tested either. This is simply unacceptable and > > means that we will need to fork out the code that we need for our > > portfolio package. This is not a great option and creates more work > > for us, but we have no choice. > > > > > We try to keep the changes minimum (as far as the syntax goes). All > > > the changes (minor or major) are and will be noted in the documentation. > > > See http://gking.harvard.edu/matchit/docs/What_s_New.html > > > > But this is not even distributed with the package! Again, any > > professional software developper would regard your failure to > > distribute this (along with your failure to distribute all the > > documentation, as previously noted on this list) as proof enough of a > > lack of concern for the needs of other developpers. > > > > Of course, this all sounds harsher than I mean it to sound. I love > > MatchIt. I think it is a great program and I believe that it is and > > will be *very* successful for the purpose that you intend it for (use > > by other academics in writing academic papers). Indeed, I plan on > > using it myself for that every purpose! > > > > But, if you want MatchIt to be used by other package writers (like us) > > in the development of their packages (like portfolio), then you will > > need to handle things differently. You can't just make a change that > > blows us (and our users) up without taking adequate care and giving > > due warning. > > > > But, again, thanks for making MatchIt available and open-source. It is > > a fine program and we have benefitted from your efforts. > > > > Dave > > > > > Kosuke > > > > > > On Fri, 21 Apr 2006, Jeff Enos wrote: > > > > > > > Thanks, Kosuke. > > > > > > > > Does the MatchIt team care whether its package is used in and relied > > > > upon by other open source efforts? If not, no problem -- we're still > > > > very happy to have the MatchIt package available. > > > > > > > > If so, I think that major changes in behavior should be kept to a > > > > minimum in non-beta releases, and especially patch releases, and be > > > > clearly documented in release notes. If a change like the one I > > > > mentioned in this thread occurs, I think it's reasonable to issue a > > > > fix patch to correct the change in behavior that isn't required for > > > > correctness and will probably be reverted soon anyway. > > > > > > > > Jeff > > > > > > > > Kosuke Imai writes: > > > > > Hi, > > > > > These issues with missing data are on our to-do list. Hopefully, we can > > > > > implement them sometime soon. For now, you would have to delete missing > > > > > data by hand or even better incorporate missing data patterns as > > > > > covariates. > > > > > Best, > > > > > Kosuke > > > > > > > > > > ----------------------------------------------------- > > > > > Kosuke Imai Office: Corwin Hall 041 > > > > > Assistant Professor Phone: 609-258-6601 > > > > > Department of Politics eFax: 973-556-1929 > > > > > Princeton University Email: kimai(a)Princeton.Edu > > > > > Princeton, NJ 08544-1012 http://imai.princeton.edu > > > > > ----------------------------------------------------- > > > > > > > > > > On Wed, 19 Apr 2006, Jeff Enos wrote: > > > > > > > > > > > MatchIt list, > > > > > > > > > > > > Thanks very much for providing MatchIt as an open source package. > > > > > > > > > > > > I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in > > > > > > the matchit function, a stop is now encountered if there are any NA > > > > > > values in any column of the data frame supplied as the 'data' > > > > > > parameter: > > > > > > > > > > > > if(sum(is.na(data))>0) > > > > > > stop("Missing values exist in the data") > > > > > > > > > > > > Unless I'm missing something, isn't this overly restrictive? I think > > > > > > it's standard R practice to only make requirements of columns used in > > > > > > the function's calculation (as in the 'lm' function). > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jeff > > > > > > > > > > > > > > > > > > > > > > > > > > > - > > > > > MatchIt mailing list served by Harvard-MIT Data Center > > > > > List Address: matchit(a)latte.harvard.edu > > > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit > > > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/ > > > > > > > > > > > > > > > > > - > > > MatchIt mailing list served by Harvard-MIT Data Center > > > List Address: matchit(a)latte.harvard.edu > > > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit > > > MatchIt Software and Documentation: http://gking.harvard.edu/matchit/ > > > > > - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

The matchit function and NA's

by Jeff Enos

MatchIt list, Thanks very much for providing MatchIt as an open source package. I recently upgraded to MatchIt 2.2-7 from 2.2-3 and noticed that, in the matchit function, a stop is now encountered if there are any NA values in any column of the data frame supplied as the 'data' parameter: if(sum(is.na(data))>0) stop("Missing values exist in the data") Unless I'm missing something, isn't this overly restrictive? I think it's standard R practice to only make requirements of columns used in the function's calculation (as in the 'lm' function). Thanks, Jeff -- Jeff Enos Kane Capital Management jeff(a)kanecap.com - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

simpler example of problem

by David Kane

I can now reproduce the problem that I reported yesterday on simpler data. Consider: > set.seed(1) > x <- data.frame(treated = ifelse(rnorm(100) < -1, 1, 0), b = rnorm(100), c = ifelse(rnorm(100) < 0, "green", "blue")) > summary(x) treated b c Min. :0.00 Min. :-1.9144 blue :50 1st Qu.:0.00 1st Qu.:-0.6510 green:50 Median :0.00 Median :-0.1772 Mean :0.11 Mean :-0.0378 3rd Qu.:0.00 3rd Qu.: 0.5009 Max. :1.00 Max. : 2.3080 > mt <- matchit(treated ~ b, data = x, exact = "c") > summary(mt) Error in x * w : non-numeric argument to binary operator In addition: Warning messages: 1: argument is not numeric or logical: returning NA in: mean.default(x1, na.rm = T) 2: argument is not numeric or logical: returning NA in: mean.default(x0, na.rm = T) > traceback() 7: sum(x * w) 6: weighted.mean(X.t.m, ww1[ww1 > 0]) 5: FUN(newX[, i], ...) 4: apply(XX, 2, qoi, tt = treat, ww = weights, standardize = standardize) 3: summary.matchit(mt) 2: summary(mt) 1: summary(mt) > Why does summary() fail? Dave Kane -- David Kane Kane Capital Management 646-644-3626 - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

errors with summary

by David Kane

Hi, I am making much more use of MatchIt and enjoying it. I hope to have a series of questions over the next few days. See below for my version information. I only see these problems with my data, so I have made a copy avalaible for others to see. > load(url("http://www.kanecap.com/R/portfolio/assay.RData")) > summary(assay) id symbol name Length:4000 Length:4000 Length:4000 Class :character Class :character Class :character Mode :character Mode :character Mode :character country currency price USA :2032 USD :2034 Min. : 0.28 JPN : 788 JPY : 788 1st Qu.: 16.95 GBR : 282 EUR : 459 Median : 34.21 AUS : 126 GBP : 282 Mean : 7123.29 FRA : 119 AUD : 126 3rd Qu.: 81.08 HKG : 90 HKD : 90 Max. :2880000.00 (Other): 563 (Other): 221 sector liquidity assay Financials :826 Min. :-3.48e+00 Mode :logical Staples :665 1st Qu.:-6.74e-01 FALSE:3967 Cyclicals :664 Median : 0.00e+00 TRUE :33 Industrials :660 Mean :-3.97e-17 Communications:330 3rd Qu.: 6.74e-01 Technology :283 Max. : 3.48e+00 (Other) :572 ret.0.3.m ret.0.6.m Min. :-0.8385 Min. :-0.8293 1st Qu.:-0.0652 1st Qu.:-0.1007 Median : 0.0157 Median :-0.0164 Mean : 0.0144 Mean :-0.0199 3rd Qu.: 0.0965 3rd Qu.: 0.0663 Max. : 1.1318 Max. : 0.9860 I apologize for the hackiness of the dataframe with the same name (assay) as the key variable in it. My first question concerns an error in summary(). > mt.1 <- matchit(assay ~ liquidity + sector, data = assay) > mt.2 <- matchit(assay ~ liquidity, data = assay, exact = "sector") > summary(mt.2) Error in x * w : non-numeric argument to binary operator In addition: Warning messages: 1: argument is not numeric or logical: returning NA in: mean.default(x1, na.rm = T) 2: argument is not numeric or logical: returning NA in: mean.default(x0, na.rm = T) > Note that a summary(mt.1) works fine and that the mt.2 object does print. > mt.2 Call: matchit(formula = assay ~ liquidity, data = assay, exact = "sector") Sample sizes: Control Treated All 3967 33 Matched 33 33 Unmatched 3934 0 Discarded 0 0 > Why does summary(mt.1) fail and what can I do about it? Thanks, Dave Kane -------------------------- > packageDescription("MatchIt") Package: MatchIt Version: 2.2-5 Date: 2005-12-07 Title: MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Author: Daniel Ho <daniel.e.ho(a)gmail.com>, Kosuke Imai <kimai(a)Princeton.Edu>, Gary King <king(a)harvard.edu>, Elizabeth Stuart <stuart(a)stat.harvard.edu> Maintainer: Kosuke Imai <kimai(a)Princeton.Edu> Depends: R (>= 2.2), MASS, Zelig Suggests: optmatch, Matching, WhatIf Description: MatchIt selects matched samples of the the original treated and control groups with similar covariate distributions -- can be used to match exactly on covariates, to match on propensity scores, or perform a variety of other matching procedures. LazyLoad: yes LazyData: yes License: GPL version 2 or newer URL: http://gking.harvard.edu/matchit Packaged: Wed Dec 7 09:39:20 2005; kimai Built: R 2.2.0; ; 2006-01-11 10:17:37; unix -- File: /home/kane/rlib/MatchIt/DESCRIPTION > - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

test

by Chunling Lu

Hi this is the test. Am I on the mailing list? --------------------------------- Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.

18 years

reestimating the propensity score

by M.E.Aussems

Hi, I have problems with reestimating the propensity score after discarding those cases with no common support. I use the following command: > m.out=matchit(DIV2~SEKSE + O0PSI + O0SES + O0FAD + K0HAGEN + GEN_INT, + data=imptra, model="logit", method="nearest", replace=TRUE, discard="both", reestimate=TRUE) which results in this error: Error in var(distance[in.sample == 1]) : missing observations in cov/cor There are no missing values in my data and the data format is ok, so I wonder why the analysis fails. Does anyone know what goes wrong here? Regards, Claire Aussems - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

Re: [zelig] Cbind errors

by Kosuke Imai

Dear Ani, I just released MatchIt version 2.2-6 which includes this bug fix. You can install this latest version via install.packages("MatchIt", repos="http://gking.harvard.edu") Or you can wait for a few days until it appears on CRAN. Cheers, Kosuke ----------------------------------------------------- Kosuke Imai Office: Corwin Hall 041 Assistant Professor Phone: 609-258-6601 Department of Politics eFax: 973-556-1929 Princeton University Email: kimai(a)Princeton.Edu Princeton, NJ 08544-1012 http://imai.princeton.edu ----------------------------------------------------- On Fri, 31 Mar 2006, Anirudh V. S. Ruhil wrote: > Dear Kosuke; > > I reinstalled matchIt 2.2-5 and the error persists. The relevant code and > associated output (truncated to save bandwidth but show m.out2 exists) > follow ... > > ------------------------------------ > > > library(MatchIt) > > > m.out2 <- matchit(scale ~ population + urbanization > + + pctownocc + mdyearstruc > + + mdhomevalue + pcincome > + + pctjuvenile + pctjuvespoor, > + data = scale, method = "nearest") > > summary(m.out2) > > Call: > matchit(formula = scale ~ population + urbanization + pctownocc + > mdyearstruc + mdhomevalue + pcincome + pctjuvenile + pctjuvespoor, > data = scale, method = "nearest") > > {....} > {....} > > > Percent Balance Improvement: > Mean Diff. eQQ Med eQQ Mean eQQ Max > distance 98.45 96.71 94.96 89.44 > population 48.63 24.30 62.40 86.64 > urbanization 95.06 57.10 94.97 96.52 > pctownocc 84.82 81.91 82.74 65.41 > mdyearstruc 86.78 83.33 76.47 73.08 > mdhomevalue 92.66 87.10 87.44 93.22 > pcincome 96.39 87.71 91.88 92.71 > pctjuvenile 85.76 80.44 84.25 86.66 > pctjuvespoor 86.45 80.75 81.74 67.84 > > Sample sizes: > Control Treated > All 160 13 > Matched 13 13 > Unmatched 147 0 > Discarded 0 0 > > > > m.out2$match.matrix > 1 > 1 "5" > 31 "59" > 35 "44" > 53 "34" > 70 "64" > 81 "62" > 87 "146" > 100 "102" > 137 "127" > 148 "154" > 158 "30" > 171 "84" > 173 "114" > > > matches <- match.data(m.out2) > Error in cbind(data, object$distance) : cannot coerce type closure to list > vector > > ---------------------------------- > > best > > Ani > > Anirudh V. S. Ruhil, Ph.D. > Sr. Research Associate > Voinovich Center for Leadership and Public Affairs > Ohio University > Building 21, The Ridges > Athens, OH 45701-2979 > Tel: 740.597.1949 | Fax: 740.597.3057 > - MatchIt mailing list served by Harvard-MIT Data Center List Address: matchit(a)latte.harvard.edu Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=matchit MatchIt Software and Documentation: http://gking.harvard.edu/matchit/

18 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Matchit April 2006