Dear all,
I fixed the subscript out of bounds problem in matchit using a solution that I discovered 10 years ago and sent to the matchit mailing list.
Here is my post from 2007:
https://lists.gking.harvard.edu/pipermail/matchit/2007-September/000029.html
I have a dataset called Small without missing values. I get a subscript out of bounds error. When I create a new dataset that is exactly the same as the old dataset, matchit works. See the below code. The two data frames are the same data types, and they have the same row names. I do not know how they are different, but for some reason only the second one works.
I can’t give you this data to replicate the problem in for data confidentiality reasons, but perhaps someone else has data that creates this error.
> sum(apply(apply(Small, 1, is.na), 1, sum))
[1] 0
> model1<-matchit(formula= treatment ~ male, exact=c("black"), mahvars=c("age1"), data=Small, method="nearest", distance = "logit", caliper=0.25, replace=T, calclosest=T, ratio=3, verbose=T)
Nearest neighbor matching...
Matching Treated: Error in mahvars[pool, , drop = F] : subscript out of bounds
> dim(Small)
[1] 7756 8
>
> nrow(Small)
[1] 7756
> Small2=Small[1:nrow(Small),]
>
> model1<-matchit(formula= treatment ~ male, exact=c("black"), mahvars=c("age1"), data=Small2, method="nearest", distance = "logit", caliper=0.25, replace=T, calclosest=T, ratio=3, verbose=T)
Nearest neighbor matching...
Matching Treated: 10%...20%...30%...40%...50%...60%...70%...80%...90%...100%...Done
> is.data.frame(Small)
[1] TRUE
> is.data.frame(Small2)
[1] TRUE
> is.matrix(Small)
[1] FALSE
> is.matrix(Small2)
[1] FALSE
> rownames(Small)[1:4]
[1] "1" "2" "3" "4"
> rownames(Small2)[1:4]
[1] "1" "2" "3" “4”
Janet
Janet Rosenbaum, Ph.D.
Assistant Professor of Epidemiology
School of Public Health, SUNY Downstate Medical Center, Brooklyn, NY
janet(a)post.harvard.edu
Dear all,
I’m another person who gets the following error with the Mahalanobis matching option for matchit:
Error in mahvars[pool, , drop = F] : subscript out of bounds
I have done Mahalanobis matching a thousand times before without problems. Now I’ve exported a slightly different version of the data — slightly different sample size, slightly different set of variables. Running one matchit command with the first version of the data is fine. Running the exact same matchit command with the second version of the data gives the error; if I take out the mahvars option, the command runs fine. Which is unfortunate because Mahalanobis matching seems to work with the earlier version of the data.
An earlier poster said that the row names may be different, but in my case, I don’t have row names for either dataset. Both datasets are csv files exported from Stata with the same command, with only slight variations. Stata version is the same. I did upgrade R today from 3.4.0 to 3.4.1.
I replicated the problem by exporting a small version of my data file with only 8 variables: I can do nearest-neighbor matching without the mahvars option, but it doesn’t work if I specify the option. I can do full matching with this data. Any ideas how to proceed or how I can figure out what might be causing the problem? I tried debugging the data and looked at the variable pool noted in the error message, but I didn’t know where to go from there.
I recall having had a student in my class of 17 last spring who was the only student in the class who had this error, and I wasn’t able to help them either; the student just used full matching for their term paper.
> dim(Small)
[1] 7756 8
> names(Small)
[1] "treatment" "male" "age1" "black" "gonorrhea" "chlamydia" "tricho"
[8] "any_sti3"
> model1<-matchit(formula= treatment ~ male, exact=c("black"), mahvars=c("age1"), data=Small, method="nearest", distance = "logit", caliper=0.25, replace=T, calclosest=T, ratio=3, verbose=T)
Nearest neighbor matching...
Matching Treated: Error in mahvars[pool, , drop = F] : subscript out of bounds
> model0<-matchit(formula= treatment ~ male + age1, exact=c("black"), data=Small, method="nearest", distance = "logit", caliper=0.25, replace=T, calclosest=T, ratio=3, verbose=T)
Nearest neighbor matching...
Matching Treated: 10%...20%...30%...40%...50%...60%...70%...80%...90%...100%...Done
> model0
Call:
matchit(formula = treatment ~ male + age1, data = Small, method = "nearest",
distance = "logit", exact = c("black"), caliper = 0.25, replace = T,
calclosest = T, ratio = 3, verbose = T)
Sample sizes:
Control Treated
All 7375 381
Matched 1061 381
Unmatched 6314 0
Discarded 0 0
>
> model.full<-matchit(formula= treatment ~ male + priv_ah + black + age1, data=Small, method="full", verbose=T)
Error in eval(predvars, data, env) : object 'priv_ah' not found
> model.full
Call:
matchit(formula = suspended_lastyr2 ~ male + priv_ah + black +
age1, data = Small, method = "full", verbose = T)
Sample sizes:
Control Treated
All 7375 381
Matched 7375 381
Discarded 0 0
Thanks,
Janet
Janet Rosenbaum, Ph.D.
Assistant Professor of Epidemiology
School of Public Health, SUNY Downstate Medical Center, Brooklyn, NY
janet(a)post.harvard.edu