Dear,
For my PhD project I have been using the Matchit package as a preprocessing
step to control for the imbalance of covariates on the treatment. My next
step was to fit a linear model including the treatment and all the
covariates, and including the weights from the matching. This works well,
but some of the covariates are extremely skewed. My supervisor suggested to
log transform these skewed variables pre-matching. Now, my question is what
implications does this have for the matching, and if I do transform, should
I transform before or after matching?
I have been spending a lot of time searching the internet for clues, but I
cannot seem to find a final answer. If you could help me on the way that
would be very helpful.
Best wishes,
Bowy
Hi,
Suppose my data looks something like this:
df <- data.frame(treatment = c(rep(0, 6), rep(1,4)),
pretest = c(7.1, 8.1, 4, 3, 2, 1, 20, 10, 7, 8))
In this case, I would argue that for Treatment = 1 Pretest=7, a good match
is T=0 Pretest= 7.1, and similarly for Pretest 8 and 8.1
Furthermore, I would say that the other observations don't have really good
matches.
If I use `MatchIt` I endup with the following data frame:
library(MatchIt)
library(dplyr)
m.df <- matchit(formula = treatment ~ pretest,
data = df, method = 'optimal',
ratio = 1)
matched.df <- match.data(m.df)
matched.df %>% arrange(subclass)
treatment pretest distance weights subclass
1 0 7.1 0.395522749 1 1
2 1 8.0 0.620618814 1 1
3 0 8.1 0.644280156 1 2
4 1 20.0 0.999996979 1 2
5 0 3.0 0.009966059 1 3
6 1 10.0 0.926113553 1 3
7 0 4.0 0.027108952 1 4
8 1 7.0 0.371457236 1 4
Is there a way to drop the observations for which there are no good matches
and end-up with something like what I had in mind? Will this work on a more
complex example with multiple matching variables?
Thanks!
Ignacio