FLAME (Fast, Large-scale Almost Matching Exactly) is a fast, interpretable matching method for causal inference. It matches units via a learned, weighted Hamming distance that determines which covariates are more important to match on. For more details, see the below section Description of the Algorithm or the original FLAME paper, linked here.
We can start by loading FLAME…
… and generating some toy data using the included gen_data
function.
set.seed(45)
n <- 100
p <- 5
data <- gen_data(n, p) # Data we would like to match
holdout <- gen_data(n, p) # Data we will train on, to compute PE
Note that all our covariates are factors, because FLAME is designed to work with categorical covariates:
If this is not the case, they will be assumed to be continuous covariates and binned prior to matching. This use of FLAME is not recommended. To be clear: any covariates that are not continuous, that you would like to match exactly on, must be passed to FLAME
as factors.
In addition to the covariates to match on, data
contains an outcome and a treated column:
The outcome must be numeric, either binary or continuous. FLAME focuses on binary treatments and the treatment column must either be logical or binary numeric.
From here, we can run FLAME with its default parameters. This will match units on the covariates – here, X1
, X2
, X3
, X4
, X5
– and output information about the matches that were made.
By default, FLAME
returns a list with 6 entries:
The first, FLAME_out$data
contains the original data frame with several modifications:
FLAME_out$data$matched
, that indicates whether or not a unit was matched. This can be useful if, for example, you’d like to use only the units that were matched for subsequent analysis:There is an extra numeric column, FLAME_out$data$weight
that denotes on how many different sets of covariates a unit was matched. By default, this will be 1 if a unit is matched and 0 otherwise. With the replace = TRUE
argument, however, units are allowed to match several times on multiple sets of covariates and their values for weight
can therefore be greater than 1. These weights can be used when estimating treatment effects.
Regardless of their original names, the columns denoting treatment and outcome in the data will be renamed treated
and outcome
and they are moved to be located after all the covariate data.
Units that were not matched on all covariates, will have a * in place of their covariate value for all covariates for which they were not matched.
head(FLAME_out$data)
#> X1 X2 X3 X4 X5 outcome treated matched weight
#> 1 1 1 0 * * 10.6908610 1 TRUE 1
#> 2 0 0 0 0 0 -0.3718777 0 TRUE 1
#> 3 * * * * * -0.4749807 1 FALSE 0
#> 4 0 0 0 1 0 -0.8616791 0 TRUE 1
#> 5 * * * * * 2.5016291 1 FALSE 0
#> 6 0 0 0 1 0 -0.1160383 0 TRUE 1
The above, for example, implies that while unit 2 was matched to units that also had values (X1
, X2
, X3
, X4
, X5
) = (0, 0, 0, 0, 0), unit 1 was matched to units that shared values of (X1
, X2
, X3
) = (1, 1, 0), but that differed in their values of X4
and X5
. Units 3 and 5 were not matched at all.
The second, MGs
is a list, each entry of which contains the units in a single matched group.
That is, units 2, 58, 60, 75, and 95 were all matched together and there are 19 matched groups total.
The third, CATE
, complements MGs
by supplying the conditional average treatment effect (CATE) for each matched group. For example, the CATE of the matched group above is given by:
The fourth, matched_on
, is a list also corresponding to MGs
that gives the covariates, and their values, on which units in each matched group were matched.
The above shows that each of the units in the first matched group had covariate values (X1
, X2
, X3
, X4
, X5
) = (0, 0, 0, 0, 0). For matched groups not formed on all covariates, some of these entries will be missing:
Thus, the units in the 17th matched group, as defined by MGs[[7]]
, shared the same values of X1
, X2
, X3
, and X4
, but not of X5
.
The fifth, matching_covs
is a list, which shows the covariates for matching on every iteration of FLAME:
FLAME_out$matching_covs
#> [[1]]
#> [1] "X1" "X2" "X3" "X4" "X5"
#>
#> [[2]]
#> [1] "X1" "X2" "X3" "X4"
#>
#> [[3]]
#> [1] "X1" "X2" "X3"
Thus, first, matches were attempted on covariates X1
, X2
, X3
, X4
, X5
. Then, matches were attempted on all covariates but X5
, and so on. Note that entries of matching_covs
do not necessarily denote covariates on which matches were successfully made; rather, they denote the covariates which were used to (try and) match on every iteration of FLAME.
The sixth, dropped
describes the order in which covariates were dropped:
Thus, first covariate X5
was dropped, then X4
, and so on. This information is directly inferrable from matching_covs
, but for large numbers of covariates, dropped
provides an easier way of identifying this order.
After FLAME
has been run, the matched data can be used for a variety of purposes. The FLAME
package provides functionality for a few quick, post-matching analyses, via the functions MG
, CATE
, ATE
, and ATT
.
The function MG(units, FLAME_out, index_only = FALSE)
takes in a vector of units, whose matched groups you would like returned, and the output of a call to FLAME
. If we want to see the matched group of units 1 and 2, for example, we can run:
MG(c(1, 2), FLAME_out)
#> [[1]]
#> X1 X2 X3 X4 X5 outcome treated
#> 1 1 1 0 * * 10.690861 1
#> 8 1 1 0 * * 8.796627 1
#> 12 1 1 0 * * 11.365257 1
#> 28 1 1 0 * * 8.200261 1
#> 74 1 1 0 * * 5.205795 0
#> 76 1 1 0 * * 10.140320 1
#> 91 1 1 0 * * 5.551738 0
#> 93 1 1 0 * * 3.492163 0
#> 97 1 1 0 * * 3.752434 0
#>
#> [[2]]
#> X1 X2 X3 X4 X5 outcome treated
#> 2 0 0 0 0 0 -0.3718777 0
#> 58 0 0 0 0 0 0.6972013 0
#> 60 0 0 0 0 0 2.9151768 1
#> 75 0 0 0 0 0 1.0590852 0
#> 95 0 0 0 0 0 0.3293956 0
This returns a list of two data frames, the first corresponding to unit 1 and the second to unit 2. Each contains information for all units in the corresponding matched groups. The asterisks in the last two columns of the first data frame indicate that these units did not match on X4
or X5
. If we only want the indices of the units in each matched group, we can specify index_only = TRUE
:
MG(c(1, 2), FLAME_out, index_only = TRUE)
#> [[1]]
#> [1] 1 8 12 28 74 76 91 93 97
#>
#> [[2]]
#> [1] 2 58 60 75 95
CATE(units, FLAME_out)
takes in the same first two arguments and gives the estimated CATEs of the units in units
. The CATE of a unit is defined to be the CATE of its matched group and the CATE of a matched group is difference between average treated and control outcomes in the matched group.
The CATEs of units 1 and 2 are thus
ATE(FLAME_out)
and ATT(FLAME_out)
take in the output of a call to FLAME
and return the estimated average treatment effect and the estimated average treatment effect on the treated, respectively.
Below are brief descriptions of the main arguments that may be passed to FLAME
. For their complete descriptions, and those of all acceptable arguments, please refer to the documentation.
These are arguments that govern the format in which data is passed to FLAME
.
data
: Either a data frame or path to a .csv file containing the data to be matched. If a path to a .csv file, all covariates will be assumed to be categorical Treatments are assumed to be binary (can be input as logical) and outcomes numeric or binary. Treatments and outcome should not be coded as factors. Covariates should be factors; otherwise, they will be interpreted as continuous covariates and binned prior to matching. Using FLAME to match on binned, continuous covariates is not recommended. In addition, if a supplied factor has \(k\) levels, they must be: \(0, 1, \dots, k - 1\). This will change in a future update.holdout
: Either a data frame, or path to a .csv file or a value between 0 and 1. In the first two cases, the argument indicates the holdout set to be used for computing predictive error. In the third case, that proportion of data
will be used as a holdout set and only the remaining proportion will be matched. In this case, the rows (units) of the original data
input to FLAME
that are matched are those specified by rownames(FLAME_out$data)
. Restrictions on column types are the same as for data
. Must have same column names and order as data
.treated_column_name
: A character with the name of the column to be used as treatment in data
. Defaults to ‘treated’.outcome_column_name
: A character with the name of the column to be used as outcome in data
. Defaults to ‘outcome’.These are arguments that deal with features of the underlying FLAME algorithm.
C
: The hyperparameter governing the relative weights of the balancing factor and predictive error in determining match quality.replace
: If TRUE
, allows the same unit to be matched multiple times, on different sets of covariates. For example, if TRUE
and two units match exactly on all covariates, they will also match on every subsequent iteration of FLAME.verbose
: Controls how FLAME displays progress while running. If 0, no output. If 1, only outputs the stopping condition. If 2, outputs the iteration and number of unmatched units every 5 iterations, and the stopping condition. If 3, outputs the iteration and number of unmatched units every iteration, and the stopping condition.PE_method
: One of ‘ridge’ or ‘xgb’, respectively denoting whether ridge regression or xgboost is used to compute the predictive error on the holdout set. The former relies on glmnet::cv.glmnet
and cross validates over \(\lambda\), with alpha = 0
, nfolds = 5
, and all other parameters at their defaults. The latter relies on xgboost::xgb.cv
and cross validates over a grid of eta
, max_depth
, alpha
, nrounds
, and subsample
, leaving all other parameters at their defaults. Ignored if user_PE_fit
is supplied.user_PE_fit
and user_PE_fit_params
: user_PE_fit
, is an optional, user supplied function that fits a model for an outcome from covariates. Must take in a matrix of covariates as its first argument and a vector outcome as its second argument. If supplied, PE_method
will be ignored. user_PE_fit_params
, is a named list of optional parameters to be used by user_PE_fit
.user_PE_predict
and user_PE_predict_params
: user_PE_predict
is an optional, user supplied function to generate predictions from the output of user_PE_fit
. It must take the output of user_PE_fit
as its first argument and a matrix of values for which to make predictions as its second argument. If not supplied, defaults to predict
. user_PE_predict_params
is a named list of optional parameters to be used by user_PE_predict
.To illustrate the usage of these last four parameters, we can have FLAME
compute PE via Bayesian Additive Regression Trees (BART) with 100 trees as follows:
library(dbarts)
my_fit <- dbarts::bart
my_fit_params <- list(ntree = 100, verbose = FALSE, keeptrees = TRUE)
my_predict <- function(bart_fit, new_data) {
return(colMeans(predict(bart_fit, new_data)))
}
FLAME_out <-
FLAME(data = data, holdout = holdout,
user_PE_fit = my_fit, user_PE_fit_params = my_fit_params,
user_PE_predict = my_predict)
By default, FLAME terminates when all covariates have been dropped or all control / treatment units have been matched. There are various early stopping arguments that can be supplied to alter this behavior. In all cases, however, FLAME still terminates if all covariates have been dropped or all control / treatment units have been matched, even if the user-specified stopping condition has not yet been met.
early_stop_iterations
: A number of iterations, corresponding to a number of covariates dropped, after which FLAME will automatically stop. A value of 0 has FLAME perform a single round of exact matching on all covariates and then stop.
early_stop_epsilon
: If FLAME attemts to drop a covariate that would raise the PE above (1 + early_stop_epsilon) times the baseline PE (the PE before any covariates have been dropped), FLAME will stop.
early_stop_bf
: If FLAME attempts to drop a covariate that would lead to a BF below this value, FLAME stops.
early_stop_pe
: If FLAME attempts to drop a covariate that would lead to a PE below this value, FLAME stops.
early_stop_control
: If FLAME attempts to drop a covariate that would lead the proportion of control units that are unmatched to fall below this value, FLAME stops.
early_stop_treated
: If FLAME attempts to drop a covariate that would lead the proportion of treatment units that are unmatched to fall below this value, FLAME stops.
FLAME offers several options for dealing with missing data, outlined below:
missing_data
and n_data_imputations
: These two arguments govern FLAME’s response to missingness in the data to be matched. If missing_data
is 0, it is assumed that there is no missingness. If it is 1, units with missingness are dropped. If it is 2, n_data_imputations
imputed datasets are generated using mice::mice
. In this case, the FLAME algorithm will be run on each imputed dataset and all results returned. If it is 3, units will be prevented from matching on the covariates they are missing.
missing_holdout
and n_holdout_imputations
: These two arguments govern FLAME’s response to missingness in the holdout data. If missing_holdout
is 0, it is assumed that there is no missingness. If it is 1, units with missingness are dropped. If it is 2, n_holdout_imputations
imputed holdout datasets are generated using mice::mice
. In this case, the predictive error computed by FLAME
is the average of the predictive errors across the imputed holdout datasets.
FLAME operates by iteratively matching all possible units on a set of covariates and then dropping one of those covariates to make more matches. Roughly, units are said to ‘match’ on a set of covariates if they have identical values of all those covariates. FLAME is thus designed to be run on categorical covariates. However, continuous covariates can be discretized, via histogram binning rules and then passed to FLAME.
More specifically, we define our inputs to the algorithm as the datasets \(\mathcal{S} = (X, Y, T)\) and \(\mathcal{S}^H = (X^H, Y^H, T^H)\), where \(X \in \mathbb{R}^{n \times d}\) denotes the \(d\) covariates of the \(n\) units, \(Y \in \mathbb{R}^n\) denotes their outcomes, and \(T \in \mathbb{R}^n\) denotes their binary treatment assignments. We will refer to a unit \(i\) as ‘control’ if \(T_i = 0\) and as ‘treated’ if \(T_i = 1\). The dataset \(\mathcal{S}^H\) is identically structured, but for a separate, holdout set of units.
We denote the covariates used to match on an iteration \(l\) by a binary vector \(\boldsymbol{\theta}^{l} \in \mathbb{R}^d\). The \(j\)’th entry of \(\boldsymbol{\theta}^{l}\) denotes whether the \(j\)’th covariate is used to match units on iteration \(l\). When we go from iteration \(l\) to iteration \(l + 1\), we change a single entry of \(\boldsymbol{\theta}^{l}\) from 1 to 0 to generate \(\boldsymbol{\theta}^{l+1}\) and then match all possible units on \(\boldsymbol{\theta}^{l+1}\). There are two key points regarding these matches: 1: matches are only made for units in \(\mathcal{S}\) and not for units in \(\mathcal{S}^H\) and 2: units with identical values of the covariates indicated by \(\boldsymbol{\theta}^{l+1}\) are only matched if at least one is control and one is treated.
More specifically, FLAME begins with \(\boldsymbol{\theta}^{0} = \mathbf{1}_d\); that is, by attempting to match units on all covariates. At any iteration \(l\), it then drops the covariate yielding the greatest increase in match quality (\(\mathtt{MQ}\)), defined as \(\mathtt{MQ} := C \cdot \mathtt{BF} - \mathtt{PE}\), where \(C\) is a hyperparameter. The balancing factor, \(\mathtt{BF}\), at an iteration \(l\), is defined as the proportion of control units, plus the proportion of treated units, that are matched by the update from \(\boldsymbol{\theta}^{l}\) to \(\boldsymbol{\theta}^{l + 1}\). The predictive error, \(\mathtt{PE}\), at an iteration \(l\), is defined as the training MSE incurred when predicting \(Y^{H}\) from the subset of \(X^H\) indicated by \(\boldsymbol{\theta}^{l + 1}\). In this way, FLAME encourages making many matches (lowering variance of treatment effect estimates) and matching on covariates important to the outcome (lowering bias of treatment effect estimates).
By default, the algorithm terminates when all covariates have been dropped or all treated/control units have been matched, but we provide several options for early stopping, described above
For more details, see the FLAME paper