Convolution-type smoothed quantile regression
The conquer
library performs fast and accurate convolution-type smoothed quantile regression (Fernandes, Guerre and Horta, 2019) implemented via Barzilai-Borwein gradient descent (Barzilai and Borwein, 1988) with a Huber regression warm start. The package can also Construct confidence intervals for regression coefficients using multiplier bootstrap.
conquer
is available on CRAN, and it can be installed into R
environment:
The main functions of this library:
conquer
: Convolution-type smoothed quantile regressionLet us illustrate conquer by a simple example. For sample size n = 5000 and dimension p = 70, we generate data from a linear model yi = β0 + <xi, β> + εi, for i = 1, 2, … n. Here we set β0 = 1, β is a p-dimensional vector with every entry being 1, xi follows p-dimensional standard multivariate normal distribution (available in the library MASS
), and εi is from t2 distribution.
library(MASS)
library(quantreg)
library(conquer)
n = 5000
p = 70
beta = rep(1, p + 1)
set.seed(2020)
X = mvrnorm(n, rep(0, p), diag(p))
err = rt(n, 2)
Y = cbind(1, X) %*% beta + err
Then we run both quantile regression using package quantreg
, with a Frisch-Newton approach after preprocessing (Portnoy and Koenker, 1997), and conquer (with Gaussian kernel) on the generated data. The quantile level τ is fixed to be 0.5.
tau = 0.5
start = Sys.time()
fit.qr = rq(Y ~ X, tau = tau, method = "pfn")
end = Sys.time()
time.qr = as.numeric(difftime(end, start, units = "secs"))
est.qr = norm(as.numeric(fit.qr$coefficients) - beta, "2")
start = Sys.time()
fit.conquer = conquer(X, Y, tau = tau)
end = Sys.time()
time.conquer = as.numeric(difftime(end, start, units = "secs"))
est.conquer = norm(fit.conquer$coeff - beta, "2")
It takes 0.1955 seconds to run the standard quantile regression but only 0.0255 seconds to run conquer. In the meanwhile, the estimation error is 0.1799 for quantile regression and 0.1685 for conquer. For readers’ reference, these runtimes are recorded on a Macbook Pro with 2.3 GHz 8-Core Intel Core i9 processor, and 16 GB 2667 MHz DDR4 memory.
Help on the functions can be accessed by typing ?
, followed by function name at the R
command prompt.
For example, ?conquer
will present a detailed documentation with inputs, outputs and examples of the function conquer
.
GPL-3.0
C++11
Xuming He xmhe@umich.edu, Xiaoou Pan xip024@ucsd.edu, Kean Ming Tan keanming@umich.edu and Wen-Xin Zhou wez243@ucsd.edu
Xiaoou Pan xip024@ucsd.edu
Barzilai, J. and Borwein, J. M. (1988). Two-point step size gradient methods. IMA J. Numer. Anal. 8 141–148. Paper
Fernandes, M., Guerre, E. and Horta, E. (2019). Smoothing quantile regressions. J. Bus. Econ. Statist., in press. Paper
He, X., Pan, X., Tan, K. M., and Zhou, W.-X. (2020). Smoothed quantile regression for large-scale inference. Preprint.
Horowitz, J. L. (1998). Bootstrap methods for median regression models. Econometrica 66 1327–1351. Paper
Koenker, R. (2005). Quantile Regression. Cambridge Univ. Press, Cambridge. Book
Koenker, R. (2019). Package “quantreg”, version 5.54. CRAN
Koenker, R. and Bassett, G. (1978). Regression quantiles. Econometrica 46 33-50. Paper
Portnoy, S. and Koenker, R. (1997). The Gaussian hare and the Laplacian tortoise: Computability of squared-error versus absolute-error estimators. Statist. Sci. 12 279–300. Paper
Sanderson, C. and Curtin, R. (2016). Armadillo: A template-based C++ library for linear algebra. J. Open Source Softw. 1 26. Paper