Boosted regression (boosting): An introductory tutorial and a Stata plugin
Abstract. Boosting, or boosted regression, is a recent data-mining technique that has
shown considerable success in predictive accuracy. This article gives an
overview of boosting and introduces a new Stata command, boost, that
implements the boosting algorithm described in Hastie, Tibshirani, and
Friedman (2001, 322). The plugin is illustrated with a Gaussian and a
logistic regression example. In the Gaussian regression example, the
R2 value computed on a test dataset is
R2 = 21.3% for linear regression and R2
= 93.8% for boosting. In the logistic regression example, stepwise logistic
regression correctly classifies 54.1% of the observations in a test dataset
versus 76.0% for boosted logistic regression. Currently, boost
accommodates Gaussian (normal), logistic, and Poisson boosted regression.
boost is implemented as a Windows C++ plugin.
View all articles by this author:
Matthias Schonlau
View all articles with these keywords:
boost, boosted regression, boosting, data mining
Download citation: BibTeX RIS
Download citation and abstract: BibTeX RIS
|