# SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES

@inproceedings{Breiman2000SOMEIT, title={SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES}, author={Leo Breiman}, year={2000} }

To dispel some of the mystery about what makes tree ensembles work, they are looked at in distribution space i.e. the limit case of "infinite" sample size. It is shown that the simplest kind of trees are complete in D-dimensional space if the number of terminal nodes T is greater than D. For such trees we show that the Adaboost minimization algorithm gives an ensemble converging to the Bayes risk. Random forests which are grown using i.i.d random vectors in the tree construction are shown to be… Expand

#### 171 Citations

Analysis of a Random Forests Model

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2012

An in-depth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm, and shows in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present. Expand

Learning with Ensembles of Randomized Trees : New Insights

- Mathematics, Computer Science
- ECML/PKDD
- 2010

A connection with kernel target alignment, a measure of kernel quality, is pointed out, which suggests that randomization is a way to obtain a high alignment, leading to possibly low generalization error. Expand

1 RANDOM FORESTS

- 1999

Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The… Expand

Learning with random forests

- Mathematics
- 2015

This is devoted to a nonparametric estimation method called random forests, introduced by Breiman in 2001. Extensively used in a variety of areas, random forests exhibit good empirical performance… Expand

Random Forests

- Mathematics, Computer Science
- Machine Learning
- 2004

Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression. Expand

When do random forests fail?

- Computer Science, Mathematics
- NeurIPS
- 2018

This paper considers various tree constructions and examines how the choice of parameters affects the generalization error of the resulting random forests as the sample size goes to infinity, and shows that trees that have good performance in nearest-neighbor search can be a poor choice for random forests. Expand

Extremely randomized trees

- Mathematics, Computer Science
- Machine Learning
- 2006

A new tree-based ensemble method for supervised classification and regression problems that consists of randomizing strongly both attribute and cut-point choice while splitting a tree node and builds totally randomized trees whose structures are independent of the output values of the learning sample. Expand

Random Forests and Kernel Methods

- Mathematics, Computer Science
- IEEE Transactions on Information Theory
- 2016

It is shown that by slightly modifying their definition, random forests can be rewritten as kernel methods (called KeRF for kernel based on random forests) which are more interpretable and easier to analyze. Expand

On the asymptotics of random forests

- Computer Science, Mathematics
- J. Multivar. Anal.
- 2016

It is proved that q quantile forests-close in spirit to Breiman's (2001) forests but easier to study-are able to combine inconsistent trees to obtain a final consistent prediction, thus highlighting the benefits of random forests compared to single trees. Expand

Towards Convergence Rate Analysis of Random Forests for Classification

- Computer Science, Mathematics
- NeurIPS
- 2020

This work presents the first finite-sample rate O(n−1/(8d+2)) on the convergence of pure random forests for classification, which can be improved to be of O( n−1/3.87d-2) by considering the midpoint splitting mechanism. Expand

#### References

SHOWING 1-10 OF 23 REFERENCES

Special Invited Paper-Additive logistic regression: A statistical view of boosting

- Mathematics
- 2000

Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data… Expand

Boosting Decision Trees

- Computer Science
- NIPS
- 1995

A constructive, incremental learning system for regression problems that models data by means of locally linear experts that does not compete for data during learning and derives asymptotic results for this method. Expand

Shape Quantization and Recognition with Randomized Trees

- Mathematics, Computer Science
- Neural Computation
- 1997

A new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity, and a comparison with artificial neural networks methods is presented. Expand

Experiments with a New Boosting Algorithm

- Computer Science
- ICML
- 1996

This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand

Improved Boosting Algorithms Using Confidence-rated Predictions

- Computer Science
- COLT' 98
- 1998

We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a… Expand

Arcing the edge

- 1997

Recent work has shown that adaptively reweighting the training set, growing a classifier using the new weights, and combining the classifiers constructed to date can significantly decrease… Expand

Boosting Algorithms as Gradient Descent

- Computer Science
- NIPS
- 1999

Following previous theoretical results bounding the generalization performance of convex combinations of classifiers in terms of general cost functions of the margin, a new algorithm (DOOM II) is presented for performing a gradient descent optimization of such cost functions. Expand

An Empirical Comparison of Voting Classification Algorithms

- Mathematics
- 1999

Methods for voting classification algorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and real-world data...

Explaining bagging, available at www.stat.Berkeley.EDU/users/binyu/publications.html

- 2000

Arcing Classifiers, (discussion paper

- Annals of Statistics,
- 1998