Pruning

A tree can be grown to be quite large, almost to the point where it fits the training data perfectly, that is, sometimes having just one observation in each leaf. However, this results in overfitting and poor predictions on independent test sets. A tree may also be constructed that is too small and does not extract all the useful relationships that exist. Appropriate tree size can be determined in a number of ways. One way is to set a threshold for the reduction in impurity measure, below which no split will be made. A preferred approach is to grow an overly large tree until some minimum node size is reached. Then prune the tree back to an optimal size. Optimal size can be determined using an independent test set or cross-validation (described below). In either case, what results is a tree of optimal size accompanied by an independent measure of its error rate.

Solar Power Sensation V2

Solar Power Sensation V2

This is a product all about solar power. Within this product you will get 24 videos, 5 guides, reviews and much more. This product is great for affiliate marketers who is trying to market products all about alternative energy.

Get My Free Ebook


Post a comment