Decision Trees

Decision trees are hierarchical structures, where each internal node contains a test on an attribute, each branch corresponds to an outcome of the test, and each leaf node gives a prediction for the value of the class variable. Depending on whether we are dealing with a classification or a regression problem, the decision tree is called a classification or a regression tree, respectively. Two classification trees derived from the data set in Table 1 are given in Figure 2. An example regression tree, also derived from the data set in Table 1, is given in Figure 3.

Regression tree leaves contain constant values as predictions for the class value. They thus represent piecewise constant functions. Model trees, where leaf nodes can contain linear models predicting the class value, represent piecewise linear functions.

Note that decision trees represent total partitions of the data space, where each test corresponds to an axis-parallel split. This is illustrated in Figure 3. Most algorithms for decision tree induction consider such axis-parallel splits, but there are a few algorithms that consider splits along lines that need not be axis parallel or even consider splits along nonlinear curves.


Age Yes

No Yes

Figure 2 Two classification trees that predict the value of variable BigSpender from the variables Age and Income, and Age and Gender, respectively.

Was this article helpful?

0 0
Project Earth Conservation

Project Earth Conservation

Get All The Support And Guidance You Need To Be A Success At Helping Save The Earth. This Book Is One Of The Most Valuable Resources In The World When It Comes To How To Recycle to Create a Better Future for Our Children.

Get My Free Ebook

Post a comment