While a single measure of error may be used to pick the optimum tree size, no single measure of error can capture the adequacy of the model for often diverse applications. Consequently, several measures of error may need to be reported on the final model. In classification problems, these may include the misclassification rate and kappa. Kappa measures the proportion of correctly classified units after accounting for the probability of chance agreement. In classification problems involving only a zero-one response, additional measures include sensitivity, specificity, receiver operating characteristic (ROC) curves with associated area under the curve (AUC). In regression problems measures of interest might include correlation coefficients, root mean squared error, average absolute error, bias, and the list continues. The literature on error assessment is vast. The point here is that an optimal tree size may be determined using one criterion, but often it is necessary to report several measures to assess the applicability of the model for different applications.
Was this article helpful?