As mentioned before, one of the advantages of classification and regression trees is their ability to accommodate missing values. If a response variable is missing, that observation can be excluded from the analysis, or, in the case of classification problem, treated as a new class (e.g., missing) to identify any potential patterns in the loss of information. If explanatory variables are missing, trees can use surrogate variables in their place to determine the split. Alternatively, an observation can be passed to the next node using a variable that is not missing for that observation.
Was this article helpful?