## Equations

Statistics is one of the major scientific disciplines that DM draws upon. A predictive model in statistics most commonly takes the form of an equation.

Linear models predict the value of a target (dependent) variable as a linear combination of the input (independent) variables. Three linear models that predict the value of the variable Total are represented by eqns [1]—[3]. These have been derived using linear regression on the data from Table 1.

Total = 189.126 x Age + 0.093 2 x Income - 2420.67 [3]

Linear equations involving two variables (such as eqns [1] and [2]) can be depicted as straight lines in a two-dimensional space (see Figure 1). Linear equations involving three variables (such as eqn [3]) can be depicted as planes in a three-dimensional space. Linear equations, in general, represent hyperplanes in multidimensional spaces. Nonlinear equations are represented by curves, surfaces, and hypersurfaces.

Note that equations (or rather inequalities) can be also used for classification. If the value of the expression 0.093 x Income + 6119.744 is greater than 15 000, for example, we can predict the value of the variable BigSpender to be 'Yes'. Points for which 'Yes' will be predicted are those above the regression line in the lower part of Figure 1.

Age

Figure 1 Two regression lines that predict the value of variable Total from each of the variables Age and Income, respectively. The points correspond to the training examples.

### Income

