# Exercise 316 EM

Show that the maximum likelihood estimates for a and b are the solution of the equations n n

and then solve these equations for the maximum likelihood values of the parameters.

Once we have computed these parameters, the remaining variation, which is unexplained by the linear model, is ^n=i (Yj — a — might then ask if the linear model is an improvement over no model at all. One way of assessing this is to ask how much the variation is explained by the linear model, relative to the constant model. The common way to do that is with the ratio n 2

Figure 3.10. In ordinary least squares (OLS) we minimize the vertical distance between the regression line y = a + bxand the data points. In total least squares (TLS), we minimize the actual distance between the data points and the regression line.

Figure 3.10. In ordinary least squares (OLS) we minimize the vertical distance between the regression line y = a + bxand the data points. In total least squares (TLS), we minimize the actual distance between the data points and the regression line. inimize this distance = OS

inimize this distance = TS

inimize this distance = OS

inimize this distance = TS

where we understand that the as in the fraction are different - coming from minimizing the sum of squares in the linear model (numerator) or the constant model (denominator). This ratio is the fraction of the variance explained by the linear model relative to the constant model and is a handy tool for measuring how much understanding we have gained from the increase in complexity of the model.

The least squares procedure that we have been discussing is called ordinary least squares (OLS) and works under the presumption that the causative variables are measured without error. But often that condition cannot be guaranteed - we simply cannot measure the X accurately. The general formulation of this problem, in terms of a statistical model, is very complicated (see Connections) and is called the errors in variables problem. However, we can discuss an extension of least squares, called total least squares, to this case. In ordinary least squares (Figure 3.10) we minimize the vertical distance between the regression line y = a + bx and the data points (that is, we assume no error in the measurement ofX). In total least squares, we minimize the actual distance between the data points and the regression line. That is, we let {xc,-, yc,-} denote the point on the regression line closest to the data point {X, Y}, and find it by choosing ^1 (X/ — xg,-)2 + (Y,- — yg,-)2 to be a minimum. To operationalize this idea, we need to find the closest point. The line has slope b, so that the line segment perpendicular to the regression line will have slope — 1/ b and the equation of the line joining the data and the closest point is (Y- — yc/)/(X- — xc/) = — (1 / b). Since we know that the closest point is on the regression line y = a + bx we conclude that

and view this as an equation for xci. We solve Eq. 3.84 for xci and obtain