## Point Estimation Fitting the Model to Data

The previous section explored a number of possibilities for developing statistical models, but without much consideration for the statistical and technical challenges. This section tackles the methods, challenges, and limitations of fitting biologically motivated statistical models to data. As we will see below, some of the statistical models given by eqns [2]—[4] are easier to fit to data than others. By fitting a model to data, we seek to estimate the unknown parameters in the statistical model, including parameters for both the deterministic and stochastic components. While there are a few approaches for finding parameter estimates from data, the focus is on the most popular approach - the method of maximum likelihood.

Likelihood functions emerge from the probability distribution(s) that make up the stochastic component of a statistical model. Probability distributions describe the probability density (or probability mass if discrete) of a particular response given a set of parameters, which, more formally, is written as f (j, | 01 , ■■■,0k )

where yt is the tth response variable, and fyt | 0!, ■■■ ,0k) is the probability density given the parameters 01 through 0k. As such, the probability distribution is a function of the random variable y for a fixed set of parameters. However, our interest here is the opposite because we seek the parameter values that are best supported by a fixed data set. Finding the parameter values best supported by the data is done using the likelihood function, which is defined as

In practice, the proportionality is replaced with an equality relationship because statistical inference is based on relative likelihood values. The resulting likelihood function is a function of the parameters 0b ■ ■ .,0k, and describes the likelihood of the parameters given a fixed observation y . If the observations are independent and identically distributed, the likelihood function for multiple observations is the product of the probability distributions:

L(0!,..., 0k\yi.. .y„) = IL f (y> \0i,. ..,0k-

To simplify notation, the likelihood definition in eqn [6] can be written as L(0|y). The parameter value(s) that make the observations most likely are those that maximize the likelihood function. To work through a straightforward example, consider a cohort survival experiment where a fixed number of individuals are raised in the laboratory for 1 week. The initial number of individuals is given by n, the number surviving after ! week is given by y, and we seek to estimate the probability of survival p. If we assume that all individuals had the same probability of dying over the week, then the likelihood function is given by

For this example, it is possible to estimate the most likely parameter value (referred to as 'p-hat': p) analytically by setting the first derivative of the likelihood function to