Probability and some statistics

In the January 2003 issue of Trends in Ecology and Evolution, Andrew Read (Read 2003) reviewed two books on modern statistical methods (Crawley 2002, Grafen and Hails 2002). The title of his review is ''Simplicity and serenity in advanced statistics'' and begins as follows:

One of the great intellectual triumphs of the 20th century was the discovery of the generalized linear model (GLM). This provides a single elegant and very powerful framework in which 90% of data analysis can be done. Conceptual unification should make teaching much easier. But, at least in biology, the textbook writers have been slow to get rid of the historical baggage. These two books are a huge leap forward.

A generalized linear model involves a response variable (for example, the number of juvenile fish found in a survey) that is described by a specified probability distribution (for example, the gamma distribution, which we shall discuss in this chapter) in which the parameter (for example, the mean of the distribution) is a linear function of other variables (for example, temperature, time, location, and so on).

The books of Crawley, and Grafen and Hails, are indeed good ones, and worth having in one's library. They feature in this chapter for the following reason. On p. 15 (that is, still within the introductory chapter), Grafen and Hails refer to the t-distribution (citing an appendix of their book). Three pages later, in a lovely geometric interpretation of the meaning of total variation of one's data, they remind the reviewer of the Pythagorean theorem - in much more detail than they spend on t-distribution. Most of us, however, learned the Pythagorean theorem long before we learned about the t-distribution.

If you already understand the ¿-distribution as well as you understand the Pythagorean theorem, you will likely find this chapter a bit redundant (but I encourage you to look through it at least once). On the other hand, if you don't, then this chapter is for you. My objective is to help you gain understanding and intuition about the major distributions used for general linear models, and to help you understand some tricks of computation and application associated with these distributions.

With the advent of generalized linear models, everyone's power to do statistical analysis was made greater. But this also means that one must understand the tools of the trade at a deeper level. Indeed, there are two secrets of statistics that are rarely, if ever, explicitly stated in statistics books, but I will do so here at the appropriate moments.

The material in this chapter is similar to, and indeed the structure of the chapter is similar to, the material in chapter 3 of Hilborn and Mangel (1997). However, regarding that chapter my colleagues Gretchen LeBuhn (San Francisco State University) and Tom Miller (Florida State University) noted its denseness. Here, I have tried lighten the burden. We begin with a review of probability theory.

0 0

Post a comment