Protein allozymes

In the 1960s a method known as starch gel electrophoresis of allozymic proteins was an extremely important breakthrough that allowed biologists to obtain direct information on some of the genetic properties of individuals, populations, species and higher taxa. Note that we are not yet talking about DNA markers but about proteins that are encoded by DNA. This distinction is extremely important, and to eliminate any confusion we will take a minute to review the relationship between DNA, genes and proteins. Prokaryotes, which lack cell nuclei, have their DNA arranged in a closed double-stranded loop that lies free within the cell's cytoplasm. Most of the DNA within the cells of eukaryotes, on the other hand, is organized into chromosomes that can be found within the nucleus of each cell; these constitute the nuclear genome (also referred to as nuclear DNA or nrDNA). Each chromosome is made up of a single DNA molecule that is functionally divided into units called genes. The site that each gene occupies on a particular chromosome is referred to as its locus (plural loci). At each locus, different forms of the same gene may occur, and these are known as alleles.

Each allele is made up of a specific sequence of DNA. The DNA sequences are determined by the arrangement of four nucleotides, each of which has a different chemical constituent known as a base. The four DNA bases are adenine (A), thymine (T), guanine (G) and cytosine (C), and these are linked together by a sugar-phosphate backbone to form a strand of DNA. In its native state, DNA is arranged as two strands of complementary sequences that are held together by hydrogen bonds in a double-helix formation (Figure 1.1). No two alleles have exactly the same DNA sequence, although the similarity between two alleles from the same locus can be very high.

The function of many genes is to encode a particular protein, and the process in which genetic information is transferred from DNA into protein is known as gene expression. The sequence of a protein-coding gene will determine the structure of the protein that is synthesized. The first step of protein synthesis occurs when the coding region of DNA is transcribed into ribonucleic acid (RNA) through a process known as transcription. The RNA sequences, which are single stranded, are complementary to DNA sequences and have the same bases with the exception of uracil (U), which replaces thymine (T). After transcription, the introns (non-coding segments of DNA) are excised and the RNA sequences are translated into protein sequences following a process known as translation.

Translation is possible because each RNA molecule can be divided into triplets of bases (known as codons), most of which encode one of 20 different amino acids, which are the constituents of proteins (Table 1.2). Transcription and

Sugar-phosphate backbone

Hydrogen bond

Sugar-phosphate backbone

3'

5

—T

A

—G

C

—G

C

-A

T

-A

T

—C G-

T

A

CG

G

C

A

T

T

A

Figure 1.1 (A) A DNA double helix. Each sequence is linked together by a sugar-phosphate backbone, and complementary sequences are held together by hydrogen bonds; 3' and 5' refer to the orientation of the DNA: one end of a sequence has an unreacted 5' phosphate group and the other end has an unreacted 3' hydroxyl group. (B) Denatured (single-stranded) DNA showing the two complementary sequences. The DNA becomes denatured following the application of heat or certain chemicals translation involve three types of RNA: ribosomal RNA (rRNA), messenger RNA (mRNA) and transfer RNA (tRNA). Ribosomal RNA is a major component of ribosomes, which are the organelles on which mRNA codons are translated into proteins, i.e. it is here that protein synthesis takes place. Messenger RNA molecules act as templates for protein synthesis by carrying the protein-coding information that was encoded in the relevant DNA sequence, and tRNA molecules incorporate particular amino acids into a growing protein by matching amino acids to mRNA codons (Figure 1.2).

Specific combinations of amino acids give rise to polypeptides, which may form either part or all of a particular protein or, in combination with other molecules, a protein complex. If the DNA sequences from two or more alleles at the same locus are sufficiently divergent, the corresponding RNA triplets will encode different amino acids and this will lead to multiple variants of the same protein. These variants are known as allozymes. However, not all changes in DNA sequences will result in different proteins. Table 1.2 shows that there is some redundancy in the genetic code, e.g. leucine is specified by six different codons. This redundancy means that it is possible for two different DNA sequences to produce the same polypetide product.

Table 1.2 The eukaryotic nuclear genetic code (RNA sequences): a total of 61 codons specify 20 amino acids, and an additional three stop-codons (UAA, UAG, UGA) signal the end of translation. This genetic code is almost universal, although minor variations exist in some microbes and also in the mitochondrial DNA (mtDNA) of animals and fungi

Table 1.2 The eukaryotic nuclear genetic code (RNA sequences): a total of 61 codons specify 20 amino acids, and an additional three stop-codons (UAA, UAG, UGA) signal the end of translation. This genetic code is almost universal, although minor variations exist in some microbes and also in the mitochondrial DNA (mtDNA) of animals and fungi

Amino acid

Codon

Amino acid

Codon

Leucine (Leu)

UUA

Arginine (Arg)

CGU

UUG

CGC

CUU

CGA

CUC

CGG

CUA

AGA

CUG

AGG

Serine (Ser)

UCU

Alanine (Ala)

GCU

UCC

GCC

UCA

GCA

UCG

GCG

AGU

AGC

Valine (Val)

GUU

Threonine (Thr)

ACU

GUC

ACC

GUA

ACA

GUG

ACG

Proline (Pro)

CCU

Glycine (Gly)

GGU

CCC

GGC

CCA

GGA

CCG

GGG

Glutamine (Gln)

CAA

Aspartic acid (Asp)

GAU

CAG

GAC

Asparagine (Asn)

AAU

Glutamic acid (Glu)

GAA

AAC

GAG

Lysine (Lys)

AAA

Cysteine (Cys)

UGU

AAG

UGC

Tyrosine (Tyr)

UAU

Histidine (His)

CAU

UAC

CAC

Isoleucine (Ile)

AUU

Phenylalanine (Phe)

UUU

AUC

UUC

AUA

Methionine (Met)

AUGa

Tryptophan (Trp)

UGG

aCodes for Met when within the gene and signals the start of translation when at the beginning of the gene.

aCodes for Met when within the gene and signals the start of translation when at the beginning of the gene.

Allozymes as genetic markers

The first step in allozyme genotyping is to collect tissue samples or, in the case of smaller species, entire organisms. These samples are then ground up with appropriate buffer solutions to release the proteins into solution, and the allozymes then can be visualized following a two-step process of gel electrophoresis

Figure 1.2 DNA codes for RNA via transcription, and RNA codes for proteins via translation

and staining. Electrophoresis refers here to the process in which allozymes are separated in a solid medium such as starch, using an electric field. Once an electric charge is applied, molecules will migrate through the medium at different rates depending on the size, shape and, most importantly, electrical charge of the molecules, characteristics that are determined by the amino acid composition of the allozymes in question. Allozymes then can be visualized by staining the gel with a reagent that will acquire colour in the presence of a particular, active enzyme. A coloured band will then appear on the gel wherever the enzyme is located. In this way, allozymes can be differentiated on the basis of their structures, which affect the rate at which they migrate through the gel during electrophoresis.

Genotypes that are inferred from allozyme data provide some information about the amount of genetic variation within individuals; if an individual has only one allele at a particular locus then it is homozygous, but if it has more than one allele at the same locus then it is heterozygous (Figure 1.3). Furthermore, if enough individuals are characterized then the genetic variation of populations can be quantified and the genetic profiles of different populations can be compared. This distinction between individuals and populations will be made repeatedly throughout this book because it is fundamental to many applications of molecular ecology. Keep in mind that data are usually collected from individuals, but if the sample size from any given population is big enough then we often assume that the

Individual 1 Individual 2

Locus 1 Locus 2 Locus 3 Locus 1 Locus 2 Locus 3

Locus 1 Locus 2 Locus 3 Locus 1 Locus 2 Locus 3

Figure 1.3 Diagrammatic representation of part of a chromosome, showing which alleles are present at three loci. Individual 1 is homozygous at loci 1 and 3 (AA in both cases) and heterozygous at locus 2 (AS). Individual 2 is homozygous at locus 1 (BB) and heterozygous at locus 2 (BC) and locus 3 (AB)

individuals collectively provide a good representation of the genetic properties of that population.

We will return to allozymes in subsequent chapters, but at this point it is enough to realize that the identification within populations of multiple allozymes (alleles) at individual loci was a seminal event because it provided the first snapshot of genetic variation in the wild. In 1966, one of the first studies based on allozyme data was conducted on five populations of the fruitfly Drosophila pseudoobscura. This revealed substantially higher levels of genetic variation within populations than were previously believed (Lewontin and Hubby, 1966). In this study 18 loci were characterized from multiple individuals, and in each population up to six of these loci were found to be polymorphic (having multiple alleles). There was also evidence of genetic variation within individuals, as revealed by the observed heterozygosity (Ho) values, which are calculated by averaging the heterozygosity values across all characterized loci (Table 1.3).

Although unarguably a major breakthrough in population genetics, and still an important source of information in molecular ecology, allozyme markers do have some drawbacks. One limitation is that, as we saw in Table 1.2, not all variation in

Table 1.3 Levels of polymorphism and observed heterozygosity (Ho) at 18 enzyme loci calculated for five populations of Drosophila pseudoobscura (data from Lewontin and Hubby, 1966). This was one of the first studies to show that genetic variation in the wild is much higher than was previously believed

Table 1.3 Levels of polymorphism and observed heterozygosity (Ho) at 18 enzyme loci calculated for five populations of Drosophila pseudoobscura (data from Lewontin and Hubby, 1966). This was one of the first studies to show that genetic variation in the wild is much higher than was previously believed

Number of

Proportion of

Observed

Population

polymorphic loci

polymorphic loci

heterozygosity

Strawberry Canyon

6

6/18 = 0.33

0.148

Wildrose

5

5/18 = 0.28

0.106

Cimarron

5

5/18 = 0.28

0.099

Mather

6

6/18 = 0.33

0.143

Flagstaff

5

5/18 = 0.28

0.081

DNA sequences will translate into variable protein products, because some DNA base changes will produce the same amino acid following translation. A wealth of information is contained within every organism's genome, and allozyme studies capture only a small portion of this. Less than 2 per cent of the human genome, for example, codes for proteins (Li, 1997). The acquisition of allozyme data is also a cumbersome technique because organisms often have to be killed before adequate tissue can be collected, and this tissue then must be stored at very cold temperatures (up to —70°C), which is a logistical challenge in most field studies. These drawbacks can be overcome by using appropriate DNA markers, which are now the most common source of data in molecular ecology because they can potentially provide an endless source of information, and they also allow a more humane approach to sampling study organisms. In the following sections, therefore, we shall switch our focus from proteins to DNA.

Was this article helpful?

0 0

Post a comment