These notes are provided to help direct your study from the textbook. They are not designed to explain all aspects of the material in great detail; that is what class time and the book is for. If you were to study only these notes, you would not learn enough genetics to do well in the course.
Genetics Notes
Chapter 23 -- Population and Evolutionary Genetics
Theodosius Dobzhansky - "Nothing in biology makes sense except in
the light of evolution."
Evolution cannot be understood except in the light of genetics. Evolutionary change can be defined as a change in gene frequencies within a population over time.
A population, or deme, is a community of individuals linked by bonds of mating and parenthood.
A Mendelian population is a group of interbreeding, sexually
reproducing individuals.
A species is a group of actually or potentially interbreeding natural populations that are reproductively isolated from other such groups.
The evolutionary unit is the population or, in some cases, the
species. It is the genome of the population that changes over time and not that of the individual. Much of the mathematical description of genetic
changes in population was developed during the 1920's and 1930's by Fisher, Haldane, and Wright.
The first step in characterizing the genome of a population is the
calculation of allelic frequencies and genotypic frequencies (pages 678-680).
The phenotypic distribution of the MN blood type among 200 people . . .
Type M (MM) = 88 f(MM) = 88/200 = .44
Type MN (MN) = 88 f(MN) = 88/200 = .44
Type N (NN) = 24 f(NN) = 24/200 = .12
# of M alleles
f(M) = -----------------
total # of alleles
(It is easier to think in terms of alleles rather than genotypes.)
2(88) +88 264
p = f(M) = ---------- = ----- = .66
2(200) 400
2(24) + 88 136
q = f(N) = ----------- = --- = .34
2(200) 400
As there are only two alleles in this example, q = 1 - p, because
p + q = 1. The sum of the frequencies of all the alleles must add to equal one.
An alternative way of calculating gene frequencies is . . .
(MM) + 0.5*(MN) 88 + 44
p = f(M) = -------------- = -------- = 0.66
individuals 200
(NN) +0.5*(MN) 24 + 44
q = f(N) = -------------- = ------- = 0.34
individuals 200
In 1908, Hardy and Weinberg independently discovered that an equilibrium in allelic and genotypic frequencies will arise in a diploid population
if certain conditions are met. This is called the Hardy-Weinberg equilibrium and it has three important facets.
- the allelic frequencies at an autosomal locus will not change from one generation to the next
- the genotypic frequencies of a population are determined on the basis of the allelic frequencies
- if the equilibrium in disturbed, it will be reestablished after just one generation of random mating
Assumptions of Hardy-Weinberg equilibrium
(These assumptions must hold if the Hardy-Weinberg equilibrium is to
occur.)
- Random mating
This means that any individual has an equal chance of
mating with any other individual.
Because of this, you can predict the probability of any two
genotypes mating.
e.g. if the MM genotype makes up 44% of the population, then
the probability of two individuals with the MM genotype mating
is .44 * .44 = .1936
Deviations occur due to
- assortative mating - likes mate with likes
- disassortative mating - unlikes mate
- inbreeding - individuals are more likely to mate with related
individuals than other members of the population
- outbreeding - individuals systematically exclude relatives as
potential mates
Deviations from random mating will change the genotypic
frequency (increase homozygotes) but will not alter allelic
frequencies.
- No selection
No genotype has a better chance of survival and reproduction
than any other genotype.
- Large population size
Each new generation is a sample of the previous generation's
gametes. A small sample size (small population) is more likely
to show random fluctuation in sampling and hence greater
deviation from generation to generation.
When populations are small and allelic frequencies are affected
more by chance, the changes are referred to as drift.
- No mutation or migration
Mutation and migration result in gain or loss of alleles from a
population. This gain or loss will disturb the equilibrium.
It would seem that almost no population would ever be in equilibrium
given these assumptions. However, in practice . . .
- mutation rates are small to negligible
- population size will still give a good fit to Hardy-Weinberg even
if the population is fairly small, say 100 individuals
- also, since the equilibrium is established after one generation,
the population will come back to equilibrium within one
generation after being disturbed
Testing for Hardy-Weinberg
This is the first test that a population biologist will do after
determining the allelic and genotypic frequencies within a population.
To determine if a population is in Hardy-Weinberg equilibrium, you want
to know if the genotype (MM, MN, NN) occurs with frequencies of p2, 2pq,
and q2.
p = f(M) = .66 q = f(N) = .34 p + q = 1
p2 = the expected probability of an individual having the genotype MM
q2 = the expected probability of an individual having the genotype NN
2pq = the expected probability of an individual having the genotype MN
All of this follows from (p + q)2 = p2 + 2pq + q2
MM MN NN Total
observed 88 88 24 200
expected ratio p2 2pq q2 1
expected # (.436)(200) (.449)(200) (.116)(200) 200
87.1 89.8 23.1
(O - E)2/E .009 .036 .035
the chi-square value is the sum across all classes.
.009 + .036 + .035 = 0.09
To compare to the tables value, we need the degrees of freedom.
The d.f. = number of phenotypes - number of alleles = 3 - 2 = 1.
In this case the calculated value is below the critical value of 3.814,
thus we fail to reject the null hypothesis that the population is in
Hardy-Weinberg equilibrium.
The reverse of Hardy-Weinberg can be used to estimate allelic frequencies
in the case of dominant alleles in which the dominant homozygote cannot be
distinguished from the heterozygote. This works even for something as
severely debilitating as PKU. Because PKU babies are so rare (1/10,000),
selection has a negligible effect on genotypic frequencies.
PP Pp pp = PKU phenotype
f(pp) = q2 = 0.0001 (1/10,000)
q = 0.01 so p = 1 - q = .99
homozygous normal = p2 = (.99)(.99) = .98
heterozygous = 2pq = 2(.99)(.01) = .02
homozygous PKU = q2 = .0001
Non-random mating
Two components to non-random mating
- assortative or disassortative mating (sometimes called positive assortative and negative assortative mating, respectively)- mates are chosen on the
basis of some phenotypic characteristic such that certain matings
occur more commonly than would be predicted by chance
- inbreeding or outbreeding - mates are chosen based upon the
degree of relatedness.
Mates are more closely related or less closely related than
would be predicted by chance.
For example, marriages between first cousins are examples of
inbreeding.
The effects are similar in that inbreeding and assortative
mating increase homozygosity, while outbreeding and disassortative
mating decrease homozygosity. However, assortative and
disassortative mating will only affect the loci involved in
expression of the phenotypic trait and loci that are linked to
them. Inbreeding and outbreeding affect all loci equally,
across the entire genome.
Inbreeding
This comes about in two ways.
- the systematic choice of relatives as mates
- subdivision of the population where individuals have a narrower
choice of mates and thus are forced to mate with relatives
An inbred individual is one whose parents are related, meaning that
there is a common ancestry in the family tree. The most obvious effect
of inbreeding is the expression of hidden recessives. Each human carries
about four (2 - 7) lethal recessive equivalents from estimates of a
variety of studies. Four lethal-equivalents means that four
alleles, when homozygous, cause 100% lethality or eight alleles cause
50% lethality.
Only rarely does an outbred individual receive the same recessive lethal
from each parent.
Inbreeding often results in spontaneous abortions, fetal death, and
congenital deformities (inbreeding depression).
For unrelated parents, 4-6% of the offspring will carry some sort of
genetic defect. In marriages between first cousins, 16-28% of the
offspring will carry some sort of genetic defect (table 23-3).
There are two types of homozygosity
- allozygosity - two alleles are alike but unrelated. They
represent two separate mutational events.
- autozygosity - two alleles have identity by descent, meaning
they are identical copies of the same ancestral allele
The inbreeding coefficient (F) can be defined as the probability of
autozygosity, or the probability that any two alleles at a locus are
identical by descent.
F can range from 0 to 1. F = 1 would represent the doubling of a
gamete and would be autozygous at all loci.
Listed below are the calculations of expected genotypic frequencies
given (F), the level of inbreeding for the population
AA Aa aa
F = 0 p2 2pq q2
p2(1 - F) + pF 2pq(1 - F) q2(1 - F) + qF
F = 1 p2(0) + pF 2pq(0) q2(0) + qF
You can see from the formula, as F increases, the proportion of
heterozygotes goes down. These formulas are alegebraically the same as the formulas on page 684, equation 23.10). However, the allelic frequencies do not change.
Pedigree analysis
This is used to determine (F), the inbreeding coefficient for an
individual and it implies the same effect for the individual as it does
for the population.
To construct a path diagram, you must eliminate all individuals that
cannot contribute to inbreeding. You must draw all the paths through
which an allele can be passed to an individual.
Path diagram rules
1) all possible paths must be counted
2) in any path, an individual can be counted only once
3) every path must have one and only one ancestor
The inbreeding coefficient for a population can be estimated from the
observed and expected genotype frequencies. As the inbreeding
coefficient increases the heterozygosity (proportion of heterozygotes)
decreases. We can use this to estimate the inbreeding coefficient for a
population. The formula is
Ho - Hf
F = ------------
Ho
where Ho is the expected frequency of heterozygotes, which is given by
2*p*q and Hf is the observed frequency of heterozygotes. Note that as
the observed frequency of heterozygotes (Hf) decreases, the
ratio approaches 1.0.
Mutation
Picture a mutation as A ----> a
The forward mutation rate is u, while the backward mutation rate is v (figure 23.8).
u
A -----> a
<-----
v
pn is the frequency of A in generation n
qn is the frequency of a in generation n
qn+1 is the frequency of a in generation n+1 (the next generation)
qn+1 = qn + u*pn - v*qn where
u*pn = forward mutation rate times the frequency of A, and
v*qn = backward mutation rate times the frequency of a
delta q = qn+1 - qn
= u*pn - v*qn
Forward mutation rates are on the order of 1 x 10-5 and backward
mutation rates are usually 10 to 100 times slower.
delta q = (1 x 10-5)(pn) - (1 x 10-7)(qn)
The change in allelic frequency from one generation to the next is very
small.
Setting delta q equal to zero and solving for qn yields qhat, which is the value of q when the population reaches equilibrium.
u*pn = v*qn pn = 1 - qn
u - u*qn = v*qn
u = u*qn + v*qn
qn = u/u+v
Given no perturbation into the system, p and q will eventually reach an
equilibrium that is determined by the two mutational rates.
qhat = u/(u+v) phat = v/(u+v)
As u gets larger the equilibrium shifts to higher frequencies of a. If
u = v, then forward and backward mutation rates are the same, and
qhat = phat = 0.5. However, to significantly change allelic frequencies due to mutational
pressure, or for a population to reach equilibrium, requires thousands
of generations.
Migration
Assume two populations, both having alleles A and a at the A locus
p1 = f(A) in population 1 q1 = f(a) in population 1
p2 = f(A) in population 2 q2 = f(a) in population 1
Assume that members of population 2 migrate to population 1, such that
now migrants make up m proportion of the next generation population,
and natives make up 1-m frequency of the next generation (figure 23.10).
The frequency of a, qc, will be a weighted average
qc = m*q2 + (1-m)q1 with m*q2 being
migrants and (1-m)q1 being natives.
qc = q1 + m(q2 - q1)
delta q = qc - q1
delta q = m(q2 - q1)
The equilibrium value of q will be reached whenever
a) migration stops
b) or the allelic frequency of both populations becomes the same.
These equations can be used to estimate gene flow from one population
to the next, like from White U.S. populations to Black U.S.
populations.
For example, at the Duffy blood group, Europeans (the source of the U.S.
White population) have either allele Fya,or Fyb.
In West Africa (the source of U.S. Black populations) the Fyo allele
is essentially one hundred percent. By measuring the frequency of Fya or Fyb
in U.S. Black populations, we can estimate the rate of gene flow from the
White population to the Black population.
Allelic frequency of Fya in various populations
Source of Black population (native population)
Liberia .005
Ghana .01 q1
U.S. Black population (conglomerate population)
Charleston, S.C. .02
rural Georgia .04
Detroit .13 qc
Source of White population (migrant population)
Western Europe .42 q2
solve the equation for m (proportion of migrants), which will give the
total extent of migration from the white population to the black population
qc - q1
m = ---------
q2 - q1
0.13 - 0.01 0.12
m = ----------- = ---- = .29
0.42 - 0.01 0.41
These data indicate that about 29% of the alleles in the U.S. Black
population are the result of marriages and subsequent gene flow with the
White population.
Small population size
The zygotes of every generation are a sample of gametes from the parent
generation. Errors in sampling gametes from a small parental
population cause the allelic frequencies to fluctuate from generation
to generation. This process is called random genetic drift.
As the allele frequency fluctuates from generation to generation, the
possibility exists that it might fluctuate to either 0 or 1.0, in which
case, all individuals are homozygous and no further changes in allelic
frequency can occur unless mutation or migration reintroduces the lost
allele.
By using computer simulations based upon sampling theory and random
processes, we can track allelic frequency in 100 populations
simultaneously for different population sizes. (figure 23.11)
- start with 100 populations, with 100 individuals per population,
q = 0.5
- after a number of generations, record the q for each population
- and plot the number of populations at each q
Between N and 2N generations, the curve flattens out and populations
are lost to fixiation at a constant rate of about 1/2N per generation.
Founder effects
When a population is initiated by a small and therefore, genetically
unrepresentative sample of the main population, the genetic drift that
observed is called a founder effect.
Examples are seen in many groups of organisms, even in humans.
Remember the movie "Mutiny on the Bounty"? - the population on Pitcairn
Island was formed from a small number of mutineers and Polynesians.
This population today has a unique mixture of Caucasian and Polynesian
features, some of which are rare in either parent population.
A bottleneck occurs when a population declines to a small number
of individuals and then builds back up. This usually causes a loss of
genetic diversity as seen for example in American bison.
Natural Selection
Up to now, we have been discussing several factors that affect allelic
frequencies, but these factors (migration, mutation, inbreeding, etc.) do
not produce individuals that are better adapted to their environment.
Natural selection is a relentless process that eliminates the less suitable organisms in an environment. Natural selection, or just called
selection, is a process whereby one genotype leaves more offspring than
another genotype. Selection is determined by reproductive success, which
has two components - fertility and survival. The genotype that leaves the most offspring is given the highest value for reproductive success. This value is called the fitness. The letter w is
usually used to signify fitness and can vary from 0 to 1.
Fitness is always relative to the other genotypes in the population and
can vary from time to time. A variety of factors can decrease the fitness
value w to below 1. The sum of the forces provides a selection
coefficient, which is usually denoted by the letter S.
w = 1 - S
Components of fitness
Selection can act at any stage of an organism's life cycle.
- zygotic selection is the survival component
This can be either prenatal, juvenile, or adult.
- gametic selection is the differential success of an individual's
gametes
In male mice, an individual that is heterozygous at the t-locus,
Tt, will have 95% of their gametes containing the t allele.
- sexual selection means that some genotypes may mate more often than
others
e.g. large male deer mate more often than small male deer
- fecundity selection means that some genotypes may be more fertile
than other genotypes
Types of selection
Directional selection works by continuously removing individuals
from one end of the phenotypic distribution.
e.g. during the Eocene, the oldest member of the horse family appeared
in the fossil record - Hyracotherium, about one foot high at the shoulder.
Today's horses are much taller, and represent a continuous directional
selection for taller horses.
Stabilizing selection works by constantly removing individuals from
both ends of a phenotypic distribution, so that the mean is not shifted.
This is the more common situation, and occurs as a population becomes
optimally adapted to an unchanging environment. For example, directional
selection favored an increase in the length of the giraffe's neck.
However, the length today appears to be unchanging, thus stabilizing
selection is acting to maintain the length of the neck.
Disruptive selection works by removing individuals from the center
of the phenotypic distribution while favoring individuals on either end.
Disruptive selection is seen in the appearance of different discrete forms
or morphs in the same species. An example is the polymorphic butterfly
Papilio dardanus. This butterfly mimics several distasteful species of
butterflies by its color pattern. The dominance relationships among the
genotypes are such that an individual will mimic any number of models, but
intermediates do not occur. Intermediates would not resemble any of the
models and would be rapidly eaten.
Selection against the homozygous recessive
define initial condition
allow selection to act
calculate allelic frequency after selection qn+1
calculate delta q
et delta q equal to 0 and find q hat
AA Aa aa
initial frequencies p2 2pq q2
fitness (W) 1 1 1-S
ratio after selection p2(W) 2pq(W) q2(W)
p2(1) 2pq(1) q2(1-S)
the sum of the ratio after selection is p2 + 2pq + q2(1-S)
p2 + 2pq + q2 - Sq2 = 1 - Sq2 = mean W
mean W = the mean fitness of the population
It represents the sum of the fitness of the genotypes multiplied by their
proportion.
Finally, we can calculate the genotypic frequencies after selection.
(original frequency)(fitness) p2(W) p2(1)
genotypic frequency = ----------------------------- = -------- =
-------
mean fitness of population mean W 1 -
Sq2
AA Aa aa
genotypic frequencies p2/mean W 2pq/mean W q2(1-S)/mean W
after selection
qn+1 = freq(aa) + 1/2 freq(Aa)
= q2(1-S)/mean W + 1/2(2pq/mean W)
= q2(1-S)/(1-Sq2) + pq/(1-Sq2)
= (q2 - Sq2 + pq)/(1 - Sq2)
= q(q - Sq + p)/(1 - Sq2) p = 1 - q
= q(q - Sq + 1 - q)/(1 - Sq2)
= q(1 - Sq)/(1 - Sq2)
We can simplify things a little by assuming that S is approximately equal
to 1, and that the aa genotype is lethal.
qn+1 = q(1 - q)/(1 - q2)
= q/(1 + q)
delta q = qn+1 - q
= q/(1 + q) - q
= -q2/(1 + q)
= 0
There is no change, or an equilibrium exists when q = 0.
Thus selection will act to drive a to 0, or f(A) to fixation.
q is changing in proportion to q2 or the relative frequency
of the recessive homozygote. However, as q becomes small, most individuals
having a will be heterozygotes and not selected against. Thus, selection
against a recessive homozygote may take many, many generations to completely
remove recessive lethal mutation from the population.
for example, phenyl ketonuria q2 = .0001 but 2pq = .02
Types of selection models
There are only 4 basic selection models based upon how fitness values
are assigned. (table 23.7)
AA Aa aa
1) against homozygous recessive 1 1 1-S
2) against heterozygotes 1 1-S 1
3) against one allele 1 1-S1 1-S2
4) against homozygotes 1-S1 1 1-S2
In models 1 and 3, selection is against the a allele. In model 1, we saw
that selection may take an infinite number of generations to remove a
because a can 'hide' in the heterozygote. In model 3, however, the
heterozygote is also selected against and a will be removed much faster
because the a allele can no longer hide, for example thalassemia.
Under model 3, a completely dominant lethal would be removed in one
generation. Severely deleterious dominants are also removed very
quickly, for example retinoblastoma.
Model 2, where selection is against the heterozygote, is interesting
because it drives the rarer allele to extinction. An equilibrium does exist, but only if q = p = 0.5. If the equilibrium is disturbed, it moves toward extinction of one allele or the other. This is called an unstable equilibrium. An example is the condition
erythroblastosis, or maternal-fetal incompatibility at the Rh locus.
Rh+Rh- babies born to Rh-Rh- mothers can
develop a condition where the mother's antibodies attack the child's blood.
Model 4, where selection is against homozygotes, demonstrates the
heterozygote advantage. This is an important model because it is often invoked to explain the maintenance of allelic polymorphism in a population. We will see that at equilibrium, both alleles are maintained in the
population. An example is sickle cell anemia. In West Africa, where malaria is common, an individual is better off as a heterozygote. You can derive these equations exactly as we did in the last series for selection against the homozygous recessive.
delta q = pq(S1p - S2q)/mean W = 0
= 0 when p = 0, q = 0, or S1p - S2q = 0
p = 0 and q = 0 are trivial and the case of interest is when
S1p - S2q = 0
S1p - S2q = 0
S1p = S2q p = 1 - q
S1(1 - q)= S2q
S1 - S1q = S2q
S1 = (S2 + S1)q
qhat = S1/(S1 + S2) phat = S2/(S1 + S2)
This equilibrium is stable. Any perturbation to equilibrium is returned
quickly and regardless of the starting condition, the allelic frequencies
always converge to qhat and phat.
The application of population genetics to natural population in attempt to
study evolution has been called neo-Darwinism or the new synthesis. This dates back to the 1920's and 1930's.
As you can see from the models we have been working with, a natural
population would be exceedingly complex, especially when you consider the
effects of having a number of processes working simultaneously, having
more than two alleles at a locus, hundreds to thousands of loci
affecting the fitness at the individual, and the unpredictable environmental
variations.
The process of evolution as outlined by Darwin has three major
steps.
- Variation is characteristic of virtually every group of animals and
plants. This arises from mutations. This variation is important
because it is the raw material that selection is going to work on.
- Every group of organisms overproduces offspring. In stable
populations, every adult replaces itself, but most adults produce
more than one offspring. Thus, most offspring die before they
reproduce. There is an overabundance of offspring.
- The most fit will survive. Among all the organisms competing for a
limited number of resources, only the organisms best suited to
obtain and utilize these resources will survive. To whatever
degree the characteristics of the most fit are inherited, the
"favored" traits will be passed onto the next generation.
Last update on 13 November 2005
Provide comments to Dwight Moore at mooredwi@emporia.edu
Return to the General Genetics Home Page at Emporia State University.