You may recall the half-sib mating design I described in the previous post, where I hope to measure the mean performance of progeny from distinct maternal parents. ANOVA works like this:
ANOVA for R. viscosum wild half-sib families, measured for mean rhizosphere acidification
Source of Variation
|
Degrees of Freedom
|
Mean squares
|
Environment
|
e-1
|
|
Repititions per Environment
|
(r-1)e
|
|
R. viscosum HS families
|
(n-1)
|
MSHS families
|
R. viscosum HS families x Environment
|
(n-1)(e-1)
|
MSHS families x environment
|
Error
|
(n-1)(r-1)e
|
MSerror
|
The sources of variation in an ANOVA, including error, are inherent to the experiment you set up. In my case there are 4 unique sources of variation: the different media pH Environments where the half sib seedlings are grown, repetitions within the Enviroments, the half sib families themselves, and the interaction between half sib families and the environments they are grown in. Here we identify significant differences based on ratios of mean squares, for example implying that we square the means of all R.viscosum half-sib families and divide by the relevant degree of freedom (n-1) for that source of variation. We then calculate an F-statistic to test the significance of each source of variation. To determine if there is a significant genotypic effect for your trait of interest in this mating design, you take the ratio:
MSHS families
---------------------
MSHS families x environment
The results of an AMOVA table are more commonly reported as percentages known as ϕ-statistics. A percentage is more appropriate way to interpret marker data as we are most interested in where the variation is within a species. If our ϕ-statistic for among regions is 0.04 (low) while our ϕ-statistic for within populations is 0.60 (high), this means that most of the genetic variation from our marker data is present within populations with very little distinguishing among regions. This scenario is common among outbreeding, wind pollinated species such as forest trees that have high levels of heterozygosity.
This will give you an F-statistic for the effect of the half sib families. The larger this value, the more significant genotypic effect is present.
Now ANOVAs can be constructed to analyze group means for any experimental design, the derivations just become more lengthy. But the same principle still applies. AMOVA is more complex to grasp because we aren't looking at a mean as is most commonly done when we perform an ANOVA. Rather, we are analyzing the differences of alleles at marker loci from plants across a geographic area. Any marker technology can be applied and analyzed through an AMOVA, as long as different alleles can be detected to give a reliable estimate of hetero or homozygosity, the presence of multiple alleles or 1 allele at a locus, respectively.
AMOVA Table
AMOVA is the simplest way to understand genetic variation across geographic areas, but there are more complex ones. The algorithm STRUCTURE is notable for its ability to determine optimal population groupings. The result of a STRUCTURE program is presented below, where the program has grouped and sorted various human ethnic groups by their genetic similarity. When more colors are observed in a plot, there is a greater allelic diversity within that population. This also changes with the number of subgroups assumed within populations (K), an iterative process that is part of the STRUCTURE algorithm.
No comments:
Post a Comment