My Research, as it stands, from a Dodge Grand Caravan!: June 2014

Monday, June 23, 2014

Botanizing

For centuries, people who studied plants were classically trained botanists. They characterized plant species and classified them by minute details in morphology, down the lengths of leaves, flowers, petioles and bud scales. From these observations, they built a taxonomic system for plants to categorize botanical diversity. Such descriptions are still used today especially in the ornamental plant industry, where they are valuable for drafting patents to legally protect an organization's product. Classical botanists have also left their mark on the English language, giving us with a wealth of adjectives such as tomentose, hirsute, and pubescent: all referring to various textures of hair. Botanists of bygone eras would probably agree that it is a treat to immerse yourself in natural places and, while a lot of work, rewarding to critically observe them. Luckily Rhododendron viscosum occurs in a variety of habitats. Among populations within close proximity, habitats are similar with common vegetation and animal life. The plants within populations share common features like leaf shape, plant height, and environments in which they grow. Traits that are not shared are things like flower color, which can vary greatly within a population. The majority of R. viscosum populations have white flowers, although some populations contained varying hues of pink. The picture on top is from a population in southwest Arkansas, while the one below is from central Louisiana. Note the different shades of white/pink, as well as the subtle changes in stigma arc between specimens.

An observation to add about flowering would be that within populations, all plants flower at different times. Generally when I begin work on an area the plants are in all flowering stages: from not blooming to past blooming. This is an important phenomenon because it will limit the amount of gene flow within populations as plants that are not blooming cannot cross with those that are. A property like this would manifest itself in higher genetic variation at the within population level due to the mating of subgroups and variability for the trait in geographically close groups (See previous post).

Another physical difference is in plant height and habit. Again, quite variable when looking at different environments but the traits are conserved within populations, unlike flower color and blooming time. I have not seen any large differences within populations, but comparing regions tells a different story. Below are representative plant habits from three populations: southwestern AR (upper), Florida panhandle (middle), and eastern Texas (bottom).

The plants in southwestern Arkansas along the Oklahoma border were the largest in size, with mature plants easily reaching 10 feet in height. The Texas plants were unique as mature plants were short, only reaching about 4 feet at maturity and tended to sucker a lot. The Florida panhandle populations were the most morphologically distinct, having notably smaller and stiffer leaves on compact plants. A first suspicion is that some of these differences are products of the environments the plants have been growing in:

Rich Mountain summit, Southeastern Oklahoma looking into Arkansas. A large R. viscosum is located at the base of the ridge's north slope near a stream at ~1,700 feet in elevation. The forest is open, moist, and dominated by northern and southern hardwoods, notably Northern Red Oak (Quercus rubra), Sweetgum (Liquidambar styraciflua), and Red Maple (Acer rubrum).

In contrast, the environment near Sopchoppy, FL at ~20 feet in elevation and 1 mile from tidal estuaries. This is a slash pine (Pinus elliottii) ecosystem, with the dominant understory shrubs being Saw palmetto (Serenoa repens), Titi (Cyrilla racemiflora), and Rhododendron viscosum. In comparison to the photo above, R. viscosum occurs here in full sun under extremely hot conditions. Plant leaves were smaller and much stiffer, likely an evolutionary response to minimize water loss due to heat or saline breezes off the gulf. The horseflies here came in swarms of 2 inch awfulness.

R. viscosum in habitat at Boykin Springs, TX. The populations occurred along seeps in Longleaf Pine (Pinus palustris) ecosystems. These have historically been fire dependent ecosystems, and with burnings repeatedly scorching the populations. The burnt azaleas are the brown foliage in the picture. R. viscosum is a suckering plant here: quickly sending up new shoots after a fire. It has colonized large parts of the understory near these seeps with Sweetbay (Magnolia virginiana). Locals in Texas and Louisiana have even given a name to this unique assemblage, where it is known as "Baygall" .

Now to link this all together, you might be wondering how these plants could possibly look so different over these environments. Part of it could be environmental, where the weather in a given year or set of years might influence plant flowering or leaf shape. The best example would be fire: plants frequently will be shorter and more likely to sucker. But what if we removed the environmental differences? By growing the cuttings of these plants I've been harvesting in a common environment (ie. greenhouse), we can accurately document these morphological features without the confounding factors of weather, day length, or fire present in the wild. If these differences are still present, genetics likely plays a role and we can refine our study with tools previously described. We then blend the worlds of classical botany with modern science. Without the new technology our understanding will never be as thorough but, without the old knowledge, our understanding will never be guided into the right places.

I sat for an hour in a Texas forest to witness the dirty deed-a butterfly pollinating R. viscosum. With incredibly sticky pollen carried by insects, it is likely that part of the phenotypic variation observed is due to reproductive isolation between populations.

An update on the current collections. Subtropical forest in central Florida dominated by old growth Cabbage Palm (Sabal palmetto). Seedlings of this and other Sabal species are making collection difficult as they shroud out any other undergrowth.

Walking the streams is the quickest way through the forest.

Some people like sunsets or sunrises, but I've always been a zenith guy. The sun and blue sky at these latitudes are intense! Ocala NF.

Saturday, June 21, 2014

Genetic Differences from a Quantitative perspective

As I've been traveling and doing field collections, there are notable physical differences within a species from one area to another. I suspect there to be genetic differences, too. With approximately 700 miles between my current location and where I first started collecting, it is unlikely that geographically separate R. viscosum plants share recent ancestors and have been reproductively isolated for a period of time. While we can't pinpoint exact shared ancestors with the advent of genetic markers, we can estimate the the amount of variation at the DNA-sequence level. This DNA sequence variation, composed of different allele size or the presence/absence of marker loci, can be divided into three parts: variation among regions (large geographic areas), variation among populations within regions (small geographic areas), and variation among individuals within populations. The technical term for this is AMOVA, or analysis of molecular variance. It follows the same principle as a traditional analysis of variance (ANOVA): a statistical model commonly used to analyze the difference between group means and procedure (treatment) applied.

You may recall the half-sib mating design I described in the previous post, where I hope to measure the mean performance of progeny from distinct maternal parents. ANOVA works like this:

ANOVA for R. viscosum wild half-sib families, measured for mean rhizosphere acidification

Source of Variation	Degrees of Freedom	Mean squares
Environment	e-1
Repititions per Environment	(r-1)e
R. viscosum HS families	(n-1)	MS_{HS families}
R. viscosum HS families x Environment	(n-1)(e-1)	MS_{HS families x environment}
Error	(n-1)(r-1)e	MS_error

The sources of variation in an ANOVA, including error, are inherent to the experiment you set up. In my case there are 4 unique sources of variation: the different media pH Environments where the half sib seedlings are grown, repetitions within the Enviroments, the half sib families themselves, and the interaction between half sib families and the environments they are grown in. Here we identify significant differences based on ratios of mean squares, for example implying that we square the means of all R.viscosum half-sib families and divide by the relevant degree of freedom (n-1) for that source of variation. We then calculate an F-statistic to test the significance of each source of variation. To determine if there is a significant genotypic effect for your trait of interest in this mating design, you take the ratio:

MS_{HS families}

_{---------------------}

MS_{HS families x environment}

This will give you an F-statistic for the effect of the half sib families. The larger this value, the more significant genotypic effect is present.

Now ANOVAs can be constructed to analyze group means for any experimental design, the derivations just become more lengthy. But the same principle still applies. AMOVA is more complex to grasp because we aren't looking at a mean as is most commonly done when we perform an ANOVA. Rather, we are analyzing the differences of alleles at marker loci from plants across a geographic area. Any marker technology can be applied and analyzed through an AMOVA, as long as different alleles can be detected to give a reliable estimate of hetero or homozygosity, the presence of multiple alleles or 1 allele at a locus, respectively.

AMOVA Table

Excoffier, L., Smouse, P. E., & Quattro, J. M. (1992). Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics, 131(2), 479–491.

The results of an AMOVA table are more commonly reported as percentages known as ϕ-statistics. A percentage is more appropriate way to interpret marker data as we are most interested in where the variation is within a species. If our ϕ-statistic for among regions is 0.04 (low) while our ϕ-statistic for within populations is 0.60 (high), this means that most of the genetic variation from our marker data is present within populations with very little distinguishing among regions. This scenario is common among outbreeding, wind pollinated species such as forest trees that have high levels of heterozygosity.

AMOVA is the simplest way to understand genetic variation across geographic areas, but there are more complex ones. The algorithm STRUCTURE is notable for its ability to determine optimal population groupings. The result of a STRUCTURE program is presented below, where the program has grouped and sorted various human ethnic groups by their genetic similarity. When more colors are observed in a plot, there is a greater allelic diversity within that population. This also changes with the number of subgroups assumed within populations (K), an iterative process that is part of the STRUCTURE algorithm.

https://anthrogenetics.files.wordpress.com/2010/04/rosenberg-2002-structure.jpg

A nice thing with STRUCTURE, although it is notably more complex, are the graphics it can generate. They are prettier than tables of numbers, such as those included above.

Tuesday, June 10, 2014

Experiments in the Woods

Let’s say you wanted to have more genetic insight about a group of people in a region. You would expect that individuals within families would be more similar than individuals who are unrelated. But what might not be apparent without analysis is how much genetic diversity occurs over the entire region. Genetic diversity is one component that can have a large impact on traits within a population. Take height in humans for example, a highly quantitative trait with a large range of phenotypes, or observed values. Tall parents generally have tall children, short parents generally have short children. But environmental effects such as malnourishment can also impact how short or tall a person is. Say individuals within a family are generally short and exhibit low genetic diversity for regions of DNA, or loci that are partly responsible for height when looking at the family on a whole. In comparison, a family of mixed height people possesses high genetic diversity in these areas, again on a family basis. We could postulate that greater genetic diversity at these loci leads to a greater range of observed height within families. We would say that the family of short people had loci that were fixed, or not diverse at the locus level. It is then a major function of population studies to estimate genetic diversity in order to understand how traits are inherited, how individuals are related, and how inheritance and relatedness affect the phenotypes we observe.

In my case, I know of groups of plants but have no idea of how they are related at the local, regional, or national level. Knowing this information can help refine the analysis of the iron acquisition traits we’re interested in (see previous post). As mentioned in the example above, tall parents generally have tall children. Plants adapted to a stressful soil type might have offspring that are equally adapted, however I can’t determine that yet. But if the plants in a population are all closely related and intermating, this relatedness would be useful information to know in case these traits are conserved (common) in family structures.

So how can you analyze a family of plants in the wild? First, you need to create one! Within each population I observe, I identify individual plants that have flower buds and are capable of producing seed. These plants are given an identification tag and leaves are sampled to determine genetic features (such as relatedness and diversity) among individuals in a population. Each plant represents a family of half-siblings housed on a maternal plant: pollination was at random with the paternal parents being unknown. We can then estimate how each maternal parent performs individually or as a mean of all maternal parents in a population based on the performance of the offspring plants. I’ll explain more later, but this is known as a half-sib mating design.

Mature R. viscosum individuals to be used as maternal parents. Each flower will be pollinated randomly through natural pollinators. Orange tags are placed on plants containing a labeling system and unique number. Leaves from the parents are sampled and their location saved as GPS coordinates so that seed can be recovered when it is ripe in the fall.

I let pollination occur naturally, which for this species mostly occurs via butterflies and other insects. By using this scheme, I can not only determine genetic diversity within each population, but compare it to other populations sampled. This will help us get a better picture of how diverse this species is across its range. In designating certain plants in a population as parents and by genotyping them (estimating the amount of genetic diversity present at the DNA level), we can also estimate the effect that relatedness has on traits of interest in the progeny. The progeny will be grown from the seeds collected off these parental plants this coming fall and evaluated for the traits of interest (see previous post). Relatedness will likely vary depending on location as some populations were smaller and more isolated than others.

R. viscosum populations sampled throughout eastern Texas and western Louisiana. Populations contained between 3 and 50+ mature individuals.

A main reason for using parents in both half-sibling family and population analysis is that it is logistically simple. A single plant can serve two functions, both as a parent in a mating design and as an individual for population analysis. Because this species is never common and populations are isolated, other sampling strategies such sampling a plant every 5 miles are not realistic. This is known as a transect, and is more appropriate for estimating genetic diversity in species which are common and continuous across a large area. In my case populations are clearly defined and can be tested, through the genetic diversity we identify, to determine how unique the adaptations within and among each population are.

I’ve been blogging from a McDonald’s as it is the only place in town with wifi, and will keep updating whenever I’m hungry.

Pages