Amphioxus Genomics

Cephalochordates, commonly known as lancelets or amphioxus, represent an ancient chordate lineage falling at the boundary between invertebrates and vertebrates. They are considered the best living proxy for the common ancestor of all chordate animals and hold the key for understanding chordate evolution. As a result, they have gradually become popular as emerging model organisms in developmental biology during the past decade (Schubert et al. 2006; Yu and Holland 2009; Bertrand and Escriva 2011). There are three genera of cephalochordates: Branchiostoma with ~28 species, Epigonichthys with one species, and Asymmetron with two recognized species, but probably additional cryptic ones. To date, studies on cephalochordates are almost exclusively limited to the Branchiostoma genus, leaving the other two genera largely unexplored. As the most distant related cephalochordate genus relative to Branchiostoma, Asymmetron occupies a basal position in the cephalochordate phylogeny (Kon et al. 2007) and diverges from the Branchiostoma genus 120-160 mya (Nohara et al. 2005; Yue et al. 2014). Morphologically, the most striking difference between Asymmetron and Branchiostoma lies in that Asymmetron has gonads only on the right side whereas Branchiostoma has gonads on both sides (Holland and Holland 2010).

Starting from my PhD, we set out to develop transcriptomic and genomic resources for a representative species from the Asymmetron genus, Asymmetron lucayanum, by both RNA-seq and whole-genome shotgun (WGS) sequencing. By comparing its transcriptomic and genomic sequences with those of distantly related amphioxus species from the Branchiostoma genus, as well as with several representative vertebrate species, many aspects of genome biology for amphioxus were illuminated. Amongst the findings are a seemingly slow lineage-specific molecular evolutionary rate, observed sets of fast-evolving genes, a new calibration of molecular and fossil data describing the evolution of this lineage, a first pass description of conserved non-coding elements, a collection of genes potentially specific to germline development, and the evolution of genes encoding green fluorescent proteins (GFPs) in amphioxus. These findings lay a good foundation for functional studies on this important organisms. Starting from August 2016, we are working on  applying long-read sequencing technology (PacBio) to generate high quality genome assembly for Asymmetron lucayanum, towards a better understanding of its biology and evolution, as well as the evolutionary transition from invertebrates to vertebrates.

Yeast Genomics

Structural rearrangements (inversions, translocations, transpositions, and large insertions, deletions and duplications) hold a key role in promoting and maintaining genetic variation with broad implications in genome instability, speciation, functional innovation and disease susceptibility. However, our understanding of their evolutionary dynamics remains limited due to insufficient resolution and accuracy in characterizing complex events with short-read sequencing. The incomplete reference genome assemblies of many organisms further exacerbate this problem, especially in repetitive and highly variable regions such as subtelomeres and chromosome-ends.

The applications of long-read sequencing technology in several recent studies have proved to be quite successful in detecting and resolving structural rearrangements, even for complex genomic regions (Chaisson et al. Nature 2014; Zapata et al. PNAS 2016; Dong et al. PNAS 2016). Here we take this to the next level, by applying long-read sequencing technology to the population level by sequencing 12 representative strains from the partially domesticated yeast Saccharomyces cerevisiae and its closest wild relative Saccharomyces paradoxus. This allowed us to systematically discover structural rearrangements based on complete genome assemblies. Furthermore, for the first time, we explicitly partitioned nuclear chromosomes into cores, subtelomeres and chromosome-ends, which allows us to assess their respective structural dynamics accordingly. In particular, instead of relying on the current subtelomere definition that treats all chromosomes indiscriminately (e.g. 20~30 kb from chromosome-ends in yeasts), we proposed a chromosome-specific subtelomere definition based on synteny conservation. Crucially, this effectively captured the dynamic nature of subtelomeres. Our high-resolution analysis of structural rearrangements across different chromosome partitions within a well-defined phylogenetic framework uncovered striking contrasts in genome dynamics between domesticated and wild yeasts – revealing the influence of human activities on structural genome evolution.

The dedicated project website can be accessed via this link

Last updates: Oct 27, 2016