Decoding Genomes: From Sequences to Phylodynamics
To equip students with the conceptual foundations and practical skills required to decode genomic data and link it to evolutionary and population dynamic processes across biology, while simultaneously building core bioinformatics skills that fill a current gap in the OIST curriculum. It is especially suited to students interested in evolutionary genomics, computational biology, epidemiology, or systems biology.
This course introduces the statistical, computational, and biological foundations of genome analysis, molecular evolution, phylogenetics, and phylodynamics, based on "Decoding Genomes: From Sequences to Phylodynamics" (Stadler et al., 2024). Learn about sequencing and alignment, substitution models, tree inference, phylodynamics, and Bayesian methods. Applications span macroevolution, infectious disease epidemiology, immunology, cancer, and linguistics. A key feature is integrated training in bioinformatics methods, ensuring students gain hands-on skills in sequence analysis, phylogenetics, and data visualization. This dual focus on conceptual foundations and technical skills provides a structured introduction to bioinformatics methods in genomics and is accessible to diverse backgrounds.
Week 1 – Introduction & Basics
Lecture: Evolutionary principles, genotype–phenotype mapping, probability, and stochasticity in biology. Practical: Introduction to genomic data formats (FASTA/FASTQ) and basic handling using Python or R; reading and visualizing sequences.
Week 2 – Sequencing Technologies
Lecture: Overview of Sanger, next-generation (Illumina, Ion Torrent), and third-generation (PacBio, Nanopore) sequencing platforms. Practical: Perform sequence quality assessment using FastQC and trimming with Cutadapt; interpret read metrics.
Week 3 – Sequence Alignment I
Lecture: Pairwise alignment algorithms, dynamic programming, and scoring schemes. Practical: Run BLAST searches and pairwise alignments with Clustal Omega; interpret alignment statistics and E-values.
Week 4 – Sequence Alignment II & Assembly
Lecture: Multiple sequence alignment and genome assembly concepts (reference-based vs. de novo). Practical: Assemble a bacterial genome using SPAdes; assess assembly quality (N50, contigs, and coverage).
Week 5 – Genetic Associations
Lecture: Principles of genome-wide association studies (GWAS): design, confounding, and multiple testing correction. Practical: Conduct a toy GWAS using PLINK; visualize results with Manhattan plots and QQ plots.
Week 6 – Molecular Evolution
Lecture: Substitution models (nucleotide, codon, amino acid), dN/dS ratios, and rate heterogeneity. Practical: Estimate substitution models and detect selection using IQ-TREE and MEGA.
Week 7 – Phylogenetic Trees I
Lecture: Tree definitions, distance-based and parsimony methods, and tree representation formats (Newick). Practical: Construct and visualize phylogenetic trees using PHYLIP and IQ-TREE; explore bootstrap support.
Week 8 – Phylogenetic Trees II + Mid-term Test
Lecture: Maximum likelihood tree inference, molecular clocks, and rooting. Practical: Perform model testing and time-tree estimation in IQ-TREE. Mid-term Test: Covers sequencing, alignment, molecular evolution, and phylogenetic inference.
Week 9 – Statistical Testing
Lecture: Likelihood ratio tests, AIC, and bootstrap methods for model comparison. Practical: Apply bootstrapping and model adequacy tests in IQ-TREE to assess phylogenetic support.
Week 10 – Trait Evolution
Lecture: Modeling discrete and continuous trait evolution, Brownian motion, and correlated evolution. Practical: Use R packages (ape, phytools) to reconstruct ancestral states and analyze trait correlations.
Week 11 – Phylodynamics I
Lecture: Birth–death and coalescent models for population dynamics, effective population size, and demographic inference. Practical: Perform BEAST2 analyses to infer population size changes and interpret skyline plots.
Week 12 – Phylodynamics II
Lecture: Structured coalescent and epidemiological models, migration, and population subdivision. Practical: Conduct structured coalescent analyses using BEAST2; compare demographic models.
Week 13 – Bayesian Inference
Lecture: Bayesian framework, priors, MCMC sampling, and convergence diagnostics. Practical: Build Bayesian time trees in BEAST2, assess chain convergence, and summarize credible intervals.
Final Fortnight – Project Presentations
Students present their final projects (data analysis or literature-based mini-reviews) and receive feedback.
Homework & practical computational exercises: 40% (~4 hrs/week)
Mid-term test: 20% (more theoretical stuff)
Final project (analysis or mini-review, with presentation): 40%
Assessment emphasizes continuous engagement and application of methods.
Basic probability and statistics; introductory molecular biology; some experience with the command line, Python or R. The course is not suitable for students without any quantitative or biological background.
Stadler T. et al. (2024). Decoding Genomes: From Sequences to Phylodynamics. (Open access at https://decodinggenomes.org/).
Felsenstein J. (2004). Inferring Phylogenies.
Yang Z. (2014). Molecular Evolution: A Statistical Approach.
Lesk, A. M. (2014). Introduction to Bioinformatics (4th ed.).