Go to the homepage of the Vrije Universiteit. Go to the homepage of the department of mathematics. Go to the homepage of the faculty of sciences.

Statistical Genetics





Probability and statistics play an important role in genetics. The mechanism of “meiosis”, the forming of sperm or egg cells, is thought to be probabilistic in nature, as is the process of mating in large populations. The relationship between “genotypes” (DNA-sequence) and “phenotypes” (observable traits or diseases) can be modelled by probability distributions. The analysis of genetic determinants is based on random samples from a population, often biased, and various statistical methods are necessary to analyse such data. This course provides an introduction to stochastic models and methods used in genetics, directed at students in mathematics. We do assume a good working knowledge of probability and statistics (e.g. likelihood and Bayes inference, asymptotics, testing), but do not assume prior knowledge of genetics. In particular, the jargon in this description will be explained.

Statistical genetics is a classical branch of applied probability and statistics, which has recently gained much new interest, due to the signicifant breakthroughs in genetics, both experimentally and theoretically. With modern techniques and significantly increased data it is hoped to link diseases and other traits to genes (pieces of DNA) that can be precisely located on the genome in an unprecedented manner. One could safely say that this area is among the hottest in applied stochastics, and in science in general. There are plenty of opportunities for mathematicians interested in life sciences. This course incorporates parts of many different areas of statistics.

Of course we start with Mendel’s laws of “segregation”, which stipulate that each parent passes a randomly chosen gene on to his/her offspring from each pair of genes, independently across genes. The latter independence was later found out to be untrue, and replaced by “linkage models”, which stipulate positive dependence between genes sitting close together on the genome. The most popular model is based on a Poisson process model for “crossovers” during meiosis. The resulting models combined with “penetrance models” (conditional distributions for phenotypes given genotopyes) allow to write likelihoods for the observed phenotypes in families (or “pedigrees”), and thus to estimate the dependence of phenotypic traits on genetic factors. Because a full likelihood analysis requires the specification of many probability densities and is computationally intensive, other methods with the same aim are based on reduced data, in particular “IBD” (identity by descent) status, and/or on clever sampling plans. “Association” studies are based on the idea that, under a random mating assumption, a population should tend to equilibrium, with deviations in pairs of genes (possibly) in a random sample of individuals indicating that these genes are close together on the genome. Finally, “biometric analysis” is directed at decomposing phenotypic variation into genetic and environmental parts.

Lecturer:
Marianne Jonker  (www.few.vu.nl/~majonker, m.a.jonker@vu.nl)

Required Knowledge:
Courses in probability and statistics from a Bachelor mathematics program, or equivalent. No knowledge of genetics required.

Meetings:
Friday afternoon: 13.30-16.15
Room: WN-P624  (week 36-42)
Room: WN-S607 (week 44-50)

Remark:
If you cannot make it to the first meeting, send an e-mail to the lecturer (m.a.jonker@vu.nl), as the course will become a reading course if insufficient enrollment.

Credits: 6.

VU Vakcode: 400296.

Literature:
Lecture Notes (our main text).
Peter Almgren, Par-Ola Bendahl, Henrik Bengtsson, Ola Hossjer, Roland Perfekt: Statistics in Genetics. Downloadable from: Lund University

Exercises:
There is no formal problem class. However, some exercises will be discussed during the meetings on friday.


Lectures:

September 9: 
         chapter 1 biology,  Mather              
         homework: exercises 1, 2, 3 (week 1)

September 16:
         Remainder of Chapter 1, chapter 2 up to page 25
         homework: exercises 1, 2, 3, 4 (week 2)

September 23:
        Chapter 2:  HWE and LE (no sections with an asterik)
        Section 14.7: EM-algorithm
        homework:
               Try to understand section 2.2.2
               Compute the maximum likelihood estimators at slide 24.
               Do the calculations at slide 29.

September 30:
        Chapter 3: Pedigree Likelihoods
        homework: exercises 1, 2 (chapter 3, week 4)

October 7:
        Chapter 4: Identity by Descent
        Inheritance vectors (sections 1.4, 3.6)
        homework: exercise 2 of last week
                          exercise 4.1 (at page 83)
                          check the values in Tables 4.1, 4.2 and 4.3

October 14: No lecture

October 21:
       Chapter 5
       Breast Cancer research (not in lecture notes)

October 28:
       Autumn break, no lecture

November 4: (Let op vanaf nu: college in zaal WN-S607)
      Chapter 6 (lecturer: Aad van der Vaart):
            Section 14.6
            Section 6.1 up to page 111

November 11: (Let op: college in zaal WN-S607)
      Chapter 6 (lecturer: Aad van der Vaart)

November 18:
      Chapter 7
      Homework: exercise 2 (not 2f) of exam december 2006
                        exercise week 9
 
November 25:
      Chapter 8
      Homework: see lecture notes

December 2:
      Chapter 9, section 9.1
      Homework: see lecture notes

December 9:
      Chapter 9, section 9.2
      Homework: june 2005, exercises 3, 4
                        december 2006, exercise 1
                        Lund 1: 6, 8, 9, 10
                        Lund 2: 6

December 16:
       Homework             

    

Exam:
Written.
Date: December 21,  time: 15.15-18.00 (always check the schedule for changes)

Old Exams:
June 2005
December 2006
July 2007
To test yourself you might try the exams of Lund University Lund 1 and Lund 2 with picture .