| Credits |
8 credit points |
| Instructors |
Jacqueline Meulman (Universiteit Leiden),
Aad van der Vaart
(Vrije Universiteit),
Mark van de Wiel (Vrije Universiteit) |
| E-mail |
jmeulman at math.leidenuniv.nl, aad at few.vu.nl, mark.vdwiel at vumc.nl |
| Description |
This course gives an overview of techniques
for analysing high-dimensional data, e.g. arising from microarray experiments, mass spectronomy, or high-throughput genotyping, including some statistical theory about the quality of such procedures. Among the topics may be
regression, classification, multiple testing, and clustering, from the point of
view of statistics and statistical learning. The methods discussed
will include support vector machines, regression trees,
boosting, model selection methods. |
| Organization |
Lectures, reading, and possibly presentations by the participants. |
| Examination |
Project or oral exam.
Project assignment with
supporting material
|
| Literature |
-
Book: Hastie, Tibshirani, Friedman: The elements of statistical learning, Springer Verlag, 2001.
- Handouts: pca+LDA (updated 25/2),
svm (updated 29/2),
micro arrays,
Slides Meulman,
Data Mining 1,
Data Mining 2,
Boosting,
additive models and regression splines,
additive models and optimal scaling,
clustering,
splines
- Research papers:
Blanchard, Bousquet, Massart (2006) ,
Efron dr discussion ,
Dudoit et al (review multiple testing) ,
Friedman and Meulman (clustering)
- Manuals:
R manual
|
| Prerequisites |
No specific requirements. |