DESIGN OF EXPERIMENTS AND ANALYSIS OF VARIANCE

Credits 3 ECTS
Code 405060
Audience Master AI, CS, Information Science
Lecturer Prof.dr. Aad van der Vaart (Faculty of Sciences, VU).
E-mail: aad at few.vu.nl
Period 7, 14, 21, 28 September, 5, 12 October 2010.
Hours Tuesdays 11-12.45
WN - P647
Aim An introduction to the design and statistical analysis of experiments.
Form Lectures, computer assignments, final project.
Description An outcome may depend on one or more factors that can be manipulated. For instance, time spent on a website depends on several aspects of web design; proximity to target value is a function of the parameters of an evolutionary algorithm, or (a classic in this area!) agricultural yield depends on type of fertilizer and crop variety. In an experiment one may measure the outcome for several settings of the factors. If the measured outcomes are subject to chance variation, statistical techniques must be used to analyse the results. This course discusses commonly used techniques by way of examples, using a minimum of mathematical formulas. Data-sets are analysed using the statistical package R, and emphasis is on correctly implementing statistical tests and interpreting the computer output. Among the topics that will be addressed are:
Recap of basic statistical concepts. Population distribution, histogram, QQ-plot, sample, statistical test, p-value.
Introduction to R. Basics of the open source computer package R, and its application to ANOVA.
Analysis of variance. One-way and two-way completely randomized designs, randomized block design, regression, ancova.
Literature
Assessment The average mark for the 5 assignments accounts for 60 % of the final mark, and the final project for the remaining 40 %. The marks for both parts have to be sufficient. Assignments and final project are typically carried out by groups of two students. A final individual, oral discussion may be part of the examination.
Requirements Introductory course to statistics at the bachelor's level for computer scientists.
Computer language The statistical package R can be downloaded from the R-project site www.r-project.org . It is free! It is also installed on the FEW computers.
Assignments Assignments 1,2,3,4,5 are due before the beginning of lectures 2,3,4,5,6, which include a discussion of their solution. The assignments concern material presented in lectures 1,2,3,4,5.
Final project The course must be concluded by designing a small experiment, analyzing its results, and writing a short report (e.g. 5 pages). Students must formulate their own research question, and gather their own data, by carrying out a (small) experiment, using data from another course, or data from a third source. Think carefully whether the data can answer the question, and which analysis method applies. You may wish to send a short proposal to the lecturer for comments before starting the experiment. The deadline is at the beginning of the next lecture block.