732A51 Bioinformatics
Course information
Autumn 2024
This is the course page for the course in Bioinformatics.
The hyperlinks for 2024's material are being updated at the moment.
The first course occasion will be in R36 on Monday 2024-11-11 10:15-12 Lecture!
Course Content
The course is a general statistically oriented introduction to various topic in Bioinformatics. More specifically, the course includes:
- Basics of molecular biology and genetics
- Hidden Markov models, genetic sequence analysis
- Sequence similarity, sequence alignment
- Phylogeny reconstruction
- Quantitative trait modelling (phylogenetic comparative methods)
- Microarray analysis
- Network biology
Course literature
Statistical Methods in Bioinformatics by W.J. Ewens, G.R. Grant (EG).
Concepts in Bioinformatics and Genomics by J. Momand, A. McCurdy (MM).
Recommeded reading
Little Book of R for Bioinformatics by Avril Coghlan (C). The book can be found online here.
The lecture slides contain sections of the textbook that are recommended to be read for particular parts.
Additional reading
Bioinformatics Algorithms: Sequence Analysis, Genome Rearrangements, and Phylogenetic Reconstruction by E. Ohlebusch. The book can be found online here.
Course structure
The course contains 7 lectures, 5 computer labs and 2 exercise sessions.
All classes are planned (hoped) to be on campus.
The best thing is that the students work with their own computer. Most of the computer exercises will be done using R or online bioinformatics services.
The course contains three teaching activities:
- Lecture (Fö) Introduction of new concepts.
- Computer lab (DATALAB) Individual computer lab with individual help.
- Exercise session (Lektion) Presentation of solutions to mathematically oriented exercises.
Lectures
The following content will be presented on each lecture:
Lecture | Time and place | Slides | Code | Additional Reading | |
---|---|---|---|---|---|
1 | 2024-11-11 10:15 R36 | Basics of molecular biology and genetics (MM Ch. 1-3), slides and additional reading | ZIP | ||
2 | 2024-11-12 R35 13:15 | Hidden Markov models, genetic sequence analysis (EG Ch. 5,12; MM Ch. 12), slides and additional reading | ZIP | ZIP | |
3 | 2024-11-14 R35 08:15 | Sequence similarity, sequence alignment (EG Ch. 6,10; MM Ch. 4-6), slides and additional reading | ZIP | ||
4 | 2024-11-19 R35 13:15 | Phylogeny reconstruction (EG Ch. 14,15; MM Ch. 8), slides and additional reading | ZIP | ||
5 | 2024-11-26 R35 13:15 | Quantitative trait modelling (phylogenetic comparative methods), slides and additional reading | ZIP | ZIP | |
6 | 2024-12-03 R35 13:15 | Microarray analysis (EG Ch. 13; MM Ch. 10), slides and additional reading |
PDF (low resolution)
PDF (high resolution) |
R | ZIP |
7 | 2024-12-10 R35 13:15 | Network biology (EG Ch. 13; MM Ch. 10), slides and additional reading | R | ZIP |
Assignments
The course contains 5 assignments that all are mandatory.
Attendance of the lab sessions is not obligatory but it might be difficult to complete the labs without supervision so it is recommended to attend these sessions.
Labs are to be done in groups. Groups will be setup at the first classes. Once groups are established the submissions will be open in LISAM. Until their openning the lab materials will be available in course documents in LISAM and some labs can be downloaded from this webpage. Students must discuss their lab solutions in a group and compile a collaborative report showing the results and the code. Attention: there is a deadline for such a report! The document should clearly state the names of the students that participated in its compilation. This report should be submitted via LISAM as a .PDF with accompanying R scripts (alternatively in case of problems emailed to one of the responsible staff ) before the report deadline.
The solution should contain all used code.
The file should be named Group X.pdf where X is the group number. Please also include your names in the report.
The collaborative reports are corrected and graded by the teacher. A student is PASSED on the lab if the group report is PASSED.
All group members have to contribute to, understand and be able to explain all aspects of the work. In case some member(s) of a group do not contribute equally this has to be reported and in this situation a formal group work contract will be signed, s tipulating the consequences for further unequal contributions.
If you miss the deadline for a lab solution, you must submit the solution anyway, and in this case some penalty assignments may also be given.
There is a second deadline of 23:59 2 February 2025 for submitting corrections for all the hand-ins.
There is a final deadline of 23:59 2 March 2025 for all the hand-ins. After this date NO submissions nor corrections will be accepted.
Late submissions will result in penalty assignments!
Submission is done through LISAM or via email to lab assistants.
ALL submissions will be CHECKED through URKUND for plagarism (also with respect to past labs)!
Assignment no.    | Instructions    | Material presented    | Lab date    | Deadline |
---|---|---|---|---|
1 | ZIP | 2024-11-11, 12 | 2024-11-15 | 2024-11-18 |
2 | ZIP | 2024-11-14, 19 | 2024-11-22 | 2024-11-25 |
3 | 2024-11-26 | 2024-11-29 | 2024-12-02 | |
4 | 2024-12-03 | 2024-12-05 | 2024-12-09 | |
5 | 2024-12-10 | 2024-12-13 | 2024-12-16 |
Excersise sessions
The exercises are taken from:
[BE] M. Borodovsky, S. Ekisheva., 2006, Problems and Solutions in Biological Sequence Analysis, Cambridge University Press.
[EG] W. J. Ewens, G. R. Grant., 2005, Statistical Methods in Bioinformatics, 2nd ed. Springer.
[F] F. C. Klebaner., 2005 Introduction to Stochastic Calculus with Applications, Imperial College Press.
[L] A. M. Lesk., 2014, Introduction to Bioinformatics, Oxford University Press.
[MM] J. Momand, A. McCurdy., 2017, Concepts in Bioinformatics and Genomics, Oxford University Press.
Tuesday November 28 08:15-10 in R35
2 possible bonus points.
Exercises for session 1 can be found under this link.BE: Problems 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, 1.14, 1.19, 3.10;
EG: Problems 5.5, 12.1;
MM: Problems 6, 7 (p. 330);
Tuesday December 12 08:15-10 in R35
2 possible bonus points.
Exercises for session 2 can be found under this link.BE: Problems 2.9, 2.10, 2.18, 8.1;
EG: Problem 6.1;
K: Problems 3.4, 3.5, 3.8;
L: Exercises 5.1, 5.3, 5.14, 5.20;
MM: Problems 5.5, 8.7;
Exercise 12
Assume s < t. Calculate Cov [X(s), X(t)] for X an
(a) Brownian motion,
(b) Ornstein-Uhlenbeck process.
Exercise 16
Let X(t) be an Ornstein-Uhlenbeck process. Using Itô's formula find the SDE representation of (X(t))^2.
Exercise 17
Find the most parsimonious internal node labels for the first tree of Exercise 5.20 (in L). Assume that at the tips the labels are
(a) A : 2, E : 2. C : 1, D : 1, E : 1, F : 2
(b) A : 1, E : 1. C : 1, D : 1, E : 2, F : 2
Exercise 18
Prove the formula for the covariance between traits measured at two tips, Cov[X_1, X_2], under the Ornstein-Uhlenbeck model of evolution.
Active participation in the exercise sessions gives maximum 1 bonus point per session to the exam. Active participation means that a student comes prepared to the seminar session with all the given day's exercises (please hand-in your solutions to me at the beginning of the session; or if you did not solve them on paper, e-mail them to me before the session), correctly solves an exercise on the board, is able to answer questions about the presented solution and is able to give help and comments to the classmates' presented solutions. In the sessions, for each exercise a student will be selected (how depends on the number of students) to present a solution.
Physical presence at the excersise sessions is a necessary condition to obtain the bonus points.
This is the same system as for the Advanced R Programming course Computational Complexity exercise session.Computer exam
The exam will be a computer exam on 2025-01-15
The examination has max score 20 points and grade limits: A : 18p, B: 16p, C: 14p, D: 12p, E: 10p.It might be that some questions will require hand written derivations, and paper will be provided for these.
The material that will be included with the exam can be downloaded from here
Previous exams can be found here.
Four bonus points can be obtained in total from the exercise sessions (2 points per session).
Staff
- Krzysztof Bartoszek, lecturer
- Krzysztof Bartoszek, examiner
- Ying Luo teaching assistant
Page responsible: Krzysztof Bartoszek
Last updated: 2024-08-22