Hide menu

732A51 Bioinformatics

Course information


Autumn 2023

Autumn 2023

This is the course page for the course in Bioinformatics.

The first course occasion will be in KY24 on Thursday 2023-11-09 08:15-10 Lecture!

Course Content


The course is a general statistically oriented introduction to various topic in Bioinformatics. More specifically, the course includes:

  • Basics of molecular biology and genetics
  • Hidden Markov models, genetic sequence analysis
  • Sequence similarity, sequence alignment
  • Phylogeny reconstruction
  • Quantitative trait modelling (phylogenetic comparative methods)
  • Microarray analysis
  • Network biology

Course literature


  • Statistical Methods in Bioinformatics by W.J. Ewens, G.R. Grant (EG).

  • Concepts in Bioinformatics and Genomics by J. Momand, A. McCurdy (MM).

Recommeded reading

Little Book of R for Bioinformatics by Avril Coghlan (C). The book can be found online here.

The lecture slides contain sections of the textbook that are recommended to be read for particular parts.

Course structure


The course contains 7 lectures, 5 computer labs and 2 exercise sessions.

All classes are planned (hoped) to be on campus.

The best thing is that the students work with their own computer. Most of the computer exercises will be done using R or online bioinformatics services.

The course contains three teaching activities:

  • Lecture (Fö) Introduction of new concepts.
  • Computer lab (DATALAB) Individual computer lab with individual help.
  • Exercise session (Lektion) Presentation of solutions to mathematically oriented exercises.

Lectures


The following content will be presented on each lecture:

Lecture Time and place Slides Code Additional Reading
1 2023-11-09 08:15 KY24 Basics of molecular biology and genetics (MM Ch. 1-3), slides and additional reading PDF ZIP
2 2023-11-13 U7 10:15 Hidden Markov models, genetic sequence analysis (EG Ch. 5,12; MM Ch. 12), slides and additional reading PDF ZIP ZIP
3 2023-11-20 U3 10:15 Sequence similarity, sequence alignment (EG Ch. 6,10; MM Ch. 4-6), slides and additional reading PDF ZIP
4 2023-11-21 U6 13:15 Phylogeny reconstruction (EG Ch. 14,15; MM Ch. 8), slides and additional reading PDF ZIP
5 2023-11-27 U7 10:15 Quantitative trait modelling (phylogenetic comparative methods), slides and additional reading PDF ZIP ZIP
6 2023-12-04 U6 10:15 Microarray analysis (EG Ch. 13; MM Ch. 10), slides and additional reading PDF (low resolution)
PDF (high resolution)
R ZIP
7 2023-12-11 U6 10:15 Network biology (EG Ch. 13; MM Ch. 10), slides and additional reading PDF ZIP

Assignments


The course contains 5 assignments that all are mandatory.

Attendance of the lab sessions is not obligatory but it might be difficult to complete the labs without supervision so it is recommended to attend these sessions.

Labs are to be done in groups. Groups will be setup at the first classes. Once groups are established the submissions will be open in LISAM. Until their openning the lab materials will be available in course documents in LISAM and some labs can be downloaded from this webpage. Students must discuss their lab solutions in a group and compile a collaborative report showing the results and the code. Attention: there is a deadline for such a report! The document should clearly state the names of the students that participated in its compilation. This report should be submitted via LISAM as a .PDF with accompanying R scripts (alternatively in case of problems emailed to one of the responsible staff ) before the report deadline.

The solution should contain all used code.

The file should be named Group X.pdf where X is the group number. Please also include your names in the report.

The collaborative reports are corrected and graded by the teacher. A student is PASSED on the lab if the group report is PASSED.

All group members have to contribute to, understand and be able to explain all aspects of the work. In case some member(s) of a group do not contribute equally this has to be reported and in this situation a formal group work contract will be signed, s tipulating the consequences for further unequal contributions.


If you miss the deadline for a lab solution, you must submit the solution anyway, and in this case some penalty assignments may also be given.
There is a second deadline of 23:59 31 January 2024 for submitting corrections for all the hand-ins.
There is a final deadline of 23:59 29 February 2024 for all the hand-ins. After this date NO submissions nor corrections will be accepted.

Late submissions will result in penalty assignments!

Submission is done through LISAM or via email to lab assistants.

ALL submissions will be CHECKED through URKUND for plagarism (also with respect to past labs)!

Assignment no.    Instructions    Material presented    Lab date    Deadline
1 ZIP 2023-11-09,13 2023-11-14 2023-11-19
2 ZIP 2023-11-20,21 2023-11-23 2023-11-26
3 PDF 2023-11-27 2023-12-01 2023-12-03
4 PDF 2023-12-04 2023-12-08 2023-12-10
5 PDF 2023-12-11 2023-12-15 2023-12-17

Excersise sessions


The exercises are taken from:
[BE] M. Borodovsky, S. Ekisheva., 2006, Problems and Solutions in Biological Sequence Analysis, Cambridge University Press.
[EG] W. J. Ewens, G. R. Grant., 2005, Statistical Methods in Bioinformatics, 2nd ed. Springer.
[F] F. C. Klebaner., 2005 Introduction to Stochastic Calculus with Applications, Imperial College Press.
[L] A. M. Lesk., 2014, Introduction to Bioinformatics, Oxford University Press.
[MM] J. Momand, A. McCurdy., 2017, Concepts in Bioinformatics and Genomics, Oxford University Press.


Tuesday November 28 13:15-15 in U3

Exercises for session 1 (also in LISAM):
BE: Problems 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, 1.14, 1.19, 3.10
EG: Problems 5.5, 12.1
MM: Problems 6, 7 (p. 330)

Tuesday December 12 13:15-15 in U3

Exercises for session 2 (also in LISAM):
BE: Problems 2.9, 2.10, 2.18, 8.1
EG: Problem 6.1
K: Problems 3.4, 3.5, 3.8
L: Exercises 5.1, 5.3, 5.14, 5.20
MM: Problems 5.5, 8.7

Exercise 12
Assume s < t. Calculate Cov [X(s), X(t)] for X an
(a) Brownian motion,
(b) Ornstein-Uhlenbeck process.

Exercise 16
Let X(t) be an Ornstein-Uhlenbeck process. Using Itô's formula find the SDE representation of (X(t))^2.

Exercise 17
Find the most parsimonious internal node labels for the first tree of Exercise 5.20 (in L). Assume that at the tips the labels are
(a) A : 2, E : 2. C : 1, D : 1, E : 1, F : 2
(b) A : 1, E : 1. C : 1, D : 1, E : 2, F : 2

Exercise 18
Prove the formula for the covariance between traits measured at two tips, Cov[X_1, X_2], under the Ornstein-Uhlenbeck model of evolution.

Active participation in the exercise sessions gives maximum 1 bonus point per session to the exam. Active participation means that a student comes prepared to the seminar session with all the given day's exercises, correctly solves an exercise on the board, is able to answer questions about the presented solution and is able to give help and comments to the classmates' presented solutions. In the sessions, for each exercise a student will be selected (how depends on the number of students) to present a solution.

Physical presence at the excersise sessions is a necessary condition to obtain the bonus points.

This is the same system as for the Advanced R Programming course Computational Complexity exercise session.

Computer exam


The exam will be a computer exam on 2024-01-09

The examination has max score 20 points and grade limits: A : 18p, B: 16p, C: 14p, D: 12p, E: 10p.

It might be that some questions will require hand written derivations, and paper will be provided for these.

The material that will be included with the exam can be downloaded from here

Previous exams can be found here.

Four bonus points can be obtained in total from the exercise sessions (2 points per session).

Staff


  • Krzysztof Bartoszek, lecturer
  • Krzysztof Bartoszek, examiner
  • Hao Chi Kiang teaching assistant

Page responsible: Krzysztof Bartoszek
Last updated: 2021-10-19