Biomedical Information - Knowledge Engineering and Text Mining

This course is organized within the framework of the Ph.D. Programme in Medical Bioinformatics, and takes place at Karolinska Institutet, Stockholm, November 17-18, 2008.

OPEN for students and researchers from academia and industry.
During recent years an enormous amount of biological data, such as DNA and protein sequences, and gene regulatory and protein interaction networks, has been generated. This data is spread in a large number of autonomous data sources that are often publicly available on the Web. Further, there are also numerous tools available on the Web. Researchers in various areas, e.g. medicine, agriculture and environmental sciences, use these data sources and tools for such things as developing drugs enabling treatment of diseases, studying how mutations affect functioning of different components in organisms and investigating the influence of environmental factors on human health.

Due to the explosion of the amount of online accessible data and tools, it becomes more and more difficult for researchers to find the relevant sources and tools, and retrieve the relevant information. Further, often information from different sources needs to be integrated. The vision of a Semantic Web alleviates these difficulties. The Semantic Web is an extension of the current Web in which information is given a well-defined meaning by annotating Web content with ontology terms.

In this course we discuss the vision of a Semantic Web for biomedical informatics with a focus on modeling, organization and management of biomedical data for improved access and search. We discuss two important technologies that are needed to make this vision happen: knowledge engineering and text mining. Further, we exemplify these approaches through real cases in a pharmaceutical company and demonstrate different systems.


  • prof Patrick Lambrix, Linköpings unversitet
  • Dr Lena Strömbäck, Linköpings unversitet
  • Dr Jose M Pena, Linköpings unversitet
  • Dr He Tan, Linköpings unversitet
  • Dr Marcus Bjäreland. AstraZeneca

Practical Information

The course takes place at Karolinska Institutet, Solna as follows:

Course content (Preliminary titles)

  • A Semantic Web for Bioinformatics, Professor Patrick Lambrix
Knowledge Engineering
  • Biomedical Ontologies and Alignment of Biomedical Ontologies, Professor Patrick Lambrix
  • Standards for Molecular Interaction Data, Dr Lena Strömbäck
Data Mining
  • Probabilistic Graph Models for Gene Regulatory Networks, Dr Jose M Pena
Text Mining
  • Text mining for Biomedicine, Dr He Tan
  • Applied Biomedical Text Mining at AstraZeneca, Dr Marcus Bjäreland