Hide menu

Text Mining


Status Archive
School Computer and Information Science (CIS)
Division STIMA
Owner Mattias Villani
Homepage http://www.ida.liu.se/~732A47/

  Log in  

Course plan

No of lectures

2-3 preparatory lectures (Python, Linguistics, Statistics) + 5-6 lectures on text mining.

Recommended for

PhD students in Statistics, Computer Science and Cognitive Sciences.

The course was last given

Spring 2013


The course aims to show how to textual data can be retrieved, linguistically pre-processed and subsequently analyzed quantitatively using formal statistical methods and models. The course brings together expertise from the areas of database methodology, computational linguistics and statistics.


Students entering the course should have been admitted to a master’s programme in Computer Science, Cognitive Science or Statistics, or similar master’s programmes. Advanced students in bachelor’s programmes in engineering may also be admitted to the course. In addition, the equivalent of 18 ECTS credits in Statistics and Computer Science is required, with at least 6 ECTS in both Statistics and Computer Science.


The course consists of lectures, lab exercises and a text mining project. The lectures are devoted to presentations of concepts, and methods. The computer exercises are devoted to practical application of text mining tools. In the project work, the student will get hands-on experience in solving a text mining problem.
Language of instruction: English.


The course aims to show how to textual data can be retrieved, linguistically pre-processed and subsequently analyzed quantitatively using formal statistical methods and models. The course brings together expertise from the areas of database methodology, computational linguistics and statistics.
The course proceeds in four stages:
* Introductory modules
- Introduction to Python programming
- Introduction to statistical modeling
- Introduction to computational linguistics
* Data models and information retrieval for textual data
* Statistical models for textual data
* Text mining project




Mattias Villani
Oleg Sysoev
Lars Ahrenberg
Fang Wei-Kleiner


Mattias Villani


Text mining project report. Written reports on lab assignments.




Page responsible: Anne Moe