TDDE16 Text Mining

Course Information

This course website is no longer being maintained. Please refer to Lisam for HT2023.

Text Mining develops methods for accessing information in and extracting knowledge from large volumes of text. The overall aim of this course is to provide you with practical experience of the main steps of text mining: information retrieval, processing of text data, modelling and analysis of experimental results. The course ends with an individual project where you work on a self-defined problem.

Intended learning outcomes

On completion of the course, you should be able to:

implement text mining methods and apply them to practical problems
analyse and summarise results from text mining experiments
identify, formulate and solve problems within the area of text mining
clearly present and discuss the conclusions of a project work

For each intended learning outcome, there is a set of more specific knowledge requirements that express what you need to demonstrate in order to attain a particular grade. These knowledge requirements are listed on the Examination page.

Course content

The course covers the following content:

information retrieval
basic methods in language technology
predictive modelling, in particular text classification
text clustering and topic modelling
information extraction
validation methods

Teaching and working methods

The course is taught in the form of lectures, lab sessions, and supervision in connection with an individual project. You are also expected to study independently, both individually and in groups. When you plan your time for the course, you should calculate approximately

40 hours to prepare for, watch and follow-up on the video lectures
40 hours to prepare for, carry out and follow-up on the labs
80 hours to plan, carry out and document the project

You are entitled to individual project supervision during the study period you are registered for.

The course is co-taught with 732A81 Text Mining on the Master’s programme in Statistics and Machine Learning.

Course literature

The reading for this course consists of excerpts from the following books, as well as research articles.

Daniel Jurafsky and James H. Martin. Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Draft chapters of 3rd edition, December 2021.
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. The complete book is available on-line.
ChengXiang Zhai and Sean Massung. Text Data Management and Analysis. A Practical Introduction to Information Retrieval and Text Mining. Morgan & Claypool, 2016. We will only use Chapter 1, which is available on-line.

Feedback policy

What you can expect from us. We try our best to give you prompt, constructive, and meaningful feedback on how well you meet the knowledge requirements set out for the course. We offer feedback in various forms; you can find the details on the Examination page. Our focus is on formative feedback, which you can use to improve your learning (and we can use to improve our teaching!) while the course is ongoing.

What we expect from you. We expect you to familiarise yourself with the knowledge requirements set out for the course and to actively seek our feedback on how well you meet these requirements. We also expect you to reflect on the feedback we provide and grasp opportunities to put it to good use.

Communication policy

What we expect from you. This website is the primary source of information about the course, and we expect you to keep yourself up-to-date with what we publish here. We also send out information via the University’s email list for the course and the class team on Microsoft Teams, and we expect you to read these channels regularly while the course is ongoing.

What you can expect from us. When you contact us via email or chat, you can expect an answer during standard working hours, 8–17. (We do not respond to email/chat in the evening or weekend.) For more personal contact, you can talk to the examiner in class or book an appointment.

Special needs

Accessibility. If there is any portion of the course that is not accessible to you due to challenges with technology or the course format, please let the examiner know so we can make appropriate accommodations.

Students with disabilities. If you have a documented disability, you should contact the examiner as soon as possible regarding accommodations. Book an appointment with the examiner

Page responsible: Marco Kuhlmann
Last updated: 2022-10-26

IDA - Department of Computer and Information Science