732A92 Text Mining
Intended learning outcomes
On completion of the course, you should be able to:
- use basic methods for information extraction and retrieval of textual data
- apply text processing techniques to prepare documents for statistical modelling
- apply relevant machine learning models for analyzing textual data and correctly interpret the results
- use machine learning models for text prediction
- evaluate the performance of machine learning models for textual data
For each learning objective, there is a set of more specific knowledge requirements that express what you need to demonstrate in order to attain a particular grade. These knowledge requirements are listed on the Examination page.
The course covers the following content:
- information retrieval
- document classification
- document clustering
- natural language processing
- information extraction
Teaching and working methods
The course is taught in the form of lectures, lab sessions, and supervision in connection with an individual project. You are also expected to study independently, both individually and in groups. When you plan your time for the course, you should calculate approximately
- 42 hrs to prepare for, attend, and follow-up on the lectures
- 30 hrs to prepare for, carry out, and follow-up on the labs
- 88 hrs to plan, carry out, and document the project
The course is co-taught with TDDE16 Text Mining at the Faculty of Science and Engineering.
The reading for this course consists of excerpts from the following books, as well as research articles.
Daniel Jurafsky and James H. Martin. Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Draft chapters of 3rd edition, October 2019.
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. The complete book is available on-line.
ChengXiang Zhai and Sean Massung. Text Data Management and Analysis. A Practical Introduction to Information Retrieval and Text Mining. Morgan & Claypool, 2016.
What you can expect from us. We try our best to give you prompt, constructive, and meaningful feedback on how well you meet the knowledge requirements set out for the course. We offer feedback in various forms; you can find detailed information about this on the Examination page. Our focus is on non-examinatory, formative feedback, which you can use to improve your learning (and we can use to improve our teaching!) while the course is ongoing.
What we expect from you. We expect you to familiarize yourself with the knowledge requirements set out for the course, and to actively seek our feedback on how well you meet these requirements. We also expect you to reflect on the feedback that we provide, and to grasp opportunities to put it to good use.
What we expect from you. This website is the primary source of information about the course, and we expect you to keep yourself up-to-date with what we publish here. We also send out information via the University’s email list for the course, and we expect you to read email from this list on a regular basis while the course is ongoing.
What you can expect from us. When you contact us via email, you can expect an answer during standard working hours, 8–17. (We do not respond to email in the evening or on a weekend.) For a more personal contact, you can book an appointment with the examiner (via Doodle). During the 2020 session, the course staff uses Microsoft Teams instead of physical meetings.
Page responsible: Marco Kuhlmann
Last updated: 2020-11-02