732A47 Text Mining
Course information
Course sections
- Introductory modules
- Introduction to Python Programming
- Introduction to Statistical Modeling
- Introduction to Computational Linguistics
- Data models and Information Retrieval for Textual Data
- Statistical Models for Textual Data
- Text Mining Project
You need to pass two out of the three introductory modules, and you are free to choose which module (if any) to skip.
Course literature
The following books will be used, in parts, during the course:
- Natural Language Processing with Python.
This book contains a lot of practical hands-on material using the NLTK toolkit for Python.
The book's website is here, where the book can be read for free in HTML format. The publisher O'Reilly also sells the book in PDF format. - Foundations of Statistical Natural Language Processing.
This book describes the background theory for computational linguistics and statistical analysis of text data.
It available electronically for free here (for LiU students, but and probably also for students at most other Swedish universities).
The book's website is here. - Extra material
Course Introduction
Slides
Introduction to Python Programming
Recommended literature
- Chapter 4 in Natural Language Processing with Python
- Chapters 1-13 in Learning to Program Using Python by Cody Jackson.
- Cheat sheet that translates between Matlab, R and Python commands.
- Interactive Python web tutorial
- Python code visualization lets you see what happens at each step of your code.
- Python tutorial from the official Python.org site.
Introduction to Statistical Modeling
Slides
Computer lab
Introduction to Computational Linguistics
Slides
Computer lab
Data models and Information Retrieval for Textual Data
Slides
Computer lab
Statistical Models for Textual Data
Slides
- Slides - 1 per page | Slides - 4 per page
Text Mining Project
Slides
Computer lab
Page responsible: Mattias Villani
Last updated: 2013-04-04
