Hide menu

732A32 Data mining project

Course information

The aim of this course is that, after its completion, the student is able to

  • apply previously obtained knowledge in the field of data mining in a real setting,
  • plan, perform and report on an individual task, and
  • demonstrate insight in research and development work.
The course is worth 6 ECTS credits and, thus, the course work load corresponds to 8 weeks at half-speed. The course consists in project work. The project should be chosen in cooperation with a supervisor and, in general, it will be related to the research of the supervisor. The work is performed individually with support and guidance of a supervisor.

Projects available (do not forget to submit your project proposal before you get started, read more on this on the Examination section):
  • Your own project.
  • Analysis of predictive power of data mining algorithms with embedded monotonicity constraints, supervised by Oleg Sysoev.
  • Various topics on probabilistic graphical models such as Bayesian networks and chain graphs, supervised by Jose M. Peña.
  • Interaction between databases and data mining, supervised by Jose M. Peña.
  • Detection of anomalies in data from Automatic Identification System (AIS) for shipping, supervised by Anders Grimvall.
  • Visualizing text using wordclouds using ggplot2, supervised by Måns Magnusson.

    One of the most common ways of visualizing text todays is using wordclouds, i.e. graphs where the size of the word is proportional to the number of times it occurs in the corpus. Today it does exist a package visualizing wordclouds in R using R base graphics. On the other hand, ggplot2 and the grammar of graphics is gaining interest in the area of visualization. This project is about creating an R package to visualize wordclouds using ggplot type grammar of graphics as well as studying if there can be improvements in how to visualize text using wordclouds.

  • Analyzing the political topics of the Swedish Riksdag, supervised by Måns Magnusson.

    This project is about analyzing the textual data of the Swedish Riksdag using what is commonly called topic models. The purpose is to do some preliminary analysis about what types of topics are debated in the Swedish Riksdag and how this has come to change over time. To get the data you need to connect to the API of the Swedish Riksdag to download the textual data.

  • Literature survey on data mining and machine learning methods for big data, supervised by Niklas Carlsson.

Page responsible: Jos?M Pena
Last updated: 2015-09-14