Hide menu

TDDD41 Data Mining - Clustering and Association Analysis
and
732A75 Advanced Data Mining

Examples of exam question types


Example exams

No solutions available, but you are welcome to solve the questions and send the solutions to your teachers for checking.

Collection of example exam question types. (not necessarily complete.)

Data mining

  • What is the purpose of data mining?
  • When are patterns interesting?
  • Data in the real world can be dirty. Give reasons and examples.
  • Describe a typical process for the knowledge discovery process.
  • Describe a typical architecture for a data mining system.

Clustering

  • Give examples of attributes of a specific type (interval-based, binary symmetric, binary asymmetric, categorical, ordinal, ...).
  • Define distance measures for the different types of attributes.
  • Compute the distance between two given data objects. The objects have the same attributes which may be of different types.
  • Describe the principles and ideas regarding clustering algorithm X. Explain the different steps of the algorithm/Give the algorithm.
  • Run (an iteration of) clustering algorithm X on a given data set and give partial results for each step.
  • What are the main strengths and weaknesses of clustering algorihm X.
  • Describe the graph representation of the clustering problem when using partitioning approaches and medoids. In general or given a specific data set. Define/exemplify swapping cost.
  • PAM/CLARA/CLARANS: Show how PAM, CLARA, CLARANS work on the graph representation of the clustering problem. Discuss the differences between PAM, CLARA, CLARANS using the graph representation.
  • BIRCH: Define/give examples of CF, CF tree.
  • ROCK: Define/give examples of neighbor, common neighbor, link, Link, goodness measure.
  • Chameleon: Define/give examples of k-nearest neighbor graph, edge cut, interconnectivity, closeness.
  • DBSCAN/OPTICS: Define/give examples of directly density reachable, density reachable, density connected, core point, core distance, reachability distance.

Association analysis

  • Given a transaction database, run the Apriori algorithm. Explain the execution step by step.
  • Prove the correctness of the Apriori algorithm.
  • Show what the Apriori property is and how you use it.
  • Given a transaction database, run the Apriori algorithm with given constraints. Explain the execution step by step.
  • Given a transaction database, run the FP Growth algorithm.
  • Given a transaction database, run the FP Growth algorithm with given constraints.
  • Give examples of different kinds of constraints. Give an example of a convertible monotone constraint that is not monotone. Give an example of a convertible antimonotone constraint that is not antimonotone.
  • Discuss how to incorporate different kind of constraints into the Apriori algorithm.
  • Discuss how to incorporate different kind of constraints into the FP Growth algorithm.
  • Discuss advantages and disadvantages of the FP Growth algorithm w.r.t. the Apriori algorithm.

Page responsible: Patrick Lambrix
Last updated: 2020-01-13