# TDDD41 Data Mining - Clustering and Association Analysis

and

732A75 Advanced Data Mining

### Examples of exam question types

## Example exams

No solutions available, but you are welcome to solve the questions and send the solutions to your teachers for checking.## Collection of example exam question types. (not necessarily complete.)

### Data mining

- What is the purpose of data mining?
- When are patterns interesting?
- Data in the real world can be dirty. Give reasons and examples.
- Describe a typical process for the knowledge discovery process.
- Describe a typical architecture for a data mining system.

### Clustering

- Give examples of attributes of a specific type (interval-based, binary symmetric, binary asymmetric, categorical, ordinal, ...).
- Define distance measures for the different types of attributes.
- Compute the distance between two given data objects. The objects have the same attributes which may be of different types.
- Describe the principles and ideas regarding clustering algorithm X. Explain the different steps of the algorithm/Give the algorithm.
- Run (an iteration of) clustering algorithm X on a given data set and give partial results for each step.
- What are the main strengths and weaknesses of clustering algorihm X.
- Describe the graph representation of the clustering problem when using partitioning approaches and medoids. In general or given a specific data set. Define/exemplify swapping cost.
- PAM/CLARA/CLARANS: Show how PAM, CLARA, CLARANS work on the graph representation of the clustering problem. Discuss the differences between PAM, CLARA, CLARANS using the graph representation.
- BIRCH: Define/give examples of CF, CF tree.
- ROCK: Define/give examples of neighbor, common neighbor, link, Link, goodness measure.
- Chameleon: Define/give examples of k-nearest neighbor graph, edge cut, interconnectivity, closeness.
- DBSCAN/OPTICS: Define/give examples of directly density reachable, density reachable, density connected, core point, core distance, reachability distance.

### Association analysis

- Given a transaction database, run the Apriori algorithm. Explain the execution step by step.
- Prove the correctness of the Apriori algorithm.
- Show what the Apriori property is and how you use it.
- Given a transaction database, run the Apriori algorithm with given constraints. Explain the execution step by step.
- Given a transaction database, run the FP Growth algorithm.
- Given a transaction database, run the FP Growth algorithm with given constraints.
- Give examples of different kinds of constraints. Give an example of a convertible monotone constraint that is not monotone. Give an example of a convertible antimonotone constraint that is not antimonotone.
- Discuss how to incorporate different kind of constraints into the Apriori algorithm.
- Discuss how to incorporate different kind of constraints into the FP Growth algorithm.
- Discuss advantages and disadvantages of the FP Growth algorithm w.r.t. the Apriori algorithm.

Page responsible: Patrick Lambrix

Last updated: 2020-01-13