TDDD41 Data Mining - Clustering and Association Analysis
and
732A75 Advanced Data Mining
Course information
Course literature
- Jiawei Han, Micheline Kamber, Data Mining - Concepts and Techniques, 2nd edition, Morgan-Kaufmann, 2006. ISBN: 978-1-55860-901-3 (chapters 1, 2, 5, 7)
or
Jiawei Han, Micheline Kamber, Jian Pei, Data Mining - Concepts and Techniques, 3rd edition, Morgan-Kaufmann, 2011. ISBN: 978-0123814791 (chapters 1, 2, 3, 6, 10)
- Lab assignment descriptions
- Articles
- Clustering - Partitioning Methods
- Raymond T Ng, Jiawei Han. Efficient and Effective Clustering Methods for Spatial Data Mining, VLDB 94, 144--155, 1994. (CLARANS, also introduction to PAM and CLARA) NOTE: For the partitioning methods, use the algorithms from the slides and this paper; NOT the ones from the course book.
- Clustering - Hierarchical Methods
- Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH : an efficient data clustering method for very large databases. SIGMOD 96, 103-114, 1996.
- Sudipto Guha, Rejeev Rastogi, and Kyuseok Shim. ROCK: A robust clustering algorithm for categorical attributes, Information Systems 25(5):345-366, 2000.
- George Karypis, Eui-Hong Han, and Vipin Kumar. CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling, COMPUTER 32(8): 68-75, 1999.
- Clustering - Density-Based Methods
- Mihael Ankerst, Markus M Breunig, Hans-Peter Kriegel, Jörg Sander. Optics: Ordering points to identify the clustering structure, SIGMOD 99, 49-60, 1999.
- Alexander Hinneburg, Daniel A. Keim. An Efficient Approach to Clustering in Large Multimedia Databases with Noise, KDD 98, 58-65, 1998. (DENCLUE)
- Association analysis - Apriori algorithm
- R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules. In Proc. of the 20th Int. Conf. on Very Large Databases, 1994. Expanded version available as IBM Research Report RJ9839, 1994.
- Association analysis - FP grow algorithm
- J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In Proc. 2000 ACM-SIGMOD Int. Conf. on Management of Data, 2000.
- Association analysis - Constraints
- J. Pei and J. Han. Can We Push More Constraints into Frequent Pattern Mining?. In Proc. 2000 Int. Conf. on Knowledge Discovery and Data Mining, 2000.
- Association analysis - Causal discovery
- C. Silverstein, S. Brin, R. Motwani, and J. Ullman. Scalable Techniques for Mining Causal Structures. Data Mining and Knowledge Discovery 4, 163-192 (2000). Shorter version available in Proc. of the 24th Int. Conf. on Very Large Databases, 1998.
- Clustering - Partitioning Methods
Page responsible: Patrick Lambrix
Last updated: 2025-01-12