Machine Learning (3p)
Introduction and Application for Automated Performance Tuning
EXERCISES FOR SELF-ASSESSMENT - HT/2012
- The complexity of most learning algorithms depends on the training set (size).
Suggest a filtering algorithm that eliminates redundant instances.
- The sum-of-difference-squares used as error function is popular but
not robust against outliers, and other error functions are possible.
Suggest an error function that is robust against outliers.
- Show that the VC dimension of a line is 3.
To be extended.
- In a univariate decision tree, numeric properties are often tested
by a binary split (comparison of a single variable to a threshold value).
Discuss the advantages and
disadvantages of using ternary, quaternary etc. splits instead.
- Discuss advantages and disadvantages of using nonlinear split functions
in multivariate decision trees, compared to linear ones (as presented).
- Extend the greedy decision tree creation algorithm of the lecture
by backtracking to compute optimal decision trees for the given training set.
This page is maintained by Christoph Kessler.