All Packages Class Hierarchy This Package Previous Next Index WEKA's home
Class weka.classifiers.MetaCost
java.lang.Object
|
+----weka.classifiers.Classifier
|
+----weka.classifiers.MetaCost
- public class MetaCost
- extends Classifier
- implements OptionHandler
This metaclassifier makes its base classifier cost-sensitive using the
method specified in
Pedro Domingos (1999). MetaCost: A general method for making classifiers
cost-sensitive, Proceedings of the Fifth International Conference on
Knowledge Discovery and Data Mining, pp. 155-164. Also available online at
http://www.cs.washington.edu/homes/pedrod/kdd99.ps.gz.
This classifier should produce similar results to one created by
passing the base learner to Bagging, which is in turn passed to a
CostSensitiveClassifier operating on minimum expected cost. The difference
is that MetaCost produces a single cost-sensitive classifier of the
base learner, giving the benefits of fast classification and interpretable
output (if the base learner itself is interpretable). This implementation
uses all bagging iterations when reclassifying training data (the MetaCost
paper reports a marginal improvement when only those iterations containing
each training instance are used in reclassifying that instance).
Valid options are:
-W classname
Specify the full class name of a classifier (required).
-C cost file
File name of a cost matrix to use. If this is not supplied, a cost
matrix will be loaded on demand. The name of the on-demand file
is the relation name of the training data plus ".cost", and the
path to the on-demand file is specified with the -D option.
-D directory
Name of a directory to search for cost files when loading costs on demand
(default current directory).
-I num
Set the number of bagging iterations (default 10).
-S seed
Random number seed used when reweighting by resampling (default 1).
-P num
Size of each bag, as a percentage of the training size (default 100).
Options after -- are passed to the designated classifier.
- Author:
- Len Trigg (len@intelligenesis.net)
-
MATRIX_ON_DEMAND
-
-
MATRIX_SUPPLIED
-
-
TAGS_MATRIX_SOURCE
-
-
MetaCost()
-
-
buildClassifier(Instances)
- Builds the model of the base learner.
-
classifyInstance(Instance)
- Classifies a given test instance.
-
getBagSizePercent()
- Gets the size of each bag, as a percentage of the training set size.
-
getClassifier()
- Gets the distribution classifier used.
-
getCostMatrix()
- Gets the misclassification cost matrix.
-
getCostMatrixSource()
- Gets the source location method of the cost matrix.
-
getNumIterations()
- Gets the number of bagging iterations
-
getOnDemandDirectory()
- Returns the directory that will be searched for cost files when
loading on demand.
-
getOptions()
- Gets the current settings of the Classifier.
-
getSeed()
- Get seed for resampling.
-
listOptions()
- Returns an enumeration describing the available options
-
main(String[])
- Main method for testing this class.
-
setBagSizePercent(int)
- Sets the size of each bag, as a percentage of the training set size.
-
setClassifier(Classifier)
- Sets the distribution classifier
-
setCostMatrix(CostMatrix)
- Sets the misclassification cost matrix.
-
setCostMatrixSource(SelectedTag)
- Sets the source location of the cost matrix.
-
setNumIterations(int)
- Sets the number of bagging iterations
-
setOnDemandDirectory(File)
- Sets the directory that will be searched for cost files when
loading on demand.
-
setOptions(String[])
- Parses a given list of options.
-
setSeed(int)
- Set seed for resampling.
-
toString()
- Output a representation of this classifier
MATRIX_ON_DEMAND
public static final int MATRIX_ON_DEMAND
MATRIX_SUPPLIED
public static final int MATRIX_SUPPLIED
TAGS_MATRIX_SOURCE
public static final Tag TAGS_MATRIX_SOURCE[]
MetaCost
public MetaCost()
listOptions
public Enumeration listOptions()
- Returns an enumeration describing the available options
- Returns:
- an enumeration of all the available options
setOptions
public void setOptions(String options[]) throws Exception
- Parses a given list of options. Valid options are:
-W classname
Specify the full class name of a classifier (required).
-C cost file
File name of a cost matrix to use. If this is not supplied, a cost
matrix will be loaded on demand. The name of the on-demand file
is the relation name of the training data plus ".cost", and the
path to the on-demand file is specified with the -D option.
-D directory
Name of a directory to search for cost files when loading costs on demand
(default current directory).
-I num
Set the number of bagging iterations (default 10).
-S seed
Random number seed used when reweighting by resampling (default 1).
-P num
Size of each bag, as a percentage of the training size (default 100).
Options after -- are passed to the designated classifier.
- Parameters:
- options - the list of options as an array of strings
- Throws: Exception
- if an option is not supported
getOptions
public String[] getOptions()
- Gets the current settings of the Classifier.
- Returns:
- an array of strings suitable for passing to setOptions
getCostMatrixSource
public SelectedTag getCostMatrixSource()
- Gets the source location method of the cost matrix. Will be one of
MATRIX_ON_DEMAND or MATRIX_SUPPLIED.
- Returns:
- the cost matrix source.
setCostMatrixSource
public void setCostMatrixSource(SelectedTag newMethod)
- Sets the source location of the cost matrix. Values other than
MATRIX_ON_DEMAND or MATRIX_SUPPLIED will be ignored.
- Parameters:
- newMethod - the cost matrix location method.
getOnDemandDirectory
public File getOnDemandDirectory()
- Returns the directory that will be searched for cost files when
loading on demand.
- Returns:
- The cost file search directory.
setOnDemandDirectory
public void setOnDemandDirectory(File newDir)
- Sets the directory that will be searched for cost files when
loading on demand.
- Parameters:
- newDir - The cost file search directory.
setClassifier
public void setClassifier(Classifier classifier)
- Sets the distribution classifier
- Parameters:
- classifier - the distribution classifier with all options set.
getClassifier
public Classifier getClassifier()
- Gets the distribution classifier used.
- Returns:
- the classifier
getBagSizePercent
public int getBagSizePercent()
- Gets the size of each bag, as a percentage of the training set size.
- Returns:
- the bag size, as a percentage.
setBagSizePercent
public void setBagSizePercent(int newBagSizePercent)
- Sets the size of each bag, as a percentage of the training set size.
- Parameters:
- newBagSizePercent - the bag size, as a percentage.
setNumIterations
public void setNumIterations(int numIterations)
- Sets the number of bagging iterations
getNumIterations
public int getNumIterations()
- Gets the number of bagging iterations
- Returns:
- the maximum number of bagging iterations
getCostMatrix
public CostMatrix getCostMatrix()
- Gets the misclassification cost matrix.
- Returns:
- the cost matrix
setCostMatrix
public void setCostMatrix(CostMatrix newCostMatrix)
- Sets the misclassification cost matrix.
- Parameters:
- the - cost matrix
setSeed
public void setSeed(int seed)
- Set seed for resampling.
- Parameters:
- seed - the seed for resampling
getSeed
public int getSeed()
- Get seed for resampling.
- Returns:
- the seed for resampling
buildClassifier
public void buildClassifier(Instances data) throws Exception
- Builds the model of the base learner.
- Parameters:
- data - the training data
- Throws: Exception
- if the classifier could not be built successfully
- Overrides:
- buildClassifier in class Classifier
classifyInstance
public double classifyInstance(Instance instance) throws Exception
- Classifies a given test instance.
- Parameters:
- instance - the instance to be classified
- Throws: Exception
- if instance could not be classified
successfully
- Overrides:
- classifyInstance in class Classifier
toString
public String toString()
- Output a representation of this classifier
- Overrides:
- toString in class Object
main
public static void main(String argv[])
- Main method for testing this class.
- Parameters:
- argv - should contain the following arguments:
-t training file [-T test file] [-c class index]
All Packages Class Hierarchy This Package Previous Next Index WEKA's home