All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.classifiers.ThresholdSelector

java.lang.Object
   |
   +----weka.classifiers.Classifier
           |
           +----weka.classifiers.DistributionClassifier
                   |
                   +----weka.classifiers.ThresholdSelector

public class ThresholdSelector
extends DistributionClassifier
implements OptionHandler
Class for selecting a threshold on a probability output by a distribution classifier. The threshold is set so that a given performance measure is optimized. Currently this is the F-measure. Performance is measured either on the training data, a hold-out set or using cross-validation. In addition, the probabilities returned by the base learner can have their range expanded so that the output probabilities will reside between 0 and 1 (this is useful if the scheme normally produces probabilities in a very narrow range).

Valid options are:

-C num
The class for which threshold is determined. Valid values are: 1, 2 (for first and second classes, respectively), 3 (for whichever class is least frequent), 4 (for whichever class value is most frequent), and 5 (for the first class named any of "yes","pos(itive)", "1", or method 3 if no matches). (default 5).

-W classname
Specify the full class name of the base classifier.

-X num
Number of folds used for cross validation. If just a hold-out set is used, this determines the size of the hold-out set (default 3).

-R integer
Sets whether confidence range correction is applied. This can be used to ensure the confidences range from 0 to 1. Use 0 for no range correction, 1 for correction based on the min/max values seen during threshold selection (default 0).

-S seed
Random number seed (default 1).

-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation, 1 for evaluation using hold-out set, and 2 for evaluation on the training data (default 1).

Options after -- are passed to the designated sub-classifier.

Author:
Eibe Frank (eibe@cs.waikato.ac.nz)

Variable Index

 o EVAL_CROSS_VALIDATION
 o EVAL_TRAINING_SET
 o EVAL_TUNED_SPLIT
 o OPTIMIZE_0
 o OPTIMIZE_1
 o OPTIMIZE_LFREQ
 o OPTIMIZE_MFREQ
 o OPTIMIZE_POS_NAME
 o RANGE_BOUNDS
 o RANGE_NONE
 o TAGS_EVAL
 o TAGS_OPTIMIZE
 o TAGS_RANGE

Constructor Index

 o ThresholdSelector()

Method Index

 o buildClassifier(Instances)
Generates the classifier.
 o designatedClassTipText()
 o distributionClassifierTipText()
 o distributionForInstance(Instance)
Calculates the class membership probabilities for the given test instance.
 o evaluationModeTipText()
 o getDesignatedClass()
Gets the method to determine which class value to optimize.
 o getDistributionClassifier()
Get the DistributionClassifier used as the classifier.
 o getEvaluationMode()
Gets the evaluation mode used.
 o getNumXValFolds()
Get the number of folds used for cross-validation.
 o getOptions()
Gets the current settings of the Classifier.
 o getRangeCorrection()
Gets the confidence range correction mode used.
 o getSeed()
Gets the random number seed.
 o globalInfo()
 o listOptions()
Returns an enumeration describing the available options
 o main(String[])
Main method for testing this class.
 o numXValFoldsTipText()
 o rangeCorrectionTipText()
 o seedTipText()
 o setDesignatedClass(SelectedTag)
Sets the method to determine which class value to optimize.
 o setDistributionClassifier(DistributionClassifier)
Set the DistributionClassifier for which threshold is set.
 o setEvaluationMode(SelectedTag)
Sets the evaluation mode used.
 o setNumXValFolds(int)
Set the number of folds used for cross-validation.
 o setOptions(String[])
Parses a given list of options.
 o setRangeCorrection(SelectedTag)
Sets the confidence range correction mode used.
 o setSeed(int)
Sets the seed for random number generation.
 o toString()
Returns description of the cross-validated classifier.

Variables

 o RANGE_NONE
 public static final int RANGE_NONE
 o RANGE_BOUNDS
 public static final int RANGE_BOUNDS
 o TAGS_RANGE
 public static final Tag TAGS_RANGE[]
 o EVAL_TRAINING_SET
 public static final int EVAL_TRAINING_SET
 o EVAL_TUNED_SPLIT
 public static final int EVAL_TUNED_SPLIT
 o EVAL_CROSS_VALIDATION
 public static final int EVAL_CROSS_VALIDATION
 o TAGS_EVAL
 public static final Tag TAGS_EVAL[]
 o OPTIMIZE_0
 public static final int OPTIMIZE_0
 o OPTIMIZE_1
 public static final int OPTIMIZE_1
 o OPTIMIZE_LFREQ
 public static final int OPTIMIZE_LFREQ
 o OPTIMIZE_MFREQ
 public static final int OPTIMIZE_MFREQ
 o OPTIMIZE_POS_NAME
 public static final int OPTIMIZE_POS_NAME
 o TAGS_OPTIMIZE
 public static final Tag TAGS_OPTIMIZE[]

Constructors

 o ThresholdSelector
 public ThresholdSelector()

Methods

 o listOptions
 public Enumeration listOptions()
Returns an enumeration describing the available options

Returns:
an enumeration of all the available options
 o setOptions
 public void setOptions(String options[]) throws Exception
Parses a given list of options. Valid options are:

-C num
The class for which threshold is determined. Valid values are: 1, 2 (for first and second classes, respectively), 3 (for whichever class is least frequent), 4 (for whichever class value is most frequent), and 5 (for the first class named any of "yes","pos(itive)", "1", or method 3 if no matches). (default 3).

-W classname
Specify the full class name of classifier to perform cross-validation selection on.

-X num
Number of folds used for cross validation. If just a hold-out set is used, this determines the size of the hold-out set (default 3).

-R integer
Sets whether confidence range correction is applied. This can be used to ensure the confidences range from 0 to 1. Use 0 for no range correction, 1 for correction based on the min/max values seen during threshold selection (default 0).

-S seed
Random number seed (default 1).

-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation, 1 for evaluation using hold-out set, and 2 for evaluation on the training data (default 1).

Options after -- are passed to the designated sub-classifier.

Parameters:
options - the list of options as an array of strings
Throws: Exception
if an option is not supported
 o getOptions
 public String[] getOptions()
Gets the current settings of the Classifier.

Returns:
an array of strings suitable for passing to setOptions
 o buildClassifier
 public void buildClassifier(Instances instances) throws Exception
Generates the classifier.

Parameters:
instances - set of instances serving as training data
Throws: Exception
if the classifier has not been generated successfully
Overrides:
buildClassifier in class Classifier
 o distributionForInstance
 public double[] distributionForInstance(Instance instance) throws Exception
Calculates the class membership probabilities for the given test instance.

Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws: Exception
if instance could not be classified successfully
Overrides:
distributionForInstance in class DistributionClassifier
 o globalInfo
 public String globalInfo()
Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui
 o designatedClassTipText
 public String designatedClassTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o getDesignatedClass
 public SelectedTag getDesignatedClass()
Gets the method to determine which class value to optimize. Will be one of OPTIMIZE_0, OPTIMIZE_1, OPTIMIZE_LFREQ, OPTIMIZE_MFREQ, OPTIMIZE_POS_NAME.

Returns:
the class selection mode.
 o setDesignatedClass
 public void setDesignatedClass(SelectedTag newMethod)
Sets the method to determine which class value to optimize. Will be one of OPTIMIZE_0, OPTIMIZE_1, OPTIMIZE_LFREQ, OPTIMIZE_MFREQ, OPTIMIZE_POS_NAME.

Parameters:
newMethod - the new class selection mode.
 o evaluationModeTipText
 public String evaluationModeTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setEvaluationMode
 public void setEvaluationMode(SelectedTag newMethod)
Sets the evaluation mode used. Will be one of EVAL_TRAINING, EVAL_TUNED_SPLIT, or EVAL_CROSS_VALIDATION

Parameters:
newMethod - the new evaluation mode.
 o getEvaluationMode
 public SelectedTag getEvaluationMode()
Gets the evaluation mode used. Will be one of EVAL_TRAINING, EVAL_TUNED_SPLIT, or EVAL_CROSS_VALIDATION

Returns:
the evaluation mode.
 o rangeCorrectionTipText
 public String rangeCorrectionTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setRangeCorrection
 public void setRangeCorrection(SelectedTag newMethod)
Sets the confidence range correction mode used. Will be one of RANGE_NONE, or RANGE_BOUNDS

Parameters:
newMethod - the new correciton mode.
 o getRangeCorrection
 public SelectedTag getRangeCorrection()
Gets the confidence range correction mode used. Will be one of RANGE_NONE, or RANGE_BOUNDS

Returns:
the confidence correction mode.
 o seedTipText
 public String seedTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setSeed
 public void setSeed(int seed)
Sets the seed for random number generation.

Parameters:
seed - the random number seed
 o getSeed
 public int getSeed()
Gets the random number seed.

Returns:
the random number seed
 o numXValFoldsTipText
 public String numXValFoldsTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o getNumXValFolds
 public int getNumXValFolds()
Get the number of folds used for cross-validation.

Returns:
the number of folds used for cross-validation.
 o setNumXValFolds
 public void setNumXValFolds(int newNumFolds)
Set the number of folds used for cross-validation.

Parameters:
newNumFolds - the number of folds used for cross-validation.
 o distributionClassifierTipText
 public String distributionClassifierTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui
 o setDistributionClassifier
 public void setDistributionClassifier(DistributionClassifier newClassifier)
Set the DistributionClassifier for which threshold is set.

Parameters:
newClassifier - the Classifier to use.
 o getDistributionClassifier
 public DistributionClassifier getDistributionClassifier()
Get the DistributionClassifier used as the classifier.

Returns:
the classifier used as the classifier
 o toString
 public String toString()
Returns description of the cross-validated classifier.

Returns:
description of the cross-validated classifier as a string
Overrides:
toString in class Object
 o main
 public static void main(String argv[])
Main method for testing this class.

Parameters:
argv - the options

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home