weka.classifiers
Class Logistic

java.lang.Object
  |
  +--weka.classifiers.Classifier
        |
        +--weka.classifiers.DistributionClassifier
              |
              +--weka.classifiers.Logistic
All Implemented Interfaces:
Cloneable, OptionHandler, Serializable

public class Logistic
extends DistributionClassifier
implements OptionHandler

Class for building and using a two-class logistic regression model with a ridge estimator.

This class utilizes globally convergent Newtons Method adapted from Numerical Recipies in C. Reference: le Cessie, S. and van Houwelingen, J.C. (1997). Ridge Estimators in Logistic Regression. Applied Statistics, Vol. 41, No. 1, pp. 191-201.

Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

Valid options are:

-D
Turn on debugging output.

Version:
$Revision: 1.12 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz)
, Eibe Frank (eibe@cs.waikato.ac.nz) , Tony Voyle (tv6@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  int m_ClassIndex
          The index of the class attribute
protected  boolean m_Debug
          Debugging output
protected  double m_LL
          The log-likelihood of the built model
protected  double m_LLn
          The log-likelihood of the null model
protected  int m_NumPredictors
          The number of attributes in the model
protected  double[] m_Par
          The coefficients of the model
protected  double m_Ridge
          The ridge parameter.
 
Constructor Summary
Logistic()
           
 
Method Summary
 void buildClassifier(Instances train)
          Builds the classifier
protected  double calculateLogLikelihood(double[][] X, double[] Y, Matrix jacobian, double[] deltas)
          Calculates the log likelihood of the current set of coefficients (stored in m_Par), given the data.
 double[] distributionForInstance(Instance instance)
          Computes the distribution for a given instance
protected  double evaluateProbability(double[] instDat)
          Evaluate the probability for this point using the current coefficients
 boolean getDebug()
          Gets whether debugging output will be printed.
 String[] getOptions()
          Gets the current settings of the classifier.
 Enumeration listOptions()
          Returns an enumeration describing the available options
 void lnsrch(int n, double[] xold, double fold, double[] g, double[] p, double[] x, double stpmax, double[][] X, double[] Y)
          Finds a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently.
static void main(String[] argv)
          Main method for testing this class.
protected static double Norm(double z)
          Returns probability.
 void setDebug(boolean debug)
          Sets whether debugging output will be printed.
 void setOptions(String[] options)
          Parses a given list of options.
 String toString()
          Gets a string describing the classifier.
 
Methods inherited from class weka.classifiers.DistributionClassifier
classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_LL

protected double m_LL
The log-likelihood of the built model


m_LLn

protected double m_LLn
The log-likelihood of the null model


m_Par

protected double[] m_Par
The coefficients of the model


m_NumPredictors

protected int m_NumPredictors
The number of attributes in the model


m_ClassIndex

protected int m_ClassIndex
The index of the class attribute


m_Ridge

protected double m_Ridge
The ridge parameter.


m_Debug

protected boolean m_Debug
Debugging output

Constructor Detail

Logistic

public Logistic()
Method Detail

lnsrch

public void lnsrch(int n,
                   double[] xold,
                   double fold,
                   double[] g,
                   double[] p,
                   double[] x,
                   double stpmax,
                   double[][] X,
                   double[] Y)
            throws Exception
Finds a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently.

Parameters:
n - number of variables
xold - old point
fold - value at that point
g - gtradient at that point
p - direction
x - new value along direction p from xold
stpmax - maximum step length
X - instance data
Y - class values
Throws:
Exception - if an error occurs

Norm

protected static double Norm(double z)
Returns probability.


evaluateProbability

protected double evaluateProbability(double[] instDat)
Evaluate the probability for this point using the current coefficients

Parameters:
instDat - the instance data
Returns:
the probability for this instance

calculateLogLikelihood

protected double calculateLogLikelihood(double[][] X,
                                        double[] Y,
                                        Matrix jacobian,
                                        double[] deltas)
Calculates the log likelihood of the current set of coefficients (stored in m_Par), given the data.

Parameters:
X - the instance data
Y - the class values for each instance
jacobian - the matrix which will contain the jacobian matrix after the method returns
deltas - an array which will contain the parameter adjustments after the method returns
Returns:
the log likelihood of the data.

listOptions

public Enumeration listOptions()
Returns an enumeration describing the available options

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options

setOptions

public void setOptions(String[] options)
                throws Exception
Parses a given list of options. Valid options are:

-D
Turn on debugging output.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
Exception - if an option is not supported

getOptions

public String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setDebug

public void setDebug(boolean debug)
Sets whether debugging output will be printed.

Parameters:
debug - true if debugging output should be printed

getDebug

public boolean getDebug()
Gets whether debugging output will be printed.

Returns:
true if debugging output will be printed

buildClassifier

public void buildClassifier(Instances train)
                     throws Exception
Builds the classifier

Specified by:
buildClassifier in class Classifier
Parameters:
train - set of instances serving as training data
Throws:
Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws Exception
Computes the distribution for a given instance

Specified by:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance for which distribution is computed
Returns:
the distribution
Throws:
Exception - if the distribution can't be computed successfully

toString

public String toString()
Gets a string describing the classifier.

Overrides:
toString in class Object
Returns:
a string describing the classifer built.

main

public static void main(String[] argv)
Main method for testing this class.

Parameters:
argv - should contain the command line arguments to the scheme (see Evaluation)