Back to the table of contents|
Coding With Supervised Learners
The GSupervisedLearner class is declared in Learner.h. All classes that inherit from GSupervisedLearner must implement a method named
void train(const GMatrix& features, const GMatrix& labels);and one named
void predict(const GVec& in, GVec& out);. As you might expect, the train method trains the model, and the predict method uses a trained model to make a prediction.
The train method expects two matrices to be passed in as parameters. The first parameter contains the features (or input patterns), and the second parameter contains the corresponding labels (or target outputs). These two matrices are expected to have the same number of rows.
If your data is stored in one table that contains both features and labels, then you will need to divide it into two separate matrices before you call the train method. Here is an example that will load an ARFF file, swap the first column with the last one, then split it into a feature matrix and a label matrix. In this case, the last 2 columns will be used for the label matrix:
GMatrix data; data.loadArff("mydata.arff"); data.swapColumns(1, data.cols() - 1); GDataColSplitter splitter(data, 2); GMatrix& features = splitter.features(); GMatrix& labels = splitter.labels();
Notice that you are not restricted to having one-dimensional labels. Our supervised learning algorithms can implicitly handle labels of arbitrary dimensionality. This is particularly convenient when you need to predict things like pixel colors (which are generally comprised of 3 channel values), or points in n-dimensional space, or control vectors for systems with several knobs and levers, etc.
So, training a model is as simple as calling the train method.
GDecisionTree model; model.train(features, labels);or
GKNN model; model.setNeighborCount(3); model.train(features, labels);etc. For a full list of all of our supervised learning algorithms, take a look in the API docs at the class hierarchy. Expand GTransducer to show all the classes that inherit from it. Then, expand GSupervisedLearner (which inherits from GTransducer) to show all the classes that inherit from it. Also, expand GIncrementalLearner.
To make a prediction using a trained model, just pass one row of features in to the predict method, and the predicted label vector will come out. Example:
GVec out(2); model.predict(features, out);
Note that some learning algorithms may not implicitly support all types of data. This problem can be solved by wrapping the learning algorithm in a filter. A filter is a class that converts the data to a suitable type before passing it to the learning algorithm. Perhaps, the easiest filter to use is GAutoFilter. Example:
GNaiveBayes model; GAutoFilter af(&model, false); af.train(features, labels); // It is okay if features and/or labels contains continuous values, // even though naive Bayes only supports categorical values. The // GAutoFilter class will take care of type conversions as needed. ... af.predict(pattern, prediction);
Back to the table of contents