Back to the table of contents

Previous      Next

Looking around

This document hilights some of the more important tools that you will find to work with in this library.

One of the most common operations in machine learning is to optimize something. The GOptimizer class provides a base class for optimization techniques. Some of the classes that inherrit from GOptimizer include GBruteForceSearch, GAnnealing, GEvolutionaryOptimizer, GHillClimber, GParticleSwarm, GRandomSearch, etc. In order to use these classes for optimization, you must create a class that inherits from GTargetFunction. Your task for the target function is to evaluate the error associated with a candidate vector. You can then plug in any of these optimization techniques to try to find a vector that evaluates to have low error for your particular task. Since some optimizers are better suited for some tasks than others, you might as well try them all and go with the one that works the best. (In my experience, evolutionary optimization is used with more applications than it deserves, and hill climbing often produces surprisingly better results than many people expect. Of course, your mileage may vary, depending on the applications to which you apply it.)

Clustering algorithms inherit from the GClusterer class, which inherits from GTransform. These include GAgglomerativeClusterer, GKMeans, and GKMedoids. These classes all take a matrix as input, and return a class id for each row in the matrix.

Non-linear Dimensionality Reduction algorithms inherit from GManifoldLearner, which also inherits from GTransform. These include GIsomap, GLLE, GManifoldSculpting, and a few others. These methods take a matrix as input, and return a corresponding matrix with fewer columns.

Some other particularly useful classes include GPCA, which implements principal component analysis, and GAttributeSelector, which removes the least salient attributes one-at-a-time, until it can identify the attributes that are most salient for predicting the labels.

Collaborative filtering algorithms inherrit from the GCollaborativeFilter class. These classes are used for building a recommendation system, or for intelligently filling in missing values in data.

Neighbor-finding algorithms inherrit from the GNeighborFinder class. These include GBruteForceNeighborFinder, GKdTree (which finds the nearest neighbors more efficiently), and a few algorithms for intelligently selecting neighbors according to various criteria.

Several graph-based algorithms are provided in the GGraphCut, GFloydWarshall, GDijkstra, GBrandesBetweennessCentrality, and GCycleCut classes.

The GRand class is a particularly useful pseudo-random number generator. It provides methods to draw random values from a plethora of distributions. Implementations of various common statistical distributions are also implemented. These inherrit from GDistribution.

The GPlotWindow and GImage classes are particularly useful for creating visualizations of data.

Many other useful classes are provided for a variety of specific machine learning operations. For a complete list of implemented algorithms, see the API documentation.

Previous      Next

Back to the table of contents