Getting Started with MatArray     Search    Reference 

Clustering

Clustering, especially hierarchical clustering, has proven itself one of the most useful tool to deal with microarray data. The toolbox offers efficient implementations of the most usual algorithms.

Hierachical clustering

The most common framework to analyze microarray data is hierarchical clustering. The toolbox offers a fast implementation of these algorithms as a mex-file, hierarc.

The position of the "left" and "right" arms at each node in a hierarchical clustering is arbitrary. One way to choose a clustering is to search for the ordering for which the sum of the distances between adjacent leaves is minimal. This can be calculated using the mex-file orderleaves.

If you have the statistic toolbox installed and want to use clustering generated in one toolbox on the other, two functions are available for the translation: clustM2S and clustS2M.

Visualization

A hierarchical clustering can be visualized in two different ways: in the command window, for a quick-and-dirty first look using disptree, or it can be exported for visualization with TreeView using clustTV.

TreeView is the program developed by Michael Eisen which was used to generate the typical microarray pictures we all have seen. It can be found here.

K-means clustering

A classical clustering algorithm is the K-means clustering. An efficient implementation of this algorithm has been done, with a small quirk added for better results. See the reference on kmeans for details.

[ Previous | Reference | Next ]