MatArray Toolbox
  Go to function:
    Search    Help Desk 
kmeans    Examples   See Also

K-means clustering.

Syntax

Description

The kmeans function performs a K-means clustering of the columns of the matrix M in nc clusters. The result consists of two parts: the nc clustering prototypes proto and the cluster memberships mship. The K-means error, that is the sum of the square distances between each column of M and its cluster prototype, is stored in er. This is the criterion the K-means algorithm tries to minimize.

beg is the starting point for the algorithm. If it is a line vector, it is taken as the starting mship, otherwise it is taken as the starting proto. In the case beg is not given, or its length is zero, a random mship is drawn and the algorithm starts from there.

If there is a fourth argument, a slightly modified version of the algorithm is used to try to improve the solution (see Reference). The modification is cheap and cannot hurt, but since it is not classical it is given as an option (but we believe you should use it).

In the original version, the function uses dl2c to calculate the distances between the items and the prototypes. In the case there are missing values, you should replace dl2c by dl2cNaN and mean by nanmean in the function code.

Examples

Re-clustering starting from this result gives the same error, as expected. However, using the fancy algorithm the result is significantly improved. This is not systematic, but the error cannot raise and the time penality is small.

See Also

dl2c, dl2cNaN, hierarc

References



[ Previous | Help Desk | Next ]