| MatArray Toolbox | Search  Help Desk |
| orderleaves | Examples See Also |
hierarc so that the sum of the
distances between adjacent leaves is minimal.
Syntax
res = orderleaves(tree, dist)
Description
The hierarchical clusterings generated byhierarc
are an arbitrary choice amongst a family of similar clusterings,
obtained by permutation of the left and right leaves at any nodes.
orderleaves searches for the clustering similar to
tree for which the sum of the distances between
adjacent leaves,
as determined by the distance matrix dist, is minimal.
The C code is based on the program of Ziv Bar-Joseph, see the
reference for details.
The functions prints the initial sum of the distances, the optimal sum of distances and the improvement as a percentage in the command window.
Note: the function is implemented as a mex-file, and as such cannot be stopped once it has started. You should make sure you are ready to wait before you launch it. As a first approximation, with 1000 leaves the routine takes about 10secs and uses about 100Megs of RAM. The time complexity is O(n^3) and the space complexity is O(n^2). Make especially sure enough RAM is available, or be prepared to either wait for ages or kill MatLab.
Examples
d = sqrt(dl2c(randn(5,5)));
tree = hierarc(d,1);
disptree(tree);
|
---------
| |
------- 5
| |
1 ------
| |
---- 4
| |
2 3
Initial distance:
d(1,2) + d(2,3) + d(3,4) + d(4,5)
ans =
7.5382
Rearrange the tree in order to minimize this distance:
tree = orderleaves(tree,d); Initial distance : 7.53822 Optimal distance : 7.28541 Improvement : 3.4%The initial distance corresponds to the distance calculated before. The optimal distance can be verified on the new tree:
disptree(tree);
|
---------
| |
------- 5
| |
----- 1
| |
4 ----
| |
3 2
d(4,3) + d(3,2) + d(2,1) + d(1,5)
ans =
7.2854
See Also
clustTV,
disptree,
dl2c,
hierarc
References
[1] Bar-Joseph, Z., Gifford, D.K. and Jaakkola, T.S., "Fast optimal leaf ordering for hierarchical clustering," Proc. 9th ISMB, Bioinformatics, 17, S22-S29 (2001)