Clustering

Subnetwork Clustering program clusters list of subnetworks using hierarchical clustering. It first finds unique genes in a gene set file (*.gmt). Then, each sub-network is encoded as a binary string whose bit indicates the existence of a certain gene of the unique genes. It groups similar sub-networks to the clusters of given depth.
Usage : cluster [OPTION]
Options:
    -g           gmt file
    -m           [BrayCurtis | Chord | Dot | Euclidean | Hellinger | Intersection | Manhattan | MeanManhattan | PatternDifference | ShapedDifference | SizedDifference | Vari]
    -l           [Average | Centroid | Complete]
    -d           depth
    -o           output prefix
    -v           verbose
Example: cluster -g subnet01.gmt -m Euclidean -l Average -d 4 -v 1 -o subnet01_cluster

Input

Gene set file contains sub-networks formatted below:
Name Description Gene1 Gene2 Gene3...
For example,
Subnet1	19.7322948871072	1030	1030	2494	2494	6387	6387	8850	10419
Subnet2	19.6456517089056	1019	2494	2494	10419
Subnet3	19.4627028700683	1019	1030	1030	2494	2494	6387	6387	8850	10419
Subnet4	19.2645660741764	1030	1030	2494	6387	6387	8850	10419
Subnet5	19.2327236850464	1030	1030	2494	2494	6387	6387	8850	10018	10419
Subnet6	19.1755896772945	1030	2494	2494	8850	10419
...

Output

If depth is 1, it will show two clusters with verbose = 1 option.
96 (6)  Subnet23, Subnet32, Subnet36, Subnet29, Subnet35, Subnet38
97 (44) Subnet2, Subnet49, Subnet45, Subnet26, Subnet7, Subnet13, ...

The first cluster named "96" contains 6 sub-networks. The second cluster named "97" has 44 sub-networks. Each cluster will be saved in a separated gene set file.

Last edited Feb 8, 2013 at 4:53 AM by yongkeecho, version 4

Comments

No comments yet.