Cluster ranking with an application to mining mailbox networks

Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoëlle S. Maaren, Vladimir Soroka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations


We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters-the integrated cohesion-which is applicable to arbitrary weighted networks. We then present C-Rank: a new cluster ranking algorithm. Given a network with arbitrary pairwise similarity weights, C-Rank creates a list of overlapping clusters and ranks them by their integrated cohesion. We provide extensive theoretical and empirical analysis of C-Rank and show that it is likely to have high precision and recall. Our experiments focus on mining mailbox networks. A mailbox network is an egocentric social network, consisting of contacts with whom an individual exchanges email. Ties among contacts are represented by the frequency of their co-occurrence on message headers. C-Rank is well suited to mine such networks, since they are abundant with overlapping communities of highly variable strengths. We demonstrate the effectiveness of C-Rank on the Enron data set, consisting of 130 mailbox networks.

Original languageEnglish
Title of host publicationProceedings - Sixth International Conference on Data Mining, ICDM 2006
Number of pages12
StatePublished - 1 Dec 2006
Externally publishedYes
Event6th International Conference on Data Mining, ICDM 2006 - Hong Kong, China
Duration: 18 Dec 200622 Dec 2006

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786


Conference6th International Conference on Data Mining, ICDM 2006
CityHong Kong

ASJC Scopus subject areas

  • Engineering (all)


Dive into the research topics of 'Cluster ranking with an application to mining mailbox networks'. Together they form a unique fingerprint.

Cite this