Hostname: page-component-5c6d5d7d68-wtssw Total loading time: 0 Render date: 2024-08-17T04:17:50.344Z Has data issue: false hasContentIssue false

Efficient detection of communities with significant overlaps in networks: Partial community merger algorithm

Published online by Cambridge University Press:  20 November 2017

ELVIS H. W. XU
Affiliation:
Department of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China (e-mail: hwxu@phy.cuhk.edu.hk, pmhui@phy.cuhk.edu.hk)
PAK MING HUI
Affiliation:
Department of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China (e-mail: hwxu@phy.cuhk.edu.hk, pmhui@phy.cuhk.edu.hk)

Abstract

Detecting communities in large-scale social networks is a challenging task where each vertex may belong to multiple communities. Such behavior of vertices and the implied strong overlaps among communities render many detection algorithms invalid. We develop a Partial Community Merger Algorithm (PCMA) for detecting communities with significant overlaps as well as slightly overlapping and disjoint ones. It is a bottom-up approach based on properly reassembling partial information of communities revealed in ego networks of vertices to reconstruct complete communities. We propose a novel similarity measure of communities and an efficient merger process to address the two key issues—noise control and merger order—in implementing this approach. PCMA is tested against two benchmarks and overall it outperforms all compared algorithms in both accuracy and efficiency. It is applied to two huge online social networks, Friendster and Sina Weibo. Millions of communities are detected and they are of higher qualities than the corresponding metadata groups. We find that the latter should not be regarded as the ground-truth of structural communities. The significant overlapping pattern found in the detected communities confirms the need of new algorithms, such as PCMA, to handle multiple memberships of vertices in social networks.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ahn, Y.-Y., Bagrow, J. P., & Lehmann, S. (2010). Link communities reveal multiscale complexity in networks. Nature, 466 (7307), 761764.CrossRefGoogle ScholarPubMed
Ball, B., Karrer, B., & Newman, M. E. J. (2011). Efficient and principled method for detecting communities in networks. Physical Review E, 84 (3), 036103.CrossRefGoogle ScholarPubMed
Baumes, J., Goldberg, M., Krishnamoorthy, M., Magdon-Ismail, M., & Preston, N. (2005 Feb.). Finding communities by clustering a graph into overlapping subgraphs. In Proceedings of the IADIS International Conference on Applied Computing, pp. 615–623.Google Scholar
Bianconi, G., Pin, P., & Marsili, M. (2009). Assessing the relevance of node features for network structure. Proceedings of the National Academy of Sciences of the United States of America, 106 (28), 1143311438.CrossRefGoogle ScholarPubMed
Condon, A., & Karp, R. M. (2001). Algorithms for graph partitioning on the planted partition model. Random Structures and Algorithms, 18 (2), 116140.3.0.CO;2-2>CrossRefGoogle Scholar
Coscia, M., Rossetti, G., Giannotti, F., & Pedreschi, D. (2012). DEMON: A local-first discovery method for overlapping communities. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, pp. 615–623.CrossRefGoogle Scholar
Coscia, M., Rossetti, G., Giannotti, F., & Pedreschi, D. (2014). Uncovering hierarchical and overlapping communities with a local-first approach. ACM Transactions on Knowledge Discovery from Data, 9 (1), 127.CrossRefGoogle Scholar
Danon, L., Díaz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005 (9), P09008.CrossRefGoogle Scholar
Evans, T. S., & Lambiotte, R. (2009). Line graphs, link partitions, and overlapping communities. Physical Review E, 80 (1), 016105.CrossRefGoogle ScholarPubMed
Evans, T. S., & Lambiotte, R. (2010). Line graphs of weighted networks for overlapping communities. The European Physical Journal B, 77 (2), 265272.CrossRefGoogle Scholar
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486 (3–5), 75174.CrossRefGoogle Scholar
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99 (12), 78217826.CrossRefGoogle ScholarPubMed
Gregory, S. (2010). Finding overlapping communities in networks by label propagation. New Journal of Physics, 12 (10), 103018.CrossRefGoogle Scholar
Gregory, S. (2011). Fuzzy overlapping communities in networks. Journal of Statistical Mechanics: Theory and Experiment, 2011 (2), P02017.CrossRefGoogle Scholar
Hric, D., Darst, R. K., & Fortunato, S. (2014). Community detection in networks: Structural communities versus ground truth. Physical Review E, 90 (6), 062805.CrossRefGoogle ScholarPubMed
Lancichinetti, A., Fortunato, S., & Kertész, J. (2009). Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 11 (3), 033015.CrossRefGoogle Scholar
Lancichinetti, A., Fortunato, S., & Radicchi, F. (2008). Benchmark graphs for testing community detection algorithms. Physical Review E, 78 (4), 046110.CrossRefGoogle ScholarPubMed
Lancichinetti, A., Radicchi, F., & Ramasco, J. J. (2010). Statistical significance of communities in networks. Physical Review E, 81 (4), 046110.CrossRefGoogle ScholarPubMed
Lancichinetti, A., Radicchi, F., Ramasco, J. J, & Fortunato, S. (2011). Finding statistically significant communities in networks. PLoS ONE, 6 (4), e18961.CrossRefGoogle ScholarPubMed
Leskovec, J., & Krevl, A. (2014 June). SNAP Datasets: Stanford large network dataset collection. Retrieved from http://snap.stanford.edu/data.Google Scholar
Murray, G., Carenini, G., & Ng, R. (2012 June). Using the omega index for evaluating abstractive community detection. In Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization.Google Scholar
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435 (7043), 814818.CrossRefGoogle ScholarPubMed
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101 (9), 26582663.CrossRefGoogle ScholarPubMed
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76 (3), 036106.CrossRefGoogle ScholarPubMed
Rees, B. S., & Gallagher, K. B. (2013). EgoClustering: Overlapping community detection via merged friendship-groups. In Özyer, T., Rokne, J., Wagner, G., & Reuser, A. (Eds.), The influence of technology on social network analysis and mining (pp. 120). Vienna: Springer Vienna.Google Scholar
Seidman, S. B. (1983). Internal cohesion of ls sets in graphs. Social Networks, 5 (2), 97107.CrossRefGoogle Scholar
Soundarajan, S., & Hopcroft, J. E. (2015). Use of local group information to identify communities in networks. ACM Transactions on Knowledge Discovery from Data, 9 (3), 127.CrossRefGoogle Scholar
Xie, J., Kelley, S., & Szymanski, B. K. (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys, 45 (4), 135.CrossRefGoogle Scholar
Xie, J., Szymanski, B. K., & Liu, X. (2011). SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, IEEE pp. 344–349.CrossRefGoogle Scholar
Xu, E. H. W. (2016). Retrieved from https://github.com/hwxu/pcma.Google Scholar
Yang, J., & Leskovec, J. (2013a). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42 (1), 181213.CrossRefGoogle Scholar
Yang, J., & Leskovec, J. (2013b). Overlapping community detection at scale. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining, (pp. 587–596). ACM Press.CrossRefGoogle Scholar
Yang, J., & Leskovec, J. (2014). Structure and overlaps of ground-truth communities in networks. ACM Transactions on Intelligent Systems and Technology, 5 (2), 135.CrossRefGoogle Scholar
Zhang, P. (2015). Evaluating accuracy of community detection using the relative normalized mutual information. Journal of Statistical Mechanics: Theory and Experiment, 2015 (11), P11006.CrossRefGoogle Scholar