A network community detection method with integration of data from multiple layers and node attributes

Hannu Reittu; Lasse Leskelä; Tomi Räty

doi:10.1017/nws.2023.2

A network community detection method with integration of data from multiple layers and node attributes

Published online by Cambridge University Press: 07 March 2023

Hannu Reittu

Lasse Leskelä and

Tomi Räty

Show author details

Hannu Reittu*: Affiliation:
VTT Technical Research Centre of Finland, Espoo, Finland
Lasse Leskelä: Affiliation:
School of Science, Department of Mathematics and System Analysis, Aalto University, Espoo, Finland
Tomi Räty: Affiliation:
Microsoft, One Microsoft Way, Redmond, WA, USA
*: *Corresponding author. Email: hannu.reittu@vtt.fi

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Multilayer networks are in the focus of the current complex network study. In such networks, multiple types of links may exist as well as many attributes for nodes. To fully use multilayer—and other types of complex networks in applications, the merging of various data with topological information renders a powerful analysis. First, we suggest a simple way of representing network data in a data matrix where rows correspond to the nodes and columns correspond to the data items. The number of columns is allowed to be arbitrary, so that the data matrix can be easily expanded by adding columns. The data matrix can be chosen according to targets of the analysis and may vary a lot from case to case. Next, we partition the rows of the data matrix into communities using a method which allows maximal compression of the data matrix. For compressing a data matrix, we suggest to extend so-called regular decomposition method for non-square matrices. We illustrate our method for several types of data matrices, in particular, distance matrices, and matrices obtained by augmenting a distance matrix by a column of node degrees, or by concatenating several distance matrices corresponding to layers of a multilayer network. We illustrate our method with synthetic power-law graphs and two real networks: an Internet autonomous systems graph and a world airline graph. We compare the outputs of different community recovery methods on these graphs and discuss how incorporating node degrees as a separate column to the data matrix leads our method to identify community structures well-aligned with tiered hierarchical structures commonly encountered in complex scale-free networks.

Keywords

multiplex networks community detection information criteria power-law graphs graph distance matrix

Type: Research Article
Information: Network Science , Volume 11 , Issue 3 , September 2023 , pp. 374 - 396

DOI: https://doi.org/10.1017/nws.2023.2 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Action Editor: Matteo Magnani

References

Abbe, E. (2017). Community detection and stochastic block models: Recent developments. Journal of Machine Learning Research, 18(1), 6446–6531.Google Scholar

Avrachenkov, K., Dreveton, M., & Leskelä, L. (2022). Community recovery in non-binary and temporal stochastic block models. Retrieved from https://arxiv.org/abs/2008.04790Google Scholar

Bang-Jensen, J., & Gutin, G. Z. (2009). Digraphs: Theory, algorithms and applications. Springer.10.1007/978-1-84800-998-1CrossRef Google Scholar

Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.10.1126/science.286.5439.509CrossRef Google Scholar PubMed

Bhamidi, S., van der Hofstad, R., & Hooghiemstra, G. (2017). Universality for first passage percolation on sparse random graphs. Annals of Probability, 45(4), 2568–2630.10.1214/16-AOP1120CrossRef Google Scholar

Bhattacharyya, S., & Bickel, P. (2014). Community detection in networks using graph distance. In Networks with community structure workshop, Eurandom 2014 (pp. 40–46). Eurandom.Google Scholar

Bolla, M. (2013). Spectral clustering and biclustering: Learning large graphs and contingency tables. West Sussex, UK: John Wiley and Sons, 1–268.10.1002/9781118650684CrossRef Google Scholar

Chen, Q., Chang, H., Govindan, R., Jamin, S., Shenker, S., & Willinger, W. (2002). The origin of power laws in internet topologies revisited. In INFOCOM 2002. 22st annual joint conference of the IEEE computer and communications societies, IEEE (pp. 608–617).Google Scholar

Contisciani, M., Power, E. A., & Bacco, C. D. (2020). Community detection with node attributes in multilayer networks. Scientific Reports, 10(15736), 1–16.10.1038/s41598-020-72626-yCrossRef Google Scholar PubMed

Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). New York: John Wiley and Sons Inc.Google Scholar

De Domenico, M., Lancichinetti, A., Arenas, A., & Rosvall, M. (2015). Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Physical Review X, 5(1), 1–11.10.1103/PhysRevX.5.011027CrossRef Google Scholar

Fajardo-Fontiveros, O., Guimerà, R., & Sales-Pardo, M. (2022). Node metadata can produce predictability crossovers in network inference problems. Physical Review X, 12(1), 011010.10.1103/PhysRevX.12.011010CrossRef Google Scholar

Faloutsos, M., Faloutsos, P., & Faloutsos, C. (1999). On power-law relationships of the internet topology. In Proceedings of the conference on applications, technologies, architectures, and protocols for computer communication, SIGCOMM ’99, New York, NY, USA: Association for Computing Machinery, (pp. 251–262).Google Scholar

Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3-5), 75–174.10.1016/j.physrep.2009.11.002CrossRef Google Scholar

Gastner, M. T., & Newman, M. E. J. (2006). The spatial structure of networks. European Physical Journal B, 49(2), 247–252.10.1140/epjb/e2006-00046-8CrossRef Google Scholar

Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.10.1073/pnas.122653799CrossRef Google Scholar PubMed

Grünwald, P. D. (2007). The minimum description length principle. Cambridge, MA: MIT Press.10.7551/mitpress/4643.001.0001CrossRef Google Scholar

Haryo, L., & Pulungan, R. (2022). Performance evaluation of regular decomposition and benchmark clustering methods. In International conference on future data and security engineering, Springer, (pp. 176–191).10.1007/978-981-19-8069-5_12CrossRef Google Scholar

Holland, P. W., Laskey, K. B., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5(2), 109–137.10.1016/0378-8733(83)90021-7CrossRef Google Scholar

Hric, D., Peixoto, T. P., & Fortunato, S. (2016). Network structure, metadata, and the prediction of missing nodes and annotations. Physical Review X, 6(3), 031038.10.1103/PhysRevX.6.031038CrossRef Google Scholar

Interdonato, R., Atzmueller, M., Gaito, S., Kanawati, R., & Largeron, C. (2019). Feature-rich networks: Going beyond complex network topologies. Applied Network Science, 4(1), 4.10.1007/s41109-019-0111-xCrossRef Google Scholar

Jorritsma, J., & Komjáthy, J. (2020). Weighted distances in scale-free preferential attachment models. Random Structures & Algorithms, 57(3), 823–859.10.1002/rsa.20947CrossRef Google Scholar

Karrer, B., & Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Physical Review E, 83(1), 016107.10.1103/PhysRevE.83.016107CrossRef Google Scholar PubMed

Ketchen, D. J., & Shook, C. L. (1996). The application of cluster analysis in strategic management research: An analysis and critique. Strategic Management Journal, 17(6), 441–458.10.1002/(SICI)1097-0266(199606)17:6<441::AID-SMJ819>3.0.CO;2-G3.0.CO;2-G>CrossRef Google Scholar

Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J., Moreno, Y., & Moreno, M. (2014). Multilayer networks. Journal of Complex Networks, 2(3), 203–271.10.1093/comnet/cnu016CrossRef Google Scholar

Lei, J., & Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Annals of Statistics, 43(1), 215–237.10.1214/14-AOS1274CrossRef Google Scholar

Magnani, M., Hanteer, O., Interdonato, R., Rossi, L., & Tagarelli, A. (2021). Community detection in multiplex networks. ACM Computing Surveys, 54(3), 1–35.10.1145/3444688CrossRef Google Scholar

Negre, C. F. A., Ushijima-Mwesigwa, H., & Mniszewski, S. M. (2020). Detecting multiple communities using quantum annealing on the D-wave system. PLOS ONE, 15(2), 1–14.10.1371/journal.pone.0227538CrossRef Google Scholar PubMed

Nepusz, T., Négyessy, L., Tusnády, G., & Bazsó, F. (2008). Reconstructing cortical networks: Case of directed graphs with high level of reciprocity. In: Bollobás, B., Kozma, R., & Miklós, D., eds. Handbook of large-scale random networks, Vol. 18, (pp. 325–368). Berlin, Heidelberg: Springer.10.1007/978-3-540-69395-6_8CrossRef Google Scholar

Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8696.10.1073/pnas.0601602103CrossRef Google Scholar PubMed

Newman, M. E. J., & Clauset, A. (2016). Structure and inference in annotated networks. Nature Communications, 7(1), 1–11.10.1038/ncomms11863CrossRef Google Scholar PubMed

Norros, I., & Reittu, H. (2006). On a conditionally Poissonian graph process. Advances in Applied Probability, 38(1), 59–75.10.1239/aap/1143936140CrossRef Google Scholar

Norros, I., & Reittu, H. (2008a). Attack resistance of power-law random graphs in the finite-mean, infinite-variance region. Internet Mathematics, 5(3), 251–266.10.1080/15427951.2008.10129162CrossRef Google Scholar

Norros, I., & Reittu, H. (2008b). Network models with a ’soft hierarchy’: A random graph construction with loglog scalability. IEEE Network, 22(2), 40–46.10.1109/MNET.2008.4476070CrossRef Google Scholar

Norros, I., Reittu, H., & Bazsó, F. (2022). On model selection for dense stochastic block models. Advances in Applied Probability, 54(1), 202–226.10.1017/apr.2021.29CrossRef Google Scholar

Pehkonen, V., & Reittu, H. (2011). Szemerédi-type clustering of peer-to-peer streaming system. In Proceedings of the international workshop on modeling, analysis, and control of complex networks, Cnet 2011, San Francisco, USA, pp. 23–30, ITC23Google Scholar

Peixoto, T. P. (2012). Parsimonious module inference in large networks. Physical Review Letters, 110, 148701.Google Scholar

Peixoto, T. P. (2015). Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. Physical Review E, 92(4), 04280710.1103/PhysRevE.92.042807CrossRef Google Scholar PubMed

Reittu, H., Bazsó, F., & Norros, I. (2017a). Regular decomposition: An information and graph theoretic approach to stochastic block models, arXiv: 1704.07114[cs.IT].Google Scholar

Reittu, H., Bazsó, F., & Weiss, R. (2014). Regular decomposition of multivariate time series and other matrices. In: Fränti, P., Brown, G., Loog, M., Escolano, F., & Pelillo, M., eds. Proceedins of S+SSPR 2014, LNCS, (pp. 424–433). Springer.Google Scholar

Reittu, H., Kotovirta, V., Leskelä, L., Rummukainen, H., & Räty, T. (2020). Towards analyzing large graphs with quantum annealing. In Baru, C., Huan, J., Khan, L., Hu, X., Ak, R., Tian, Y., Barga, R., Zaniolo, C., Lee, K., & Ye, Y., eds. Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 (pp. 2457–2464). USA: IEEE, 09-12-2019 to 12-12-2019,Google Scholar

Reittu, H., Leskelä, L., Räty, T., & Fiorucci, M. (2018). Analysis of large sparse graphs using regular decomposition of graph distance matrices. In IEEE international conference on big data (big data) (pp. 3784–3792). Seattle, WA: IEEE.Google Scholar

Reittu, H., & Norros, I. (2002). On the effect of very large nodes in internet graphs. In Proceedins of global telecommunications conference, 2002. GLOBECOM’02, Vol. 3, (pp. 2624–2628). IEEE.10.1109/GLOCOM.2002.1189105CrossRef Google Scholar

Reittu, H., & Norros, I. (2004). On the power-law random graph model of massive data networks. Performance Evaluation, 55(1-2), 3–23.10.1016/S0166-5316(03)00097-XCrossRef Google Scholar

Reittu, H., Norros, I., & Bazsó, F. (2017b). Regular decomposition of large graphs and other structures: Scalability and robustness towards missing data. In Al Hasan, M., & Madduri, K., eds. Fourth international workshop on high performance big graph data management, analysis, and mining (BigGraphs 2017) (pp. 16–27). Boston, USA: IEEE BigData.Google Scholar

Reittu, H., Norros, I., Räty, T., Bolla, M., & Bazsó, F. (2019). Regular decomposition of large graphs: Foundation of a sampling approach to stochastic block model fitting. Data Science and Engineering, 4(1), 44–60.10.1007/s41019-019-0084-xCrossRef Google Scholar

Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11(2), 416–431.10.1214/aos/1176346150CrossRef Google Scholar

Szemerédi, E. (1978). Regular partitions of graphs. Problemés Combinatories et Téorie des Graphes, 260, 399–401. in Colloq. Intern. C.N.R.S. Orsay.Google Scholar

Tao, T. (2006). Szemerédi’s regularity lemma revisited. Contributions to Discrete Mathematics, 1, 8–28.Google Scholar

van der Hofstad, R. (2017). Random graphs and complex networks, Vol. 1, Cambridge University Press.Google Scholar

van der Hofstad, R., & Hooghiemstra, G. (2008). Universality of distances in power-law random graphs. Journal of Mathematical Physics, 49(12), 125209.10.1063/1.2982927CrossRef Google Scholar

van der Hofstad, R., Hooghiemstra, G., & Znamenski, D. (2007). Distances in random graphs with finite mean and infinite variance degrees. Electronic Journal of Probability, 12, 703–766.10.1214/EJP.v12-420CrossRef Google Scholar

van der Hoorn, P., & Olvera-Cravioto, M. (2018). Typical distances in the directed configuration model. Annals of Applied Probability, 28(3), 1739–1792.10.1214/17-AAP1342CrossRef Google Scholar

Wilson, J. D., Palowitch, J., Bhamidi, S., & Nobel, A. B. (2017). Community extraction in multilayer networks with heterogeneous community structure. Journal of Machine Learning Research, 18, 1–49.Google Scholar PubMed

Xu, M., Jog, V., & Loh, P.-L. (2020). Optimal rates for community estimation in the weighted stochastic block model. Annals of Statistics, 48(1), 183–204.10.1214/18-AOS1797CrossRef Google Scholar

Zhang, A. Y., & Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Annals of Statistics, 44(5), 2252–2280.10.1214/15-AOS1428CrossRef Google Scholar

Zhao, Y., Levina, E., & Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Annals of Statistics, 40(4), 2266–2292.10.1214/12-AOS1036CrossRef Google Scholar

Article contents

A network community detection method with integration of data from multiple layers and node attributes

Abstract

Keywords

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests