Advanced search
1 file | 7.37 MB Add to list

Mining topological structure in graphs through forest representations

Robin Vandaele, Yvan Saeys (UGent) and Tijl De Bie (UGent)
Author
Organization
Project
Abstract
We consider the problem of inferring simplified topological substructures—which we term backbones—in metric and non-metric graphs. Intuitively, these are subgraphs with ‘few’ nodes, multifurcations, and cycles, that model the topology of the original graph well. We present a multistep procedure for inferring these backbones. First, we encode local (geometric) information of each vertex in the original graph by means of the boundary coefficient (BC) to identify ‘core’ nodes in the graph. Next, we construct a forest representation of the graph, termed an f-pine, that connects every node of the graph to a local ‘core’ node. The final backbone is then inferred from the f-pine through CLOF (Constrained Leaves Optimal subForest), a novel graph optimization problem we introduce in this paper. On a theoretical level, we show that CLOF is NP-hard for general graphs. However, we prove that CLOF can be efficiently solved for forest graphs, a surprising fact given that CLOF induces a nontrivial monotone submodular set function maximization problem on tree graphs. This result is the basis of our method for mining backbones in graphs through forest representation. We qualitatively and quantitatively confirm the applicability, effectiveness, and scalability of our method for discovering backbones in a variety of graph-structured data, such as social networks, earthquake locations scattered across the Earth, and high-dimensional cell trajectory data
Keywords
topological data analysis, graph mining, metric spaces, visualization, topological skeletonization, cluster coefficient, cell trajectory inference, SKELETONIZATION ALGORITHM, TREE, CENTRALITY, LOCATION, FACILITY, PATH

Downloads

  • 19-1032.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 7.37 MB

Citation

Please use this url to cite or link to this publication:

MLA
Vandaele, Robin, et al. “Mining Topological Structure in Graphs through Forest Representations.” JOURNAL OF MACHINE LEARNING RESEARCH, vol. 21, 2020, pp. 1–68.
APA
Vandaele, R., Saeys, Y., & De Bie, T. (2020). Mining topological structure in graphs through forest representations. JOURNAL OF MACHINE LEARNING RESEARCH, 21, 1–68.
Chicago author-date
Vandaele, Robin, Yvan Saeys, and Tijl De Bie. 2020. “Mining Topological Structure in Graphs through Forest Representations.” JOURNAL OF MACHINE LEARNING RESEARCH 21: 1–68.
Chicago author-date (all authors)
Vandaele, Robin, Yvan Saeys, and Tijl De Bie. 2020. “Mining Topological Structure in Graphs through Forest Representations.” JOURNAL OF MACHINE LEARNING RESEARCH 21: 1–68.
Vancouver
1.
Vandaele R, Saeys Y, De Bie T. Mining topological structure in graphs through forest representations. JOURNAL OF MACHINE LEARNING RESEARCH. 2020;21:1–68.
IEEE
[1]
R. Vandaele, Y. Saeys, and T. De Bie, “Mining topological structure in graphs through forest representations,” JOURNAL OF MACHINE LEARNING RESEARCH, vol. 21, pp. 1–68, 2020.
@article{8678984,
  abstract     = {{We consider the problem of inferring simplified topological substructures—which we term backbones—in metric and non-metric graphs. Intuitively, these are subgraphs with ‘few’ nodes, multifurcations, and cycles, that model the topology of the original graph well. We present a multistep procedure for inferring these backbones. First, we encode local (geometric) information of each vertex in the original graph by means of the boundary coefficient (BC) to identify ‘core’ nodes in the graph. Next, we construct a forest representation of the graph, termed an f-pine, that connects every node of the graph to a local ‘core’ node. The final backbone is then inferred from the f-pine through CLOF (Constrained Leaves Optimal subForest), a novel graph optimization problem we introduce in this paper. On a theoretical level, we show that CLOF is NP-hard for general graphs. However, we prove that CLOF can be efficiently solved for forest graphs, a surprising fact given that CLOF induces a nontrivial monotone submodular set function maximization problem on tree graphs. This result is the basis of our method for mining backbones in graphs through forest representation. We qualitatively and quantitatively confirm the applicability, effectiveness, and scalability of our method for discovering backbones in a variety of graph-structured data, such as social networks, earthquake locations scattered across the Earth, and high-dimensional cell trajectory data}},
  articleno    = {{215}},
  author       = {{Vandaele, Robin and Saeys, Yvan and De Bie, Tijl}},
  issn         = {{1532-4435}},
  journal      = {{JOURNAL OF MACHINE LEARNING RESEARCH}},
  keywords     = {{topological data analysis,graph mining,metric spaces,visualization,topological skeletonization,cluster coefficient,cell trajectory inference,SKELETONIZATION ALGORITHM,TREE,CENTRALITY,LOCATION,FACILITY,PATH}},
  language     = {{eng}},
  pages        = {{215:1--215:68}},
  title        = {{Mining topological structure in graphs through forest representations}},
  url          = {{https://jmlr.org/papers/v21/19-1032.html}},
  volume       = {{21}},
  year         = {{2020}},
}

Web of Science
Times cited: