## Abstract

The availability of large-scale data about interactions of social media users allows the study of complex human behavior. Graphs are typically employed to represent user interactions, but several algorithms become impractical for analyzing large graphs. Hence, it can be useful to analyze a small sub-graph instead in a practice known as graph sampling. However, if the graph is unobtainable, for example, due to privacy limitations, graph sampling is impossible. We introduce an innovative algorithm for representing a large unobtainable graph of user relationships such as Facebook friendships, using a streaming graph of user activity that can include, for example, wall posts on Facebook. We applied different methods of the proposed algorithm to two large datasets. The results show that averages and distribution statistics of nodes in a large, unobtainable relationship graph are well represented by a graph of about 20% of the size of the unobtainable graph. Finally, we apply the proposed algorithm to identify influencers in an unobtainable graph by analyzing a representative graph. We find that 63% to 76% of identified influencers in the representative graph act as influencers in the unobtainable graph, suggesting that the developed algorithm can effectively capture properties of the unobtainable graph.

Original language | English |
---|---|

Pages (from-to) | 1097-1112 |

Journal | Information Sciences |

Volume | 546 |

DOIs | |

State | Published - 6 Feb 2021 |

## Keywords

- Activity graph
- Large relationship graph
- Streaming graph
- Unobtainable graph representation

## ASJC Scopus subject areas

- Software
- Control and Systems Engineering
- Theoretical Computer Science
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence