Online social networking platforms have the possibility to collect an incredibly rich set of information about their users: the people they talk to, the people they follow and trust, the people they can influence, as well as their hobbies, interests, and topics in which they are authoritative. Analyzing these data creates fascinating opportunities for expanding our understanding about social structures and phenomena such as social influence, trust and their dynamics. At the same time, mining this type of rich information allows building novel online services, and it represents a great resource for advertisers and for building viral marketing campaigns. Sharing social-network graphs, however, raises important privacy concerns. To alleviate this problem, several anonymization methods have been proposed that aim at reducing the risk of a privacy breach on the published data while still allowing to analyze them and draw relevant conclusions. The bulk of those proposals only considers publishing the network structure, that is a simple (often undirected) graph. In this paper we study the problem of preserving users’ individual privacy when publishing information-rich social networks. In particular, we consider the obfuscation of users’ identities in a topic-dependent social influence network, i.e., a directed graph where each edge is enriched by a topic model that represents the strength of the social influence along the edge per topic. This information-rich graph is obviously much harder to anonymize than standard graphs. We propose here to obfuscate the identity of nodes in the network by randomly perturbing the network structure and the topic model. We then formalize our privacy notion, k-obfuscation, and show how to evaluate the level of obfuscation under a strong adversarial assumption. Experiments on two social networks confirm that randomization can successfully protect the privacy of the users while maintaining high-quality data for applications, such as influence maximization for viral marketing.
ASJC Scopus subject areas
- Information Systems
- Media Technology
- Human-Computer Interaction
- Computer Science Applications