Target oriented network intelligence collection: effective exploration of social networks

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Target Oriented Network Intelligence Collection (TONIC) is a crawling process whose goal is to find social network profiles that contain information about a given target. Such profiles are called leads and the TONIC problem is how to minimize crawling costs incurred while finding them. We model this problem as a search problem in an unknown graph and present a best-first search approach for solving it. Three key challenges are (1) which profiles to consider crawling to, (2) how to prioritize the crawling order, and (3) when additional crawling is not worthwhile. For the first challenge, we propose two frameworks: the Restricted TONIC Framework (RTF), that restricts the search to immediate neighbors of previously found leads, and the Extended TONIC Framework (ETF), that extends the scope of the search to a wider neighborhood. Guidelines for when to choose which framework are provided. For the second challenge, we propose a set of effective topology-based heuristics that guide the search towards profiles that are more likely to be leads. For the third challenge, we propose to use data collected in previously executed crawls to learn when additional crawling is expected to be useful.

    Original languageEnglish
    Pages (from-to)1447-1480
    Number of pages34
    JournalWorld Wide Web
    Volume22
    Issue number4
    DOIs
    StatePublished - 15 Jul 2019

    Keywords

    • Artificial intelligence
    • Heuristic search
    • Online social networks

    ASJC Scopus subject areas

    • Software
    • Hardware and Architecture
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Target oriented network intelligence collection: effective exploration of social networks'. Together they form a unique fingerprint.

    Cite this