VulChecker: Graph-based Vulnerability Localization in Source Code

Yisroel Mirsky, George Macon, Michael Brown, Carter Yagemann, Matthew Pruett, Evan Downing, Sukarno Mertoguno, Wenke Lee

Research output: Contribution to conferencePaperpeer-review

Abstract

In software development, it is critical to detect vulnerabilities in a project as early as possible. Although, deep learning has shown promise in this task, current state-of-the-art methods cannot classify and identify the line on which the vulnerability occurs. Instead, the developer is tasked with searching for an arbitrary bug in an entire function or even larger region of code. In this paper, we propose VulChecker: a tool that can precisely locate vulnerabilities in source code (down to the exact instruction) as well as classify their type (CWE). To accomplish this, we propose a new program representation, program slicing strategy, and the use of a message-passing graph neural network to utilize all of code’s semantics and improve the reach between a vulnerability’s root cause and manifestation points. We also propose a novel data augmentation strategy for
cheaply creating strong datasets for vulnerability detection in the wild, using free synthetic samples available online. With this training strategy, VulChecker was able to identify 24 CVEs (10 from 2019 & 2020) in 19 projects taken from the wild, with nearly zero false positives compared to a commercial tool that could only detect 4. VulChecker also discovered an exploitable zero-day vulnerability, which has been reported to developers for responsible disclosure.
Original languageEnglish
StatePublished - 2023
Event31st USENIX Security Symposium, Security 2022 - Boston, United States
Duration: 10 Aug 202212 Aug 2022

Conference

Conference31st USENIX Security Symposium, Security 2022
Country/TerritoryUnited States
CityBoston
Period10/08/2212/08/22

Fingerprint

Dive into the research topics of 'VulChecker: Graph-based Vulnerability Localization in Source Code'. Together they form a unique fingerprint.

Cite this