Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models (ISSTA 2023 - Technical Papers)

Who

Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, Lingming Zhang

Track

ISSTA 2023 Technical Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 18 Jul 2023 11:15 - 11:30 at Smith Classroom (Gates G10) - ISSTA 2: Fuzzing 1 Chair(s): Jonathan Bell

Abstract

Deep Learning (DL) systems have received exponential growth in popularity and have become ubiquitous in our everyday life. Such systems are built on top of popular DL libraries, e.g., TensorFlow and PyTorch which provide APIs as building blocks for DL systems. Detecting bugs in these DL libraries is critical for almost all downstream DL systems in ensuring effectiveness/safety for end users. Meanwhile, traditional fuzzing techniques can be hardly effective for such a challenging domain since the input DL programs need to satisfy both the input language (e.g., Python) syntax/semantics and the DL API input/shape constraints for tensor computations.
To address these limitations, we propose TitanFuzz – the first approach to directly leveraging Large Language Models (LLMs) to generate input programs for fuzzing DL libraries. LLMs are titanic models trained on billions of code snippets and can autoregressively generate human-like code snippets. Our key insight is that modern LLMs can also include numerous code snippets invoking DL library APIs in their training corpora, and thus can implicitly learn both language syntax/semantics and intricate DL API constraints for valid DL program generation. More specifically, we use both generative and infilling LLMs (e.g., Codex/InCoder) to generate and mutate valid/diverse input DL programs for fuzzing. Our experimental results demonstrate that TitanFuzz can achieve 30.38%/50.84% higher code coverage than state-of-the-art fuzzers on TensorFlow/PyTorch. Furthermore, TitanFuzz is able to detect 65 bugs, with 44 already confirmed as previously unknown bugs.
This paper demonstrates that modern titanic LLMs can be leveraged to directly perform both generation-based and mutation-based fuzzing studied for decades, while being fully automated, generalizable, and applicable to domains challenging for traditional approaches (such as DL systems). We hope TitanFuzz can stimulate more work in this promising direction of LLMs for fuzzing.

DOI

https://doi.org/10.1145/3597926.3598067

Yinlin Deng

University of Illinois at Urbana-Champaign

United States

Chunqiu Steven Xia

University of Illinois at Urbana-Champaign

United States

Haoran Peng

University of Science and Technology of China

China

Chenyuan Yang

University of Illinois at Urbana-Champaign

United States

Lingming Zhang

University of Illinois at Urbana-Champaign

United States

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 18 Jul
Displayed time zone: Pacific Time (US & Canada) change

10:30 - 12:00	ISSTA 2: Fuzzing 1Technical Papers at Smith Classroom (Gates G10) Chair(s): Jonathan Bell Northeastern University

10:30 15m Talk		Icicle: A Re-designed Emulator for Grey-Box Firmware Fuzzing Technical Papers Michael Chesser University of Adelaide, Surya Nepal CSIRO’s Data61, Damith C. Ranasinghe University of Adelaide DOI
10:45 15m Talk		Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability Prediction Technical Papers Stephan Lipp TU Munich, Daniel Elsner TU Munich, Severin Kacianka TU Munich, Alexander Pretschner TU Munich, Marcel Böhme MPI-SP; Monash University, Sebastian Banescu TU Munich DOI
11:00 15m Talk		ItyFuzz: Snapshot-Based Fuzzer for Smart Contract Technical Papers Chaofan Shou University of California at Santa Barbara, Shangyin Tan University of California at Berkeley, Koushik Sen University of California at Berkeley DOI
11:15 15m Talk		Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models Technical Papers Yinlin Deng University of Illinois at Urbana-Champaign, Chunqiu Steven Xia University of Illinois at Urbana-Champaign, Haoran Peng University of Science and Technology of China, Chenyuan Yang University of Illinois at Urbana-Champaign, Lingming Zhang University of Illinois at Urbana-Champaign DOI
11:30 15m Talk		Detecting State Inconsistency Bugs in DApps via On-Chain Transaction Replay and Fuzzing Technical Papers Mingxi Ye Sun Yat-sen University, Yuhong Nan Sun Yat-sen University, Zibin Zheng Sun Yat-sen University, Dongpeng Wu Sun Yat-sen University, Huizhong Li WeBank DOI
11:45 15m Talk		Green Fuzzer Benchmarking Technical Papers Jiradet Ounjai MPI-SWS, Valentin Wüstholz ConsenSys, Maria Christakis TU Wien DOI