Automatically Reproducing Android Bug Reports using Natural Language Processing and Reinforcement Learning
As part of the process of resolving issues submitted by users via bug reports, Android developers attempt to reproduce and observe the crashes described by the bug reports. Due to the low-quality of bug reports and the complexity of modern apps, the reproduction process is non-trivial and time-consuming. Therefore, automatic approaches that can help reproduce Android bug reports are in great need. However, current approaches to help developers automatically reproduce bug reports are only able to handle limited forms of natural language text and struggle to successfully reproduce crashes for which the initial bug report had missing or imprecise steps. In this paper, we introduce a new fully automated approach to reproduce crashes from Android bug reports that addresses these limitations. Our approach accomplishes this by leveraging natural language processing techniques to more holistically and accurately analyze the natural language in Android bug reports and designing new techniques, based on reinforcement learning, to guide the search for successful reproducing steps. We conducted an empirical evaluation of our approach on 77 real world bug reports. Our approach achieved 67% precision and 77% recall in accurately extracting reproduction steps from bug reports, reproduced 74% of the total bug reports, and reproduced 64% of the bug reports that contained missing steps, significantly outperforming state of the art techniques.
Tue 18 JulDisplayed time zone: Pacific Time (US & Canada) change
13:30 - 15:00 | ISSTA 3: Deep-Learning for Software AnalysisTechnical Papers at Amazon Auditorium (Gates G20) Chair(s): Shiyi Wei University of Texas at Dallas | ||
13:30 15mTalk | API2Vec: Learning Representations of API Sequences for Malware Detection Technical Papers Lei Cui Zhongguancun Laboratory, Jiancong Cui University of Chinese Academy of Sciences; Institute of Information Engineering at Chinese Academy of Sciences, Yuede Ji University of North Texas, Zhiyu Hao Zhongguancun Laboratory, Lun Li Institute of Information Engineering at Chinese Academy of Sciences, Zhenquan Ding Institute of Information Engineering at Chinese Academy of Sciences DOI | ||
13:45 15mTalk | CONCORD: Clone-Aware Contrastive Learning for Source CodeACM SIGSOFT Distinguished Paper Technical Papers Yangruibo Ding Columbia University, Saikat Chakraborty Microsoft Research, Luca Buratti IBM Research, Saurabh Pujar IBM, Alessandro Morari IBM Research, Gail Kaiser Columbia University, Baishakhi Ray Columbia University DOI | ||
14:00 15mTalk | Type Batched Program Reduction Technical Papers DOI | ||
14:15 15mTalk | CodeGrid: A Grid Representation of Code Technical Papers Abdoul Kader Kaboré University of Luxembourg, Earl T. Barr University College London; Google DeepMind, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg DOI | ||
14:30 15mTalk | Guided Retraining to Enhance the Detection of Difficult Android Malware Technical Papers Nadia Daoudi University of Luxembourg, Kevin Allix CentraleSupélec, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg DOI | ||
14:45 15mTalk | Automatically Reproducing Android Bug Reports using Natural Language Processing and Reinforcement Learning Technical Papers Zhaoxu Zhang University of Southern California, Robert Winn University of Southern California, Yu Zhao University of Central Missouri, Tingting Yu University of Cincinnati, William G.J. Halfond University of Southern California DOI |