Data poisoning aims to compromise a machine learning based software component by contaminating its training set to change its prediction results for test inputs. Existing methods for deciding data-poisoning robustness have either poor accuracy or long running time and, more importantly, they can only certify some of the truly-robust cases, but remain inconclusive when certification fails. In other words, they cannot falsify the truly-non-robust cases. To overcome this limitation, we propose a systematic testing based method, which can falsify as well as certify data-poisoning robustness for a widely used supervised-learning technique named k-nearest neighbors (KNN). Our method is faster and more accurate than the baseline enumeration method, due to a novel over-approximate analysis in the abstract domain, to quickly narrow down the search space, and systematic testing in the concrete domain, to find the actual violations. We have evaluated our method on a set of supervised-learning datasets. Our results show that the method significantly outperforms state-of-the-art techniques, and
can decide data-poisoning robustness of KNN prediction results for most of the test inputs.

Wed 19 Jul

Displayed time zone: Pacific Time (US & Canada) change

10:30 - 12:00
ISSTA 5: Improving Deep Learning SystemsTechnical Papers at Smith Classroom (Gates G10)
Chair(s): Michael Pradel University of Stuttgart
10:30
15m
Talk
Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)
Technical Papers
XuNie Huazhong University of Science and Technology; Beijing University of Posts and Telecommunications, Ningke Li Huazhong University of Science and Technology, Kailong Wang Huazhong University of Science and Technology, Shangguang Wang Beijing University of Posts and Telecommunications, Xiapu Luo Hong Kong Polytechnic University, Haoyu Wang Huazhong University of Science and Technology
DOI
10:45
15m
Talk
Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction Deemphasis
Technical Papers
Xiangzhe Xu Purdue University, Shiwei Feng Purdue University, Yapeng Ye Purdue University, Guangyu Shen Purdue University, Zian Su Purdue University, Siyuan Cheng Purdue University, Guanhong Tao Purdue University, Qingkai Shi Purdue University, Zhuo Zhang Purdue University, Xiangyu Zhang Purdue University
DOI
11:00
15m
Talk
CILIATE: Towards Fairer Class-Based Incremental Learning by Dataset and Training Refinement
Technical Papers
Xuanqi Gao Xi’an Jiaotong University, Juan Zhai University of Massachusetts Amherst, Shiqing Ma UMass Amherst, Chao Shen Xi’an Jiaotong University, Yufei Chen Xi’an Jiaotong University; City University of Hong Kong, Shiwei Wang Xi’an Jiaotong University
DOI Pre-print
11:15
15m
Talk
DeepAtash: Focused Test Generation for Deep Learning Systems
Technical Papers
Tahereh Zohdinasab USI Lugano, Vincenzo Riccio University of Udine, Paolo Tonella USI Lugano
DOI
11:30
15m
Talk
Systematic Testing of the Data-Poisoning Robustness of KNN
Technical Papers
Yannan Li University of Southern California, Jingbo Wang University of Southern California, Chao Wang University of Southern California
DOI
11:45
15m
Talk
Semantic-Based Neural Network Repair
Technical Papers
Richard Schumi Singapore Management University, Jun Sun Singapore Management University
DOI