ROME: Testing Image Captioning Systems via Recursive Object Melting (ISSTA 2023 - Technical Papers)

Who

BoXi Yu, Zhiqing Zhong, Jiaqi Li, Yixing Yang, Shilin He, Pinjia He

Track

ISSTA 2023 Technical Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Jul 2023 15:50 - 16:00 at Smith Classroom (Gates G10) - ISSTA Online 4: Testing and Analysis of DL Systems Chair(s): Elena Sherman

Abstract

Image captioning (IC) systems aim to generate a text description of the salient objects in an image. In recent years, IC systems have been increasingly integrated into our daily lives, such as assistance for visually-impaired people and description generation in Microsoft Powerpoint. However, even the cutting-edge IC systems (e.g., Microsoft Azure Cognitive Services) and algorithms (e.g., OFA) could produce erroneous captions, leading to incorrect captioning of important objects, misunderstanding, and threats to personal safety. The existing testing approaches either fail to handle the complex form of IC system output (i.e., sentences in natural language) or generate unnatural images as test cases. To address these problems, we introduce Recursive Object MElting (Rome), a novel metamorphic testing approach for validating IC systems. Different from existing approaches that generate test cases by inserting objects, which easily make the generated images unnatural, Rome melts (i.e., remove and inpaint) objects. Rome assumes that the object set in the caption of an image includes the object set in the caption of a generated image after object melting. Given an image, Rome can recursively remove its objects to generate different pairs of images. We use Rome to test one widely-adopted image captioning API and four state-of-the-art (SOTA) algorithms. The results show that the test cases generated by Rome look much more natural than the SOTA IC testing approach and they achieve comparable naturalness to the original images. Meanwhile, by generating test pairs using 226 seed images, Rome reports a total of 9,121 erroneous issues with high precision (86.47%-92.17%). In addition, we further utilize the test cases generated by Rome to retrain the Oscar, which improves its performance across multiple evaluation metrics.

DOI

https://doi.org/10.1145/3597926.3598094

BoXi Yu

Chinese University of Hong Kong

China

Zhiqing Zhong

Chinese University of Hong Kong

China

Jiaqi Li

Chinese University of Hong Kong

China

Yixing Yang

Chinese University of Hong Kong

China

Shilin He

Microsoft Research

n.n.

Pinjia He

Chinese University of Hong Kong

China

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Jul
Displayed time zone: Pacific Time (US & Canada) change

15:30 - 17:00	ISSTA Online 4: Testing and Analysis of DL SystemsTechnical Papers at Smith Classroom (Gates G10) Chair(s): Elena Sherman Boise State University

15:30 10m Talk		A Tale of Two Approximations: Tightening Over-Approximation for DNN Robustness Verification via Under-Approximation Technical Papers Zhiyi Xue East China Normal University, Si Liu ETH Zurich, Zhaodi Zhang East China Normal University, Yiting Wu East China Normal University, Min Zhang East China Normal University DOI
15:40 10m Talk		In Defense of Simple Techniques for Neural Network Test Case Selection Technical Papers Shenglin Bao Fudan University, Chaofeng Sha Fudan University, Bihuan Chen Fudan University, Xin Peng Fudan University, Wenyun Zhao Fudan University DOI
15:50 10m Talk		ROME: Testing Image Captioning Systems via Recursive Object Melting Technical Papers BoXi Yu Chinese University of Hong Kong, Zhiqing Zhong Chinese University of Hong Kong, Jiaqi Li Chinese University of Hong Kong, Yixing Yang Chinese University of Hong Kong, Shilin He Microsoft Research, Pinjia He Chinese University of Hong Kong DOI
16:00 10m Talk		ACETest: Automated Constraint Extraction for Testing Deep Learning Operators Technical Papers Jingyi Shi Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yang Xiao Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yuekang Li University of New South Wales, Yeting Li Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, DongSong Yu Zhongguancun Laboratory, Chendong Yu Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Hui Su Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yufeng Chen Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Wei Huo Institute of Information Engineering at Chinese Academy of Sciences DOI
16:10 10m Talk		Latent Imitator: Generating Natural Individual Discriminatory Instances for Black-Box Fairness Testing Technical Papers Yisong Xiao Beihang University, Aishan Liu Beihang University; Institute of Dataspace, Li Tianlin Nanyang Technological University, Xianglong Liu Beihang University; Institute of Dataspace; Zhongguancun Laboratory DOI
16:20 10m Talk		CoopHance: Cooperative Enhancement for Robustness of Deep Learning Systems Technical Papers Quan Zhang Tsinghua University, Yongqiang Tian University of Waterloo, Yifeng Ding University of Illinois at Urbana-Champaign, Shanshan Li National University of Defense Technology, Chengnian Sun University of Waterloo, Yu Jiang Tsinghua University, Jiaguang Sun Tsinghua University DOI
16:30 10m Talk		Back Deduction Based Testing for Word Sense Disambiguation Ability of Machine Translation Systems Technical Papers Jun Wang Nanjing University, Yanhui Li Nanjing University, Xiang Huang Nanjing University, Lin Chen Nanjing University, Xiaofang Zhang Soochow University, Yuming Zhou Nanjing University DOI
16:40 10m Talk		CydiOS: A Model-Based Testing Framework for iOS Apps Technical Papers Shuohan Wu Hong Kong Polytechnic University, Jianfeng Li Xi’an Jiaotong University, Hao Zhou Hong Kong Polytechnic University, Yongsheng Fang Beijing University of Posts and Telecommunications, Kaifa ZHAO Hong Kong Polytechnic University, Haoyu Wang Huazhong University of Science and Technology, Chenxiong Qian University of Hong Kong, Xiapu Luo Hong Kong Polytechnic University DOI