Systematically Producing Test Orders to Detect Order-Dependent Flaky Tests (ISSTA 2023 - Technical Papers)

Who

Chengpeng Li, M. Mahdi Khosravi, Wing Lam, August Shi

Track

ISSTA 2023 Technical Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 20 Jul 2023 11:30 - 11:45 at Amazon Auditorium (Gates G20) - ISSTA 9: Testing 2 Chair(s): Cristian Cadar

Abstract

Software testing suffers from the presence of flaky tests, which can pass or fail when run on the same version of code. Order- dependent tests (OD tests) are flaky tests whose outcome depends on the order in which they are run. An OD test can be detected if specific tests are run or not run before it, resulting in a difference in test outcome. While prior work has proposed rerunning tests in different random test orders, this approach does not provide guarantees toward detecting all OD tests. Later work that proposed a more systematic approach to ordering tests still fails to account for the relationships between all tests in the test suite.
We propose three new techniques to detect OD tests through a more systematic means of producing test orders. Our techniques build upon prior work in Tuscan squares to cover test pairs in a minimal set of test orders while also obeying the constraints of how tests can be positioned in a test order w.r.t. their test classes. Further, as there are many test pairs that need to be covered, we develop a technique that can take a specified set of test pairs to cover and produce test orders that aim to cover just those test pairs. Our evaluation with 289 known OD tests across 47 test suites from open-source projects shows that our most cost-effective technique can detect 97.2% of the known OD tests with 104.7 test orders, on average, per subject. While all techniques produce a relatively large number of test orders, our analysis of the minimal set of test orders needed to detect OD tests shows a tremendous reduction in the test orders needed to detect OD tests – representing an opportunity for future work to prioritize test orders.

DOI

https://doi.org/10.1145/3597926.3598083

Chengpeng Li

University of Texas at Austin

United States

M. Mahdi Khosravi

Middle East Technical University

Turkey

Wing Lam

George Mason University

United States

August Shi

University of Texas at Austin

United States

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 20 Jul
Displayed time zone: Pacific Time (US & Canada) change

10:30 - 12:00	ISSTA 9: Testing 2Technical Papers at Amazon Auditorium (Gates G20) Chair(s): Cristian Cadar Imperial College London

10:30 15m Talk		A Comprehensive Study on Quality Assurance Tools for Java Technical Papers Han Liu East China Normal University, Sen Chen Tianjin University, Ruitao Feng UNSW, Chengwei Liu Nanyang Technological University, Kaixuan Li East China Normal University, Zhengzi Xu Nanyang Technological University, Liming Nie Nanyang Technological University, Yang Liu Nanyang Technological University, Yixiang Chen East China Normal University DOI
10:45 15m Talk		Transforming Test Suites into Croissants Technical Papers Yang Chen University of Illinois at Urbana-Champaign, Alperen Yildiz Sabanci University, Darko Marinov University of Illinois at Urbana-Champaign, Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign DOI
11:00 15m Talk		SlipCover: Near Zero-Overhead Code Coverage for Python Technical Papers Juan Altmayer Pizzorno University of Massachusetts Amherst, Emery D. Berger University of Massachusetts Amherst DOI
11:15 15m Talk		To Kill a Mutant: An Empirical Study of Mutation Testing Kills Technical Papers Hang Du University of California at Irvine, Vijay Krishna Palepu Microsoft, James Jones University of California at Irvine DOI
11:30 15m Talk		Systematically Producing Test Orders to Detect Order-Dependent Flaky Tests Technical Papers Chengpeng Li University of Texas at Austin, M. Mahdi Khosravi Middle East Technical University, Wing Lam George Mason University, August Shi University of Texas at Austin DOI
11:45 15m Talk		Extracting Inline Tests from Unit Tests Technical Papers Yu Liu University of Texas at Austin, Pengyu Nie University of Texas at Austin, Anna Guo University of Texas at Austin, Milos Gligoric University of Texas at Austin, Owolabi Legunsen Cornell University DOI