Theme and Goals

The integration of AI techniques in the domain of software testing represents a promising frontier, one that is still at the dawn of its potential. Over the past few years, software developers have witnessed a surge in innovative approaches aimed at streamlining the development lifecycle, with a particular focus on the testing phase. These approaches harness the capabilities of AI, including Convolutional Neural Networks (CNN), Deep Neural Networks (DNNs), and Large Language Models (LLMs), to transform the way we verify and validate software applications.

The adoption of AI in software testing yields numerous advantages. It significantly reduces the time and effort invested in repetitive and mundane testing tasks, allowing human testers to focus on more complex and creative aspects of testing, such as exploratory testing and user experience evaluation. Additionally, AI-driven testing improves software quality by enhancing test coverage and mutation score. The outcome is not just cost savings but also increased customer satisfaction, as the likelihood of critical software defects making it into production is greatly diminished.

The AIST workshop aspires to bring together a diverse community of researchers and practitioners. It aims to create a platform for the presentation and discussion of cutting-edge research and development initiatives in the areas of AI-driven software testing. The workshop encourages collaboration, facilitating the exchange of knowledge and ideas, and fostering a holistic understanding of the potential applications that AI offers in the context of software testing. By acknowledging the broad spectrum of perspectives and topics within the AI umbrella, AIST seeks to be a catalyst for innovation, ultimately ushering in a path for software testing efficiency and effectiveness.

Call for Papers

We invite novel papers from both academia and industry on AI applied to software testing that cover, but are not limited to, the following aspects:

AI for test case design, test generation, test prioritization, and test reduction.
AI for load testing and performance testing.
AI for monitoring running systems or optimizing those systems.
Explainable AI for software testing.
Case studies, experience reports, benchmarking, and best practices.
New ideas, emerging results, and position papers.
Industrial case studies with lessons learned or practical guidelines.

Papers can be of one of the following types:

Full Papers (max. 8 pages): Papers presenting mature research results or industrial practices.
Short Papers (max. 4 pages): Papers presenting new ideas or preliminary results.
Tool Papers (max. 4 pages): Papers presenting an AI-enabled testing tool. Tool papers should communicate the purpose and use cases for the tool. The tool should be made available (either free to download or for purchase).
Position Papers (max. 2 pages): Position statements and open challenges, intended to spark discussion or debate.

The reviewing process is single blind. Therefore, papers do not need to be anonymized. Papers must conform to the two-column IEEE conference publication format and should be submitted via EasyChair using the following link: https://easychair.org/conferences/?conf=ieeeicst24workshops

All submissions must be original, unpublished, and not submitted for publication elsewhere. Submissions will be evaluated according to the relevance and originality of the work and on their ability to generate discussions between the participants of the workshop. Each submission will be reviewed by three reviewers, and all accepted papers will be published as part of the ICST proceedings. For all accepted papers, at least one author must register in the workshop and present the paper.

Program

Accepted Papers

9:00 - 10:30: Opening/Keynote/Papers 1
- 9:00 - 9:15: Workshop Opening
- 9:15 - 10:15: Lingming Zhang. Keynote: Towards Better Software Quality in the Era of Large Language Models
- 10:15 - 10:30: Cristopher McIntyre-Garcia, Adrien Heymans, Beril Borali, Wonsook Lee and Shiva Nejati. Generating Minimalist Adversarial Perturbations to Test Object-Detection Models: An Adaptive Multi-Metric Evolutionary Search Approach
11:00 - 12:30: Papers 2
- 11:00 - 11:22: Sol Zilberman and Betty H. C. Cheng. “No Free Lunch” when using Large Language Models to Verify Self-Generated Programs
- 11:22 - 11:45: Md Asif Khan, Akramul Azim, Ramiro Liscano, Kevin Smith, Yee-Kang Chang, Qasim Tauseef and Gkerta Seferi. An End-to-End Test Case Prioritization Framework using Optimized Machine Learning Models
- 11:45 - 12:07: Gaadha Sudheerbabu, Tanwir Ahmad, Dragos Truscan, Juri Vain and Ivan Porres. Iterative Optimization of Hyperparameter-based Metamorphic Transformations
- 12:07 - 12:30: Hajra Naeem and Manar Alalfi. Machine Learning for Cross-Vulnerability Prediction in Smart Contracts
14:00 - 15:30: Tutorials
- 14:00 - 14:45: Mitchell Olsthoorn and Annibale Panichella. Tutorial: A Hands-on Tutorial for Automatic Test Case Generation and Fuzzing for JavaScript
- 14:45 - 15:30: Addison Crump and Thorsten Holz. Tutorial: SoKotHban - Competitive Adversarial Testing of Sokoban Solvers
16:00 - 17:30: Panel Discussion/Closing
- 16:00 - 17:15: Panel Discussion. TBA
- 17:15 - 17:30: Workshop Closing

Keynote and Tutorials

Keynote: Lingming Zhang - Towards Better Software Quality in the Era of Large Language Models

Abstract: Large Language Models (LLMs), such as ChatGPT, have shown impressive performance in various downstream tasks spanning diverse fields. In this talk, I will present our recent work on leveraging LLMs for quality assurance of real-world software systems, encompassing software testing, program repair, and program synthesis. More specifically, I will first talk about how LLMs can be directly applied for both generation- and mutation-based fuzz testing studied for decades, while being fully automated, generalizable, and applicable to challenging domains (including quantum computing systems). Next, I will talk about AlphaRepair, which reformulates the automated program repair (APR) problem as an infilling (or cloze) task and demonstrates that LLMs can directly outperform traditional APR techniques studied for over a decade. Lastly, I will also briefly talk about our recent work on LLM-based program synthesis, including Magicoder and EvalPlus.

Biography: Lingming Zhang is an Associate Professor in the Department of Computer Science at the University of Illinois Urbana-Champaign (UIUC). His main research interests lie in Software Engineering and Programming Languages, as well as their synergy with Machine Learning, including Large Language Models (LLMs) for Code and ML System Reliability. To date, his work has found 1000+ new bugs and vulnerabilities in real-world software systems, including deep learning compilers/libraries, C/C++ compilers, Java virtual machines, operating systems, and even quantum computing systems. He is the recipient of ACM SIGSOFT Early Career Researcher Award, NSF CAREER Award, UIUC Dean’s Award for Excellence in Research, UIUC List of Teachers Ranked as Outstanding, multiple ACM SIGSOFT Distinguished Paper Awards, and various awards/grants from Alibaba, Amazon, Google, Kwai Inc., Meta, NVIDIA, and Samsung. He currently serves as program co-chair for ASE 2025 and LLM4Code 2024, and associate chair for OOPSLA 2024. For more details, please visit: http://lingming.cs.illinois.edu/

Tutorial: Addison Crump - SoKotHban: Competitive Adversarial Testing of Sokoban Solverss

Abstract: Sokoban is a puzzle game where the player is tasked with moving crates in a warehouse to designated places. While the task is simple on its surface, the reality is quite different: intricate floor layouts force particular sequences of moves, the limited space in which to move creates difficulties in moving when crates block each other, and as your warehouse increases in size, things only become more complex. Solving these puzzles automatically is a problem that maps nicely into real-world applications, but the evaluation of these strategies often relies on fixed test suites which may be prone to overfit. We propose a King-of-the-Hill (KotH) competition designed to develop and evaluate automated puzzle solvers and generators. With each contestant submitting both a puzzle solver and a puzzle generator, contestants must demonstrate both effective solving strategies to overcome puzzles generated by opponents and generation strategies which exploit limitations of the other competitors' solvers. We expect that this competition will act as an exciting way to push the envelope in automated solving and adversarial testing of solvers, with contestants discovering new ways to target the weaknesses in each other’s strategies and optimisations.

Biography: Addison Crump is a second-year PhD student at CISPA Helmholtz Center for Information Security under the supervision of Prof. Dr. Thorsten Holz. Though specialising primarily in fuzzing for security testing, Addison’s focus prioritises the integration of strategies from other testing domains to make security testing more automated and approachable. Outside of academia, Addison is a maintainer of LibAFL and a member of secret.club.

Tutorial: Annibale Panichella and Mitchell Olsthoorn - A Hands-on Tutorial for Automatic Test Case Generation and Fuzzing for JavaScript

Abstract: The SynTest-Framework is designed as a user-friendly, flexible, and highly customizable platform that supports fuzzing and automated test case generation. It serves as a base for developing testing tools tailored to various programming languages. Additionally, the framework contains a collection of language-independent search algorithms that are optimized for automatic test case generation and fuzzing. Our primary objective with this framework is to streamline the process for researchers to devise and implement novel methods for automatic test case generation. Additionally, we hope that the framework will make it easier for practitioners to adopt automatic test case generation in their projects.

In this tutorial session, we will show how to use the framework to implement a new automatic test case generation approach. The tutorial will be hands-on and will consist of a series of practical scenarios. These scenarios will be based on the TypeScript programming language. Join us in this fun and interactive tutorial session and equip yourself with the skills to demonstrate your approach to software testing and validation.

Biography (Annibale Panichella): Annibale is an associate professor in the Software Engineering Research Group (SERG) at Delft University of Technology (TU Delft) in the Netherlands. He is the head of the Computation Intelligence for Software Engineering Lab (CISELab) within SERG. His research interests include security testing, software testing, search-based software engineering, testing for AI, empirical software engineering. He served and has served as a program committee member of various international conferences (e.g., ICSE, ESEC/FSE, ISSTA, GECCO, ICST) and as a reviewer for various international journals (e.g., TSE, TOSEM, TEVC, EMSE, STVR) in the fields of software engineering and evolutionary computation.

Biography (Mitchell Olsthoorn): Mitchell is a postdoctoral researcher in the Software Engineering Research Group (SERG) at the Delft University of Technology. He is also a member of the Computational Intelligence for Software Engineering lab (CISELab) and the Delft Blockchain Lab (DBL). His interests include network security, computational intelligence, and pen-testing. Currently, he is working on combining search-based approaches with Large Language Models (LLMs).

Important Dates

Submission deadline: 5 February 2024 AoE
Notification of Acceptance: 26 February 2024
Camera-ready: 8 March 2024
Workshop: 28 May 2024

Organization

Organizing Committee

Aitor Arrieta, University of Mondragon, Spain (Co-Chair)
Gregory Gay, Chalmers | University of Gothenburg, Sweden (Co-Chair)
Sebastiano Panichella, Zurich University of Applied Sciences, Switzerland (Co-Chair)

Program Committee

Jon Ayerdi, University of Mondragon, Spain
Francisco Chicano, University of Malaga, Spain
Joao F. Ferreira, University of Lisbon, Portugal
Gordon Fraser, University of Passau, Germany
Thomas Laurent, JSPS@National Institute of Informatics, Japan
Chengjie Lu, Simula Research Laboratory, Norway
Alberto Martin-Lopez, Università della Svizzera Italiana, Switzerland
Nuno Pombo, Universidade da Beira Interior, Portugal
I.S.W.B. (Wishnu) Prasetya, Utrecht University, Netherlands
Aurora Ramírez,University of Córdoba, Spain
Vincenzo Riccio, Università di Udine, Italy
Helge Spieker, Simula Research Laboratory, Norway
Andrea Stocco, Technical University of Munich & fortiss, Germany
Valerio Terragni, University of Auckland, New Zealand
Man Zhang, Beihang University, China
Xiao-Yi Zhang, University of Science and Technology Beijing, China
Zhenya Zhang, Kyushu University, Japan
Yu Zhou, Nanjing University of Aeronautics and Astronautics, China

Steering Committee

Alexandra Mendes, University of Porto and INESC TEC, Portugal
Tim Menzies, North Carolina State University, USA
Franz Wotawa, Graz University of Technology, Austria

28 May, 2024

AIST 2024

4th International Workshop on Artificial Intelligence in Software Testing

Important Dates

Paper Submission

Theme and Goals

Call for Papers

Program

Keynote and Tutorials

Important Dates

Organization

Organizing Committee

Program Committee

Steering Committee

Previous Editions

Contact