Towards Automatic Test Suite Generation for Functional Programming Assignments using Budgeted Compositional Symbolic Execution
Date: Wed, October 26, 2022
Room: Lecture Hall D@ta (Building 36 HB.01.630)
Note: This is a MSc thesis defense
In this thesis, we have defined a symbolic execution technique to automatically generate test suites for programs written in functional programming languages that can find the behavioural differences between a reference implementation and a set of potentially different implementations. Our symbolic execution technique uses a constraint solver in order to find a model that satisfies all constraints that together represent an execution path through the program. Furthermore, our technique utilises manually defined budget constraints to guide the symbolic execution to more interesting areas of the programs. These budget constraints define an initial budget that dictates when the symbolic execution terminates, and a set of costs associated to certain operations such as calling (specific) functions or performing (specific) pattern matches. This allows the symbolic execution to deplete budgets faster when it explores functions or branches of a program that are deemed “unlikely to be interesting”. This results in less system resources being used on exploring these uninteresting execution paths, and instead allows for exploration of deeper paths in more interesting areas of the programs. Our budget constraint optimisation strategy works alongside other well-known optimisation strategies for symbolic execution such as early branch pruning and reusing intermediate results, more commonly known in the field of symbolic execution as compositioning.
In order to perform our symbolic execution on functional programs, we have defined a variation of Control Flow Graphs (CFG) for functional programming languages which allow modelling the execution flow between the different expressions that form the body of functions in functional programming. Additionally, our technique uses a constraint solver in order to find what paths through the program are feasible and to generate inputs that will lead to the execution of said paths. To allow us to specify these constraints on a higher-level, we have defined an intermediate Domain Specific Language (DSL), which can be transformed into a lower-level language understood by the constraint solver. We refer to this DSL as our Intermediate Constraint Language (ICL). Our ICL supports working seamlessly with type systems that allow inheritance, rather than only supporting Algebraic Data Types (ADT).
Furthermore, we have developed a symbolic execution engine for programs written in a pure and functional subset of the Scala programming language. This symbolic execution engine implements our proposed symbolic execution technique alongside the other described optimisation techniques. We have used this implementation in order to measure the effectiveness of our symbolic execution technique. In order to do this, we have used our technique in a scenario to automatically generate test suites for real student submissions of introductory assignments to functional programming. We have performed a comparison between the obtained branch coverages and mutation scores with our technique and the manually written test suites that have been used for multiple years. Finally, we have reflected on the feasibility of using this approach in practice by looking at the execution time and the number of inquiries to the constraint solver, required to generate test suites that are able to find an adequate number of errors.