If test cases take too long to execute, we can use machine learning on data of past test executions and code changes to predict the results of the tests.
Some test cases take a very long time to execute. Having to wait for a test’s result for hours (or sometimes even days) causes friction – if a developer has moved on to another topic, they probably will have to read up on their own code before being able to fix anything. Additionally, when executing whole test suites, we would like to know beforehand which tests are likely to fail dependent on code changes so that these tests can be executed first. This is non-trivial because the actual impact of a code change can be hard to assess.
However, the data inherently present in modern software development contexts can help: Code changes tracked in a source control system, together with logs of past test results, can deliver input data for a machine learning system that is able to predict the results of test cases ahead of time and within seconds. Thus, developers gain immediate feedback about the changes they are about to check in. Additionally, actual test case execution can be scheduled to prioritize the tests which are predicted to fail.
We trained a machine learning system on 5 years worth of source control and test execution data from a software project of 200+ programmers. In our talk, you will learn about the required data, how the data are processed, the machine learning approach we used, and how well the system is able to predict test results.