The data mining approach to automated software testing

Mark Last, Menahem Friedman, Abraham Kandel

Research output: Contribution to conferencePaperpeer-review

46 Scopus citations


In today's industry, the design of software tests is mostly based on the testers' expertise, while test automation tools are limited to execution of pre-planned tests only. Evaluation of test outputs is also associated with a considerable effort by human testers who often have imperfect knowledge of the requirements specification. Not surprisingly, this manual approach to software testing results in heavy losses to the world's economy. The costs of the so-called "catastrophic" software failures (such as Mars Polar Lander shutdown in 1999) are even hard to measure. In this paper, we demonstrate the potential use of data mining algorithms for automated induction of functional requirements from execution data. The induced data mining models of tested software can be utilized for recovering missing and incomplete specifications, designing a minimal set of regression tests, and evaluating the correctness of software outputs when testing new, potentially flawed releases of the system. To study the feasibility of the proposed approach, we have applied a novel data mining algorithm called Info-Fuzzy Network (IFN) to execution data of a general-purpose code for solving partial differential equations. After being trained on a relatively small number of randomly generated input-output examples, the model constructed by the IFN algorithm has shown a clear capability to discriminate between correct and faulty versions of the program.

Original languageEnglish
Number of pages9
StatePublished - 1 Dec 2003
Event9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03 - Washington, DC, United States
Duration: 24 Aug 200327 Aug 2003


Conference9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03
Country/TerritoryUnited States
CityWashington, DC


  • Automated software testing
  • Finite element solver
  • Info-fuzzy networks
  • Input-output analysis
  • Regression testing

ASJC Scopus subject areas

  • Software
  • Information Systems


Dive into the research topics of 'The data mining approach to automated software testing'. Together they form a unique fingerprint.

Cite this