zhang01

Advanced Code Coverage Analysis Using Substring Holes

1.Problem: Drill-down capabilities that look at different granularities of the data, starting with directories and going through files to functions and lines of source code, are insufficient. Such capabilities make the assumption that the coverage issues themselves follow the code hierarchy. We argue that this is not the case for much of the uncovered code.

2.Contribution:we developed a hole analysis algorithm and tool that is based on common substrings in the names of functions.

Substring hole analysis is based on the observation that developers typically give semantically meaningful names to software source code elements.  Therefore, source code elements with similar names are often associated with a common topic, context, or functionality. In a nutshell and very informally, substring hole analysis considers substrings as holes if they are common to multiple element names with poor coverage.

The main goal of code coverage analysis is to increase the probability of finding bugs by improving the coverage of the system’s test suite. Using the raw code coverage data to achieve this goal can be a difficult task, especially when the size of the data is very large. In many cases, the newly added tests improve the coverage but do not increase the probability of detecting new bugs. Substring hole analysis provides a list of uncovered holes that relate to a certain functionality of the system (e.g., error handling). Therefore, it helps in adding new tests or changing existing tests in a way that increases the coverage of the most significant code areas.

3.METHODOLOGY:To get the most out of substring hole analysis, we found it useful to make multiple rounds of analysis.

1). In the first round, we recommend finding holes on the program hierarchy. IF we find a hole with 50 files, all of which belong to a single uncovered directory. Such a hole is less informative than other holes because there is already information about the directory in the hierarchical coverage analysis.Therfore, a substring hole that contains tasks belonging to many different hierarchical nodes,most of which are partially covered.This is an interesting cross concern hole.

2). In the next round, substring hole analysis is run on the remaining data.This typically reveals some holes that are well known to the user. For example, holes that are due to running on only a subset of the platforms supported by the software.

3). A final round of hole analysis is then done over the remaining data. This often triggers less obvious holes to surface.

  The coverage tasks are usually names of functions. For initial reports, it is sufficient to have a Boolean value for each task indicating whether it was covered. An example of a single coverage task is “Exception.firm.io, 0”. A hole is a set of coverage tasks, of which at least one is not covered, that have a common substring. This substring identifies the hole.

4.Example:

For example, if a line of output in the coverage report for functions is “Exception, 912, 907”, it means that there are 912 functions whose names contain the Exception substring, 907 of which were not covered.

posted on 2012-02-28 15:21  zhanghs  阅读(218)  评论(0编辑  收藏  举报

导航