declaration: array.c:5: int i;
The range of possible values for a scalar variable can be
deduced from assignment operations (e.g., i= 0 ;) and com-
parisons in conditional branch instructions. For instance,
from the conditional if( i< 5 ) we can conclude that along
the true branch scalar variable i can only have values less
than 5, and along the false branch it can only have values
larger than or equal to 5. The information is combined
when multiple conditionals are passed along a path, and
fixed at each assignment. A precise indication of a value
range is easily lost though. For instance, when the current
value range for i is ( i ≥0 ∧ i < 5 ) and the next operation
seen is i+ + then the known range for i reduces to ( i ≥0 ).
Value ranges are computed conservatively in U
NO.
The range could be extended by relying on specialized
tools for resolving constraint systems, e.g. (Kelly et al.
1996). Limits clearly will remain also in that case, so it is
uncertain if such an extension could indeed significantly
improve UNO’s performance in practice.
Eliminating Infeasible Paths: A path through the control
flow graph consists of a simple sequence of transitions.
Only two basic types of transitions are of interest to us in
determining if a specific path is feasible or not: some tran-
sitions correspond to steps through a conditional branch
point in the program source (e.g., ( i > 5 ) ≡ true or
( i> 5 ) ≡ false); the remaining transitions correspond to
unconditional steps (e.g., i+ +). If a path is infeasible it
must contain at least one conditional that cannot hold in
the given context. It may, for instance, contradict earlier
conditionals that appear in the same path. The sequence
( i> 5 ) ≡ true ; ( i<5 ) ≡ true ; clearly is not consistent.
The conditional can also conflict with earlier assignments.
For instance, the sequence i= 6 ; ( i> 5 ) ≡ false ; is not
consistent. In many cases UNO can determine if a path is
feasible or not. It can use this information to suppress
error reports on execution paths that turn out to be infeasi-
ble, and it can also, more profitably, use this information
to shortcut the search procedure that is used to compute
the language intersection of property and system. A
resolver tool, such as the Omega calculator from (Kelley
et al. 1996), can be used to improve accuracy, but as
before, not all infeasible paths can always be detected, and
therefore a reasonable engineering compromise must be
sought.
GLOBAL ANALYSIS
To check if a global variable can be evaluated or derefer-
enced before it is defined is harder than checking if the
same is true for a local variable. Globals in C are by
default initialized to zero, so in principle it is impossible
to violate the rule that a value must be assigned before the
variable is evaluated. An initial value of zero is still a
problem, though, when the variable is a pointer. Derefer-
encing a nil-pointer is always a fatal error. The second
phase of the analysis is therefore concentrated on the anal-
ysis of dereferencing operations on global pointer vari-
ables, all initialized to a nil value by default.
For global variables we need to be able to take into
account the possible call chain through the function call
graph, not just the information derived from possible exe-
cution paths within local control-flow graphs of functions.
This can quickly become overly complex, especially for
large code bases. U
NO therefore uses an approximation
method based on the information gathered from the inter-
mediate files from the first pass of the analysis. The func-
tion call graph plays a central role in this analysis.
The analysis starts at the main() routine and recur-
sively descends into all functions that can be called from
that routine, via a depth-first search. Because the search
touches all functions reachable from main, as a by product
of the analysis, it can also readily identify all functions
that are not called, which can capture a remarkable
amount of discarded code in evolving programs.
To be able to do the global analysis in a meaningful
way, the analyzer must have access to some minimal
information from the first pass. It needs to know, for
instance, the list of functions that can be called from a
given function, and it needs to know on which execution
paths pointer variables may be evaluated, dereferenced, or
set. The first pass of U
NO captures this information by
generating, among other information, a highly condensed
version of the control-flow graph for each function, that
contains only this information. If no globals are set or
used, the abstract graph contains only the points where
other functions are called. To make the analyses more pre-
cise, also information about points in the abstract graph
where global variables have known (zero or nonzero)
value, are recorded. The check now reduces to the same
problem as encountered in the local analyses: a standard
model checking problem.
UNO uses a predefined property to capture global
nil-pointer dereferencing problems, but also accepts user-
defined global properties, again defined as ANSI-C func-
tions in a style we discuss in more detail shortly.
UNO’s abstract function graphs do not attempt to
compute possible return values. Since the tool is focused
on def-use analysis, only assignments are important, not
the values being assigned. The tool PREfix (Bush et al.
2000) attempts to capture more information by generating
functional models of each function, based on a restricted
symbolic execution of the function source. The user of
the tool can control the maximum number of execution