Testing and Fault Localization Part 2

The Underlying Problem of Fault Localization

The underlying problem for fault localization is a phenomenon called confounding (also called confounding bias). Therefore, we must first understand what confounding bias means.

In simplest terms, confounding is

an unknown causal quantity that is not directly measurable from observed data[1]

In other words, an effect is confounded when its cause (either actual or contributory) is not clear from measurements, experiments and data gathered.

The difficulties encountered by naive fault localization methods in the face of confounding bias can be understood and explained when using Pearl’s mathematical notation for confounding and probabilistic association as follows.


Associational No-Confounding

Let X and Y be variables of interest that we suspect have some causal relationship (particularly that X might cause Y). Further, let T be the set of variables that are unaffected by X.

We say that X and Y are not confounded in the presence of T if each member Z ∈ T satisfies at least one of the following conditions:

  1. P (X|Z) = P (X)
  2. P (Y |Z, X) = P (Y |X)

In words, condition 1 states that when Z ∈ T is not associated with X, then the probability of X given that we observe Z is simply the probability of observing X alone (i.e. X and Z are independent because observing Z does not affect the occurrence of X, and vice versa).

Further, condition 2 states that when we observe both X and Z, and Z is not associated with Y (given X), then the probability of observing Y equals that probability of Y given that we only observe X.

If any one of the conditions above is satisfied by each variable in T, then X and Y are not confounded and we can be sure that X indeed causes Y. Otherwise, the causal relationship between X and Y is unclear.

Even though the associational definition above sets forth clear conditions for when two variables of interest are not confounded, there is a particular implied assumption that presents difficulty for fault localization for software in general and software testing in particular.

Note that T is not simply any set of variables that are unaffected by X. Rather, it is the set of all such variables – and therein lies the difficulty with naive fault localization methods.

In other words, unless we can guarantee that each and every variable (say, a component, module, class, function, or any other logical piece of a particular software system in question) that is not affected by X satisfies condition 1 or condition 2, we cannot be completely sure that X indeed has a causal relationship with Y.

Moreover, even the relationships between X and every possible Z might be unclear to us, therefore limiting the accuracy of our model for the SUT and, hence, our testing.


An Example of the Inadequacy of the Associational Non-Confounding Criterion

Consider an example to illustrate the limitation of utilizing naive fault localization methods in the face of confounding.

Assume that Y represents “Result of authenticating to an application’s GUI” and X stands for “Account status“.

We may think that if we get an authentication error (Y = ȳ) when entering our credentials in the application’s GUI, it is because we suspect our password is wrong or possibly expired (X = x̄).

Yet there might be a variable Z that we do not know about, affecting X and Y, thus confounding the causal relationship. For instance, Z might be a newly introduced authentication module that returns incorrect results due to a bug even though the given password might be valid.

A different scenario might be that we are unable to authenticate due to another variable Z’ (representing “System hacked status“) whereby the logic of the application has been compromised by a malicious hacker through code injection techniques.

Yet another scenario contains both Z and Z’, whereby the software system has been hacked and the authentication module is also faulty.


So what does this mean?

As these examples illustrate, the effectiveness of naive fault localization methods given confounding bias are directly related to how we define our models for a SUT.

However, the way we do so is inherently limited by Closed-World assumptions, marginality (association between X and multiple variables in T), barren proxies (including irrelevant variables in the model that are associated to relevant ones), and incidental confoundedness that lead to false negatives as well as false positives.

Our inability to know for certain all the states and conditions under which a software system executes restricts our knowledge of the precise causal relationships and, hence, root cause of a given behavior of a SUT. For instance, even though our initial suspicion might have been that X causes Y, our not knowing about variables Z and Z’ confound the suspected causal relationships and prevent us from determining the root cause.

Given the disappointing realization about our impediment to fully comprehend the complete mechanisms by which a given SUT works – and the apparent severe limitations of testing as implied by the difficulties to fault localization – why is it that the practice and profession of software testing is still utilized around the world as a mayor and highly accurate way to ascertain the degree of quality for software?

To answer this question, we must delve deeper into how effective fault localization works, but you will have to stay tuned until the next post.



[1] Judea Pearl. 2013. Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, NY, USA.