Unit tests #

Testing software #

Testing is an integral part of software development.

Different terminologies coexists to categorize software tests (more or less precise, an with overlapping meanings). Common categories are:

functional vs non-functional test
unit vs integration test
security test
performance test
regression test
etc.

A test can help developers clarify what a program (resp. component, method) is expected to do. So in a sense, tests are part of the documentation or specification of a system, because they provide precise examples of the expected behavior of the program (resp. component, method).

This is why tests are often part of the development process itself. Notably, test-driven development consists in developing tests cases before software is fully developed.

Unit tests #

A unit test is usually understood as:

testing the behavior of a small piece of code (typically a method),

automated (typically integrated to the build process),

fast.

Besides, most unit tests are concerned with correctness (rather than performance for instance).

Warning. The tests used to evaluate assignments in this course are implemented via libraries for unit testing. However, many of them do not qualify as unit tests.

In its simplest form, a unit test can be viewed as a pair

$\qquad$ (input, expected output)

for a given computational problem.

Example. Consider the following problem:

Input: a (finite) array of positive integers, representing the successive values of a stock, one value per day.

Output: the maximal gain that can be made by buying on a certain day, and selling the same day or later on.

Possible unit tests for this problem are:

( [0, 3], 3 )

( [4, 3, 6, 8, 6], 5 )

( [2, 4, 9, 1, 3], 7 )

( [3, 2], 0 )

etc.

Exercise

Consider the following problem:

Input: a (finite) array of characters, with possibly duplicated characters (and no restriction on the size of the array).

Output: the size of the longest (left-to-right) sequence in this array that respects alphabetical order.

For instance, for the input

$\qquad$ [m, q, b, e, e, z, m, e],

the expected output is 4.

Question. Does there exist a (finite) set of unit tests for this problem that guarantees that an implementation is correct?

Solution

No.

For any (finite) set of unit tests (for this problem), there exists an algorithm that passes the tests but is incorrect.

To demonstrate this, observe that a unit tests for this problem is a pair $(a, i)$, where

$a$ is an array of characters, and
$i$ is a natural number.

Take any finite set

$\qquad (a_1, i_1), (a_2, i_2),\ ..,\ (a_n, i_n)$

of such unit tests.

Because this set is finite, there exists an array of characters $a_0$ that does not appear in this set of tests.

Let $i_0$ be the expected output for $a_0$, and let $j$ be any positive integer different from $i_0$.

Now consider the method (in pseudocode):

int sizeOfLongestNonDecreasingSequence(char[] characters) {
    if(characters == a_1) {
      return i_1
    }
    if(characters == a_2) {
      return i_2
    }
    ...

    if(characters == a_n) {
      return i_n
    }
    return j;
}

This algorithm will pass the unit tests, but is incorrect for the input $a_0$.

This (artificial) exercise was only meant to illustrate the following:

Warning. In general, no (reasonably small) set of unit tests can ensure that a (non-trivial) method is correct.

Unit test design #

Here are a few simple rules of thumb to design unit tests.

The input for a unit test is usually small (this makes debugging easier when the test fails).
Two tests for the same method should illustrate different types of inputs/scenarios (writing similar tests is a waste of time).
Trivial methods do not need unit tests.
Priority is often given to so-called “happy path” tests (a.k.a “normal” scenarios). These are representative of what the tested method is likely to receive as input.
In addition to “happy path” tests, one may implement tests that deal with corner cases (e.g. empty array, value 0, etc.). However, an exhaustive coverage of corner cases is often unnecessary, because the methods that call the tested method cannot produce such inputs.
A unit test should be reproducible. In particular, it should not depend on:
- (pseudo)-random values,
- external services (web API, etc.) whose behavior cannot be controlled.

Implementation #

Requirement. A unit test should itself be free of bugs.

For this reason, unit tests usually:

rely of widely used libraries for test execution,
mostly consist of simple, declarative code otherwise.

In particular, unit tests are a (rare) case where code factorization is not a priority (i.e. unit tests may contain redundant code).

Unit tests and build automation #

Unit tests are usually integrated to the standard build automation process of a project.

For instance, the test-compile and test phases of Maven’s default lifecycle are in charge of compiling and executing unit tests respectively. This effectively prevents further phases to be executed if a test fails.

Regression tests #

A regression test is meant to verify that modifications brought to the code base (e.g. a new feature, code optimization, reorganization, etc. ) do not compromise correctness of functionalities that were already implemented.

Some unit tests may act in practice as regression tests.

In particular, before sharing code with co-developers (e.g. via the main branch of a git repository), it is good practice to verify that all unit tests as successful.

Exercise (reminder)

Consider a basic collaboration scheme via git, where:

Alice, Bob and Carol collaborate on the same project,
they share code via the main branch,
each of them has a personal branch (named alice, bob and carol respectively) where they write code that is not yet ready to be shared with the others.

Alice just finished implementing a method, on the branch alice. Her code compiles and passes all unit tests. Now she wants to share her code with the other two.

Which sequence of git commands should Alice execute?

Solution

Commit her changes (locally, to the branch alice):

git add .
git commit -m "commit message"

Update the remote copy of the branch alice (for backup only):

git push

Synchronize the local copy of the branch main (because Bob and Carol may have added content to it):

git checkout main
git pull

Merge (locally) the content of main into alice (and fix the merge conflicts, if any):

git checkout alice
git merge main

Merge (locally) the content of alice into main (there should be no more conflict). After this step, the local copies of alice and main will be identical.

git checkout main
git merge alice

Upload the changes:

git push

Go back to work:

git checkout alice