
libfiu - Fault injection in userspace
=====================================

Introduction
------------

You, as a programmer, know many things can fail, and your software is often
expected to be able to handle those failures. But how do you test your failure
handling code, when it's not easy to make a failure appear in the first place?
One way to do it is to perform *fault injection*.

According to Wikipedia, "fault injection is a technique for improving the
coverage of a test by introducing faults in order to test code paths, in
particular error handling code paths, that might otherwise rarely be followed.
It is often used with stress testing and is widely considered to be an
important part of developing robust software".

libfiu is a library that you can use to add fault injection to your code. It
aims to be easy to use by means of a simple API, with minimal code impact and
little runtime overhead when enabled.

By minimal code impact I mean that the modifications you have to do to your
code (and build system) in order to support libfiu should be as little
intrusive as possible.


Code overview
-------------

Let's take a look to a small (fictitious) code sample to see what's the
general idea behind libfiu.

Let's say you have this code that check if there's enough free space to fit a
given file::

        size_t free_space() {
                [code to find out how much free space there is]
                return space;
        }

        bool file_fits(FILE *fd) {
                if (free_space() < file_size(fd)) {
                        return false;
                }
                return true;
        }

As you can see, with current disk sizes, it's very, very strange that a disk
runs out of free space. So the scenario where *free_space()* returns 0 is hard
to test. With libfiu, you can modify that code into::

        size_t free_space() {
                fiu_return_on("no_free_space", 0);

                [code to find out how much free space there is]
                return space;
        }

        bool file_fits(FILE *fd) {
                if (free_space() < file_size(fd)) {
                        return false;
                }
                return true;
        }

There, you've just created a *point of failure* identified by the name
"no_free_space", and said that when that point of failure is enabled, the
function will return 0.

In your testing code, you can now do this::

        fiu_init();
        fiu_enable("no_free_space", 1, NULL, 0);
        assert(file_fits("tmpfile") == false);

The first line initializes the library, and the second *enables* the point of
failure. As the point of failure is enabled, *free_space()* will return 0, so
you can test how your code behaves under that condition, which was otherwise
hard to trigger.

As you can see, libfiu's API has to "sides": a core API and a control API. The
core API is used inside the code wanting to perform fault injection on. The
control API is used inside the testing code, in order to control the injection
of failures.

In the example above, *fiu_return_on()* is a part of the core API, and
*fiu_enable()* is a part of the control API.


Using libfiu in your project
----------------------------

To use libfiu in your project, there are three things to consider: the build
system, the fault injection code, and the testing code.


The build system
~~~~~~~~~~~~~~~~

The first thing to do is to enable your build system to use libfiu. Usually,
you do not want to make libfiu a runtime or build-time dependency, because
it's a testing library.

To that end, you should copy *fiu-local.h* into your source tree, and then
create an option to do a *fault injection build* that #defines the constant
*FIU_ENABLE* (usually done by adding ``-DFIU_ENABLE=1`` to your compiler
flags) and links against libfiu (usually done by adding ``-lfiu`` to your
linker flags).

That way, normal builds will not have a single trace of fault injection code,
but it will be easy to create a binary that does, for testing purposes.


The fault injection code
~~~~~~~~~~~~~~~~~~~~~~~~

Adding fault injection to your code means inserting points of failure in it,
using the core API.

First, you should ``#include "fiu-local.h"`` in the files you want to add
points of failure to. That header allows you to avoid libfiu as a build-time
dependency, as mentioned in the last section.

Then, you to insert points of failure, you sprinkle your code with calls like
``fiu_return_on("name", -1)``, ``fiu_exit_on("name")``, or more complex code
using ``fiu_fail("name")``. Consult the libfiu's manpage for the details on
the API.

It is recommended that you use meaningful names for your points of failure, to
be able to easily identify their purpose. You can also name them
hierarchically (for example, using names like *"io/write"*, *"io/read"*, and
so on), to be able to enable entire groups of points of failure (like
*"io/\*"*,). To this end, any separator will do, the *'/'* is not special at
all.


The testing code
~~~~~~~~~~~~~~~~

Testing can be done in too many ways, so I won't get into specific details
here. As a general approach, usually the idea with fault injection is to write
tests similar in spirit to the one shown above: initialize the library, enable
one or more failures using the control API, and then check if the code behaves
as expected.

Initially, all points of failure are disabled, which means your code should run
as usual, with a very small performance impact.

The points of failure can be enabled using different strategies:

Unconditional (*fiu_enable()*)
  Enables the point of failure in an unconditional way, so it always fails.

Random (*fiu_enable_random()*)
  Enables the point of failure in a non-deterministic way, which will fail with
  the given probability.

External (*fiu_enable_external()*)
  Enables the point of failure using an external function, which will be called
  to determine whether the point of failure should fail or not.

You can also use an asterisk *at the end* of a name to enable all the points
of failure that begin with the given name (excluding the asterisk, of course).

Check libfiu's manpage for more details about the API.

Using the Python bindings, you can also write and/or control your tests using
Python.


