Settings

Hypothesis tries to have good defaults for its behaviour, but sometimes that’s not enough and you need to tweak it.

The mechanism for doing this is the settings object. You can set up a @given based test to use this using a settings decorator:

@given invocation is as follows:

from hypothesis import given, settings


@given(integers())
@settings(max_examples=500)
def test_this_thoroughly(x):
    pass

This uses a settings object which causes the test to receive a much larger set of examples than normal.

This may be applied either before or after the given and the results are the same. The following is exactly equivalent:

from hypothesis import given, settings


@settings(max_examples=500)
@given(integers())
def test_this_thoroughly(x):
    pass

Available settings

class hypothesis.settings(
parent=None,
*,
max_examples=not_set,
derandomize=not_set,
database=not_set,
verbosity=not_set,
phases=not_set,
stateful_step_count=not_set,
report_multiple_bugs=not_set,
suppress_health_check=not_set,
deadline=not_set,
print_blob=not_set,
)

A settings object configures options including verbosity, runtime controls, persistence, determinism, and more.

Default values are picked up from the settings.default object and changes made there will be picked up in newly created settings.

database

An instance of ExampleDatabase that will be used to save examples to and load previous examples from. May be None in which case no storage will be used.

See the example database documentation for a list of built-in example database implementations, and how to define custom implementations.

default value: (dynamically calculated)

deadline

If set, a duration (as timedelta, or integer or float number of milliseconds) that each individual example (i.e. each time your test function is called, not the whole decorated test) within a test is not allowed to exceed. Tests which take longer than that may be converted into errors (but will not necessarily be if close to the deadline, to allow some variability in test run time).

Set this to None to disable this behaviour entirely.

default value: timedelta(milliseconds=200)

derandomize

If True, seed Hypothesis’ random number generator using a hash of the test function, so that every run will test the same set of examples until you update Hypothesis, Python, or the test function.

This allows you to check for regressions and look for bugs using separate settings profiles - for example running quick deterministic tests on every commit, and a longer non-deterministic nightly testing run.

default value: False

max_examples

Once this many satisfying examples have been considered without finding any counter-example, Hypothesis will stop looking.

Note that we might call your test function fewer times if we find a bug early or can tell that we’ve exhausted the search space; or more if we discard some examples due to use of .filter(), assume(), or a few other things that can prevent the test case from completing successfully.

The default value is chosen to suit a workflow where the test will be part of a suite that is regularly executed locally or on a CI server, balancing total running time against the chance of missing a bug.

If you are writing one-off tests, running tens of thousands of examples is quite reasonable as Hypothesis may miss uncommon bugs with default settings. For very complex code, we have observed Hypothesis finding novel bugs after several million examples while testing SymPy. If you are running more than 100k examples for a test, consider using our integration for coverage-guided fuzzing - it really shines when given minutes or hours to run.

default value: 100

phases

Control which phases should be run. See the full documentation for more details

default value: (Phase.explicit, Phase.reuse, Phase.generate, Phase.target, Phase.shrink, Phase.explain)

print_blob

If set to True, Hypothesis will print code for failing examples that can be used with @reproduce_failure to reproduce the failing example. The default is True if the CI or TF_BUILD env vars are set, False otherwise.

default value: (dynamically calculated)

report_multiple_bugs

Because Hypothesis runs the test many times, it can sometimes find multiple bugs in a single run. Reporting all of them at once is usually very useful, but replacing the exceptions can occasionally clash with debuggers. If disabled, only the exception with the smallest minimal example is raised.

default value: True

stateful_step_count

Number of steps to run a stateful program for before giving up on it breaking.

default value: 50

suppress_health_check

A list of HealthCheck items to disable.

default value: ()

verbosity

Control the verbosity level of Hypothesis messages

default value: Verbosity.normal

Controlling what runs

Hypothesis divides tests into logically distinct phases:

  1. Running explicit examples provided with the @example decorator.

  2. Rerunning a selection of previously failing examples to reproduce a previously seen error.

  3. Generating new examples.

  4. Mutating examples for targeted property-based testing (requires generate phase).

  5. Attempting to shrink an example found in previous phases (other than phase 1 - explicit examples cannot be shrunk). This turns potentially large and complicated examples which may be hard to read into smaller and simpler ones.

  6. Attempting to explain why your test failed (requires shrink phase).

Note

The explain phase has two parts, each of which is best-effort - if Hypothesis can’t find a useful explanation, we’ll just print the minimal failing example.

Following the first failure, Hypothesis will (usually) track which lines of code are always run on failing but never on passing inputs. This relies on sys.settrace(), and is therefore automatically disabled on PyPy or if you are using coverage or a debugger. If there are no clearly suspicious lines of code, we refuse the temptation to guess.

After shrinking to a minimal failing example, Hypothesis will try to find parts of the example – e.g. separate args to @given() – which can vary freely without changing the result of that minimal failing example. If the automated experiments run without finding a passing variation, we leave a comment in the final report:

test_x_divided_by_y(
    x=0,  # or any other generated value
    y=0,
)

Just remember that the lack of an explanation sometimes just means that Hypothesis couldn’t efficiently find one, not that no explanation (or simpler failing example) exists.

The phases setting provides you with fine grained control over which of these run, with each phase corresponding to a value on the Phase enum:

class hypothesis.Phase(value)[source]

An enumeration.

explicit = 0

controls whether explicit examples are run.

reuse = 1

controls whether previous examples will be reused.

generate = 2

controls whether new examples will be generated.

target = 3

controls whether examples will be mutated for targeting.

shrink = 4

controls whether examples will be shrunk.

explain = 5

controls whether Hypothesis attempts to explain test failures.

The phases argument accepts a collection with any subset of these. e.g. settings(phases=[Phase.generate, Phase.shrink]) will generate new examples and shrink them, but will not run explicit examples or reuse previous failures, while settings(phases=[Phase.explicit]) will only run the explicit examples.

Seeing intermediate result

To see what’s going on while Hypothesis runs your tests, you can turn up the verbosity setting.

>>> from hypothesis import find, settings, Verbosity
>>> from hypothesis.strategies import lists, integers
>>> @given(lists(integers()))
... @settings(verbosity=Verbosity.verbose)
... def f(x):
...     assert not any(x)
... f()
Trying example: []
Falsifying example: [-1198601713, -67, 116, -29578]
Shrunk example to [-1198601713]
Shrunk example to [-1198601600]
Shrunk example to [-1191228800]
Shrunk example to [-8421504]
Shrunk example to [-32896]
Shrunk example to [-128]
Shrunk example to [64]
Shrunk example to [32]
Shrunk example to [16]
Shrunk example to [8]
Shrunk example to [4]
Shrunk example to [3]
Shrunk example to [2]
Shrunk example to [1]
[1]

The four levels are quiet, normal, verbose and debug. normal is the default, while in quiet mode Hypothesis will not print anything out, not even the final falsifying example. debug is basically verbose but a bit more so. You probably don’t want it.

If you are using pytest, you may also need to disable output capturing for passing tests.

Building settings objects

Settings can be created by calling settings with any of the available settings values. Any absent ones will be set to defaults:

>>> from hypothesis import settings
>>> settings().max_examples
100
>>> settings(max_examples=10).max_examples
10

You can also pass a ‘parent’ settings object as the first argument, and any settings you do not specify as keyword arguments will be copied from the parent settings:

>>> parent = settings(max_examples=10)
>>> child = settings(parent, deadline=None)
>>> parent.max_examples == child.max_examples == 10
True
>>> parent.deadline
200
>>> child.deadline is None
True

Default settings

At any given point in your program there is a current default settings, available as settings.default. As well as being a settings object in its own right, all newly created settings objects which are not explicitly based off another settings are based off the default, so will inherit any values that are not explicitly set from it.

You can change the defaults by using profiles.

Settings profiles

Depending on your environment you may want different default settings. For example: during development you may want to lower the number of examples to speed up the tests. However, in a CI environment you may want more examples so you are more likely to find bugs.

Hypothesis allows you to define different settings profiles. These profiles can be loaded at any time.

static settings.register_profile(name, parent=None, **kwargs)[source]

Registers a collection of values to be used as a settings profile.

Settings profiles can be loaded by name - for example, you might create a ‘fast’ profile which runs fewer examples, keep the ‘default’ profile, and create a ‘ci’ profile that increases the number of examples and uses a different database to store failures.

The arguments to this method are exactly as for settings: optional parent settings, and keyword arguments for each setting that will be set differently to parent (or settings.default, if parent is None).

static settings.get_profile(name)[source]

Return the profile with the given name.

static settings.load_profile(name)[source]

Loads in the settings defined in the profile provided.

If the profile does not exist, InvalidArgument will be raised. Any setting not defined in the profile will be the library defined default for that setting.

Loading a profile changes the default settings but will not change the behaviour of tests that explicitly change the settings.

>>> from hypothesis import settings
>>> settings.register_profile("ci", max_examples=1000)
>>> settings().max_examples
100
>>> settings.load_profile("ci")
>>> settings().max_examples
1000

Instead of loading the profile and overriding the defaults you can retrieve profiles for specific tests.

Optionally, you may define the environment variable to load a profile for you. This is the suggested pattern for running your tests on CI. The code below should run in a conftest.py or any setup/initialization section of your test suite. If this variable is not defined the Hypothesis defined defaults will be loaded.

>>> import os
>>> from hypothesis import settings, Verbosity
>>> settings.register_profile("ci", max_examples=1000)
>>> settings.register_profile("dev", max_examples=10)
>>> settings.register_profile("debug", max_examples=10, verbosity=Verbosity.verbose)
>>> settings.load_profile(os.getenv("HYPOTHESIS_PROFILE", "default"))

If you are using the hypothesis pytest plugin and your profiles are registered by your conftest you can load one with the command line option --hypothesis-profile.

$ pytest tests --hypothesis-profile <profile-name>

Health checks

Hypothesis’ health checks are designed to detect and warn you about performance problems where your tests are slow, inefficient, or generating very large examples.

If this is expected, e.g. when generating large arrays or dataframes, you can selectively disable them with the suppress_health_check setting. The argument for this parameter is a list with elements drawn from any of the class-level attributes of the HealthCheck class. Using a value of list(HealthCheck) will disable all health checks.

class hypothesis.HealthCheck(value)[source]

Arguments for suppress_health_check.

Each member of this enum is a type of health check to suppress.

data_too_large = 1

Checks if too many examples are aborted for being too large.

This is measured by the number of random choices that Hypothesis makes in order to generate something, not the size of the generated object. For example, choosing a 100MB object from a predefined list would take only a few bits, while generating 10KB of JSON from scratch might trigger this health check.

filter_too_much = 2

Check for when the test is filtering out too many examples, either through use of assume() or filter(), or occasionally for Hypothesis internal reasons.

too_slow = 3

Check for when your data generation is extremely slow and likely to hurt testing.

return_value = 5

Deprecated; we always error if a test returns a non-None value.

large_base_example = 7

Checks if the natural example to shrink towards is very large.

not_a_test_method = 8

Deprecated; we always error if @given is applied to a method defined by unittest.TestCase (i.e. not a test).

function_scoped_fixture = 9

Checks if @given has been applied to a test with a pytest function-scoped fixture. Function-scoped fixtures run once for the whole function, not once per example, and this is usually not what you want.

Because of this limitation, tests that need to set up or reset state for every example need to do so manually within the test itself, typically using an appropriate context manager.

Suppress this health check only in the rare case that you are using a function-scoped fixture that does not need to be reset between individual examples, but for some reason you cannot use a wider fixture scope (e.g. session scope, module scope, class scope).

This check requires the Hypothesis pytest plugin, which is enabled by default when running Hypothesis inside pytest.

differing_executors = 10

Checks if @given has been applied to a test which is executed by different executors. If your test function is defined as a method on a class, that class will be your executor, and subclasses executing an inherited test is a common way for things to go wrong.

The correct fix is often to bring the executor instance under the control of hypothesis by explicit parametrization over, or sampling from, subclasses, or to refactor so that @given is specified on leaf subclasses.