Hypothesis internals

Warning

This page documents internal Hypothesis interfaces. Some are fairly stable, while others are still experimental. In either case, they are not subject to our standard deprecation policy, and we might make breaking changes in minor or patch releases.

This page is intended for people building tools, libraries, or research on top of Hypothesis. If that includes you, please get in touch! We’d love to hear what you’re doing, or explore more stable ways to support your use-case.

Alternative backends

See also

See also the user-facing Alternative backends for Hypothesis documentation.

class hypothesis.internal.conjecture.providers.PrimitiveProvider(conjecturedata, /)[source]

PrimitiveProvider is the implementation interface of a Hypothesis backend.

A PrimitiveProvider is required to implement the following five draw_* methods:

Each strategy in Hypothesis generates values by drawing a series of choices from these five methods. By overriding them, a PrimitiveProvider can control the distribution of inputs generated by Hypothesis.

For example, hypothesis-crosshair implements a PrimitiveProvider which uses an SMT solver to generate inputs that uncover new branches.

Once you implement a PrimitiveProvider, you can make it available for use through AVAILABLE_PROVIDERS.

lifetime = 'test_function'

The lifetime of a PrimitiveProvider instance. Either test_function or test_case.

If test_function (the default), a single provider instance will be instantiated and used for the entirety of each test function (i.e., roughly one provider per @given annotation). This can be useful for tracking state over the entirety of a test function.

If test_case, a new provider instance will be instantiated and used for each input Hypothesis generates.

The conjecturedata argument to PrimitiveProvider.__init__ will be None for a lifetime of test_function, and an instance of ConjectureData for a lifetime of test_case.

Third-party providers likely want to set a lifetime of test_function.

avoid_realization = False

Solver-based backends such as hypothesis-crosshair use symbolic values which record operations performed on them in order to discover new paths. If avoid_realization is set to True, hypothesis will avoid interacting with symbolic choices returned by the provider in any way that would force the solver to narrow the range of possible values for that symbolic.

Setting this to True disables some hypothesis features and optimizations. Only set this to True if it is necessary for your backend.

add_observability_callback = False

If True, on_observation() will be added as a callback to TESTCASE_CALLBACKS, enabling observability during the lifetime of this provider. If False, on_observation() will never be called by Hypothesis.

The opt-in behavior of observability is because enabling observability might increase runtime or memory usage.

abstract draw_boolean(p=0.5)[source]

Draw a boolean choice.

Parameters:

p (float) –

The probability of returning True. Between 0 and 1 inclusive.

Except for 0 and 1, the value of p is a hint provided by Hypothesis, and may be ignored by the backend.

If 0, the provider must return False. If 1, the provider must return True.

abstract draw_integer(
min_value=None,
max_value=None,
*,
weights=None,
shrink_towards=0,
)[source]

Draw an integer choice.

Parameters:
  • min_value (int | None) – (Inclusive) lower bound on the integer value. If None, there is no lower bound.

  • max_value (int | None) – (Inclusive) upper bound on the integer value. If None, there is no upper bound.

  • weights (dict[int, float] | None) – Maps keys in the range [min_value, max_value] to the probability of returning that key.

  • shrink_towards (int) – The integer to shrink towards. This is not used during generation and can be ignored by backends.

abstract draw_float(
*,
min_value=-inf,
max_value=inf,
allow_nan=True,
smallest_nonzero_magnitude,
)[source]

Draw a float choice.

Parameters:
  • min_value (float) – (Inclusive) lower bound on the float value.

  • max_value (float) – (Inclusive) upper bound on the float value.

  • allow_nan (bool) – If False, it is invalid to return math.nan.

  • smallest_nonzero_magnitude (float) – The smallest allowed nonzero magnitude. draw_float should not return a float f if abs(f) < smallest_nonzero_magnitude.

abstract draw_string(
intervals,
*,
min_size=0,
max_size=10000000000,
)[source]

Draw a string choice.

Parameters:
  • intervals (IntervalSet) – The set of codepoints to sample from.

  • min_size (int) – (Inclusive) lower bound on the string length.

  • max_size (int) – (Inclusive) upper bound on the string length.

abstract draw_bytes(
min_size=0,
max_size=10000000000,
)[source]

Draw a bytes choice.

Parameters:
  • min_size (int) – (Inclusive) lower bound on the bytes length.

  • max_size (int) – (Inclusive) upper bound on the bytes length.

per_test_case_context_manager()[source]

Returns a context manager which will be entered each time Hypothesis starts generating and executing one test case, and exited when that test case finishes generating and executing, including if any exception is thrown.

In the lifecycle of a Hypothesis test, this is called before generating strategy values for each test case. This is just before any custom executor is called.

Even if not returning a custom context manager, PrimitiveProvider subclasses are welcome to override this method to know when Hypothesis starts and ends the execution of a single test case.

realize(value, *, for_failure=False)[source]

Called whenever hypothesis requires a concrete (non-symbolic) value from a potentially symbolic value. Hypothesis will not check that value is symbolic before calling realize, so you should handle the case where value is non-symbolic.

The returned value should be non-symbolic. If you cannot provide a value, raise BackendCannotProceed with a value of "discard_test_case".

If for_failure is True, the value is associated with a failing example. In this case, the backend should spend substantially more effort when attempting to realize the value, since it is important to avoid discarding failing examples. Backends may still raise BackendCannotProceed when for_failure is True, if realization is truly impossible or if realization takes significantly longer than expected (say, 5 minutes).

replay_choices(choices)[source]

Called when Hypothesis has discovered a choice sequence which the provider may wish to enqueue to replay under its own instrumentation when we next ask to generate a test case, rather than generating one from scratch.

This is used to e.g. warm-start hypothesis-crosshair with a corpus of high-code-coverage inputs discovered by HypoFuzz.

observe_test_case()[source]

Called at the end of the test case when observability is enabled.

The return value should be a non-symbolic json-encodable dictionary, and will be included in observations as observation["metadata"]["backend"].

observe_information_messages(*, lifetime)[source]

Called at the end of each test case and again at end of the test function.

Return an iterable of {type: info/alert/error, title: str, content: str | dict} dictionaries to be delivered as individual information messages. Hypothesis adds the run_start timestamp and property name for you.

on_observation(observation)[source]

Called at the end of each test case which uses this provider, with the same observation["type"] == "test_case" observation that is passed to other callbacks in TESTCASE_CALLBACKS. This method is not called with observation["type"] in {"info", "alert", "error"} observations.

Important

For on_observation() to be called by Hypothesis, add_observability_callback must be set to True.

on_observation() is explicitly opt-in, as enabling observability might increase runtime or memory usage.

Calls to this method are guaranteed to alternate with calls to per_test_case_context_manager(). For example:

# test function starts
per_test_case_context_manager()
on_observation()
per_test_case_context_manager()
on_observation()
...
# test function ends

Note that on_observation() will not be called for test cases which did not use this provider during generation, for example during Phase.reuse or Phase.shrink, or because Hypothesis switched to the standard Hypothesis backend after this backend raised too many BackendCannotProceed exceptions.

span_start(label, /)[source]

Marks the beginning of a semantically meaningful span of choices.

Spans are a depth-first tree structure. A span is opened by a call to span_start(), and a call to span_end() closes the most recently opened span. So the following sequence of calls:

span_start(label=1)
n1 = draw_integer()
span_start(label=2)
b1 = draw_boolean()
n2 = draw_integer()
span_end()
f1 = draw_float()
span_end()

produces the following two spans of choices:

1: [n1, b1, n2, f1]
2: [b1, n2]

Hypothesis uses spans to denote “semantically meaningful” sequences of choices. For instance, Hypothesis opens a span for the sequence of choices made while drawing from each strategy. Not every span corresponds to a strategy; the generation of e.g. each element in lists() is also marked with a span, among others.

label is an opaque integer, which has no defined semantics. The only guarantee made by Hypothesis is that all spans with the same “meaning” will share the same label. So all spans from the same strategy will share the same label, as will e.g. the spans for lists() elements.

Providers can track calls to span_start() and span_end() to learn something about the semantics of the test’s choice sequence. For instance, a provider could track the depth of the span tree, or the number of unique labels, which says something about the complexity of the choices being generated. Or a provider could track the span tree across test cases in order to determine what strategies are being used in what contexts.

It is possible for Hypothesis to start and immediately stop a span, without calling a draw_* method in between. These spans contain zero choices.

Hypothesis will always balance the number of calls to span_start() and span_end(). A call to span_start() will always be followed by a call to span_end() before the end of the test case.

span_start() is called from ConjectureData.start_span() internally.

span_end(discard, /)[source]

Marks the end of a semantically meaningful span of choices.

discard is True when the draw was filtered out or otherwise marked as unlikely to contribute to the input data as seen by the user’s test. Note however that side effects can make this determination unsound.

span_end() is called from ConjectureData.stop_span() internally.

hypothesis.internal.conjecture.providers.AVAILABLE_PROVIDERS

Registered Hypothesis backends. This is a dictionary whose keys are the name to be used in settings.backend, and whose values are a string of the absolute importable path to a subclass of PrimitiveProvider, which Hypothesis will instantiate when your backend is requested by a test’s settings.backend value.

For example, the default Hypothesis backend is registered as:

from hypothesis.internal.conjecture.providers import AVAILABLE_PROVIDERS

AVAILABLE_PROVIDERS["hypothesis"] = "hypothesis.internal.conjecture.providers.HypothesisProvider"

And can be used with:

from hypothesis import given, settings, strategies as st

@given(st.integers())
@settings(backend="hypothesis")
def f(n):
    pass

Though, as backend="hypothesis" is the default setting, the above would typically not have any effect.

The purpose of mapping to an absolute importable path, rather than the actual PrimitiveProvider class, is to avoid slowing down Hypothesis startup times by only importing alternative backends when required.

hypothesis.internal.conjecture.provider_conformance.run_conformance_test(
Provider,
*,
context_manager_exceptions=(),
settings=None,
_realize_objects=st.from_type(object) | st.from_type(type).flatmap(st.from_type),
)[source]

Test that the given Provider class conforms to the PrimitiveProvider interface.

For instance, this tests that Provider does not return out of bounds choices from any of the draw_* methods, or violate other invariants depended on by Hypothesis.

This function is intended to be called at test-time, not at runtime. It is provided by Hypothesis to make it easy for third-party backend authors to test their provider. Backend authors wishing to test their provider should include a test similar to the following in their test suite:

from hypothesis.internal.conjecture.provider_conformance import run_conformance_test

def test_conformance():
    run_conformance_test(MyProvider)

If your provider can raise control flow exceptions inside one of the five draw_* methods that are handled by your provider’s per_test_case_context_manager, pass a list of these exceptions types to context_manager_exceptions. Otherwise, run_conformance_test will treat those exceptions as fatal errors.

class hypothesis.errors.BackendCannotProceed(scope='other', /)[source]

Raised by alternative backends when a PrimitiveProvider cannot proceed. This is expected to occur inside one of the .draw_*() methods, or for symbolic execution perhaps in realize().

The optional scope argument can enable smarter integration:

verified:

Do not request further test cases from this backend. We may generate more test cases with other backends; if one fails then Hypothesis will report unsound verification in the backend too.

exhausted:

Do not request further test cases from this backend; finish testing with test cases generated with the default backend. Common if e.g. native code blocks symbolic reasoning very early.

discard_test_case:

This particular test case could not be converted to concrete values; skip any further processing and continue with another test case from this backend.

final class hypothesis.internal.intervalsets.IntervalSet(intervals=())[source]

A compact and efficient representation of a set of (a, b) intervals. Can be treated like a set of integers, in that n in intervals will return True if n is contained in any of the (a, b) intervals, and False otherwise.

Observability

hypothesis.internal.observability.TESTCASE_CALLBACKS = []

A list of callback functions for observability. Whenever a new observation is created, each function in this list will be called with a single value, which is a dictionary representing that observation.

You can append a function to this list to receive observability reports, and remove that function from the list to stop receiving observability reports. Observability is considered enabled if this list is nonempty.

hypothesis.internal.observability.OBSERVABILITY_COLLECT_COVERAGE = True

If False, do not collect coverage information when observability is enabled.

This is exposed both for performance (as coverage collection can be slow on Python 3.11 and earlier) and size (if you do not use coverage information, you may not want to store it in-memory).

hypothesis.internal.observability.OBSERVABILITY_CHOICES = False

If True, include the metadata.choice_nodes and metadata.spans keys in test case observations.

False by default. metadata.choice_nodes and metadata.spans can be a substantial amount of data, and so must be opted-in to, even when observability is enabled.

Warning

EXPERIMENTAL AND UNSTABLE. We are actively working towards a better interface for this as of June 2025, and this attribute may disappear or be renamed without notice.

Engine constants

We pick reasonable values for these constants, but if you must, you can monkeypatch them. (Hypothesis is not responsible for any performance degradation that may result).

hypothesis.internal.conjecture.engine.MAX_SHRINKS = 500

The maximum number of times the shrinker will reduce the complexity of a failing input before giving up. This avoids falling down a trap of exponential (or worse) complexity, where the shrinker appears to be making progress but will take a substantially long time to finish completely.

hypothesis.internal.conjecture.engine.MAX_SHRINKING_SECONDS = 300

The maximum total time in seconds that the shrinker will try to shrink a failure for before giving up. This is across all shrinks for the same failure, so even if the shrinker successfully reduces the complexity of a single failure several times, it will stop when it hits MAX_SHRINKING_SECONDS of total time taken.

hypothesis.internal.conjecture.engine.BUFFER_SIZE = 8192

The maximum amount of entropy a single test case can use before giving up while making random choices during input generation.

The “unit” of one BUFFER_SIZE does not have any defined semantics, and you should not rely on it, except that a linear increase BUFFER_SIZE will linearly increase the amount of entropy a test case can use during generation.