What you can generate and how

Most things should be easy to generate and everything should be possible.

To support this principle Hypothesis provides strategies for most built-in types with arguments to constrain or adjust the output, as well as higher-order strategies that can be composed to generate more complex types.

This document is a guide to what strategies are available for generating data and how to build them. Strategies have a variety of other important internal features, such as how they simplify, but the data they can generate is the only public part of their API.

Core strategies

Functions for building strategies are all available in the hypothesis.strategies module. The salient functions from it are as follows:

hypothesis.strategies.binary(*, min_size=0, max_size=None)[source]

Generates bytes.

The generated bytes will have a length of at least min_size and at most max_size. If max_size is None there is no upper limit.

Examples from this strategy shrink towards smaller strings and lower byte values.


Returns a strategy which generates instances of bool.

Examples from this strategy will shrink towards False (i.e. shrinking will replace True with False where possible).

hypothesis.strategies.builds(target, /, *args, **kwargs)[source]

Generates values by drawing from args and kwargs and passing them to the callable (provided as the first positional argument) in the appropriate argument position.

e.g. builds(target, integers(), flag=booleans()) would draw an integer i and a boolean b and call target(i, flag=b).

If the callable has type annotations, they will be used to infer a strategy for required arguments that were not passed to builds. You can also tell builds to infer a strategy for an optional argument by passing ... (Ellipsis) as a keyword argument to builds, instead of a strategy for that argument to the callable.

If the callable is a class defined with attrs, missing required arguments will be inferred from the attribute on a best-effort basis, e.g. by checking attrs standard validators. Dataclasses are handled natively by the inference from type hints.

Examples from this strategy shrink by shrinking the argument values to the callable.


Generates characters, length-one strings, following specified filtering rules.

  • When no filtering rules are specified, any character can be produced.

  • If min_codepoint or max_codepoint is specified, then only characters having a codepoint in that range will be produced.

  • If categories is specified, then only characters from those Unicode categories will be produced. This is a further restriction, characters must also satisfy min_codepoint and max_codepoint.

  • If exclude_categories is specified, then any character from those categories will not be produced. You must not pass both categories and exclude_categories; these arguments are alternative ways to specify exactly the same thing.

  • If include_characters is specified, then any additional characters in that list will also be produced.

  • If exclude_characters is specified, then any characters in that list will be not be produced. Any overlap between include_characters and exclude_characters will raise an exception.

  • If codec is specified, only characters in the specified codec encodings will be produced.

The _codepoint arguments must be integers between zero and sys.maxunicode. The _characters arguments must be collections of length-one unicode strings, such as a unicode string.

The _categories arguments must be used to specify either the one-letter Unicode major category or the two-letter Unicode general category. For example, ('Nd', 'Lu') signifies “Number, decimal digit” and “Letter, uppercase”. A single letter (‘major category’) can be given to match all corresponding categories, for example 'P' for characters in any punctuation category.

We allow codecs from the codecs module and their aliases, platform specific and user-registered codecs if they are available, and python-specific text encodings (but not text or binary transforms). include_characters which cannot be encoded using this codec will raise an exception. If non-encodable codepoints or categories are explicitly allowed, the codec argument will exclude them without raising an exception.

Examples from this strategy shrink towards the codepoint for '0', or the first allowable codepoint after it if '0' is excluded.


Returns a strategy that generates complex numbers.

This strategy draws complex numbers with constrained magnitudes. The min_magnitude and max_magnitude parameters should be non-negative Real numbers; a value of None corresponds an infinite upper bound.

If min_magnitude is nonzero or max_magnitude is finite, it is an error to enable allow_nan. If max_magnitude is finite, it is an error to enable allow_infinity.

allow_infinity, allow_nan, and allow_subnormal are applied to each part of the complex number separately, as for floats().

The magnitude constraints are respected up to a relative error of (around) floating-point epsilon, due to implementation via the system sqrt function.

The width argument specifies the maximum number of bits of precision required to represent the entire generated complex number. Valid values are 32, 64 or 128, which correspond to the real and imaginary components each having width 16, 32 or 64, respectively. Passing width=64 will still use the builtin 128-bit complex class, but always for values which can be exactly represented as two 32-bit floats.

Examples from this strategy shrink by shrinking their real and imaginary parts, as floats().

If you need to generate complex numbers with particular real and imaginary parts or relationships between parts, consider using builds(complex, ...) or @composite respectively.


Defines a strategy that is built out of potentially arbitrarily many other strategies.

This is intended to be used as a decorator. See the full documentation for more details about how to use this function.

Examples from this strategy shrink by shrinking the output of each draw call.


This isn’t really a normal strategy, but instead gives you an object which can be used to draw data interactively from other strategies.

See the rest of the documentation for more complete information.

Examples from this strategy do not shrink (because there is only one), but the result of calls to each data.draw() call shrink as they normally would.

class hypothesis.strategies.DataObject[source]

This type only exists so that you can write type hints for tests using the data() strategy. Do not use it directly!


A strategy for dates between min_value and max_value.

Examples from this strategy shrink towards January 1st 2000.


A strategy for generating datetimes, which may be timezone-aware.

This strategy works by drawing a naive datetime between min_value and max_value, which must both be naive (have no timezone).

timezones must be a strategy that generates either None, for naive datetimes, or tzinfo objects for ‘aware’ datetimes. You can construct your own, though we recommend using one of these built-in strategies:

You may pass allow_imaginary=False to filter out “imaginary” datetimes which did not (or will not) occur due to daylight savings, leap seconds, timezone and calendar adjustments, etc. Imaginary datetimes are allowed by default, because malformed timestamps are a common source of bugs.

Examples from this strategy shrink towards midnight on January 1st 2000, local time.


Generates instances of decimal.Decimal, which may be:

  • A finite rational number, between min_value and max_value.

  • Not a Number, if allow_nan is True. None means “allow NaN, unless min_value and max_value are not None”.

  • Positive or negative infinity, if max_value and min_value respectively are None, and allow_infinity is not False. None means “allow infinity, unless excluded by the min and max values”.

Note that where floats have one NaN value, Decimals have four: signed, and either quiet or signalling. See the decimal module docs for more information on special values.

If places is not None, all finite values drawn from the strategy will have that number of digits after the decimal place.

Examples from this strategy do not have a well defined shrink order but try to maximize human readability when shrinking.


A deferred strategy allows you to write a strategy that references other strategies that have not yet been defined. This allows for the easy definition of recursive and mutually recursive strategies.

The definition argument should be a zero-argument function that returns a strategy. It will be evaluated the first time the strategy is used to produce an example.

Example usage:

>>> import hypothesis.strategies as st
>>> x = st.deferred(lambda: st.booleans() | st.tuples(x, x))
>>> x.example()
(((False, (True, True)), (False, True)), (True, True))
>>> x.example()

Mutual recursion also works fine:

>>> a = st.deferred(lambda: st.booleans() | b)
>>> b = st.deferred(lambda: st.tuples(a, a))
>>> a.example()
>>> b.example()
(False, (False, ((False, True), False)))

Examples from this strategy shrink as they normally would from the strategy returned by the definition.

dict_class=<class 'dict'>,

Generates dictionaries of type dict_class with keys drawn from the keys argument and values drawn from the values argument.

The size parameters have the same interpretation as for lists().

Examples from this strategy shrink by trying to remove keys from the generated dictionary, and by shrinking each generated key and value.

class hypothesis.strategies.DrawFn[source]

This type only exists so that you can write type hints for functions decorated with @composite.

def list_and_index(draw: DrawFn) -> Tuple[int, str]:
    i = draw(integers())  # type inferred as 'int'
    s = draw(text())  # type inferred as 'str'
    return i, s
hypothesis.strategies.emails(*, domains=domains())[source]

A strategy for generating email addresses as unicode strings. The address format is specified in RFC 5322#section-3.4.1. Values shrink towards shorter local-parts and host domains.

If domains is given then it must be a strategy that generates domain names for the emails, defaulting to domains().

This strategy is useful for generating “user data” for tests, as mishandling of email addresses is a common source of bugs.

hypothesis.strategies.fixed_dictionaries(mapping, *, optional=None)[source]

Generates a dictionary of the same type as mapping with a fixed set of keys mapping to strategies. mapping must be a dict subclass.

Generated values have all keys present in mapping, in iteration order, with the corresponding values drawn from mapping[key].

If optional is passed, the generated value may or may not contain each key from optional and a value drawn from the corresponding strategy. Generated values may contain optional keys in an arbitrary order.

Examples from this strategy shrink by shrinking each individual value in the generated dictionary, and omitting optional key-value pairs.


Returns a strategy which generates floats.

  • If min_value is not None, all values will be >= min_value (or > min_value if exclude_min).

  • If max_value is not None, all values will be <= max_value (or < max_value if exclude_max).

  • If min_value or max_value is not None, it is an error to enable allow_nan.

  • If both min_value and max_value are not None, it is an error to enable allow_infinity.

  • If inferred values range does not include subnormal values, it is an error to enable allow_subnormal.

Where not explicitly ruled out by the bounds, subnormals, infinities, and NaNs are possible values generated by this strategy.

The width argument specifies the maximum number of bits of precision required to represent the generated float. Valid values are 16, 32, or 64. Passing width=32 will still use the builtin 64-bit float class, but always for values which can be exactly represented as a 32-bit float.

The exclude_min and exclude_max argument can be used to generate numbers from open or half-open intervals, by excluding the respective endpoints. Excluding either signed zero will also exclude the other. Attempting to exclude an endpoint which is None will raise an error; use allow_infinity=False to generate finite floats. You can however use e.g. min_value=-math.inf, exclude_min=True to exclude only one infinite endpoint.

Examples from this strategy have a complicated and hard to explain shrinking behaviour, but it tries to improve “human readability”. Finite numbers will be preferred to infinity and infinity will be preferred to NaN.


Returns a strategy which generates Fractions.

If min_value is not None then all generated values are no less than min_value. If max_value is not None then all generated values are no greater than max_value. min_value and max_value may be anything accepted by the Fraction constructor.

If max_denominator is not None then the denominator of any generated values is no greater than max_denominator. Note that max_denominator must be None or a positive integer.

Examples from this strategy shrink towards smaller denominators, then closer to zero.

hypothesis.strategies.from_regex(regex, *, fullmatch=False, alphabet=None)[source]

Generates strings that contain a match for the given regex (i.e. ones for which re.search() will return a non-None result).

regex may be a pattern or compiled regex. Both byte-strings and unicode strings are supported, and will generate examples of the same type.

You can use regex flags such as re.IGNORECASE or re.DOTALL to control generation. Flags can be passed either in compiled regex or inside the pattern with a (?iLmsux) group.

Some regular expressions are only partly supported - the underlying strategy checks local matching and relies on filtering to resolve context-dependent expressions. Using too many of these constructs may cause health-check errors as too many examples are filtered out. This mainly includes (positive or negative) lookahead and lookbehind groups.

If you want the generated string to match the whole regex you should use boundary markers. So e.g. r"\A.\Z" will return a single character string, while "." will return any string, and r"\A.$" will return a single character optionally followed by a "\n". Alternatively, passing fullmatch=True will ensure that the whole string is a match, as if you had used the \A and \Z markers.

The alphabet= argument constrains the characters in the generated string, as for text(), and is only supported for unicode strings.

Examples from this strategy shrink towards shorter strings and lower character values, with exact behaviour that may depend on the pattern.


Looks up the appropriate search strategy for the given type.

from_type is used internally to fill in missing arguments to builds() and can be used interactively to explore what strategies are available or to debug type resolution.

You can use register_type_strategy() to handle your custom types, or to globally redefine certain strategies - for example excluding NaN from floats, or use timezone-aware instead of naive time and datetime strategies.

The resolution logic may be changed in a future version, but currently tries these five options:

  1. If thing is in the default lookup mapping or user-registered lookup, return the corresponding strategy. The default lookup covers all types with Hypothesis strategies, including extras where possible.

  2. If thing is from the typing module, return the corresponding strategy (special logic).

  3. If thing has one or more subtypes in the merged lookup, return the union of the strategies for those types that are not subtypes of other elements in the lookup.

  4. Finally, if thing has type annotations for all required arguments, and is not an abstract class, it is resolved via builds().

  5. Because abstract types cannot be instantiated, we treat abstract types as the union of their concrete subclasses. Note that this lookup works via inheritance but not via register, so you may still need to use register_type_strategy().

There is a valuable recipe for leveraging from_type() to generate “everything except” values from a specified type. I.e.

def everything_except(excluded_types):
    return (
        .filter(lambda x: not isinstance(x, excluded_types))

For example, everything_except(int) returns a strategy that can generate anything that from_type() can ever generate, except for instances of int, and excluding instances of types added via register_type_strategy().

This is useful when writing tests which check that invalid input is rejected in a certain way.

hypothesis.strategies.frozensets(elements, *, min_size=0, max_size=None)[source]

This is identical to the sets function but instead returns frozensets.

hypothesis.strategies.functions(*, like=lambda : ..., returns=..., pure=False)[source]

A strategy for functions, which can be used in callbacks.

The generated functions will mimic the interface of like, which must be a callable (including a class, method, or function). The return value for the function is drawn from the returns argument, which must be a strategy. If returns is not passed, we attempt to infer a strategy from the return-type annotation if present, falling back to none().

If pure=True, all arguments passed to the generated function must be hashable, and if passed identical arguments the original return value will be returned again - not regenerated, so beware mutable values.

If pure=False, generated functions do not validate their arguments, and may return a different value if called again with the same arguments.

Generated functions can only be called within the scope of the @given which created them. This strategy does not support .example().

hypothesis.strategies.integers(min_value=None, max_value=None)[source]

Returns a strategy which generates integers.

If min_value is not None then all values will be >= min_value. If max_value is not None then all values will be <= max_value

Examples from this strategy will shrink towards zero, and negative values will also shrink towards positive (i.e. -n may be replaced by +n).

hypothesis.strategies.ip_addresses(*, v=None, network=None)[source]

Generate IP addresses - v=4 for IPv4Addresses, v=6 for IPv6Addresses, or leave unspecified to allow both versions.

network may be an IPv4Network or IPv6Network, or a string representing a network such as "" or "2001:db8::/32". As well as generating addresses within a particular routable network, this can be used to generate addresses from a reserved range listed in the IANA registries.

If you pass both v and network, they must be for the same version.


This has the same behaviour as lists, but returns iterables instead.

Some iterables cannot be indexed (e.g. sets) and some do not have a fixed length (e.g. generators). This strategy produces iterators, which cannot be indexed and do not have a fixed length. This ensures that you do not accidentally depend on sequence behaviour.


Return a strategy which only generates value.

Note: value is not copied. Be wary of using mutable values.

If value is the result of a callable, you can use builds(callable) instead of just(callable()) to get a fresh value each time.

Examples from this strategy do not shrink (because there is only one).


Returns a list containing values drawn from elements with length in the interval [min_size, max_size] (no bounds in that direction if these are None). If max_size is 0, only the empty list will be drawn.

If unique is True (or something that evaluates to True), we compare direct object equality, as if unique_by was lambda x: x. This comparison only works for hashable types.

If unique_by is not None it must be a callable or tuple of callables returning a hashable type when given a value drawn from elements. The resulting list will satisfy the condition that for i != j, unique_by(result[i]) != unique_by(result[j]).

If unique_by is a tuple of callables the uniqueness will be respective to each callable.

For example, the following will produce two columns of integers with both columns being unique respectively.

>>> twoints = st.tuples(st.integers(), st.integers())
>>> st.lists(twoints, unique_by=(lambda x: x[0], lambda x: x[1]))

Examples from this strategy shrink by trying to remove elements from the list, and by shrinking each individual element of the list.


Return a strategy which only generates None.

Examples from this strategy do not shrink (because there is only one).


This strategy never successfully draws a value and will always reject on an attempt to draw.

Examples from this strategy do not shrink (because there are none).


Return a strategy which generates values from any of the argument strategies.

This may be called with one iterable argument instead of multiple strategy arguments, in which case one_of(x) and one_of(*x) are equivalent.

Examples from this strategy will generally shrink to ones that come from strategies earlier in the list, then shrink according to behaviour of the strategy that produced them. In order to get good shrinking behaviour, try to put simpler strategies first. e.g. one_of(none(), text()) is better than one_of(text(), none()).

This is especially important when using recursive strategies. e.g. x = st.deferred(lambda: st.none() | st.tuples(x, x)) will shrink well, but x = st.deferred(lambda: st.tuples(x, x) | st.none()) will shrink very badly indeed.


Return a strategy which returns permutations of the ordered collection values.

Examples from this strategy shrink by trying to become closer to the original order of values.


Hypothesis always seeds global PRNGs before running a test, and restores the previous state afterwards.

If having a fixed seed would unacceptably weaken your tests, and you cannot use a random.Random instance provided by randoms(), this strategy calls random.seed() with an arbitrary integer and passes you an opaque object whose repr displays the seed value for debugging. If numpy.random is available, that state is also managed, as is anything managed by hypothesis.register_random().

Examples from these strategy shrink to seeds closer to zero.

hypothesis.strategies.randoms(*, note_method_calls=False, use_true_random=False)[source]

Generates instances of random.Random. The generated Random instances are of a special HypothesisRandom subclass.

  • If note_method_calls is set to True, Hypothesis will print the randomly drawn values in any falsifying test case. This can be helpful for debugging the behaviour of randomized algorithms.

  • If use_true_random is set to True then values will be drawn from their usual distribution, otherwise they will actually be Hypothesis generated values (and will be shrunk accordingly for any failing test case). Setting use_true_random=False will tend to expose bugs that would occur with very low probability when it is set to True, and this flag should only be set to True when your code relies on the distribution of values for correctness.

For managing global state, see the random_module() strategy and register_random() function.

hypothesis.strategies.recursive(base, extend, *, max_leaves=100)[source]

base: A strategy to start from.

extend: A function which takes a strategy and returns a new strategy.

max_leaves: The maximum number of elements to be drawn from base on a given run.

This returns a strategy S such that S = extend(base | S). That is, values may be drawn from base, or from any strategy reachable by mixing applications of | and extend.

An example may clarify: recursive(booleans(), lists) would return a strategy that may return arbitrarily nested and mixed lists of booleans. So e.g. False, [True], [False, []], and [[[[True]]]] are all valid values to be drawn from that strategy.

Examples from this strategy shrink by trying to reduce the amount of recursion and by shrinking according to the shrinking behaviour of base and the result of extend.

hypothesis.strategies.register_type_strategy(custom_type, strategy)[source]

Add an entry to the global type-to-strategy lookup.

This lookup is used in builds() and @given.

builds() will be used automatically for classes with type annotations on __init__ , so you only need to register a strategy if one or more arguments need to be more tightly defined than their type-based default, or if you want to supply a strategy for an argument with a default value.

strategy may be a search strategy, or a function that takes a type and returns a strategy (useful for generic types). The function may return NotImplemented to conditionally not provide a strategy for the type (the type will still be resolved by other methods, if possible, as if the function was not registered).

Note that you may not register a parametrised generic type (such as MyCollection[int]) directly, because the resolution logic does not handle this case correctly. Instead, you may register a function for MyCollection and inspect the type parameters within that function.

hypothesis.strategies.runner(*, default=not_set)[source]

A strategy for getting “the current test runner”, whatever that may be. The exact meaning depends on the entry point, but it will usually be the associated ‘self’ value for it.

If you are using this in a rule for stateful testing, this strategy will return the instance of the RuleBasedStateMachine that the rule is running for.

If there is no current test runner and a default is provided, return that default. If no default is provided, raises InvalidArgument.

Examples from this strategy do not shrink (because there is only one).


Returns a strategy which generates any value present in elements.

Note that as with just(), values will not be copied and thus you should be careful of using mutable data.

sampled_from supports ordered collections, as well as Enum objects. Flag objects may also generate any combination of their members.

Examples from this strategy shrink by replacing them with values earlier in the list. So e.g. sampled_from([10, 1]) will shrink by trying to replace 1 values with 10, and sampled_from([1, 10]) will shrink by trying to replace 10 values with 1.

It is an error to sample from an empty sequence, because returning nothing() makes it too easy to silently drop parts of compound strategies. If you need that behaviour, use sampled_from(seq) if seq else nothing().

hypothesis.strategies.sets(elements, *, min_size=0, max_size=None)[source]

This has the same behaviour as lists, but returns sets instead.

Note that Hypothesis cannot tell if values are drawn from elements are hashable until running the test, so you can define a strategy for sets of an unhashable type but it will fail at test time.

Examples from this strategy shrink by trying to remove elements from the set, and by shrinking each individual element of the set.

hypothesis.strategies.shared(base, *, key=None)[source]

Returns a strategy that draws a single shared value per run, drawn from base. Any two shared instances with the same key will share the same value, otherwise the identity of this strategy will be used. That is:

>>> s = integers()  # or any other strategy
>>> x = shared(s)
>>> y = shared(s)

In the above x and y may draw different (or potentially the same) values. In the following they will always draw the same:

>>> x = shared(s, key="hi")
>>> y = shared(s, key="hi")

Examples from this strategy shrink as per their base strategy.


Generates slices that will select indices up to the supplied size

Generated slices will have start and stop indices that range from -size to size - 1 and will step in the appropriate direction. Slices should only produce an empty selection if the start and end are the same.

Examples from this strategy shrink toward 0 and smaller values


Generates strings with characters drawn from alphabet, which should be a collection of length one strings or a strategy generating such strings.

The default alphabet strategy can generate the full unicode range but excludes surrogate characters because they are invalid in the UTF-8 encoding. You can use characters() without arguments to find surrogate-related bugs such as bpo-34454.

min_size and max_size have the usual interpretations. Note that Python measures string length by counting codepoints: U+00C5 Å is a single character, while U+0041 U+030A is two - the A, and a combining ring above.

Examples from this strategy shrink towards shorter strings, and with the characters in the text shrinking as per the alphabet strategy. This strategy does not normalize() examples, so generated strings may be in any or none of the ‘normal forms’.


A strategy for timedeltas between min_value and max_value.

Examples from this strategy shrink towards zero.


A strategy for times between min_value and max_value.

The timezones argument is handled as for datetimes().

Examples from this strategy shrink towards midnight, with the timezone component shrinking as for the strategy that provided it.

hypothesis.strategies.timezone_keys(*, allow_prefix=True)[source]

A strategy for IANA timezone names.

As well as timezone names like "UTC", "Australia/Sydney", or "America/New_York", this strategy can generate:

  • Aliases such as "Antarctica/McMurdo", which links to "Pacific/Auckland".

  • Deprecated names such as "Antarctica/South_Pole", which also links to "Pacific/Auckland". Note that most but not all deprecated timezone names are also aliases.

  • Timezone names with the "posix/" or "right/" prefixes, unless allow_prefix=False.

These strings are provided separately from Tzinfo objects - such as ZoneInfo instances from the timezones() strategy - to facilitate testing of timezone logic without needing workarounds to access non-canonical names.


The zoneinfo module is new in Python 3.9, so you will need to install the backports.zoneinfo module on earlier versions.

On Windows, you will also need to install the tzdata package.

pip install hypothesis[zoneinfo] will install these conditional dependencies if and only if they are needed.

On Windows, you may need to access IANA timezone data via the tzdata package. For non-IANA timezones, such as Windows-native names or GNU TZ strings, we recommend using sampled_from() with the dateutil package, e.g. dateutil.tz.tzwin.list().

hypothesis.strategies.timezones(*, no_cache=False)[source]

A strategy for zoneinfo.ZoneInfo objects.

If no_cache=True, the generated instances are constructed using ZoneInfo.no_cache instead of the usual constructor. This may change the semantics of your datetimes in surprising ways, so only use it if you know that you need to!


The zoneinfo module is new in Python 3.9, so you will need to install the backports.zoneinfo module on earlier versions.

On Windows, you will also need to install the tzdata package.

pip install hypothesis[zoneinfo] will install these conditional dependencies if and only if they are needed.


Return a strategy which generates a tuple of the same length as args by generating the value at index i from args[i].

e.g. tuples(integers(), integers()) would generate a tuple of length two with both values an integer.

Examples from this strategy shrink by shrinking their component parts.

hypothesis.strategies.uuids(*, version=None, allow_nil=False)[source]

Returns a strategy that generates UUIDs.

If the optional version argument is given, value is passed through to UUID and only UUIDs of that version will be generated.

If allow_nil is True, generate the nil UUID much more often. Otherwise, all returned values from this will be unique, so e.g. if you do lists(uuids()) the resulting list will never contain duplicates.

Examples from this strategy don’t have any meaningful shrink order.

Provisional strategies

This module contains various provisional APIs and strategies.

It is intended for internal use, to ease code reuse, and is not stable. Point releases may move or break the contents at any time!

Internet strategies should conform to RFC 3986 or the authoritative definitions it links to. If not, report the bug!

hypothesis.provisional.domains(*, max_length=255, max_element_length=63)[source]

Generate RFC 1035 compliant fully qualified domain names.


A strategy for RFC 3986, generating http/https URLs.


When using strategies it is worth thinking about how the data shrinks. Shrinking is the process by which Hypothesis tries to produce human readable examples when it finds a failure - it takes a complex example and turns it into a simpler one.

Each strategy defines an order in which it shrinks - you won’t usually need to care about this much, but it can be worth being aware of as it can affect what the best way to write your own strategies is.

The exact shrinking behaviour is not a guaranteed part of the API, but it doesn’t change that often and when it does it’s usually because we think the new way produces nicer examples.

Possibly the most important one to be aware of is one_of(), which has a preference for values produced by strategies earlier in its argument list. Most of the others should largely “do the right thing” without you having to think about it.

Adapting strategies

Often it is the case that a strategy doesn’t produce exactly what you want it to and you need to adapt it. Sometimes you can do this in the test, but this hurts reuse because you then have to repeat the adaption in every test.

Hypothesis gives you ways to build strategies from other strategies given functions for transforming the data.


map is probably the easiest and most useful of these to use. If you have a strategy s and a function f, then an example s.map(f).example() is f(s.example()), i.e. we draw an example from s and then apply f to it.


>>> lists(integers()).map(sorted).example()
[-25527, -24245, -23118, -93, -70, -7, 0, 39, 40, 65, 88, 112, 6189, 9480, 19469, 27256, 32526, 1566924430]

Note that many things that you might use mapping for can also be done with builds(), and if you find yourself indexing into a tuple within .map() it’s probably time to use that instead.


filter lets you reject some examples. s.filter(f).example() is some example of s such that f(example) is truthy.

>>> integers().filter(lambda x: x > 11).example()
>>> integers().filter(lambda x: x > 11).example()

It’s important to note that filter isn’t magic and if your condition is too hard to satisfy then this can fail:

>>> integers().filter(lambda x: False).example()
Traceback (most recent call last):
hypothesis.errors.Unsatisfiable: Could not find any valid examples in 20 tries

In general you should try to use filter only to avoid corner cases that you don’t want rather than attempting to cut out a large chunk of the search space.

A technique that often works well here is to use map to first transform the data and then use filter to remove things that didn’t work out. So for example if you wanted pairs of integers (x,y) such that x < y you could do the following:

>>> tuples(integers(), integers()).map(sorted).filter(lambda x: x[0] < x[1]).example()
[-8543729478746591815, 3760495307320535691]

Chaining strategies together

Finally there is flatmap. flatmap draws an example, then turns that example into a strategy, then draws an example from that strategy.

It may not be obvious why you want this at first, but it turns out to be quite useful because it lets you generate different types of data with relationships to each other.

For example suppose we wanted to generate a list of lists of the same length:

>>> rectangle_lists = integers(min_value=0, max_value=10).flatmap(
...     lambda n: lists(lists(integers(), min_size=n, max_size=n))
... )
>>> rectangle_lists.example()
>>> rectangle_lists.filter(lambda x: len(x) >= 10).example()
[[], [], [], [], [], [], [], [], [], []]
>>> rectangle_lists.filter(lambda t: len(t) >= 3 and len(t[0]) >= 3).example()
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> rectangle_lists.filter(lambda t: sum(len(s) for s in t) >= 10).example()
[[0], [0], [0], [0], [0], [0], [0], [0], [0], [0]]

In this example we first choose a length for our tuples, then we build a strategy which generates lists containing lists precisely of that length. The finds show what simple examples for this look like.

Most of the time you probably don’t want flatmap, but unlike filter and map which are just conveniences for things you could just do in your tests, flatmap allows genuinely new data generation that you wouldn’t otherwise be able to easily do.

(If you know Haskell: Yes, this is more or less a monadic bind. If you don’t know Haskell, ignore everything in these parentheses. You do not need to understand anything about monads to use this, or anything else in Hypothesis).

Recursive data

Sometimes the data you want to generate has a recursive definition. e.g. if you wanted to generate JSON data, valid JSON is:

  1. Any float, any boolean, any unicode string.

  2. Any list of valid JSON data

  3. Any dictionary mapping unicode strings to valid JSON data.

The problem is that you cannot call a strategy recursively and expect it to not just blow up and eat all your memory. The other problem here is that not all unicode strings display consistently on different machines, so we’ll restrict them in our doctest.

The way Hypothesis handles this is with the recursive() strategy which you pass in a base case and a function that, given a strategy for your data type, returns a new strategy for it. So for example:

>>> from string import printable
... from pprint import pprint
>>> json = recursive(
...     none() | booleans() | floats() | text(printable),
...     lambda children: lists(children) | dictionaries(text(printable), children),
... )
>>> pprint(json.example())
[[1.175494351e-38, ']', 1.9, True, False, '.M}Xl', ''], True]
>>> pprint(json.example())
{'de(l': None,
 'nK': {'(Rt)': None,
        '+hoZh1YU]gy8': True,
        '8z]EIFA06^li^': 'LFE{Q',
        '9,': 'l{cA=/'}}

That is, we start with our leaf data and then we augment it by allowing lists and dictionaries of anything we can generate as JSON data.

The size control of this works by limiting the maximum number of values that can be drawn from the base strategy. So for example if we wanted to only generate really small JSON we could do this as:

>>> small_lists = recursive(booleans(), lists, max_leaves=5)
>>> small_lists.example()
>>> small_lists.example()

Composite strategies

The @composite decorator lets you combine other strategies in more or less arbitrary ways. It’s probably the main thing you’ll want to use for complicated custom strategies.

The composite decorator works by converting a function that returns one example into a function that returns a strategy that produces such examples - which you can pass to @given, modify with .map or .filter, and generally use like any other strategy.

It does this by giving you a special function draw as the first argument, which can be used just like the corresponding method of the data() strategy within a test. In fact, the implementation is almost the same - but defining a strategy with @composite makes code reuse easier, and usually improves the display of failing examples.

For example, the following gives you a list and an index into it:

>>> @composite
... def list_and_index(draw, elements=integers()):
...     xs = draw(lists(elements, min_size=1))
...     i = draw(integers(min_value=0, max_value=len(xs) - 1))
...     return (xs, i)

draw(s) is a function that should be thought of as returning s.example(), except that the result is reproducible and will minimize correctly. The decorated function has the initial argument removed from the list, but will accept all the others in the expected order. Defaults are preserved.

>>> list_and_index()
>>> list_and_index().example()
([15949, -35, 21764, 8167, 1607867656, -41, 104, 19, -90, 520116744169390387, 7107438879249457973], 0)

>>> list_and_index(booleans())
>>> list_and_index(booleans()).example()
([True, False], 0)

Note that the repr will work exactly like it does for all the built-in strategies: it will be a function that you can call to get the strategy in question, with values provided only if they do not match the defaults.

You can use assume inside composite functions:

def distinct_strings_with_common_characters(draw):
    x = draw(text(min_size=1))
    y = draw(text(alphabet=x))
    assume(x != y)
    return (x, y)

This works as assume normally would, filtering out any examples for which the passed in argument is falsey.

Take care that your function can cope with adversarial draws, or explicitly rejects them using the .filter() method or assume() - our mutation and shrinking logic can do some strange things, and a naive implementation might lead to serious performance problems. For example:

def reimplementing_sets_strategy(draw, elements=st.integers(), size=5):
    # The bad way: if Hypothesis keeps generating e.g. zero,
    # we'll keep looping for a very long time.
    result = set()
    while len(result) < size:
    # The good way: use a filter, so Hypothesis can tell what's valid!
    for _ in range(size):
        result.add(draw(elements.filter(lambda x: x not in result)))
    return result

If @composite is used to decorate a method or classmethod, the draw argument must come before self or cls. While we therefore recommend writing strategies as standalone functions and using the register_type_strategy() function to associate them with a class, methods are supported and the @composite decorator may be applied either before or after @classmethod or @staticmethod. See issue #2578 and pull request #2634 for more details.

Drawing interactively in tests

There is also the data() strategy, which gives you a means of using strategies interactively. Rather than having to specify everything up front in @given you can draw from strategies in the body of your test.

This is similar to @composite, but even more powerful as it allows you to mix test code with example generation. The downside of this power is that data() is incompatible with explicit @example(...)s - and the mixed code is often harder to debug when something goes wrong.

If you need values that are affected by previous draws but which don’t depend on the execution of your test, stick to the simpler @composite.

def test_draw_sequentially(data):
    x = data.draw(integers())
    y = data.draw(integers(min_value=x))
    assert x < y

If the test fails, each draw will be printed with the falsifying example. e.g. the above is wrong (it has a boundary condition error), so will print:

Falsifying example: test_draw_sequentially(data=data(...))
Draw 1: 0
Draw 2: 0

As you can see, data drawn this way is simplified as usual.

Optionally, you can provide a label to identify values generated by each call to data.draw(). These labels can be used to identify values in the output of a falsifying example.

For instance:

def test_draw_sequentially(data):
    x = data.draw(integers(), label="First number")
    y = data.draw(integers(min_value=x), label="Second number")
    assert x < y

will produce the output:

Falsifying example: test_draw_sequentially(data=data(...))
Draw 1 (First number): 0
Draw 2 (Second number): 0