Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22882

Luke Plant: Test factory functions in Django

$
0
0

When writing tests for Diango projects, you typically need to create quite a lot of instance of database model objects. This page documents the patterns I recommend, and the ones I don’t.

Before I get going, I should mention that a lot of this can be avoided altogether if you can separate out database independent logic from your models. But you can only go so far without serious contortions, and you’ll probably still need to write a fair number of tests that hit the database.

Contents

The aim

We want the following:

  • Every test should specify each detail about database state it depends on

  • The test should not specify any detail it doesn’t depend on

  • We should be able to conveniently and succinctly write “what we mean”, without having to worry about lower level details, especially database schema details that are not intrinsic to the test.

These things are important so that you can understand tests in isolation, and so that changes not relevant to a test should not break that test. Otherwise you will spend a lot of your time fixing broken tests rather than actually doing the changes you need to do.

Custom factory functions

The answer to this is simply to create your own “factory” functions, with optional keyword arguments (preferably keyword only) for almost everything. You can add parameters by hand as and when you need them.

Here are some simple but real examples from the Christian Camps in Wales booking system, which has a BookingAccount model and includes the ability to pay by cheque which is a ManualPayment object:

defcreate_booking_account(*,name:str="A Booker",address_line1:str="",address_post_code:str="XYZ",email:str=Auto,)->BookingAccount:returnBookingAccount.objects.create(name=name,email=emailornext(BOOKING_ACCOUNT_EMAIL_SEQUENCE),address_line1=address_line1,address_post_code=address_post_code,)defcreate_manual_payment(*,account:BookingAccount=Auto,amount:int=1,)->ManualPayment:returnManualPayment.objects.create(account=accountorcreate_booking_account(),amount=amount,payment_type=ManualPaymentType.CHEQUE,)

You can find the rest of this project’s test factory functions with this search on GitHub.

A few patterns to note:

The Auto sentinel

A number of places here we used a default value of Auto, which is a custom object defined as follows:

class_Auto:"""    Sentinel value indicating an automatic default will be used."""def__bool__(self):# Allow `Auto` to be used like `None` or `False` in boolean expressionsreturnFalseAuto:Any=_Auto()

We use Auto instead of None or something else, because:

  • Sometimes you need to specify None as an actual value (for nullable DB fields), but not want it as a default.

  • Often the correct default needs to be defined dynamically:

    • you need to create another object at runtime, as in the account: BookingAccount = Auto line above

    • a sensible and correct default depends on some other argument, so requires some logic in the body of the function.

We create a singleton value Auto so we can do if foo is Auto checks.

We also give it a type Any so that type checkers don’t complain about using it as a default value. It doesn’t break type checking for the functions calling our factory functions.

Constraints and sequences

Often you have the problem that a unique constraint on a field makes it difficult to provide a static default. As in the example above, I’m using a really simple technique to deal with this — generate a sequence of values that are unlikely to be specified manually in a test. In the above code, you can see BOOKING_ACCOUNT_EMAIL_SEQUENCE which is defined like this at the module level:

BOOKING_ACCOUNT_EMAIL_SEQUENCE=sequence(lambdan:f"booker_{n}@example.com")

Every time we call next() on this object, we get a distinct value, so we avoid issues with constraints.

The sequence utility is actually super simple, but presented here in all it’s type-hinted glory:

importitertoolsfromtypingimportAny,Callable,Generator,TypeVarT=TypeVar("T")defsequence(func:Callable[[int],T])->Generator[T,None,None]:"""    Generates a sequence of values from a sequence of integers starting at zero,    passed through the callable, which must take an integer argument."""return(func(n)forninitertools.count())

You could do something even simpler though — just use a generator expression at the top level:

BOOKING_ACCOUNT_EMAIL_SEQUENCE=(f"booker_{n}@example.com"forninitertools.count())

There can be some cases where you need something more complicated than this (for example to be able to reset sequences) but they are rare in my experience and fairly easy to write.

Delegation and sub-objects

Factory functions often delegate to other factory functions, as in the examples above.

It’s also quite common to want to specify something about a sub-object. Rather than build up a tree of objects as the caller, I often add a parameter to the top-level factory itself. This gives you some independence from the actual schema.

Special purpose factories

You aren’t limited to one factory function per model, you can have as many as you like. For example you might have create_staff_user and create_end_user which take different parameters, but both happen to return the same User model.

Sensible and minimal defaults

As far as possible, the factory function should pick sensible defaults, based on what parameters were passed in if any. If it can’t because the caller contradicted themselves, it should raise an exception.

I normally take the approach that the defaults should produce minimal and pristine objects, while being complete and usable.

For example, if your model supports soft-delete via deactivation, active=False would be a bad default. On the other hand, creating lots related objects in order to be “realistic” would not be a good idea.

You should be pragmatic. For example, for a User object, if a brand new, “pristine” user is always forced to go through an on-boarding flow on your website, meaning that every single page but the onboarding page is blocked until they complete it, then has_onboarded=True is probably a more sensible default — only a few of your tests will want has_onboarded=False.

Simplified interface

A good factory function will often simplify things for the caller.

For example, in the CCiW project mentioned, the Camp model has a leaders relationship, which is a many-to-many. For several good reasons, the leaders are not User objects, but Person objects, where Person has some metadata and another many-to-many (!) with User objects. However, when I’m writing a test, I might want to be able to say something like:

user=create_user()camp=create_camp(leader=user)login(user)

Here, I just care that the user is conceptually the leader of the camp. I don’t care:

  • that a camp can have more than one leader

  • that the Camp is actually related to the User object via a Person object.

Sometimes I don’t care about specifying who the leader actually is, just that there is one, so I might want to pass leader=True.

My factory function ends up looking like this:

defcreate_camp(*,leader:Person|User|bool=Auto,leaders:list[Person|User]=Auto,)->Camp:...

It’s redundant, but it’s easy to use, and this approach means you isolate many of your tests from needing changing. Sometimes my factory functions end up having a lot of parameters, and they’re unlikely to win any beauty contests — but who really cares? They are easy to understand and modify.

Type hints

Type hints are great for getting good help in your editor when writing tests. Use them!

Don’t depend on defaults

If a test requires a certain value, and it happens to be the default that the factory will use, the test should still specify it. This makes the test more robust, and allows the factory to change the defaults. If a test doesn’t specify it, it means it doesn’t, and it should work any value the factory happens to choose.

What not to do

Now for the anti-patterns. If you’re happy with the answer above, you don’t need to read this bit.

Fixtures

Django docs used to encourage you to define models in fixtures for use in tests. Don’t do that! I’ll let Carl Meyer tell you why.

**kwargs

When writing factory functions, rather than adding loads of parameters, it may be tempting to just let them accept **kwargs and pass those on to the underlying model. I usually prefer not to do that, because:

  • you get much less help when writing tests

  • you tend to end up overly tied to the actual schema

djangp-dynamic-fixture

I used to use djangp-dynamic-fixture to avoid the tedium of manual factory functions, but have since thought better. You are just introducing a layer between yourself and the code that you actually need to write, and have to stop it from doing things you don’t want etc. It also doesn’t understand the “business logic” needed to come up with sensible defaults.

Factory Boy

OK, Factory Boy, this is like my comments for django-dynamic-fixture, only more so.

Let me put it this way:

You’ve been tasked with providing a procedure for creating model instances, where that procedure will have sensible defaults, but will allow the caller to override them. You have to decide what are the appropriate language features of Python to use. Do you:

  1. Create a function or a method, with parameters for overriding defaults, or,

  2. Define a new class that inherits from Factory, and use the body of the class statement to define a procedure?

If you chose A), congratulations, you got the right answer! You will be rewarded for using the language as it was meant to be used, by things like:

  • Automatic help inside your editor, both for the parameters and the returned value.

  • Static type checking if you want it.

  • Everyone being able to modify your code without looking up some documentation.

If you chose B), you get points for novelty. But you will be punished as follows:

  • You will have to invent things like:

    • nested class Meta for essential configuration of FactoryOptions

    • nested class Params

    • Trait

    • PostGeneration

    • @post_generation

    • LazyAttribute

    • @lazy_attribute

    • @lazy_attribute_sequence

    • LazyFunction

    • SubFactory

    • RelatedFactory

    • SelfAttribute

    • a debug mode (of course)

    • and much more!

  • You will have to write thousands of lines of code, and page after page of documentation to support all this.

  • You will have to get people to read that documentation. Instead of which, they will spend their time writing snarky blog posts complaining about all your hard work!

  • You will get less than zero help from your editor when using these factories — not only will it just display **kwargs for inputs, it will think the output is a Factory instance, which it is not.

  • For people to find what parameters you can pass to a Factory, they will have to know/look up the schema, and inspect the Factory definition and decipher its “traits” etc.

I don’t want to add any further to the burden of the authors — they have suffered enough already! But I do want to deal with a few objections:

But Factory Boy can also create instances without saving them!

This is useful if you want to avoid hitting the DB while being able to test a model method that doesn’t need the DB. In Django, it’s extremely easy to do that without help, because if you aren’t going to save a model instance, you don’t need to worry about any attributes other than the ones you specify — models don’t run validation in the constructor — and so you don’t need factories at all:

deftest_address_formatted():address=Address(line1="123 Main St",line2="London")assertaddress.formatted()=="123 Main St\nLondon\n")

If you really need it, you could always add a save parameter to your factory functions.

But Factory Boy can specify related data!

From the example in the README:

order=OrderFactory(amount=200,status='PAID',customer__is_vip=True,address__country='AU',)

This is clever, but an anti-pattern in my opinion. As well as specifying that the order country is Australia, you are also implicitly specifying:

  • the Order model stores its address via a foreign key to a separate address model,

  • that model has a country field

  • and you store country information using ISO-3166 country codes.

In other words, you are tying the test more tightly to the schema than you need to. None of these things are relevant to the test, you just want to specify that the order is for Australia.

If instead you do create_order(address_country="AU") then you can leave the factory function to handle the details. That can include normalising a country code to whatever is the right thing, if it wants to, which is very easy to do with simple functions that you are in complete control of.

But Factory Boy has a create_batch method!

If you need to create a bunch of things, you can just do this:

payments=[create_manual_payment()foriinrange(0,100)]

which really isn’t very hard, and also means you can have arguments that vary depending on the loop variable.

But, because I’m very generous, I will write you a create_batch function for free. Not only that, I’ll add type hints for free, and I’ll leave it right here where you can find it, in the public domain:

fromtypingimportCallable,TypeVarT=TypeVar("T")defcreate_batch(factory:Callable[...,T],count,/,**kwargs)->list[T]:"""    Use `factory` callable to create `count` objects, passing along kwargs"""return[factory(**kwargs)foriinrange(0,count)]

Now you can do the following, and your editor and static type checker will know exactly what type of objects payment_1 and payment_2 are:

payment_1,payment_2=create_batch(create_manual_payment,2,amount=10)

Conclusion

You don’t need to install anything to create factory functions. Just use built-in language features, and maybe a few tiny helpers like I’ve shown, and you’re good!

The only real issue with my approach is that sometimes it can feel a bit tedious adding another parameter. But slightly tedious code that is extremely easy to understand and modify, and helps you in all the ways I’ve described, is still a big win in my book. There will be many days when you long for slightly tedious code that just works.

Happy testing!


Viewing all articles
Browse latest Browse all 22882

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>