Quantcast
Viewing all articles
Browse latest Browse all 23058

Toshio Kuratomi: Voluptuous and Python-3.4 Enums

Last year I played around with using jsonschema for validating some data that I was reading into my Python programs.  The API for the library was straightforward but the schema turned out to be a pretty sprawling affair: stellar-schema.json

This past weekend I started looking at adding some persistence to this program in the form of SQLAlchemy-backed data structures.  To make those work right, I decided that I had to change the schema for these data file as well.  Looking at the jsonschema, I realized that I was having a hard time remembering what all the schema keywords meant and how to modify the file; never a good sign!  So I decided to reimplement the schema in something simpler.

At work, we use a library called voluptuous.  One of the nice things about it is that the containers are Python data structures and the types are Python type constructors.  So simple things can be very simple:

from voluptuous import Schema

schema = [
  {'hosts': str,
   'gather_facts': bool,
   'tasks': [],
   }
]

The unfortunate thing about voluptuous is that the reference documentation is very bad.  It doesn’t have any recipe-style documentation which can teach you how to best accomplish tasks.  So when you do have to reach deeper to figure out how to do something a bit more complex, you can try to find an example in the README.md (a pretty good resource but there’s no table of contents so you’re often left wondering whether what you want is documented there or not) or you might have to hunt around with google for a blog post or stackoverflow question that explains how to accomplish that.  If someone else hasn’t discovered an answer already…. well, then, voluptuous may well have just the feature you need but you might never find it.

The feature that I needed today was to integrate Python-3.4’s Enum type with voluptuous.  The closest I found was this feature request which asked for enum support to be added.  The issue was closed with a few examples of “code that works” which didn’t quite explain all the underlying concepts to a new voluptuous user like myself.  I took away three ideas:

  • Schema(Coerce(EnumType)) was supposed to work
  • Schema(EnumType) was supposed to work but…
  • Schema(EnumType) doesn’t work the way the person who opened the issue (or I) would get value from.

I was still confused but at least now I had some pieces of code to try out.  So here’s what my first series of tests looked like:

from enum import Enum 
import voluptuous as v

# My eventual goal: data_to_validate = {"type": "one"}

MyTypes = Enum("MyTypes", ['one', 'two', 'three'])
s1 = v.Schema(MyTypes)
s2 = v.Schema(v.Coerce(MyTypes))

s1('one')  # validation error (figured that would be the case from the ticket)
s1(MyTypes.one)  #  Works but not helpful for me
s2('one')  # validation error (Thought this one would work...)

Hmm… so far, this isn’t looking too hopeful. The only thing that I got working isn’t going to help me validate my actual data. Well, let’s google for what Coerce actually does….

A short while later, I think I see what’s going on. Coerce will attempt to mutate the data given to it into a new type via the function it is given. In the case of an Enum, you can call the Enum on the actual values backing the Enum, not the symbolic labels. For the symbolic labels, you need to use square bracket notation. Square brackets are syntax for the __getitem__ magic method so maybe we can pass that in to get what we want:

MyTypes = Enum("MyTypes", ['one', 'two', 'three'])

s2 = v.Schema(v.Coerce(MyTypes))
s3 = v.Schema(v.Coerce(MyTypes.__getitem__))

s2(1)  # This validates
s3('one')  # And so does this!  Yay!

Okay, so now we think that this all makes sense…. but there’s actually one more wrinkle that we have to work out. It turns out that Coerce only marks a value as Invalid if the function it’s given throws a TypeError or ValueError. __getitem__ throws a KeyError so guess what:

>>> symbolic_only({"type": "five"})
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/badger/.local/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 267, in __call__
    return self._compiled([], data)
  File "/home/badger/.local/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 587, in validate_dict
    return base_validate(path, iteritems(data), out)
  File "/home/badger/.local/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 379, in validate_mapping
    cval = cvalue(key_path, value)
  File "/home/badger/.local/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 769, in validate_callable
    return schema(data)
  File "/home/badger/.local/lib/python3.6/site-packages/voluptuous/validators.py", line 95, in __call__
    return self.type(v)
  File "/usr/lib64/python3.6/enum.py", line 327, in __getitem__
    return cls._member_map_[name]
KeyError: 'five'

The code throws a KeyError instead of a voluptuous Invalid exception. Okay, no problem, we just have to remember to wrap the __getitem__ with a function which returns ValueError if the name isn’t present. Anything else you should be aware of? Well, for my purposes, I only want the enum’s symbolic names to match but what if you wanted either the symbolic names or the actual values to work (s2 or s3)? For that, you can combine this with voluptuous’s Any function. Here’s what those validators will look like:

from enum import Enum 
import voluptuous as v

data_to_validate = {"type": "one"}
MyTypes = Enum("MyTypes", ['one', 'two', 'three'])

def mytypes_validator(value):
    try:
        MyTypes[value]
    except KeyError:
        raise ValueError(f"{value} is not a valid member of MyTypes")
    return value

symbolic_only = v.Schema({"type": v.Coerce(mytypes_validator)})
symbols_and_values = v.Schema({"type":
                       v.Any(v.Coerce(MyTypes),
                             v.Coerce(mytypes_validator),
                     )})

symbolic_only(data_to_validate)   # Hip Hip!
symbols_and_values(data_to_validate)  # Hooray!

symbols_and_values({"type": 1})  # If this is what you *really* want
symbolic_only({"type": 1})  # If you want implementation to remain hidden

symbols_and_values({"type": 5})  # Prove that invalids are getting caught

Viewing all articles
Browse latest Browse all 23058

Trending Articles