This is the second part of a 3-part post about my experiences using mypy to add type information to a real world open source project (I chose pycodestyle). I recommend you to read at list the first section of Part 1 if you haven’t yet.
Last time I told you what was mypy about, what questions I was trying to answer through my experiment, and what was the process I was following. This part will focus on the most significant insights that I found along the way. Part 3 will have some conclusions and my answers to the questions that followed the series.
The collection below is large and I start with negative things and move over to positive things. If you get bored in the middle at least skim the later sections or the conclusions, or you might end up with a view biased in a negative direction. I tried to be exhaustive about every interesting thing I found, so this is a large post. The next part of this series will be quite shorter and give a summarized conclusion, what you have here is the raw “data” to form that conclusion.
This section has a lot of examples from real code. To keep this from being even longer than
it is I removed irrelevant code or expressions and wrote those as ...
The missing parts
Some things are obviously missing from mypy but I’m quite sure they will be there in future versions.
The first significant example are typesheds for the standard library.
These cover the most popular parts, but not everything. In my case,
the most blatant examples were the keyword
and optparse
modules. So imports for those modules
threw warnings and anything I did on objects from those modules (like the OptionParser
) was
unchecked. Not having some components make lead you to a false sense of security. At some point I
had declared a function as taking an argument of type optparse.OptionParser
. After some review
I noted that it actually took an optparse.Values
, but mypy wasn’t able to tell me I was wrong
because it thinks that anything inside optparse
is an Any
which means “everything is ok
with this”. So I ended up with a misleading type-label which is arguably worse than no
type declarations at all.
One function that was missing in a typeshed supposed to be complete was
tokenize.generate_tokens
. This may not be exactly a bug, because that function is not in the
documented API, even if the python code is there and perfectly usable. So if you’re using
undocumented APIs you might find some false positives like this which are hard to correct:
tokengen=tokenize.generate_tokens(self.readline)
pycodestyle.py:1695: error: "module" has no attribute "generate_tokens"
Something else that looks incomplete is the generation of error messages. Most of them are clear but not always. At some point I got to a (more complicated version of):
z=x.strip()+y.strip()
Which produced the following error message:
error: Some element of union has no attribute "strip"
On one hand it’s not easy to know which of the calls to strip()
is producing a problem. Also,
“some element of union” implies (but it’s not obvious the first time) that the expression involved
(let’s say x
) may have one of many possible types. This probably means that we thought x
was
always a string, but there’s some code path that could assign it something else, let’s say a list.
It would also be useful to know what actually is this other type to find the problem, but mypy
does not provide this information (which actually is available inside mypy). This may be a design
trade-off: showing all the information may be overwhelming, so the hard question is
deciding what’s relevant.
Other part that looks a bit incomplete right now is the documentation. The one available now
is actually quite nice as a tutorial but there are some gaps in some frequently
used parts of the language. There is no indication on how to specify the types of open arguments
or keywords arguments. For example if I have a function product(*args)
that multiplies all its
integer arguments, my first intuition was to type it as product(*args:List[int])
but the
correct way to do it is indicate the type of each element. A “reference” documentation could be nice
to have in the future. Some important things that are not around in the documentation are present in
PEP-484, so the PEP itself is worth a read.
Other undocumented part of mypy is around the included types. Even looking at the code of the typing
module, it’s nice to know that a generic Generator
type is available (for functions that return or take
generator objects), but it has 3 generic arguments and there’s no clue of what is each. I still
don’t know, my guess is that two of them are the types returned by next()
and accepted by send()
,
but I’m not sure what’s the third one or which is actually the order of the three.
Some buggy behaviour
Other problems I found during my experiment are small bugs. I reported some of these to the team (when they weren’t already reported) and the community seems quite nice and willing to improve, so this part of the post hopefully will get old very quickly. However to make my description complete I’ll include my findings in this section:
Some of the provided typesheds don’t seem to have the appropriate level of strictness. For example, the
frozenset
typeshed says that most operations should accept anAbstractSet
and this makes mypy complain about perfectly valid python likeARITHMETIC_OP=frozenset(['**','*','/','//','+','-'])WS_OPTIONAL_OPERATORS=ARITHMETIC_OP.union(['^','&','|','<<','>>','%'])
pycodestyle.py:99: error: Argument 1 to "union" of "frozenset" has incompatible type List[str]; expected AbstractSet[str]
The actual argument for
frozenset.union
should accept any iterable, not just a set. I think problems like this will be a frequent problem, given that the python library is not very strictly specified and a lot of python code tends to be fuzzy about this questions (can I pass this function any kind of iterable or must it actually be alist
?). On the other hand, I think this is one of the places that widespread usage of mypy will be good because it will add clarity and explicitness about these decisionsIf a function can return, say, an
int
, but may also returnNone
, the return type should be written asOptional[int]
. But I forgot about that several times, and even if I noticed my mistake later mypy wasn’t detecting it (even if it was obvious statically, i.e. the function had somereturnNone
statements). I understand that’s a work in progress and mypy 0.4.2 already has more work inOptional
that I should try.I had some issues where type variable aliases are not equivalent. For example, mypy offers a type variable called
AnyStr
that can be byte or unicode strings. As most python file handling functions accept both as path names, AnyStr may be useful for those situations. And actually, being able to say “this argument is a file path” is pretty cool for documenting purposes, so I added some code at the top sayingfromtypingimport...,AnyStr,...# ...FilePath=AnyStr# ...defreadlines(filename:FilePath)->List[str]:
But this actually said on the function definition:
error:Invalidtype"pycodestyle.FilePath"
So that kind of aliasing didn’t seem to work. If I change that line to the apparently equivalent:
defreadlines(filename:AnyStr)->List[str]:
mypy stops complaining. But actually I ended up going for a normal type-alias
FilePath=str
which loses some flexibility (states that all paths must be unicode) and also fixes the issue.
I found an occasion where I had a variable that could have two different kinds of tuple (depending on the code path, so that makes it a
Union
type in mypy terminology), and mypy was refusing to iterate on it. I couldn’t replicate it in a small scale and ended up refactoring that code, so I’m not 100% sure if I was doing something wrong myself, but the error message wasn’t helping. It may have been another case of mypy reporting that it can not do something on a union type, and actually the union had some type that I didn’t know about.Another issue that looked like a bug (even if it isn’t) was around the
Callable
generic type (it’s a type used to describe callable objects like functions, methods and classes). To describe a type of a callable you need to include the types of the arguments and the result. For example, a function that accepts a string and integer arguments and returns abool
, would have typeCallable[[str,int],bool]
. The problem is that if you now add a default value for theint
argument, now the function can also have typeCallable[[str],bool]
. This looked like it was confusing mypy on this code:ifsys.version_info<(3,):stdin_get_value=sys.stdin.readelse:defstdin_get_value()->str:return...
mypy then complains with:
error:Incompatibleredefinition(redefinitionwithtypeCallable[[],str],originaltypeCallable[[int],str])
The original type seems to indicate that mypy can’t handle the default argument for read. Actually, it is assuming one of the possibilities. If you turn this condition around
ifnotsys.version_info<(3,):defstdin_get_value()->str:return...else:stdin_get_value=sys.stdin.read
when it gets to the
sys.stdin.read
it is already trying to fit it into a known type so it chooses the correct version. When there are many definitions of something, mypy tries to take the topmost as the “true” type, and any inconsistencies below are flagged as mistakes. I actually fixed that problem adding an explicit annotation which tells mypy which version of the function to use.ifsys.version_info<(3,):stdin_get_value=sys.stdin.read# type: Callable[[], str]else:defstdin_get_value()->str:return...
The
#type:X
comments are the way to provide extra typing information to mypy besides the function annotations
Some paper cuts
Many problems I found were tiny, and actually about mypy being pedantically correct, but in a way that didn’t feel like the tool was being helpful. Much of this section may also be taken as personal preferences. What I’d like a type checker to do is help me provide useful information while not getting in my way, but I found several small usability problems. Nothing that I list here is a show-stopper, but stuff that feels more like a “usability” issue:
As a first example, pycodestyle imports the OptionParser class from the optparse
module, creates an instance,
configures it, and parses the arguments, which optparse
returns as a Values object. In the
original code, pycodestyle doesn’t care that this is a Values object, it just uses it. However,
my version need to explicitly say that some functions accept/return optparse.Values
, and given
that I need to “talk” about this type, I need to import it. It’s a minor pollution to the
namespace, but I seem to end up importing more stuff than originally, which I’d rather avoid
(I’m concerned it may introduce possible circular imports, or force to expose library internals).
There’s also some “visual” pollution in the code. Function definitions end up longer, and if following PEP8 they tend to break into multiple lines which look uglier. Also, there’s no defined style guide for breaking type annotations.
Something that add to the “syntactic” issues is that there are two symbols in the syntax to say
“has type”: ->
for results and :
for parameters.
On many occasions I used :
instead of ->
which was a syntax error that confused the hell out
of mypy (dozens of “inconsistent indentation” error messages on a single mistake). I know this
is partly Python’s fault, but it affects the experience.
Going more into the “preference” or personal “opinion” terrain, the union syntax feels bulky.
Union[int,str]
means “something that can be either an int
or a str
”. The notation
(especially for more complicated types inside) can be hard to read; I probably would have preferred
something like int||str
, or int+str
. Same for tuples; you can say Tuple[int,str]
for
2 element tuples, I would have liked to write (int,str)
instead.
The syntax for declaring types of attributes and locals has its quirks too. When python 3 was
introduced, it only added annotations in function declarations. So the annotations for object fields
and locals was specified as a special comment starting with type:
. But it is less practical
in more complicated situations: you can specify variable types in initializations like
foo,bar=1,True
with a comment #typeint,bool
. But if you have some initialization
foo=bar=None
, and you want to set different types to foo
and bar
you have to split
the assignment; I found some examples of this in pycodestyle:
path=nrows=None
Which I had to turn into
path=None# type: Optional[str]nrows=0
Functions that may return None
or not can not have a plain return
, you have to explicitly say
returnNone
. But I may be “wrong” to complain, given that I see that newer style standards for
python recommend this (I found several instances in pycodestyle, ironically not following itself
this style guideline). It didn’t feel like I was improving anything while I was fixing those. As a
side note about these, when referring to the type of None
you use None
instead of
NoneType
. I know the latter is something usually less visible to python users, but for me the
inconsistency looked weird each time I used it.
There were also some usual python patterns (which I don’t believe are against the style guide) that I
was forced to break. One of them was the possibility of chaining returnsuper()...
in functions
with no return values is not allowed:
classP():deff(self)->None:print("f")classC(P):deff(self)->None:print("G")returnsuper().f()
foo.py: note: In member "f" of class "C":
foo.py:9: error: No return value expected
I understand why this happens (and I know that removing the return
and just leaving the call
to super()
) would fix this, but for me the return+super pattern is so ingrained in my finger
memory, and makes the code more robust under refactors, that I don’t feel comfortable having to
“fix” this.
Having a type system with type variables (that allows you to say for example that List[T]
has
a method count
that takes a variable of type T
) is fantastic and adds a lot of flexibility,
but the way in which Those type variables are implemented and created as objects is not completely
intuitive. For example mypy provides a AnyStr
type variable which is a bit confusing because it’s not a type. I understand why it’s
needed (for functions that operate both with strings and bytes but need some kind of argument and/or
result type matching), but I’m not sure that type variables should be exported and shared, from a user standpoint
they are hard to tell apart from types and that may lead to surprises.
Another nice feature of mypy that is usually great but occasionally complicated is type inference.
mypy is quite good inferring types of local variables. If your function has an x=3
it assumes
x
is an int
unless you tell it otherwise. However it doesn’t do it at all in function arguments.
If you write:
deff(z:str,x=3)->None:...
Then it knows that z
is a string, but thinks that x
is Any
(so you will be allowed to write
x[0]
in the function body, or call f("","4")
). For those functions where the default
has the actual type (happens quite often, if not always because having a default of None
is also quite common), having to explicitly state it again in the declaration ends up feeling
verbose and redundant.
For variables of container types, mypy has a harder time figuring out their types automatically when the container is initialized empty (which is a very frequent case.) For example, when mypy finds code like
items=[]
it won’t be able to infer the type of the items
variable, so it will show a warning asking to
explicitly type it. However, it is smart enough to figure out some common situations, for example,
if you have something like
deff()->None:items=[]forxinrange(-10,10):items.append(abs(x))print(items)
mypy infers that the list is a List[int]
, because you’re appending int
values to it. That’s really
nice and it saves some work, but it doesn’t seem to be consistently smart. Code which is quite
similar like
defmethod(self)->None:self.items=[]forxinrange(-10,10):self.items.append(abs(x))print(self.items)
makes it lose the ability to infer, forcing the user to enter a lot of details.
Something else that needs some work on usability is how the tool responds (or doesn’t) when the user
makes a mistake (which is something that will happen). I already mentioned some examples of unclear
error messages or reactions to syntax errors. At some occasions, I also
forgot to add an annotation to a function that returned a value. When no
annotation is added, the implicit return type is Any
, which means that any return value is ok.
Probably assuming that the function has no return value (None
) would have made more sense if I
have a partially annotated function, otherwise some obvious mistakes might slip by.
Another problem of that kind (where I spent some time thinking I had found a mypy bug), happened around this code:
CheckResult=Tuple[Union[int,SourcePosition],str]...Check=Callable[...,Iterable[CheckResult]]...defregister_check(check:Check,codes:CheckCodes=None)->None:......definit_checks_registry()->None:mod=inspect.getmodule(register_check)functions=cast(Iterable[Tuple[str,'Checker']],inspect.getmembers(mod,inspect.isfunction))for(name,function)infunctions:register_check(function)
I added the cast because mypy has no idea what the inspect.getmembers
function is going to
return, so I provided the extra information. The last line was reported to have the following error:
error:Argument1to"register_check"hasincompatibletype"Checker";expectedCallable[...,Iterable[tuple(length2)]]
Which sounded really silly to me because they seem to be the same... I ended up adding a
#type:ignore
flag to silence this and didn’t find my error until I reviewed this to write
this post: I wrote Checker
instead of Check
(and Checker
happens to be an existing,
unrelated class). The mistake here is obviously mine, but a
conclusion I find here is that even if type systems are supposed to prevent several mistakes and
hard bugs that come from mistyped variable names, in this case it provided me the tool to create
new errors, this time at the level of the type system instead of the code. One thing here is that
type aliases are unnamed, if I had got the message
hasincompatibletype"Checker";expected"Check"
the error would have been obvious.
Dealing with Python dynamic idioms
I encountered several occasions where, as I expected, a piece of code that is a bit dynamic although obviously right for me, there’s no way within the type system to describe what I’m doing without refactoring the code to something a bit more complex/ugly. This is probably the part of mypy that worries most Python fans.
The most idiomatic pattern in python that uses types flexibly is duck typing. If you mix types that are roughly compatible but mypy doesn’t know what is the common interface that you care about, it may reject code, even if you try to add explicit information. I found the following case:
valid_hangs=(4,)ifindent_char!='\t'else(4,8)...hanging_indent=hanginvalid_hangs
It’s very easy to see that valid_hangs is just used as a collections of int``s,soyoucaneasilychecklaterif``hang
(which is an int
) is in that collection or not. However mypy
sees that (4,)
has type Tuple[int]
, (4,8)
has type Tuple[int,int]
so it infers
that the “common type” of both for valid_hangs
is object
. Then it proceeds to say that
hanginhangs
is mistyped because an object
isn’t a container. So, I tried to give mypy a
hint:
valid_hangs=(4,)ifindent_char!='\t'else(4,8)# type: Tuple[int, ...]
However this produces an error because mypy now knows the type that I explicitly gave to
valid_hangs
, but it’s still convinced that the expression produces an object
, which is
a type error in the assignment. In the end, I had to force it like:
valid_hangs=cast(Tuple[int,...],(4,)ifindent_char!='\t'else(4,8))
The cast
function provided in the typing
module doesn’t do anything in runtime, but it’s
a way to force the type of an expression, but in this case it does nothing more than obfuscating
the code a bit just to placate the type checker.
Having mixed types happens more often that what’s apparent, particularly around conditional
statements. Python has a bool
type, but any type of value can be used as a condition in if
or while
statements. So many functions in my sample code base that return something that should be used as
a condition do not actually return a bool
but something else. So I ended up declaring:
noqa=re.compile(...).search# type: Callable[[str], object]
Using object
is reasonable in python to say (something that can be used as a bool
) but
doesn’t add clarity about how this function is going to be used. It might be useful to be able to
say “something that can/will be used as a bool
”, there’s nothing like that yet
As a generalization of the above, it’s not possible to talk about a protocol instead of a specific type. There are
some standard protocols defined (like Sequence
or Iterator
) but if you want something more
specific (like “something with a read
function returning str
”) or related to your own
classes.
The short circuit evaluation in boolean operators produces union types that tend to confuse the checker in the long run. Take a look at this:
match=noqaandCOMPARE_SINGLETON_REGEX.search(logical_line)ifmatch:...match.group(...)
mypy infers the type of match
to be Union[bool,re.Match]
. So when two lines below the
code tries to read the group
attribute, mypy complains that the attribute might not be present
if the value is a boolean. Actually the if
should guarantee that it isn’t but realizing that
is beyond the power of mypy (at least the version I used). So I had to refactor to an uglier form
of the code to avoid the error:
ifnoqa:returnmatch=COMPARE_SINGLETON_REGEX.search(logical_line)ifmatch:...match.group(...)
Other python pattern that mypy doesn’t understand is dynamically checking for attributes.
When faced with the following code, where line
is a str
:
ifhasattr(line,'decode'):# Python 2...length=len(line.decode('utf-8'))
You get a message saying that error:"str"hasnoattribute"decode";maybe"encode"?
. For a
human reader, it’s quite obvious that there is a decode
attribute in line
(less obvious
that it’s being correctly used). So in the end you have to silence this line with
type:ignore
. You may argue that this is a quite non standard way of checking python versions
(PEP-484 recommends that checkers recognize common way to check for version differences), but I
can imagine similar scenarios for other situations unrelated to Python 3 vs 2.
Another problem with conditionals is when multiple definitions of something (like a variable or even a function) are done in different paths. mypy actually handles quite well the usual case which is when multiple definitions have the same type. But in pycodestyle I found the following function which is conditionally redefined depending on a setting; and there is something special about it:
def_is_eol_token(token:Token)->bool:return...ifCOMMENT_WITH_NL:def_is_eol_token(token:Token,_eol_token=_is_eol_token)->bool:return_eol_token(token)or...
error: All conditional function variants must have identical signatures
The actual problem here is that the author used the “capture value in default argument” trick, just
to be able to call the “old” version of the function from the new one,
but that means that there is no possible type signature for _is_eol_token
that is valid. I’ve seen
this kind of tricks with default arguments (i.e. using them to store stuff that isn’t actually an
argument) and they will probably be an issue. For this situation, what I ended up doing is add the
extra argument also to the original definition and type it as Any
.
def_is_eol_token(token:Token,_eol_token:Any=None)->bool:...ifCOMMENT_WITH_NL:def_is_eol_token(token:Token,_eol_token:Any=_is_eol_token)->bool:...
Actually, any kind of function that behaves differently (in terms of accepted/returned types) depending on argument count value are hard to specify. Type variables may help here but there’s a limit to the expressiveness of the type language. However I didn’t come across any of these problems in pycodestyle. mypy apparently has some support for this and uses it in the standard library, but it’s not clear how usable the mechanism is for other users (The documentation says that “this feature is not supported for user code, at least not yet”).
Not all the problems here happen on highly dynamic code. Tricks like throwaway variables in tuple unpacking like this:
prev_type,prev_text,__,prev_end,__=tokens[0]
produces an error because mypy tries to type __ according to the elements of the tuple, but the
3rd and 5th elements have different types in this case. It doesn’t matter because the place holder
variable __
is never used anyway, but mypy complains anyway.
Other not-so-dynamic behaviour that seems to be prohibited is accessing some methods directly.
I got a error:Cannotaccess"__init__"directly
on:
elifinspect.isclass(check):if(_get_parameters(check.__init__)[:2]==['self','tree']):
which seems a bit restrictive, and all I can do with this message is ignore the line.
Something that is dynamic but I didn’t think would be a problem was having attributes defined in
methods other than __init__
. Pycodestyle has many classes with attributes like that.
It was possible to add some
#type:xxx
declarations around at the point when they were defined, but it wasn’t very easy
to follow. I ended up solving it by forcing a declaration in __init__
. It seems to be hard to
specify the type of attributes without changing the semantics in that case. I was a bit surprised
that the type wasn’t inferred, given that the method that actually assigned an initial value had
an obvious type for them. For example Checker.line_number
is defined in the check_all
method as:
self.line_number=0
but that isn’t enough, so I had to add that assignment at __init__
to make the inference work.
Other situation that was a similar problem was the Checker.indent_char
attribute; even if all
assignments to that variable assign it a str
or None
, mypy requests a explicit type
declaration in the middle of a method that uses it. I think it finds the first line in the code
(top to bottom) that looks like an assignment, even if that’s not the best place to define the
type.
Other usual python pattern I found problems with is accepting arbitrary **kwargs
keyword arguments, and then pass
them through to another API. This allows exposing parts of a third party API instead of
having to reimplement it (and follow it when it changes). However that flexible typing is hard
to do with mypy because there is no way to say “This function accepts the same keyword arguments of
this other 3rd party API”. even if you can end up typing keyword arguments as Any
, the problem
is also compounded with a buggy behaviour in cases like the following one
(this is an artificial example based on a quite more complicated case if found in pycodestyle
which was too long):
deff(*,a:int=1,b:str=""):print(a+len(b))defg(**kwargs:Any)->None:f(**kwargs)
Which make mypy complain about Toomanypositionalargumentsfor"f"
. This problem appears
whenever they are keyword only arguments (the most usual case is when having both variable length
arguments and keyword arguments together).
Things that mypy helped me fix or improve
Until now we’ve seen many cases of mypy complaining in stuff I didn’t think they were a real problem. Now we’ll start seeing the opposite side of that coin, which are things that mypy asked me to change, and the end result was better than the original. These are the kind of promises that static type proponents always make, and I was happy to see some of those being delivered.
I wrote the standard typeshed for
tokenize.generate_tokens
, and I declared the return type to be an iterator over tokens. However the function is implemented as a generator, and pycodestyle tries to use it as such in this lineCOMMENT_WITH_NL=tokenize.generate_tokens(['#\n'].pop).send(None)[1]=='#\n'
The
send(None)
is a strange idiom, and thenext(g)
builtin is available even in python 2.6, so I replaced this withCOMMENT_WITH_NL=next(tokenize.generate_tokens(['#\n'].pop))[1]=='#\n'
And I think the end result is more readable, and depends less on how the
generate_tokens
is implemented. Here the typechecker found that the code was using something that could be argued is an implementation detail, which is a sin that I often have seen in python, even when there are more elegant solutions.pycodestyle operates a lot on tokens. Tokens are stored as a heterogeneous 5 element tuple that includes information on the token text, kind and position. As a tuple, it’s inmutable, so there’s a piece of code that tries to create a new artificial token based on a transformation of an existing one(in a variable
token
), and put it into some list calledself.tokens
. The original code did this:token=list(token)token[1]=text.rstrip('\r\n')token[3]=(token[2][0],token[2][1]+len(token[1]))self.tokens=[tuple(token)]
mypy was quite confused about this. On one hand, the
token
variable changed types (because it enters as a token, but ends up being a heterogeneous list which is hard to describe). I tried to fix this by adding a variable callednew_token
and using that to avoid overloading the semantic of thetoken
variable but there still were problems with the mixed types inside the list. Finally, I rewrote it like this:toktype,string,start,end,line=tokenstring=text.rstrip('\r\n')end=(start[0],start[1]+len(string))self.tokens=[(toktype,string,start,end,line)]
I was about to put this example into the “mypy doesn’t get dynamic code” bin, but then I saw the 2 pieces of code side by side and the new one is much more readable. There are name instead of indices, and it’s much more clear what’s happening, how the new end line/column is being calculated, and how the new token is rebuilt. So again, here forcing me to not use dynamism actually ended up in better and more readable code.
pycodestyle has a different checking function for each kind of style error. They return an iterator on errors, and each error is a pair with a message and the source position. The positions are sometimes just an
int
(the line number), other times a tuple of 2int
values (line and column). On my first pass I saw a few functions, all returned line number only, and assumed that was true for all of them. This is what I wrote:CheckResult=Tuple[int,str]defsome_checker(...)->Iterable[CheckResult]:
When I typed it that way the checker immediately told me that there was something wrong. I can imagine myself working on this project without the type information and making a refactor where I made my incorrect assumption, and I’m quite sure it would have been some painful debugging. Having my assumption validated will make it much easier. This is how the declarations ended up looking:
SourcePosition=Tuple[int,int]CheckResult=Tuple[Union[int,SourcePosition],str]defsome_checker(...)->Iterable[CheckResult]:
I also noted that some checks directly return a
CheckResult
instead of an iterable, which needs to have some ad-hoc handling elsewhere. I didn’t refactor this, but the typing information makes the inconsistency more obvious, so it’s good to have some help to detect potential improvements to the codebaseHaving to add types to arguments helped me notice several unused arguments to functions that could just be removed, simplifying code for free. Probably they were left around in some refactor. This was more a benefit of the “going through the code and add type info” than from the type checking itself.
While adding type info I found a function called
continued_indentation
that had a variable calledinent_chances
initialized as an empty dict. Mypy asked me to specify the type for that dict, and this led me to realize that this was quite hard, because the code was very complicated The keys of the dict seemed to be numbers (representing source code columns) but the values sometimes were`True
, other times a string, and other times thestr
type itself. This in turn makes the code that uses the content of this dictionary quite complicated, but given that the values are so heterogeneous the only thing it can do are checks with is or in (i.e.valueisnotTrue
,valuein(some_text,str)
, etc.). In the end just typing the dictionary asDict[int,object]
worked perfect. This is a case where the dynamic nature brought no problem, but at the same time the type system makes more visible something strange going on (my guess is that this code could be made simpler).The
missing_whitespace_around_operator
functions also had some unusually typed local calledneed_space
. It could beTrue
,False
,None
, pairs of ints, depending on which branch of a deep set of nested ifs the code goes through. I ended up refactoring this function a bit and actually was happier with the end result. In the end I transformedneed_space
into a bool, added aneeded_at
which was a source position (the tuple ofint
values that was used only in some branches) and aspace_is_optional
variable that was used deep inside one of the ifs (the original code usedNone
in this case). The function was a bit big to include here but you can see the changes at my pycodestyle repo. Making this change also allowed me to find further simplifications of the codeThe
normalize_paths
function ended up declared as:defnormalize_paths(value:Union[List[FilePath],str],parent:FilePath=os.curdir)->List[FilePath]:"""Parse a comma-separated list of paths. Return a list of absolute paths."""ifnotvalue:return[]ifisinstance(value,list):returnvaluepaths=[]forpathinvalue.split(','):
The type of the
value
argument looks unusual, but it only highlights that the implementation is a bit “un-pythonic” (theisinstance
checks to take different codepaths tends to be a smell). Again, here the typing only highlights that there is something that could be cleaner.I also detected thanks to mypy several functions that appeared to return booleans, but in fact could return a mix of types (with the correct meaning of truthiness). For example:
defis_string_literal(line:str)->bool:ifline[0]in'uUbB':line=line[1:]iflineandline[0]in'rR':line=line[1:]returnlineandline[0]=='"'orline[0]=="'"
which mypy detects as a problem given that this can also return
""
. That may not be a problem, but I like to get it explicitly detected, because in many situations having""
is not the same as havingFalse
(for example, if you’re passing this to a JSON serializer, or comparing equivalence of booleans with==
). I fixed this adding an explicit call tobool()
and I actually like being sure that this only can returnTrue
orFalse
.- The same happened in some object fields, where I think the problem may be larger (because the
not-exactly-boolean can propagate to unexpected places). Lines like
self.noqa=commentsandnoqa(''.join(comments))
where
noqa
had been declared asbool
were detected an error, and helped me add the explicit call tobool()
around the expression which is a good idea.The
BaseReport.elapsed
field, measuring time, is initialized to0
, and mypy inferred it was anint
. However, it then ends up being used for floats (because it is based ontime.time()
which returns floats). I found it nice that mypy told me about this; I changed the initialization of0.0
and now when reading the code I think it’s more obvious what to expect from this variable.The
StyleGuide
class had the following initialization method:def__init__(self,*args,**kwargs):# build options from the command lineself.checker_class=kwargs.pop('checker_class',Checker)parse_argv=kwargs.pop('parse_argv',False)config_file=kwargs.pop('config_file',False)parser=kwargs.pop('parser',None)...
When I started adding explicit types for each of the extracted kwargs, actually this helped me notice that this way of extracting the arguments is redundant. I ended up rewriting as:
def__init__(self,*args:object,checker_class:Callable[...,'Checker']=Checker,parse_argv:bool=False,config_file:FilePath=None,parser:OptionParser=None,**kwargs:object)->None:self.checker_class=checker_class
Which actually provides a more clear interface, both in names and types of accepted arguments.
There is a
StyleGuide.input_dir
method that traverses a tree checking files. When checking its return type I noted that it has an early return of0
when the directory being checked has been listed as “excluded” in the command line options, and otherwise does all the checking with no further return statements (i.e., implicitly returningNone
at the end). So, does it return anOptional[int]
? Given that this looked unusual, I decided to take a closer look and eventually find that nobody uses that return value, so I ended up removing the return value.An auxiliar method in the code reads:
def_parse_multi_options(options,split_token=','):r"""Split and strip and discard empties. Turns the following: A, B, into ["A", "B"]"""ifoptions:return[o.strip()foroinoptions.split(split_token)ifo.strip()]else:returnoptions
This function may return a list of strings, or an empty string. That actually looks curious and forced me to check the calling site and see that the reasonable thing to return is an empty list. For the use it has it the same (iterating on an empty string or list does nothing in either case), but it’s easier to understand the method that way, and actually the whole if can be removed making it simpler.
Using mypy to understand the codebase
A large part of the experience for me was understanding what each function was doing to figure out which were the correct types. The interesting thing about adding type annotations is that a fairly large part of understanding a new piece of code is to get an idea of what are its inputs and outputs, and a large part of that answer is knowing the types. Sometimes figuring out the type is quite obvious from local information, or from conventions, but other times you need to trace a large chain of calls before being able to do it. The feel after adding each type annotation (especially the difficult ones) is that not only I had understood more about the code, but that the code was more understandable to the next person coming in and reading the code with the annotation.
In the case of pycodestyle the types of the arguments for the checker functions was one of
the hardest parts. Each of those functions take a different set of arguments (they don’t have a
uniform call API), but the code does not have explicit calls to each. Instead, they are collected
through introspections, and a piece of code introspects the argument names, and pulls the values
from the check argument from equally named attributes from a Checker
instance. So the flow of
information is much less explicit than a direct function call. In that case, learning about how
each name mapped to a type was a great aid to understand how the different checkers fitted together.
Other question I had to ask myself in many places was if a given argument/return value was actually
a list, or a sequence, or an iterable. In many cases any of those looked valid locally, but when
I tried one, mypy complained that I was using too few or too many assumptions. A bit of trial and
error aided by error messages actually allowed me to easily get it right, and actually have more
clear expectations about what a function will provide me and what is what it accepts. I believe
a lot of clarity was added here and this was a positive surprise. For example, I know that most
checks return an Iterable[CheckResult]
. Even if a few return a list, I now know for sure that
trying to index the result of a check function is a bad idea.
One large piece of the puzzle was the type of this global:
_checks={'physical_line':{},'logical_line':{},'tree':{}}
Many functions work with it, with many levels of indexing, so after typing it I was quite happy with the result:
Check=Callable[...,Iterable[CheckResult]]CheckCodes=List[str]CheckArgs=Sequence[str]CheckKind=strCheckSet=Dict[Check,Tuple[CheckCodes,CheckArgs]]_checks={'physical_line':{},'logical_line':{},'tree':{}}# type: Dict[CheckKind, CheckSet]
The auxiliary definitions made it much more easy to understand and describe not only this structure but many functions that operated around it or parts of it.
Of course not everything is made completely clear; functions like parse_udiff
(source)
manipulate complex structures and even with annotations it’s a bit hard to follow, but having the
added information makes it somewhat better.
The type checker itself can be used as a tool to figure out information about types in the unknown
code base: you can declare something as object
, and if it finds an error (for example, trying to
apply len
to it) you know some more information (it’s probably some kind of sequence), and you
can iterate and refine. I used this idea a lot, although sometimes it’s not enough. There is,
for example, in pycodestyle this StyleGuide.input_file()
method which is supposed to check a
single file and takes an optional expected
argument. It took me a while to trace when this is
called, because, it is never called directly but aliased into a self.runner=self.input_file
attribute which is also never called but aliased as a local variable runner
in the
check_files
method. I was looking for the type of the expected
argument there to see what was
its type; however runner
is never called with an expected
argument so I had
no information here. The argument seems to be available only for users using pycodestyle as an API.
So what I had to do there is dig down. The expected
argument is passed down to
Checker.check_all
, which in turn passes it down to BaseReport.init_file
, which stores it
into the BaseReport.expected
attribute, and the BaseReport.error()
method reads it in a line
with says ifcodeinself.expected
, so finally we know that this is some kind of container of
error codes. In this case, I had to do all the analysis myself, and as you can see, this can mean
diving a lot through a lot of code.
Other random thoughts
Adding some custom type alias made several declarations more readable. It probably would made
more sense if the libraries provide them, otherwise you’re stuck with what you get form the stdlib.
But saying “this argument is a FilePath” is much more informative than “this is a str
”, even
if one is an alias. It would be interesting if the core team decides to provides conventions on
those.
In general, as this is something new, there are a lot of missing naming and styling conventions
that I think will be developed. What are appropriate name for type vars? How to split lines on type
declarations to follow type conventions. How to insert #type:xxx
comments in long lines?
should I put the auxiliary type definitions nearby their use or at the top of the module?
It would have been nice to have some kind of automated assistance during the “transformation” process. I feel that a lot of the work that I did could have been “assisted” by mypy or some related tool, given it already can infer a lot.
I already knew when I started that the Python language has some aspects that would be hard for a
type checker, but I didn’t foresee that booleans would be one of them. The semantics of the and
and or
operators and the ability to use anything as a value of truth tends to created mixed type
scenarios quite easily.
I made some mistakes during the process that I expected to get caught by mypy but they weren’t. I initially
mixed many type declarations related to StyleGuide.reporter
(which is a class) and
StyleGuide.report
which is an instance of this class, and never got an error related to it
(although seeing the type annotations led me to realize they were two different things).
Summarizing
After all the changes here (and a lot more of routine updates) I had a fully typed version of
pycodestyle. Even if mypy is optional I covered all arguments and return values (with
only one use of Any
that I described above in _is_eol_token
). The program still runs and
passes all tests, and mypy accepts is with no errors (I’m using a modified version of the stdlib
typesheds to remove bugs in the typesheds for frozenset
). It took me roughly a
full day of work (~10hs split in 3 different days). I found several problems, but I also got several
insights about the code by using mypy.
In the next post I’ll make a short version and my findings and conclusions, including the answers to:
- Does it help me discover actual bugs?
- Does the process of adding types help making the code base more understandable?
- How mature is mypy itself? Is it usable right now, does it have a lot of bugs?
- Is the type system flexible enough to express the kind of actual dynamic tricks that real developers like us use in actual, production python code?
- Does it feel practical/usable?
- What other things that I didn’t expect can be learned from the experience?
See you next time!
References
- My version of pycodestyle with type anotations can be found on github. That is the exact version described in this post.