David MacIver: Looking into doing a PhD

As regular readers of this blog have probably figured out, I’m a researchy sort of person.

A lot of my hobbies – maths, voting theory, weird corners of programming, etc – are research oriented, and most of my work has had some sort of research slant to it.

The last two years I’ve basically been engaged in a research project working on Hypothesis. It’s come quite far in that time, and I feel reasonably comfortable saying that it’s the best open source property based testing library on most metrics you’d care to choose. It has a number of novel features and implementation details that advance the state of the art.

It’s been pretty great working on Hypothesis like this, but it’s also been incredibly frustrating.

The big problem is that I do not have an academic background. I have a masters in mathematics (more technically I have a BA, an MA, and a CASM. Cambridge is weird. It’s entirely equivalent to a masters in mathematics though), but that’s where I stopped. Although it says “DR” in my online handle and the domain of this blog, those are just my initials and not my qualification.

As a result, I have little to no formal training or experience in doing academic research, and a similarly low understanding of who’s who and what’s what within the relevant fields. So I’ve been reading papers and trying to figure out the right people to talk to all on my own, and while it’s gone OK it’s still felt like fumbling around in the dark.

Which leads to the obvious solution that I spoilered in the title: If the problem is that I’m trying to do research outside of an academic context, the solution is to do research in an academic context.

So I’d like to do a PhD that is either about Hypothesis, or about something close enough to Hypothesis that each can benefit from the other.

There’s probably enough novel work in Hypothesis already that I could “just” clean it up, factor it out, and turn it into a PhD thesis as it is, but I’m not really expecting to do that (though I’d like that to be part of it). There are a number of additional directions that I think it would be worth exploring, and I expect most PhD funding will come with a focus subject attached which I would be happy to adapt to (a lot of the most interesting innovations in Hypothesis came because some external factor forced me to think about things in ways I wouldn’t otherwise have!).

In the absence of further factors, here are some of the directions for Hypothesis that I think it would be interesting to research further:

I have some current prototype work that really pares down Hypothesis to a single core testing primitive on which everything is built. That’s already the case to some degree, but the current primitive is rather messy and the new one is really much more elegant (it takes the core Hypothesis engine back to its eXplode origins and then rebuilds a lot of nice abstractions on top of that). I think this will work really well and it opens up a lot of possibilities for other novel abstractions built on top of it.
I’d like to pursue better grammar based generation on top of Hypothesis – e.g. I’d like to make it easy to define and use some sort of Boltzmann Sampler. This would significantly enlarge the set of things you can test with it, and would make it easy to build fuzzers for a wide variety of protocols while getting a lot of the benefits of Hypothesis (mostly example shrinking, but also any of the other improvements on this list) for free.
Conversely I’d like to use some sort of grammar inference in its backend. At its core Hypothesis is a tool for transforming byte streams into structured data. Being able to infer a grammar for the underlying bytestream would help in a number of ways, most notably it would significantly improve the assumption functionality. I’ve experimented with this in the past and not found e.g. L* search to work very well for this, but I’ve since read some more research that suggests that it might be practical.
I’d like to figure out how to integrate coverage based information into Hypothesis. I’ve done experiments in the past and I’ve produced some prototypes that work pretty well if you run them for long enough, but were not very useful because of the time constraints on Hypothesis running as part of a normal test suite. I’d like to see if I can improve on that.
I’m interested in using spies as a way of adding lightweight Concolic testing like features to Hypothesis. Possibly related is my Schroedinteger prototype, where values are kept in a suspension of one of a number of possibilities for as long as possible.
I’m interested in exploring parallel testing using Hypothesis. Historically this hasn’t been a priority because Python, but in principle this sort of testing is embarrassingly parallel, so it’s a shame not to take advantage of that.
I’m also interested in proving the claim that Hypothesis is an extremely portable set of testing primitives that are easy to implement other languages. This isn’t probably research-worthy in and of itself, but it opens up a lot of potential other applications.

I’m not particularly wed to any of these. They’re all things that I think would be both interestingly novel and would improve users’ lives, but one of the nice things about having so many ideas is that I can’t do all of them anyway, so I have to prioritise. Given that, adding more to the priority queue is no hardship at all!

Which, finally, brings me to the main point of the post: What I want from you.

I’m already looking into and approaching potential universities and interesting researchers there who might be good supervisors or able to recommend people who are. I’ve been in touch with a couple (some of whom might be reading this post. Hi), but I would also massively appreciate suggestions and introductions.

So, if you work in relevant areas or know of people who do and think it would be useful for me to talk to, please drop me an email at david@drmaciver.com. Or just leave a comment on this blog post, tweet at me, etc.

David MacIver: Looking into doing a PhD

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112