The next big feature for coverage.py is what I informally call "Who Tests What." People want a way to know more than just what lines were covered by the tests, but also, which tests covered which lines.

This idea/request is not new: it was first suggested over four years ago as issue 170, and two other issues (#185 and #311) have been closed as duplicates. It's a big job, but people keep asking for it, so maybe it's time.

There are a number of challenges. I'll explain them here, and lay out some options and questions. If you have opinions, answers, or energy to help, get in touch.

First, it's important to understand that coverage.py works in two main phases, with an optional phase in the middle:

The first phase is measurement, where your test suite runs. Coverage.py notes which code was executed, and collects that information in memory. At the end of the run, that data is written to a file.
If you are combining data from a number of test runs, perhaps for multiple versions of Python, then there's an optional combination phase. Multiple coverage data files are combined into one data file.
The reporting phase is where your project is analyzed to understand what code could have run, and the data files are read to understand what code was run. The difference between the two is the code that did not run. That information is reported in some useful way: HTML, XML, or textually.

OK, let's talk about what has to be done...

Measurement

The measurement phase has to collect and record the data about what ran.

What is Who?

At the heart of "Who Tests What" is the Who. Usually people want to know what tests run each line of code, so during measurement we need to figure out what test is being run.

I can see two ways to identify the test being run: either coverage.py figures it out by examining function names being run for "test_*" patterns, or the test runner tells coverage.py when each test starts.

But I think the fully general way to approach Who Tests What is to not assume that Who means "which test." There are other uses for this feature, so instead of hard-coding it to "test", I'm thinking in terms of the more general concept of "context." Often, the context would be "the current test," but maybe you're only interested in "Python version", or "subsystem," or "unit/integration/load."

So the question is, how to know when contexts begin and end? Clearly with this general an idea, coverage.py can't know. Coverage.py already has a plugin mechanism, so it seems like we should allow a plugin to determine the boundaries of contexts. Coverage.py can provide a plugin implementation that suffices for most people.

A context will be a string, and each different context will have its own collected coverage data. In the discussion on issue 170, you can see people suggesting that we collect an entire stack trace for each line executed. This seems to me to be enormously more bulky to collect, more difficult to make use of, and ultimately not as flexible as simply noting a string context.

There might be interesting things you can glean from that compendium of stack traces. I'd like to hear from you if you have ideas of things to do with stack traces that you can't do with contexts.

Another minor point: what should be done with code executed before any context is established? I guess an None context would be good enough.

Storing data

Having multiple contexts will multiply the amount of data to be stored. It's hard to guess how much more, since that will depend on how overlapping your contexts are. My crude first guess is that large projects would have roughly C/4 times more data, where C is the number of contexts. If you have 500 tests in your test suite, you might need to manage 100 to 200 times more data, which could be a real problem.

Recording the data on disk isn't a show-stopper, but keeping the data in memory might be. Today coverage.py keeps everything in memory until the end of the process, then writes it all to disk. Q: Will we need something more sophisticated? Can we punt on that problem until later?

The data in memory is something like a dictionary of ints. There are much more compact ways to record line numbers. Is it worth it? Recording pairs of line numbers (for branch coverage) is more complicated to compact (see Speeding up coverage data storage for one experiment on this). Eventually, we might get to counting the number of times a line is executed, rather than just a yes/no, which again would complicate things. Q: Is it important to try to conserve memory?

Today, the .coverage data files are basically JSON. This much data might need a different format. Q: Is it time for a SQLite data file?

Combining

The combine command won't change much, other than properly dealing with the context information that will now be in the data files.

But thinking about combining adds another need for the measurement phase: when running tests, you should be able to specify a context that applies to the entire run. For example, you run your test suite twice, once on Python 2, and again on Python 3. The first run should record that it was a "python2" context, and the second, "python3". Then when the files are combined, they will have the correct context recorded.

This also points up the need for context labels that can indicate nesting, so that we can record that lines were run under Python 2 and also note the test names that ran them. Contexts might look like "python2.test_function_one", for example.

Reporting

Reporting is where things get really murky. If I have a test suite with 500 tests, how do I display all the information about those 500 tests? I can't create an HTML report where each line of code is annotated with the names of all the tests that ran it. It's too bulky to write, and far too cluttered to read.

Partly the problem here is that I don't know how people will want to use the data. When someone says, "I want to know which tests covered which lines," are they going to start from a line of code, and want to see which tests ran it? Or will they start from a test, and want to see what lines it ran? Q: How would you use the data?

One possibility is a new command, the opposite of "coverage combine": it would take a large data file, and subset it to write a smaller data file. You could specify a pattern of contexts to include in the output. This would let you slice and dice arbitrarily, and then you can report as you want from the resulting segmented data file. Q: Would this be too clumsy?

Perhaps the thing to do is to provide a SQLite interface. A new "report" would produce a SQLite database with a specified schema. You can then write queries against that database to your heart's content. Q: Is that too low-level? Will it be possible to write a useful report from it?

What's already been done

I started hacking on this context idea a year ago. Coverage.py currently has some support for it. The measurement support is there, and data is collected in memory. I did it to test whether the plugin idea would be fast enough, and it seems to be. If you are interested to see it, search for "wtw" in the code.

The data is not written out to a .coverage data file, and there is zero support for combining, segmenting, or reporting on context data.

How you can help

I'm interested to hear about how you would use this feature. I'm interested to hear ideas for implementation. If you want to help, let me know.

Ned Batchelder: Who Tests What

Measurement

What is Who?

Storing data

Combining

Reporting

What's already been done

How you can help

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112