Saturday, July 9, 2011

Behind the scenes of py.test's new assertion rewriting

py.test 2.1 was just released. py.test, which uses the Python assert statement to check test conditions, has long had support for displaying intermediate values in subexpressions of a failing assert statement. This feature is called assertion introspection. Historically, py.test performed assertion introspection by reinterpreting failed assertions in order to glean information about subexpressions. In assertion reinterpreting, py.test actually reruns the assertion noting intermediate values during interpretation. This works pretty well but is subject to several problems, most importantly that assert statements with side-effects can produce strange results because they are evaluated twice on failure. py.test 2.1's main new feature, which I wrote (with generous sponsorship from Merlinux GmbH), is a new assertion introspection technique called assertion rewriting. Assertion rewriting modifies the AST of test modules to produce subexpression information when assertions fail. This blog post will give a peek into how this is done and what the rewritten tests look like.

py.test tries to rewrite every module that it collects as a test module. Assertion rewriting uses a PEP 302 import hook to capture test modules for rewriting. I'm happy to report doing this was easier than I expected. Most of the code in the import hook I had to write was dealing with detecting test modules rather than supporting import's extremely complicated API. Rewriting has a non-zero cost during test collection, so py.test compiles rewritten modules to bytecode and caches them in the PEP 3147 PYC repository, __pycache__. One major thing I did have to account for was the possibility that multiple py.test processes would be writing PYC files. (This is a very real possibility when the xdist plugin is being used. Therefore, py.test uses only atomic operations on the rewritten PYC file. Windows, lacking atomic rename, was a pain here.

I'm now going to demonstrate what py.test's rewriting phase does to a test module. Let's dive in with a failing test for a (broken) function that is supposed to create empty files:

import os

def make_empty_file(name):
with open(name, "w") as fp:
fp.write("hello")

def test_make_empty_file():
name = "/tmp/empty_test"
make_empty_file(name)
with open(name, "r") as fp:
assert not fp.read()


This test nicely demonstrates the problem with py.test's old assertion method mentioned in the first paragraph. If we force the old assertion interpretation mode with --assert=reinterp, we see:


def test_make_empty_file():
name = "/tmp/empty_test"
make_empty_file(name)
with open(name, "r") as fp:
> assert not fp.read()
E AssertionError: (assertion failed, but when it was re-run for printing intermediate values, it did not fail. Suggestions: compute assert expression before the assert or use --no-assert)

test_empty_file.py:11: AssertionError


The problem is that assert statement has the side-effect of reading the file. When py.test reinterprets the assert statement, it uses the same file object, now at EOF, and read() returns an empty string. py.test's new rewriting mode fixes this by scanning the assert for introspection information before executing the test. Running py.test with assertion rewriting enabled gives the desired result:


def test_make_empty_file():
name = "/tmp/empty_test"
make_empty_file(name)
with open(name, "r") as fp:
> assert not fp.read()
E assert not 'hello'
E + where 'hello' = ()
E + where = .read

test_empty_file.py:11: AssertionError


So what magic has py.test worked to display such nice debugging information? This is what Python is actually executing:

def test_make_empty_file():
name = '/tmp/empty_test'
make_empty_file(name)
with open(name, 'r') as fp:
@py_assert1 = fp.read
@py_assert3 = @py_assert1()
@py_assert5 = (not @py_assert3)
if (not @py_assert5):
@py_format6 = ('assert not %(py4)s\n{%(py4)s = %(py2)s\n{%(py2)s = %(py0)s.read\n}()\n}' %
{'py0': (@pytest_ar._saferepr(fp) if ('fp' in @py_builtins.locals() is not @py_builtins.globals()) else 'fp'),
'py2': @pytest_ar._saferepr(@py_assert1),
'py4': @pytest_ar._saferepr(@py_assert3)})
raise AssertionError(@pytest_ar._format_explanation(@py_format6))
del @py_assert5, @py_assert1, @py_assert3


As you can see, it's not going to be winning any awards for beautiful Python! (Ideally, though, you'll never have to see or think about it.) Examining the rewritten code, we see a lot of internal variables starting with "@" have been created. The "@", invalid in Python identifiers, is to make sure internal names don't conflict with any user-defined names which might be in the scope. In the first four written lines under the with statement, the test of the assert statement has been expanded into its component subexpressions. This allows py.test to display the values of subexpressions should the assertion fail. If the assertion fails, the if statement in the fifth line of rewriting evaluates to True and a AssertionError will be raised. Under the if statement is the real mess. This is where the helpful error message is generated. The line starting with @py_format6 is simply does string formatting (with %) on a template generated from the structure of the assert statement. This template is filled in with the intermediate values of the expressions collected above. @py_builtins is the builtins module, used in case the test is shadowing builtins the rewriting code uses. The @pytest_ar variable is a special module of assertion formatting helpers. For example, @pytest_ar._saferepr is like builtin repr but gracefully handles long reprs and __repr__ methods that raise exceptions. A non-obvious trick in the format dict is the expression @pytest_ar._saferepr(fp) if ('fp' in @py_builtins.locals() is not @py_builtins.globals()) else 'fp'. This checks whether fp is a local variable or not and customizes the display accordingly. After the initial formatting, the helper function _format_explanation is called. This function produces the indentation and "+" you see in the error message. Finally, we note that if the assertion doesn't fail, py.test cleans up after itself by deleting temporary variables.

The example above is a fairly tame (and luckily also typical) assertion. Rewriting gets more "exciting" when boolean operations and comparisons enter because they require short circuit evaluation, which complicates both the expansion of expressions and formatting (think lots of nested ifs).

In conclusion, py.test's new assertion rewriting fixes some long standing issues with assertion introspection and continues py.test's long tradition of excellent debugging support. (There are now three(!) assertion introspection methods in py.test: two reinterpretation implementations as well as rewriting) I just hope I haven't scared you completely off py.test! :)

16 comments:

Michael Foord said...

How easy do you think it would be to abstract out the assertion rewriting and reuse it with other test frameworks?

Could it even be done as a standalone library?

Benjamin said...

@Michael The assertion stuff is not very dependent on py.test internals, so it should be easy to separate them out if you wanted to. In fact, until recently, assertion introspection code was in the pylib, a separate library than py.test.

gps said...

Nice.

Though I strongly discourage people from using the assert statement in most non-trivial cases in Python due to its syntax oddity and the common error of using ()s that changes the assert into an assert of an always true tuple.

Benjamin said...

@gps In recent versions of Python, the compiler will give you a warning if you make a vacuous assert statement.

$ python3
Python 3.1.3 (r313:86834, Feb 27 2011, 22:48:24)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> assert (3, "msg")
:1: SyntaxWarning: assertion is always true, perhaps remove parentheses?

Zooko said...

People interested in this should also check out "bindann":

http://tahoe-lafs.org/trac/bindann

It inspects the stack and the source code to extract the values that are relevant to the failing exception. Very cool! I feel like it should have gotten a lot more attention than it did when it was first created.

Disclosure: my brother invented it while staying rent free in my house, and he is super awesome and I love him, so I might be a tad bit biased.

Peterino said...

@Zooko sounds promising, but where is it gone? The link is broken.

Interesting also your own quote from September 14, 2007 in commit 8fd98624:

don't try to use bindann

It causes a mysterious misbehavior in Python import which causes the previous patch to fail (the patch to not run trial tests if dependencies can't be imported)

Xyore said...

Amazing tutorial

latesttechnologyblogs said...

I like your blog, I read this blog please update more content on python, further check it once at python online training

mrkdvsn said...

Sometime few educational blogs become very helpful while getting relevant and new information related to your targeted area. As I found this blog and appreciate the information delivered to my database.medium cursus

Repairtech Solutions said...

The next time I read a blog, I hope that it does not disappoint me as much as this one. I mean, Yes, it was my choice to read, but I really thought you would probably have something interesting to talk about. All I hear is a bunch of whining about something that you can fix if you weren't too busy searching for attention. onsite mobile repair bangalore Spot on with this write-up, I absolutely believe this site needs a lot more attention. I’ll probably be back again to read more, thanks for the information! asus display replacement You are so cool! I don't suppose I've truly read anything like this before. So wonderful to discover somebody with original thoughts on this subject. Really.. thank you for starting this up. This web site is something that is required on the web, someone with a little originality! huawei display repair bangalore

Repairtech Solutions said...

After looking at a few of the blog posts on your web page, I really appreciate your technique of blogging. I added it to my bookmark site list and will be checking back soon. Please visit my website as well and tell me what you think. vivo charging port replacement This site really has all of the info I needed about this subject and didn’t know who to ask. lg service center Bangalore There's definately a great deal to learn about this issue. I really like all the points you've made. motorola display repair bangalore

Raj Sharma said...

Nice Post. I like your blog. Thanks for Sharing.
Python Training Institute in Noida

meritstep1 said...

I just loved your article on the beginners guide to starting a blog.If somebody take this blog article seriously in their life, he/she can earn his living by doing blogging.thank you for thizs article. python online training

Angular expert said...

I really like it whenever people come together and share views. Great blog, keep it up!

Best Advanced Java Training In Bangalore Marathahalli

Advanced Java Courses In Bangalore Marathahalli

Advanced Java Training in Bangalore Marathahalli

Advanced Java Training Center In Bangalore

Advanced Java Institute In Marathahalli

Angular expert said...

Very good information. Lucky me I ran across your site by accident (stumbleupon). I have saved it for later!


Selenium Courses in Marathahalli

selenium institutes in Marathahalli

selenium training in Bangalore

Selenium Courses in Bangalore

best selenium training institute in Bangalore

selenium training institute in Bangalore

svr technologies said...


Hello, I have gone through your post Its really awesome.Thats a great article. I am also want to share about python online course and advanced python training. thank you