Saturday, November 20, 2010

New version of six, lean and mean

I just released a new version a of six, my Python 2/3 compatibility library. The main feature in this release has that six has been flattened into one source file on the philosophy of "flat is better than nested" and for ease of distributing in projects. I've also switched from Bazaar to Mercurial, since the latter seems more popular and it's all the same to me. The issue tracker and source code is no on BitBucket.

I'm calling this version, 1.0.0 beta 1. Assuming no one complains, I think I'd like to release a final version in the next month or so.

Your feedback is appreciated.

Tuesday, June 29, 2010

Six: Python 2/3 compatibility helpers

Increasingly, I've seen a movement towards supporting Python 2 and Python 3 in the same code base. Having ported a few projects myself, I decided to collect the code I've duplicated between them into a library. The result is six. It includes fake byte and unicode literals, b() and u() and has wrappers for syntax changes such as print and exec. You can check out the documentation on PyPi.The license is MIT, so I hope it can see wide use in projects planning to support Python 2 and 3 simultaneously.

Friday, March 26, 2010

On commit messages

I would like to address the issue of commit messages. Good commit messages can make finding bugs and understanding the timeline of a project easy, and bad ones can result in an infuriating waste of time reading diffs and trying to locate information.

First of all, all commits should be atomic, that is they shouldn't include unrelated changes. Fixing a typo or spacing while fixing bug in related code is acceptable, but fixing 6 bugs and adding 2 features in the same commit makes it hard for people to parse out what change was for in the future. A good rule of thumb is that if a summary of your changes can't fit in one line, it's probably too big.

The first line of the commit message is most important part. This is especially true today, where many DVCSes only show the first line of the commit by default in their log command. The summary line should succinctly summarize what your change is and what it accomplishes. It need not be a full sentence, but just a bug number or general statement ("fix this") is not appropriate. The best summary lines quickly inform any log browser of the purpose and changes in the commit. Summary lines should also never be wrapped. Nothing is more annoying than reading a summary line which is cut off in the middle by a line break. Simple typo fixes do not require complicated messages. Good examples:
fix #2345 by preventing add() from accepting strings

fix a segfault in foo_my_bars() #4563

fix spelling

add a Python interface to the tokenizer #3222

and bad ones:
test and a fix

ugg

bah

a huge change to Foo class

why does this not work?

bug #4543


After the summary line can optionally come a body. A blank line should always separate the commit message from the body and different sections of the body from another. Bodies should also always be line wrapped. The body can include any of the following:

  • Bullet points describing various aspect of the change in more detail.

  • A paragraph description explaining why how something was implemented or why it's written a certain way.

  • A reference to mailing list discussions or decisions that lead to the commit.

  • Authors and attributions.

  • Any other significant information about the commit. For example, explain how it affects external components or might result in unexpected behavior.


Some projects follow the convention of listing affected files in bullet points and describing the individual changes to each. I personally find a prose summary of the changes in the body along with a diff or the verbose version of the log which shows changed files more helpful than this technique.
Good examples of complete commit messages:

"""
normalize encoding before opening file #3242

This change requires that tokenizer.c be linked with the Unicode
library.
"""

"""
silence foo warnings by default

Approved by BDFL in
http://mail.python.org/pipermail/mailinglist/bladh.html
"""

"""
support unicode in shlex module #4523

This is implemented by providing a separate class for Unicode and
requiring a locale to be set before parsing commences.

Patch by J. Hacker and J. Programmer
"""

"""
boost the speed of keyword argument comparisons

This improves some function calls by over 30% by comparing for
identity before falling back to the regular comparison. stringobject.c
was modified to provide faster access to a string's value.
"""

Saturday, October 3, 2009

% formatting to str.format converter

Recent discussions on Python-dev have revolved around transitioning the standard library to the new str.format method. One suggestion was to write a automatic converter for old format strings to new ones. I've taken on the task and written mod2format at https://code.launchpad.net/~gutworth/+junk/mod2format. You can try it out by running "python3 -m mod2format [your format strings here]".

Friday, September 4, 2009

Reivew: IronPython in action

Disclaimer: Manning Press and Michael Ford very generously sent me a free copy of the book.

One thing that always slightly annoys me when I'm reading a book about Python programming is having the first few chapters devoted to introducing the Python language. However, I'm sure experienced .NET people felt the same while scanning through the introduction chapters to .NET, which was totally new to me. (I'm also glad there was an appendix about C# syntax; I learned that C# seems to have invented a new syntax or keyword for every possible programming paradigm.) IronPython in Action seems to do a very job, overall, of catering both Python programmers tiptoeing into IronPython and .NET and C# developers finding the light of dynamic programming.

I found the web programming part of the book, especially the part on Silverlight, most interesting, since embedding Python in the browser seems like a lot more fun than writing cross-browser JavaScript. Michael Foord's Try Python (source) is a good demonstration of what can be accomplished. (Though, I wonder if PyPy's sandboxing could someday be used in the browser to do the same thing.)

I would have appreciated a chapter or section on parallel processing, since IronPython offers much better threading and concurrency primitives than CPython. Perhaps an example where IronPython can perform a task that would be impossible on other implementations of Python is in order. I want to see how .NET can make concurrency easy and pythonic.

Before reading this book, I had dismissed .NET as a non-cross-platform hunk of Javaish APIs. I see now, though, that IronPython is able to combine the beauty of Python with some of .NET's better APIs (I would still rather use PyQt for GUI programming. Windows Forms has not improved.) to make a powerful development platform.

Tuesday, August 25, 2009

parser-compiler branch merged

The PyPy project I have been working on over the summer, rewriting the parser and compiler, has finally been merged back to trunk (during the now-ending JIT sprint in Gothenburg, Sweden). I wrote a little summary up about it on the PyPy blog.

Saturday, June 27, 2009

Python 3.1 released!

I'm happy to announce that today Python 3.1 was released. I won't dwell the new features, since those are more completely listed elsewhere. I'm quite happy with this release. A lot of work has been put into 3.x as stable as its older 2.x siblings. I would like to see a lot of libraries and applications start serious looking at the port to 3.x now. As always there's a bunch of core developers waiting to help on the python-porting mailing list.

Anyway, 3.1 is available for download in source and several binary formats on python.org.