On Software Testing

There was quite a discussion about software testing on Hacker News this week and I was shocked by how many of the commenters leaned anti-testing. On the one hand, I breathed a sigh of relief, firm in the knowledge my job is safe since I build large well-tested systems; but on the other hand, I weep for anyone who has to work with someone who doesn’t believe in testing because there is a high likelihood their code sucks. I’ve been building software for a while now and I've never seen a large untested codebase that people were excited to work on¹. Why? Because the fear of not knowing if a change will break everything stifles even the best of developers.

I’ve often said that every developer writes tests, whether they admit it or not, some developers just don't save them for later. Here’s an example, say you’re adding a new function named boom. You add your boom function and then run it against input foo to make sure you get the expected output:

boom('foo')

After verifying the output, you smile to yourself because your boom function works exactly as expected with input foo, so you decide to also run it against input bar:

boom('bar')

Oh no! Your boom function threw an error with input bar. So you go back into the code and change some stuff and run it again:

boom('bar')

Whew! Once again, you’re all smiles because now your boom function works with input bar, but because you're a great non-test writing developer, you also run input foo again:

boom('foo')

Dang it! It doesn’t work with input foo anymore, back to the code, and after another set of changes, you run it again:

boom('foo')

Yay! It now works with input foo again, but in order to make sure it still works with bar you need to go back and run it with input bar again also. This back and forth grows tedious after awhile. This is exactly how most non-test writing developers write code². The problem is, as the codebase grows and gets more complex, you forget some of the earlier inputs or just stop running earlier inputs altogether because it's a pain, or you make a change in some other area of the codebase not realizing that the new change completely breaks the boom function for input bar now.

Now, let’s add the same boom function, but with testing:

def test_boom():
    boom('foo')

Let's make sure it also works for input bar:

def test_boom():
    boom('foo')
    boom('bar')

That's what testing gives you, a place to keep your previous input history so you can re-check it every time you make a change. So, each change to the code is ran against all the previous inputs to make sure it is working as expected. This allows you to make changes throughout the whole system and be able to check past assumptions to make sure your change didn’t break anything. I love testing because I like being as sure as possible I’m not the one who broke stuff, with the added benefit of other developers being able to understand and work on my code also.

My testing style has developed over the course of building a few companies from scratch and also working on rather large existing codebases. The idea is to make writing and running tests to be as fast and efficient as possible, allowing changes to be made effortlessly.

The first thing you need when writing tests is to get everything into a needed state. Usually, when people think state, they think of fixtures, which are usually data files you load when starting the tests, that set up the database and stuff. I’m not a fan of fixtures because I've always found them to be too rigid and difficult to keep in sync on a fast moving codebase because they are so separate from the actual code that runs the tests, so when I tried to use fixtures I found I almost always abandoned them and stopped running tests that depended on them all together.

Since I need testing to be as easy as possible to make sure I stick with it, I don’t use fixtures. Instead, I generate my data on the fly. In Python, I use a module called Testdata for this, It makes it easy to quickly generate a wide variety of data. I usually start with my generic library but wrap it in a codebase specific module. For example, at First Opinion we have users, and we often need a valid user on lots of our tests, so this is how we've set that up:

from testdata import *

def create_user(**kwargs):
    first_name, last_name = get_name(as_str=False)
    kwargs.setdefault('first_name', first_name)
    kwargs.setdefault('last_name', last_name)
    kwargs.setdefault('email', get_email())

    u = User.create(**kwargs)
    return u

Then, at the first of a new test function that needs a user, we just call:

u = testdata.create_user()

Now, in our test, we have our new user and even if the user object gets modified, we just have to update the testdata.create_user() function and we are good to go, instead of having to change large monolithic fixture files³. Anytime we need to add a new object or whatnot, we just add a new testdata function that creates it and then any of our tests that need it will be able to get it. Testdata has been a lifesaver for me, allowing database populating to be relatively low friction and easy.

The next problem I had was actually running the tests. I need my tests to be easy to run, otherwise I won’t run them nearly as often as I should. For Python, I solved this with another module named Pyt. One of the most annoying things about python’s unittest module is it was harder than it needed to be to run a single test, a normal test would be named something like:

footest.FooTestCase.test_bar

And to run that using the command line, I would need to type:

Python –m unittest footest.FooTestCase.test_bar

Which was hard for me to remember. Sometimes I use the Test postfix, othertimes I would use TestCase and I almost always switched them when typing out the command. Pyt lets me shorten that to:

pyt Foo.bar

And it handles the rest, this means I spend less time trying to remember how to run the test and more time actually writing the test.

When I worked on small codebases where I was the only programmer, I didn't care about writing tests, but as the codebases got bigger, and the developers more numerous, I became a convert to the testing lifestyle. I don’t really distinguish between unit, system, regression, and functional tests, I just want code in my applications to be tested in some way because I need it to work, and I need other developers to be able to work on it. The most important thing when testing is finding something that works for you and sticking with it. The next most important thing is being able to answer in the affirmative the question: am I confident I can push this code to production?

Usually, in large untested codebases, when new developers join they push to rewrite everything rather than trying to dive in and understand the existing untested code. ↩
And every student in programming courses in college. ↩
Another benefit of this technique, since Testdata generates a lot of Unicode output randomly, you wouldn’t believe how many times it’s caught Unicode problems in our code, something that normally wouldn’t happen with fixture files since most english speaking developers don't think of adding unicode to their fixture files. ↩

Jay Marcyes