(home | about | stats)

pwyky: TestingWithPython

I was lucky enough to start working at a company that believed in testing and QA. I was a tester, one of those young idealistic (perhaps to an annoying degree) folks determined to ship a quality product. The company I was working for sold a complete distributed system that did information retrieval. It ran on five different flavors of Unix (Solaris, HP/UX, AIX, Iris and Digital Unix - this was before Linux was widely adopted) as well as Windows 95 and NT. There were probably ten to fifteen full-time engineers and four full-time QA folks. In addition to both distributed and non-distributed versions of the product, there was a relatively rich C API.

The reason we were able to keep shipping the product and maintain some baseline quality was the test harnesses we had inherited. The product had originated in an academic environment, and included both white-box and black-box automated processes. A bash script would run the tests and simply do a diff against expected output. Simple, but functional. When something changed in the code, you would run the script, make sure nothing had broken, and use that as the baseline for the next tests.

The white-box tests were also simple but very useful. They sent all kinds of normal and outlier inputs to the API calls, just to make sure no null pointers or bad data could crash the entire system.

At the time, I thought this was at least as good as anything else I'd see in my software career. Little was I to know just how wrong I was. At future jobs, testing mostly involved pointing and clicking - the web and Windows applications had taken over from our simple world of Unix command-line tools. Regression testing a web-based product can take literally days or weeks. The battle for automation can be hard; common excuses include the expense of tools and the time that will need to be invested to develop the harness.

However, new (and not so new) technologies make it much easier to at least unit test your software. Unit testing improves the quality of the software you give to QA drastically, thus having an impact on the length of the development cycle.

Python has been around for a few years now. In its current mature state, it is one of the better scripting environments I've seen. The rich libraries it comes with are empowering without being encumbering.

For example, a recent task required quickly writing a servlet to help provision a system without human interaction. When I started writing my servlet, it quickly became clear that simple testing would be crucial to getting the job done in time. I had used Python to do socket connections before, and figured it was probably easy to do HTTP connections, too.

# Initialize our destination variables.
hostname = sys.argv[1]
request = "/webapp" + url

# Create a new HTTP connection, and send
# a GET request.
connection = httplib.HTTPConnection(hostname)
connection.request("GET", request)

# Display status code and response.
response = connection.getresponse()
print response.status, response.reason
print response.read()

The above code snippet sends a HTTP request to a server and gets the response status code, reason and content. This was enough to get me going - I could easily test any changes I made to my servlet by running a simple Python script. Importing Data

Everybody who has done any sort of testing of their code knows checking boundary or other conditions greatly improves the code sent to QA, thus reducing the length of the development cycle and making everybody happy. Shortly after my servlet was working well enough for the "happy path" to always work, I wanted to try some other conditions. I figured the easiest way to do that would be to create some XML data and parse it with my script.

I wanted to use the simplest parser possible. I didn't want to add complexity to my test harness.

# Send a HTTP request to the server with the
# data in the element. If it's the end of the element
# or we have data, we don't care.
def start_element(name, attrs):
    send_request("?" + urllib.urlencode(attrs))
def end_element(name):
    return
def char_data(data):
    return

# Create the parser.
parser = xml.parsers.expat.ParserCreate()

# Set the callbacks.
parser.StartElementHandler = start_element
parser.EndElementHandler = end_element
parser.CharacterDataHandler = char_data

databuffer = ""

# Read the file.
for line in fileinput.input(sys.argv[1]):
    databuffer += line

# Parse the XML and send the requests.
parser.Parse(databuffer)

The above code creates a XML parser. I used expat because I've used it before in C++ and know it's tiny and fast. After creating the parser, my script points parser to my one-line callback handlers. Only one of them really does something: at the start of an element, I send off the attributes, formatted as a query string.

One note here. Python has a few really neat "dictionary" types. In this example, I'm encoding the attributes (see below for an example XML data file) of an element. This just iterates through the keys and values of an element and encodes them into the escaped form you'd find in an URL. This sort of power is extremely useful when quickly writing test drivers.

Here's an example of the test data I used:

<?xml version="1.0"?>

<user firstname="Craig" middlename="" lastname="Thrall" id="cthrall"
 password="pyth0n" email="cthrall@mobilitytech.com">

As you can clearly see, it's not valid XML at all. But for my purposes, it works.

www.mobilitytech.com owner. This is a pwyky site.