Zach Hacks: testing

Showing posts with label testing. Show all posts

Thursday, May 9, 2013

HabitMaster: Alpha release

HabitMaster is now live in the wild. You can see the new one-page HabitMaster project site that links to everything else you might want to see, including a running Heroku instance.

Overall, this deadline has been met with limited success. I'm calling this an alpha release because the HabitMaster achievements (in other words, the actual gamification parts!) are still unimplemented. It means HabitMaster is just a basic habit tracker right now, and still a little rough around the edges at that.

This has mostly been due to just slow progress. According to my time tracker, I averaged around 8 hours a week on ICS691 projects for most the semester. I stepped that up to averaging around 12 hours a week for the past month, but I still don't appear to have much to show for it. Yet it seems all of my implementation projects go this way: I continue to underestimate just how much code is required to do things.

I spent most of my time implementing back-end details, such as streak detection given a list of activities and a particular schedule type. I was excited to use a Python generator (basically a fancy iterator) to good effect there.

I also came up with a workable solution to handling polymorphism in database objects. Specifically, my habits link to a Schedule object, but there are two subtypes of Schedule: DaysOfWeekSchedule and IntervalSchedule. Grabbing habit.schedule would give me a row from the Schedule table, which was fairly useless. I ended up adding a cast() method to Schedule that would resolve that instance to the proper subtype for me.

Django's unittest framework was very handy. I didn't get into the UI unit tests that actually let you test requests and responses, though. I would like to explore that more. I'm still impressed with Django and its documentation.

I picked up more HTML5 tags, which has been enjoyable. I dig all the semantic tags they have now. Twitter Bootstrap and CSS design in general was a time drain, though. There was a lot of fidgeting required to get everything laid out right. I'm still a lousy graphical designer. CSS still trips me up on things that seem like they should be easy, such as define a DIV of a given relative width and then align a couple items to the left and one or two to the right within that DIV. And then have them all play nicely as the window resizes. I will say that Bootstrap helped compared to trying to do this from scratch in CSS, though.

GitHub (and to a lesser extent Heroku) has been an increasing pleasure to use. I put GitHub Pages and GitHub's wiki, issues, and milestone features to work on this project. I code alone so often that I've developed some lazy git practices, though. The main one is that my commits are based on work-session rather than issue. It doesn't affect anyone else, but it means I often commit things half-done or in a broken state. That is going to come back to haunt me if I ever have to rollback to a previous verison. I also have a lot yet to learn about merging and pulling with git. I ran into a bit of that while I was editing on Windows, Linux, and through the GitHub web interface today.

As you can probably tell, I enjoyed all the technical learning this project has afforded me.

Regarding the gamification aspects, I have this lesson to share: It takes a lot of time. Any gamification mechanics will be in addition to implementing the underlying service you want to provide. The gamification needs to be well-thought out, too. If possible, leveraging an existing framework could help cut down on that time... though it means you'll need to spend the time learning the framework and then being bound by its limitations.

The other thing that HabitMaster has clarified for me is the difference between gamification and a serious game. I see HabitMaster as an example of the former but not of the latter. To me, gamification means being inspired in your design by the fun and rewarding engagement we find in games--even when you're designing something that is clearly not a game. Because it's not a game, you can't just paste game mechanics onto your service. You need to think about the experience you're going for... which brings us to deeper concepts like mastery, autonomy, meaning, flow, engagement, reward, and the like. A serious game, on the other hand, is when you take something that is clearly a game and try to graft non-game goals or outcomes on to it.

HabitMaster will likely be dormant for a little while as I finish up some other pressing projects, but I do hope to resume development when I get some time. I'll also be starting a new teaching job in July. I wonder if I might bring some gamification to bear on course design there.

Friday, June 29, 2012

Unit-testing a Python CGI script

This summer I'm overhauling Tamarin, my automated grading system. Under the hood, Tamarin is little more than a bunch of Python CGI scripts. However, as I overhaul it and convert it from Python 2 to 3, I also wanted to build a proper unit test framework for it.

It's been a dozen years or so since I last used Perl and CGI.pm, but I recall running my scripts on the command line and manually specifying key=value pairs. So, I was somewhat surprised to find no comparable way to test my CGI scripts in Python. The official Python cgi module documentation suggests the only way to test a CGI script is in a web server envirnoment. That's an unnecessarily complex environment for quick tests during development and precludes any simple separate unit tests.

In general, I'm not very impressed with the cgi module docs. In fact, browsing around revealed that there are a number of parameter options undocumented in the official docs.

Using this found information, I was able to build my own cgifactory module. Depending on the function called, it allows you to build a cgi object based on either a GET or POST query. For example:

  form = cgifactory.get(key1='value1', key2='v2')

If you then write your CGI script's main function to take an optional CGI object, you can easily build a CGI query, pass it to your script, and then run string matching on the (redirected) output produced by your script. Of course, most of your unit tests will probably be of component functions used by your script, but sometimes you want to test or run your script as a whole unit. cgifactory will help you there.

The cgifactory code is available here, where you'll always find the most recent version. The code itself is actually quite short; most of the file is documentation and doctests showing how to use it. I don't guarantee it's right, but it's worked for me so far. Hopefully it might be of use to someone else too! Feel free to copy, modify, and/or redistribute.

(Oh, and if you really need a command line version, it shouldn't be too hard to write a main that parse key=values pairs into a dictionary and then calls cgifactory.get(pairs) to build the CGI object.)

Tuesday, December 13, 2011

Group Programming: Some Adjustment Required

After last week's technical review, Jeff and I swapped projects with the grads group. That is, we took over on the grads' code, and they started working on ours. Both groups added three more commands to the existing codebase that they had just inherited.

This was an interesting experience taking over on an existing software project. Of course, there was some initial overhead getting familiar with the layout of the new project's code. This wasn't too onerous, though, since the overall structure was fairly logical.

Once Jeff and I started working, the work went smoothly. We had already completed one project together. I think Jeff and I communicated well, and each of us made some significant contributions. We agreed on all of the major design issues.

Still, for all of the harmony, I'd say that my most interesting learning experience this week was the old "pick your battles" experience. If you've ever shared a living space with someone as an adult--whether a college roommate or a romantic partner--you probably know what I mean. There's a period of adjustment to the person's habits and quirks. For example, maybe your new roommate already put the silverware in the left drawer when you feel that it is obviously more natural to have it in the right drawer. Or maybe your new lover seems to somehow constantly drop loose change; eventually stray coins lie scattered throughout the house.

For each new quirk you discover, you have to decide whether it bothers you or not. If it does, then you need to figure out if you're willing to just overlook it, constantly clean up after them, or make an issue of it. If you make an issue of every little thing, you can easily become an unpleasant nag. Depending on how it's done, cleaning up behind someone can come across as a passive-aggressive show of disapproval. But, if something really does bother you and you don't speak up, you can find that your living space is not really your own. That's what I mean by picking your battles.

First, Jeff and I had to come to terms with the existing code. As we settled in, we found we had to "move the silverware" in a few places to meet our tastes. In particular, we overhauled the look and behavior of the user interface. We left the reflection-based loading of classes in place, though.

Then there were the minor differences in working style that I noticed between me and Jeff. For the most part, these were so minor as to border on petty: "If I'd done it, I wouldn't have spaced the output that way" or "I usually write one class at a time rather than mock up all the classes and then fill in the details later."

A couple weeks ago, I mentioned the motivating aspect of working in a group. My point this week is that, whenever you work with someone else, they are not going to do things exactly the same way you do. That is to be expected. But, even when you realize this consciously, each specific difference you discover can still cause a short pause: "Oh.. that's not how I would have done that... but that doesn't mean it's wrong... Am I going to accept how this has been done and move on, or do I want to say something and change it?" Overall, I discovered that I could just let these things go. It was just a new experience for me to adjust to someone else's presence in my coding project space, both in terms of the old code and the new.

Jeff and I got everything finished up before the deadline. I think our code is pretty solid, though some of our line-spacing might be a little off between different commands.

Testing code that connects to a server continued to be a pain. I considered making some sort of mock client object, but there proved to be too many methods that would need to be overridden to make it worthwhile.

I also found that, when verifying someone else's code, it's easier to run through a manual test rather than plod through all their JUnits and make sure it covers all the necessary cases. While I think it's good to get some manual testing in there occassionally--especially since that lets you spot things like typos and weird formatting that a JUnit test is not going to catch--I think this is still not quite ideal. Manual testing can find bugs that the person's JUnits missed. But, if the code is correct, manual testing won't reveal that an appropriate JUnit test is missing altogether. This means that changes to the code could cause an undetected failure later. So I guess I should work on using the person's JUnit tests as a loose guide to manual testing in order to spot missing/poor tests as well as any existing defects.

I'm still a fan of Issue-based Project Management. I'm also glad we kept our habit of having the other person mark completed tasks as Verified. It's an extra step, but it's a good feeling knowing that everything has been double-checked by someone else.

This is the last project for my software engineering course, so it may mean a blog hiatus for a while. But I should be back occasionally with news on other new software projects!

Tuesday, October 18, 2011

BusyWeek: Programming Under Pressure

It's been a busy three weeks! I completed two programming projects in that time (as well as a lot of other work). The contrast between how those two projects went provided me some interesting insights on testing and general best-practices.

Three weekends ago, I completed a hefty implementation project for an algorithms course I'm taking. I had to implement and gather runtime data on 4 abstract data types: a sorted doubly-linked list, a skip list, a binary search tree, and a red-black tree. All of the implementations adhered to the same interface. I then had to write a text-based user interface to allow a user to run performance tests as well as call methods across all 4 implementations. All told, the project took me about 30 hours over 3 or 4 weeks--though half of that time was put in over the last two days. But the project was a success. (I had one known bug at the midnight due date, but I was able to fix it and resubmit an hour later.)

I think part of that success was in how I spent my time. I started early and spent that time designing the common interface, implementing the easiest data type (linked list), and writing a single battery of JUnit tests that I could then run across all 4 implementations. When crunch-time came, I was confident that my tests were robust. I could just focus on the implementation. When all my tests passed, I felt I was on pretty solid ground. In short, test-driven development eased that last-minute stress.

My other recent project was to design and test a robocode robot.

My original design for this was named JiggersTheFuzz. (The name is an obscure reference from an old Sierra game named Police Quest.) I planned to use linear targeting, which means anticipating a target bot's location by leading their current position by an amount relative to their velocity. I planned to scan enemies for energy drops that mean they'd just fired and then try to dodge their bullets (the "jigging" of the name). If robocode bots want to "mark" another bot, they often use the getName() method. So I wanted to dig into Robocode and the runtime stack a bit to see whether I could override my own getName() to return a different String each time but in such as way that it didn't mess up the robocode simulation itself. This would be like a radar jammer (the "Fuzz" of the name): other bots would still see me, but they wouldn't be able to identify me between turns.

However, it was not to be.

Two weekends ago, I had a conference in California. While I managed to get caught up on everything else before I left, this robocode project followed me onto the plane. I tried working on it in my hotel at 9:30pm after two sleep-deprived nights and a long conference day. I tried again at 5:30am. It just wasn't happening.

I had a plan sketched out, but the details still involved too many unknowns. In short, I hadn't started early enough to be on solid ground come crunch-time. I started paring away features as the deadline neared. The first to go was the radar "fuzzing" (so I never got to look into that one). The second to go was scanning other bots to track their energy level. (This is actually very hard to do in robocode when extending Robot rather than AdvancedRobot, since a Robot can only do one thing at a time: turn its radar, move, or fire, but not at the same time.) I was down to just moving regularly in the hopes of dodging bullets and generally being unpredictable... but I wasn't handling wall collisions at all. There was little left to pare away and still have a functional bot.

At this point, I called it: This project is going to be late.

So I renamed it BusyWeek. I kept the linear targeting and the naive bullet-dodging, and, once I got back to my normal life, I sat down to do it right. Last weekend, I finally finished my bot.

Movement: BusyWeek assumes 1-on-1 combat. Once I scan an enemy, I turn to be perpendicular to him. Then I move 20px at a time, which is a touch more than the width of my bot. Thus, if he happens to shoot just before I move, I'll be out of the way by time the bullet arrives. Of course, this assume he's at least 140px away and firing small fast shoots. (He can be only 77px away if firing big slow shots.)

To ensure I'm this optimal distance away, my bot tries to stay between 140px and 180px away. (Farther than that, and it's too hard to make my own shots reliably.) If I'm currently outside of this optimal range, I'll turn to 45 degrees (instead of 90) from my enemy's heading and move that way. I handle walls by moving to the opposite side of the enemy's heading to avoid wall collisions. If I've been backed into a corner and can't avoid a wall collision in either direction, then I try to ram the opponent instead.

Targeting: As mentioned, I used a form of linear targeting to lead the enemy based on his current heading, velocity, and distance from me. I decided to include a "fudge factor" of treating his velocity as one less than it actually is. This was because it seems that stops and turns are more common behaviors than continuing along in a straight path.

Firing: BusyWeek shoots bullets with a power proportional to its own current power level. This way I don't shoot to the point of becoming disabled. Since smaller bullets move faster, it also means I hopefully become more accurate as things get more desparate. Also, I don't even aim at an enemy if my gun is still too hot to fire. This saves me some time for scanning and moving instead.

This design worked fairly well against the standard sample bots. Here are my win rates over 100 non-deterministic rounds each:

vs sample.SittingDuck: 100%
vs sample.Tracker: 79%
vs sample.Corners: 85%
vs sample.Fire: 99%
vs sample.Crazy: 100%
vs sample.SpinBot: 67%
vs sample.RamFire: 60%
vs sample.Walls: 76%

Testing this bot proved to be a trial, though. I used a number of computational methods--such as finding a point on the board based on a starting point, a heading, and a distance. Testing these using JUnit tests went fine. I only wish I'd written the tests first. I actually wrote most of these methods for some of my kata bots weeks ago, and so I was already confident they worked. This takes all the fun out of test-writing.

Testing the behavior of my bot during the simulation proved to be trickier. For example, early in my design I decided that it'd be easy to test that my proportional-bullet-power feature was working correctly. (Testing this for bugs was a bit like only looking for lost keys under a streetlight--not because I expect to find them there but simply because searching in the light is much easier!) I simply paired BusyWeek with a bot that always fires shots of power 3. Then I used the RobotTestBed to snapshot all the bullets in the air each turn. I figured if some of the bullets decreased in power as the battle progressed, then it meant my algorithm was working. However, while the test passed, when I set my bot to only fire shots of power 3 too, I couldn't make my test fail! After some digging, it turns out that a bot--even a bot like SittingDuck that never fires--effectively fires a single shot of power 1 when it explodes. So I figured I'd just exclude all bullets of power 1 from my test. But then I realized that sometimes the other bot would fire lower-powered shots near the end of the match when they didn't have enough energy left to make a full-powered shot. Sorting through all these details took over an hour--effectively debugging the flickering of the streetlight rather than looking for my keys. Finally, I gave up: this was all to test a single line of code that I could just eyeball!

So, in conclusion, I learned the following:

Start early and get the design and necessary research out of the way. This will let you make a more realistic estimate of how much time the rest will take you.
Write tests before you write code. (Or, at very least, write them together.) Writing tests after the fact is drudgery.
Write code so as much as possible can be unit-tested. Trying to test emergent or complex behavior through a simulation or an user interface is much harder.
Make sure your test can fail before you rely on it when it passes.
When you're at a conference, you're not going to have either the time or the energy to work on anything else.

Admittedly, these lesson are all very standard software engineering tips. And I knew most of them, at least consciously. But it's nice to learn viscerally that they really are valid!