MackelWhat?: 2006

Monday, December 04, 2006

How Many Digits In That Number?

This is the first article in a series of “interesting things I’ve found that I haven’t seen used elsewhere”. This will discuss a technique I used a couple of times in college when writing contrived software for CS and EE assignments.

One of the assignments I had required formatting numbers for output. For the algorithm I was using, it was important to know how many digits were in an integer number before I output it. This would make the formatting significantly easier than not knowing ahead of time. In thinking about how to do this, I realized there were several ways to do this. The first was to find modulo 10 and divide by 10 until the number was 0. This figured out the length, and found the individual parts of the number nicely, but it started from the least significant digit and worked up. I wanted something that would go the other direction.

I remembered my algebra class, and the log (base 10) function. I remembered that if log x = n, 10^n = x. In thinking about this, I realized that the ceiling of log x is the number of digits in x. In other words, since log 12345 =4.something, 12345 has 5 digits. This meant I could code something like the following:

int GetLength(int num) {

return ceil(log(num));

}

And could print out the digits of the number like this:

for(int j=GetLength(num); j>0; j--) {

cout < num / 10^j;

num \= 10^j;

}

So for base 10, it worked really well. The epiphany occurred when I needed to do the same thing with binary (base 2), octal (base 8), and hexadecimal (base 16) numbers. It turns out that this algorithm is extensible to arbitrary bases. To get that, we just have to make a couple of minor changes:

int GetLength(int num, int base) {

return ceil(log(num)/log(base));

}

int num = 12345;

int base = 2;

for(int j=GetLength(num, base); j>0; j--) {

cout < num / base^j;

num \= base^j;

}

There are a couple of special cases that you have to take care of when using this. Specifically:

Negative numbers – you need to make sure to pass positive numbers into the digit counter. Logs of negative numbers aren’t real, and most math libraries don’t handle them well.
Fractional numbers – for numbers with decimal places, this technique doesn’t work right. To make them work, you’d have to decide on how may decimal places you want, multiply by that number to start, then put the decimal point in the “right” place when emitting the value.
Digit representation – for bases greater than 10, you’d need a mapping to handle the digits to output. In the example given above (with no digit mapping), the cout will emit a “13” instead of a “D” as expected for a hexadecimal conversion. Some kind of simple digit mapping scheme would have to be added to take care of this case.

Thursday, November 02, 2006

USB Drive Adapter

Like any true geek, I have a bunch of old hard drives laying around my office. It's a pain to install them in a system just to copy data to them or to get data from them, but this week I bought a very cool device from NewEgg. It's a small square adapter that has hard drive plugs around three of it's sides. One adapter for 2.5 inch laptop drives, one for 3.5 inch normal system drives (which also works in CD/DVD drives), and another for SATA drives. There is a power supply for the drive motor, since many 3.5 inch drives draw more that USB can supply. It's a BYTECC BT-300 USB 2.0 to IDE/SATA Adapter, and it's less than $25 at NewEgg.

This will make migrating stuff between computers a breeze. I wish I would have thought of this first.

Wednesday, October 18, 2006

I Code Like A Rock Star!

Originally Posted On: DATE: 10/07/2005 11:27:57

Today I was working with one of my office mates on reproducing a problem reported at a field site. He had already gotten the source from the tip, built, and tested it, and then added a test case to demonstate the problem we were seeing in the field. The conversation went something like this:

Him: "I added the test, and it all works."
Me: "Really? That change should have caused a failure. Let's take a look."
We looked at the code, and sure enough, the test that should fail was in the test case.
Me: "This is really strange..."
I added a fail(...) to the test case, and it still didn't cause any failures.
Me: "This is really bothering me..."
I added a compile time error to the test case, and the compiler did complain this time.

Finally, we looked in the test suite (this is a fairly old class that's still using NUnit 1.x) and found that the suite wasn't including this test class. We added the class to the suite, and the failures happened in the way we were expecting. We then backed all the cruft we had just added to the test class back out, and worked on the problem from the beginning. It was very disconcerting to be able to add errors to the test class but not have them show up.

Two lessons I learned from this episode:

Don't Take Short Cuts. There's a great article on The Three Rules Of TDD by Bob Martin. We should not have added anything to a class that wasn't really being called. There are ways to validate that the classes are actually included. This problem would be alleviated by using NUnit 2 somewhat, since the tests there are tagged instead of named, and the Suite class is unneeded.
Don't Trust Anything. Whenever something smells wrong, there's probably a good reason. Start with the simplest thing (like - is the editor, the compiler, and test running all pointing at the same place?) and work up in complexity to find where things start to stink. Then iterate around there until you find the problem.

Another One Bites The Dust

Originally Posted On: DATE: 09/18/2005 10:59:45

When I was in high school in the early eighties, a friend of mine (David Jackson) told me that there were hidden lyrics in a song by the rock group Queen. I didn't believe him, so he put my copy of the LP (we had vinyl LP records then) on my record player, and turned the record backwards by hand. On the song "Another One Bites The Dust", there were some very interesting lyrics. Take a listen to both versions and see if you can hear what I thought I heard in the backwards version. (Note that before you listen to any of this that this song is (c) Queen, and I have no rights to it whatsoever. I'm just making an interesting observation about a 5 second snippet of it.)

Here's the snippet I'm talking about played normally.
Here's the exact same snippet played backwards.

To me, it sounds like the backwards version says "it's fun to smoke marijuana" several times. Now that you think I heard that when I listened to it, listen to the backwards version again. Did the things I heard affect what you heard? There are a couple of interesting thoughts here:

If it really says that, how did the guys in the band manage to make it say that?
If it doesn't, how much does your expectation that something will say something affect what you actually hear?

Personally I can imagine a bunch of guys messing around with audio equipment, trying to find something to write a song about and finding out that "i like to smoke marijuana" sounds like "another one bites the dust" backwards, and using that in a song.

In the end, it doesn't really matter. I seriously doubt that there's any subconscious decoding that will take lyrics backwards, and put them in your memory, and further, I still think that the song is a fun song.

Excellent Fowler Article

I saw an excellent article by Martin Fowler today. It hits the nail on the head of why it's so hard to do something new even though it has a very good chance of being easier/better than how we do it today. This was my experience in learning Test Driven Development.

Tuesday, October 10, 2006

Automatically Dismissing The Unhandled Exception Dialog

On our build system, we were having two problems. First of all, there was a smoke test that was throwing an unhandled exception. Second of all, the .net framework was presenting a dialog to ask the user what to do about the exception. In the spirit of fail fast, this was causing a big problem for the entire build, since we normally don't have somebody staring at the build box, expecting to give user input. It's very important that the build box be headless and input-less. It's OK to have a management console like CruiseControl to request builds, but there should never be an interactive dialog.

I googled for an answer on how to automatically dismiss the dialog, and finally found the answer. Essentially, all you have to do is modify the registry key:

HKEY_LOCAL_MACHINE\Software\Microsoft\ .NETFramework\DbgJITDebugLaunchSetting

and change it's value to 1. This will allow the process throwing the unhandled exception to terminate with a stack dump with no user interaction.

Friday, October 06, 2006

Upgrading From NAnt 0.84+ to NAnt 0.85 rc4

I converted our nightly and integration builds from NAnt 0.84 (with some local mods) to NAnt 0.85 rc4 today. Overall, the upgrade was pretty painless. There were several changes that I had to make to our build file to make it work, plus there were many warnings that I wanted to remove after trying things out. Here's the list of things I did:

Copied the support dlls from the NAnt contrib project into the nant/bin directory. This was required for several support tasks and attributes. It also makes it easier to upgrade the build file later since everything will be available.
Copied support tasks created locally into the nant/bin directory. These tasks were still compatible with the new version of NAnt, and didn't even require a rebuild.
Used to have to set an environment variable with inline C# code, but we can now take advantage of the setenv command. This even speeds up the build slightly since the dll that sets the env. variable doesn't have to be compiled and loaded.
Changed all includes and excludes to include and exclude respectively. There were many filesets in the build file. This was by far the most common change. This was not required, but removed many warnings.
Changed all user in vssget and vsslabel to username. Again, this was not required, but generated deprication warnings.
Had to add some code to ensure the write-only bit was off before touching some files to change the timestamp (this was necesssary because of making sure IIS saw the webservice files as new). This was a required change due to some new behavior in NAnt.
Used to have some code to remove a service before we installed it for Smoke Tests. With the newer NAnt (in the contrib project), there is a service::is-installed() check we can use. This eliminates severa lof the warnings we used to get on our Smoke Tests. This was a nice enhancement made possible by the new check.
In our nunit2 tasks, we formerly used haltonerror attributes. These are now "failonerror".
In our Acceptance Tests, we formerly only used the basedir attribute. With the newer NAnt, we now need to set both the basedir and workingdir attributes. This is necessary only for programs that expect data/config files to be in the executable directory. If running a program that takes no input, this would not be necessary. This was a required change to allow the tests to run.
All of the nunit (aka Nunit1) test suites are tagged as depricated, and generate warnings in the build logs. I like this because it will encourage us to upgrade those tests over the next while.

Friday, September 29, 2006

Google Reader Update

Google has updated their RSS reader. I liked what they had before, but the new features are pretty cool. They had "starred" items (i.e. items you wanted to keep track of for later which you can see in the sidebar of this blog, but now there's a new feature called "shared" items. This is like a poor man's plagaristic blog. You can even add an rss feed for your shared items.

Overall, I'm very happy with it. I tried several readers before going with Google's. I normally use two or three different computers each day, so it's really nice to have all my feeds synced.

Check it out if you're looking for a reader.

Thursday, September 28, 2006

Windows Live Writer Test Post

With images no less!

This picture or Richard was from summer camp last July. I believe this was on the Nantahala river.

Wednesday, September 20, 2006

VMWare Appliances

I've installed the (free) VMWare server on my development server at home and I frequently use a couple of their appliances on it. Specifically:

Twiki - This is an awesome idea to be able to install and use TWiki without all the hassles of getting it going on a server. Especially if your server is Windows based. This appliance already has Samba installed, so getting files on and off the box is very simple, as are backups. It also sets itself up with a netbios name, so connecting to it is cake.
Helix - This has Trac installed along with subversion. Two very nice tools for managing development. I'm currently only using subversion (since Trac is a little over the top for what I do by myself).

I've installed the Twiki appliance at work, and converted some fairly old (2+ year old version of Twiki) data to the new appliance. This took about 3 hours in all to do.

Overall, I'm very happy with this tool as a solution.

Tests Don't Save You From Being Stupid

Originally Posted On: DATE: 08/29/2005 04:18:31

A couple of weeks back I worked on adding some new capabilities to a software system at work called the `listener'. Essentially, this software receives socket messages (events), assembles, parses (from various protocols), generates `model' events, and passes these off to a subsystem to let various business logic handlers deal with the events and do interesting things with them. The piping internal to the system to move events around includes (among other things) the Visitor Pattern, which is normally very explicit about not being able to handle a new type if a new type has been added.

The capabilities I needed to add depended on creating a new event type that the visitor pattern would pass to the appropriate handler on the back end (like visitor patterns do). I built the acceptance test, ran it, watched everything fail as I expected, then got into modifying the interfaces to force the creation of the new event type, and force the various implementations to implement a `visit' method for the new event. All of this went swimmingly, and after working for about three or four hours, I had the whole system back together with new UT's for the classes that needed changed, and with the original AT that started the whole thing working.

So, I had added about 50 UTs, and about 3 ATs and everything looked copacetic. I took the finished code off to run it on a test system with real data.

It didn't work.

After spending a couple of hours digging through the logs, tests, and finally the code, I found that when we built the system, we had added a base class beneath some of the implementations of the visitor for some reason. This base class provided default behavior for unknown types coming in. Since this base class was embedded in the plumbing, the default handler was taking the new event type, and wasn't letting it get to the endpoint implementation, so the new event wasn't being handled. Once I found that, I found several tests that I had missed on the first pass through the system that would have pointed out these problems, but I had glossed over them.

If I knew then what I know now, I'd:

Make sure that if I'm using Visitor, I make the compiler warn me about problems. The fact that I'm using Visitor means that I'm SURE I'll be adding new types. Let the compiler do its job when the types aren't wired up correctly.
If there is a compelling reason to stick a base class in there (i.e. a default visit handler), add a test that does reflection to count the number of visit types between the end point implementations and the interface. This might be a little hairy, but the 2-3 hours spent will save me several times in the future.
If you're in a situation like this, and the base class is causing you pain, remove the base class.

It was a good learning experience. It just points out once again that hindsight is awesome.

Kent Beck / Alan Cooper Smackdown

Originally Posted On: 08/22/2005 09:11:55

I read a very interesting article today. One of the most interesting things I've read in the last several months. I completely agree with Beck that the XP methodologies would work best and provide a nice process for solving the difficult problems of software development, but my experience has been that getting a "customer" is one of the most painfully difficult things there is. In my experience, it's NEVER been successful for more than a few weeks. I had lunch with some XPers from GSK (GalaxoSmithKline) a couple of weeks ago, and their experience mirrored mine.

Computer Science Challenge List

Originally Posted On: 06/20/2005 03:05:45

Several years ago, I found a document called Challenges for Theoretical Computer Science. It describes a bunch of 'challenges' that haven't been solved in the computer science area. The information written in this list is 5+ years old now, and it's amazing how many items on the list are still outstanding. I don't claim to be able to understand all of the challenges (like 'Make Quantum Computing a Reality'), but there are a lot of them that apply directly to TDD methodoligies (like 'Software Specification'). It would be interesting to track this list (or one like it) over many years.

I think it's useful to take a step back every once in a while and see where we've come, and where we're headed. A couple of useful things can come out of it:

Correction/Validataion of your current course.

Realization that most problems have been around longer than our recognition of them, and that some smart people may have spent time thinking about them.

Insight into a problem outside of your current vicinity can give a great new angle on your immediate problem.

New product ideas. You may have the insight to solve some problem that nobody else has.

Mickey Mouse Show

Originally Posted On: 06/17/2005 04:22:42
Rob Selph commented to me that he liked the name of the blog, but that I should explain a little about the name. My name is Mark Mackelprang, and with a last name like that, the blog name is pretty natural. Growing up, it was intimidating to learn to spell an 11-character surname, so my older sister came up with a nice mnemonic trick to do it. I just so happens that there are also 11 characters in the name "Mickey Mouse", and it just so happens that the cadence and flow of the letters is very similar if you sing the Mickey Mouse Club Theme Song with the letters of my last name (What's the surname we adore that's made for you and me? M - A - C - K - E - L - P - R - A - N - G...). It's how I learned it, how my wife and kids learned it, and it's how I teach people I meet now to remember to spell it. Songs are amazing tools for learning things.

Namespace Refactoring

Originally Posted On: 06/16/2005 12:05:04
Over the last week or so, I've spend some time refactoring some namespaces in the C# codebase at work. There were some things that had just been languishing in the wrong
place for a long time. We had a few days to spend cleaning up things that we normally have to live with, so this is one of the things we decided to fix. Like any other refactoring, NS refactoring shouldn't change any system functionality, but just improve the overall design. It's a lot like renaming a class, and maybe it boils down to the exact same thing in the end. Luckily, this codebase is well covered with tests. We have unit tests (class level functionality), acceptance tests (system or subsystem level functionality), and smoke tests (installation functionality).

This process would have taken many times longer with much higher risks without the tests. Even with these tests, there was plenty of pain to feel. I'm a sharing kind of guy, so here's what hurt:

File Names: Don't ever use a object type for any part of a configuration file, log (thanks to Andy Sipe for the catch here) file, etc. For the actual code itself, there isn't a problem since all you have to do is rename the file. Whenever the config or logfile depends on the namespace, however, there are several locations you have to go to fix things up again. This bubbles through the system, and clear up to the install process. For upgrades, there will be difficulties merging from old to new when configuration file names change from version to version. For me, it helped to think about the fully qualified type as a unique ID in the db. Whenever that ID is used for something other than identity, trouble follows. The same is true for types. The rule here is name all files after their intended use, not after their location or namespace.

Object Identifiers: We used a persistable object layer to insulate the code from the db in the system. When we implemented the object identifier subsystem, we had the object ID's generated from the type of the object. This means that we had an OID table in the database, with type names in one of the columns. This makes the design very resistant to NS refactoring, since it would require a db upgrade to change the code. We ended up putting a mapping layer in the OID layer that takes a type and returns the "name" of the OID object to use. For now, we're using the "old" OID names to prevent an immediate database upgrade. In a future refactoring, that's another thing we'll want to fix.

Configuration File Locations: Use the "once and only once" (DRY) principle ruthlessly when setting up config files. We use VSS (don't start with me) and had originally shared various files between all of the projects that needed them. After going through the process of NS refactoring, I believe the correct way to do this would be to store any shared configuration files in a single location, then make it part of the build process (NAnt) to copy the configuration file from the one location to the test location before running the test. For files that are used only once, put them where they're used, but if they ever become shared, spend the time to put them in the single location, and change the build/test to get them there.

Tests: Don't even try to do this over a large set of files without good tests around them. The unit tests aren't really useful for this refactoring. The compiler tells you most of the errors that are at this level. The acceptance tests are good since they pull all the parts of the system together, but they can be fairly long running (some of ours take many minutes to run). The best tests for this kind of refactoring are the smoke tests. These validate the installation for the various end products, and generally only take a few seconds to run for each one (plus file copy & startup time).

Conclusion: Overall I think this kind of maintenance should be a regular activity. Company names change, the overall design change. Failure to update the NS is another "broken window" or "smell" that Andy Hunt and Dave Thomas mention in their Pragmatic Programming book.

Tuesday, September 19, 2006

Trying out Blogger...

Instead of hosting my blog on my home computer, I'm seriously thinking about simplifying things and using Blogger. We'll see how hard it is to import everything into it before I make any permanent decisions...