Mark Shead – Page 49 – Technical Agile Coach and Trainer

Thinking about Version Control

Practice. Expect to practice with your version control system before you go live. This means you need to have a copy of your data. If you are exporting data from CVS to Subversion, it is probably a good idea to keep the import files around for a while in case you have to go back. Version control systems are generally forgiving (that is the whole point over having version control right?), but you can mess things up. For example if you accidentally import several GB of binary files and then delete them, it might cause problems with your backups because that data will still be there.
Training. Make sure everyone gets at least some training on how to use the system. A short list of how to accomplish common tasks can make everyones life a lot easier and may head off problems down the road.
Backups. Your data must be backed up automatically. Just copying a directory doesn’t mean you’ve backed up anything useful. You don’t really know if you’ve got a backup until you test it. When you are setting up your backup you should do a test restore to another machine and then checkout your data to make sure it works. You should probably repeat this every few months depending on how important your data is.
Decide what to keep. You need a clear policy as to what goes into version control and what doesn’t. Some people recommend putting everything under version control. However, I can’t think of a whole lot of reasons to keep a binary iso under version control. Generally you want to put stuff in the repository that might have other versions. Personally, I don’t like keeping binary libraries in Subversion. I have a separate repository for those types of files. I do think it is a good idea to keep your configuration files in the system. It is a real lifesaver to be able to go back to a previous version of a backup script to see why it is failing now, or to see what has changed in httpd.conf that is making Apache act funny.
Remote access. If you want to make your repository accessible over the internet, you should plan for it when you set it up. With Subversion this is easy because you can plug it into Apache 2. Keep in mind that you want to use the same url from inside the network as outside. This is usually just a matter of putting a DNS entry on your internal network that points to your private ip address while the public DNS entry points to your public address.

Nightmare Programming Project

Here is a scary scenario for a programmer. Unfortunately it is probably not too uncommon. Imagine you are starting a new job. Before accepting the position you ask all the right questions, but on your first day you discover the following:

The lead programer left two months ago. (You already knew this.) The only form of documentation is a few comments and javadocs. (You didn’t know this.)
While the project has it’s own CVS server, the backups were done without shutting down CVS and they had never been tested. The server crashed and all the backups were corrupt. There is a copy of the source code that someone had on their local hard drive, but no history information and there isn’t code of the last stable release.
The 30% of the unit tests fail. Another 30% have errors. Some of the tests are just wrong, others rely on having resources configured in a specific way. There are also some tests that aren’t actually testing code, but instead do things that populate the databases, etc.
The core framework used in the project is based on a 6 year old proprietary binary library that doesn’t have any documentation.
The bug tracking data was lost in the server crash. Since you don’t have the source code from a stable release, there are a bunch of little bugs, but none of them are documented.

Sound like a nightmare? Definitely. But it is also an opportunity. Here are the steps I would take under the above circumstances. I’d enjoy hearing what others have to say as well.

Implement Version Control. Personally I prefer Subversion, but regardless of what you use, getting it up and running should be your first priority. And just having it functional isn’t adequate. It needs to be running, automatically backing up, and you must test doing a restore from the backup data. If you haven’t tested your backup you are begging disaster to strike. (more on version control)
Setup an Issue Tracker. You need a place to keep track of the problems you find. You don’t want to remember a nasty bug when a customer calls to tell you about it. The issue tracker should be used for more than just bugs. It should be your central repository for everything the developers are working on. That means it should hold features, tasks, ideas for improvements, etc. Set milestones and assign issues to those milestones. If you are tracking estimated time for each issue, you’ll be able to set realistic schedules.
Document your setup. You need to know how to setup the build and run environment from scratch. I would suggest starting with a clean machine and setting everything up. If your project builds fine on the old developer’s machine, but not on any new machines you setup, you have a problem. You may find yourself spending several days hunting down all the dependencies and configuration settings.
Fix the Unit Tests. If tests are failing the developers will ignore them. All of your active tests have to pass every time you run them. If this means you go from 400 tests down to 50 that is fine. Tests that don’t pass can be prefixed with “pending” so they won’t run and you can re-implement them over time. If a test fails you need to consider the build to be broken.
Setup Automatic Builds. This doesn’t need to be anything fancy, but if code compiles and tests fine on your machine, but fails on others, you need to know about it right away. Ideally the code should recompile whenever there are changes in the version control system. Also you should make the build fail if the tests fail. When a build fails, you can have it send email, flash the lights, or turn on the overhead sprinklers, but you should not ignore it.
Document the code. Digging through someone else’s code and trying to make sense of it can be extremely painful. But you really can’t start writing your own code until you understand how the old code works. Even if you want to replace parts of the old code, you will need to understand what the code you plan to replace is actually doing. Make sure you record your findings, so they can benefit others. You may get called away to rescue another troubled project and you want to make sure you successors don’t have to start over.

Technorati Tags: programming, software, version control

Cobertura

Cobertura is a fork of jCoverage. It runs reports to let you see how much of your code is being tested by unit tests. This is incredibly useful to find areas of your code where a bug would go undetected.

It looks like there is a plugin for Maven already, so I’m going to have to give it a try sometime. link

Looking for an Issue Tracking System

One of the things I need to do at my new job is get an issue tracking system in place. At Reslife, we put in Cerberus for tracking help desk request. Coming into a place that doesn’t have any type of system to track issues made me realize how much I took Cerberus for granted. Cerberus (www.cerberusweb.com) is a nice simple ticket tracking program. It worked very well for the church’s needs, but for software development we need something more powerful and specialized. Specifically here are some of the things I’m looking for:

Integration with version control. If someone commits a change to version control and the comments reference a particular issue, I want a notice to show up in the tracking system that links to the change. This needs to happen automatically.
Tracking estimated and actual time spent on each issue. For some reason most tracking systems don’t have this feature, but I think it is very important in order for a team to learn to estimate well. By being able to look at several months worth of estimates and compare them with actual time spent, developers should be able to become better at creating realistic schedules.
Email notification. Obviously the system should email you if something changes on an issue that you are “watching”. It would be nice if you could respond to an email and have it posted back to the ticket. (This is what Cerberus did a good job of.)
Dependency tracking. If item A can’t be completed until item B is done, the system should keep track of this.

Some of the System I’m looking at:

Scarab
Trac
JIRA
Bugzilla

So far JIRA looks like the best option. But since it isn’t free, I’ll probably need to use something else for awhile. I installed Trac, but it doesn’t seem to do a very good job of keeping track of users. Also it is fairly complicated to get running on Windows.

We’ll probably end up using Scarab. It seems stable and has a decent user interface. There are some scripts that will allow it to interface with Subversion and it handles email notifications and dependencies. It doesn’t do time tracking out of the box, but there may be a way to configure this.

Scarab was created as a general tracking platform. Almost everything is configurable. You can specify a multi step process for entering a new item and include things the checking for duplicate entries. It isn’t as feature rich JIRA, but it looks like several people have started developing on it again, so hopefully it will continue to grow.

Entity Expansion Limit in Maven

When using maven to generate a multi project I’ve run into an error where it says that it reached the “entity expansion limit” of 64,000. This appears to be coming from the SAX parser which is validating all the HTML pages. I’ve looked for a way to turn this parsing off, but so far I’ve been unable to find a way to do it. Launching Maven like:
maven multiproject:site -DentityExpansion=10000000
Seems to solve the issue.