Tapestry and testng.xml

You can use a testng.xml file to help control the tests and browser used by Selenium in Tapestry.  However, you have to tell Maven to look for the testng.xml file or it will ignore it.  You do this by putting the following in your pom.xml -> build -> plugins:


Lucene MoreLikeThis Example Code

I was recently working on a simple application where the user will enter famous quotations.  Obviously we want to avoid duplicates so I needed a way to check for quotations that were substantially similar before a new quote was added to the database.

The idea was to show the top 5 most similar quotes before letting the user save the new quotation to the db. I used Lucene for this which allowed me to punt on the more difficult task of figuring out if two quotes were similar or not. I left that up to Lucene and only had to worry about how to get my information in and out of Lucene in a usable manner.

Below is the interesting method that uses Lucene to build an index of all the quotes in the system and then returns the five quotes that are most similar to the new quote text.  Obviously creating a new index each time a quote is added isn’t particularly efficient, but makes it easier to demonstrate how it works and processor efficiency isn’t much of an issue with this particular task.

    public List<Quote> getSimilarQuotes() throws CorruptIndexException, IOException {

        String quoteText = quote.getText();
        logger.info("creating RAMDirectory");
        RAMDirectory idx = new RAMDirectory();
        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_31, new StandardAnalyzer(Version.LUCENE_31));
        IndexWriter writer = new IndexWriter(idx, indexWriterConfig);

        List<Quote> quotes =  session.createCriteria(Quote.class).list();

        //Create a Lucene document for each quote and add them to the
        //RAMDirectory Index.  We include the db id so we can retrive the
        //similar quotes before returning them to the client.
        for (Quote quote : quotes) {
            Document doc = new Document();
            doc.add(new Field("contents", quote.getText(),Field.Store.YES, Field.Index.ANALYZED));
            doc.add(new Field("id", quote.getId().toString() ,Field.Store.YES, Field.Index.ANALYZED));

        //We are done writing documents to the index at this point

        //Open the index
        IndexReader ir = IndexReader.open(idx);
        logger.info("ir has " + ir.numDocs() + " docs in it");
        IndexSearcher is = new IndexSearcher(idx, true);

        MoreLikeThis mlt = new MoreLikeThis(ir);

 		//lower some settings to MoreLikeThis will work with very short

		//We need a Reader to create the Query so we'll create one
        //using the string quoteText.
        Reader reader = new StringReader(quoteText);

        //Create the query that we can then use to search the index
        Query query = mlt.like( reader);

        //Search the index using the query and get the top 5 results
        TopDocs topDocs = is.search(query,5);
        logger.info("found " + topDocs.totalHits + " topDocs");

        //Create an array to hold the quotes we are going to
        //pass back to the client
        List<Quote> foundQuotes = new ArrayList<Quote>();
        for ( ScoreDoc scoreDoc : topDocs.scoreDocs ) {
            //This retrieves the actual Document from the index using
            //the document number. (scoreDoc.doc is an int that is the
            //doc's id
            Document doc = is.doc( scoreDoc.doc );

            //Get the id that we previously stored in the document from
            //hibernate and parse it back to a long.
            String idField =  doc.get("id");
            long id = Long.parseLong(idField);

            //retrieve the quote from Hibernate so we can pass
            //back an Array of actual Quote objects.
            Quote thisQuote = (Quote)session.get(Quote.class, id);

            //Add the quote to the array we'll pass back to the client

        return foundQuotes;

Tapestry 5 – 10 Minute Demo of Apache Tapestry

This is a demonstration of about 10 minutes of programming in Tapestry 5 creating a sample application. The app is pretty basic and just lets you add URLs to a list and then vote on them-similar to the idea behind Digg or Reddit. I didn’t really explain things in great detail so it is more of a demonstration of some of the things you can do than it is a step by step tutorial.
Continue reading “Tapestry 5 – 10 Minute Demo of Apache Tapestry”

How to Charge for Programming

Charging for programming can be a bit tricky.  Clients obviously would prefer a solid quote. However, any experienced programmer knows that what the client wants is going to change as the software takes shape. Furthermore, any non-trivial program is going to involve some unknown problems that the programmer may be able to solve quickly, but might take a long time. It isn’t easy to estimate how long it will take to find an algorithm that does X in Y seconds.

In order to feel safe giving a firm quote, a programmer is going to have to pad the numbers quite a bit to deal with unknowns, client changes, etc.  Of course this might push the price up to the point where the client decides it isn’t worth it.

An experienced client is going to have already experienced a project where they paid lots of money just working off of an estimate only to end up with software that doesn’t work and a code base that no one else is willing to touch. They can either keep sinking money into the project or just write off the whole thing as a failure.

So how can you compromise between what the client and the programmer both need?

1. Small bites

If you can determine the smallest amount of functionality for the software to be useful to the client, do that as an independent project first.  Instead of trying to quote for an entire year long programming job, pick the part that solves the most number of issues with the least amount of work and try to have something delivered in 2 to 4 weeks or less.

The client likes this approach because you are delivering solutions to problems instead of just a program. If they can get their biggest problems solved early on with the least amount of expense, so much the better. They may want you to do x, y an z, but if x gives them 90% of the benefits, you should definitely do that as a project and then let them decide if they want to proceed with y and z.

What you learn in completing the first project will make your estimate of the next project much more accurate. The chances of you getting anywhere near an accurate estimate when estimating x, y and z all together are very low.  The chances of you being able to accurately estimate y after having completed x are much greater.

You as the programmer face a lot less risk with a 2 to 4 week project as long as the requirements are defined and the scope is clear to both parties.  Keep in mind you need to clearly articulate not only what the software will do, but also what it will not do. Make sure the client knows what is being left out and what will be part of future projects should they choose to proceed.

2. Range estimate with upper limit

The client doesn’t want to pay twice as much just because you won’t know how long the project will take until you are half way done.  If you give a single quote that factors in the risk, they may feel they are being taken advantage of.  On the other hand, you don’t want to give a low bid only to find you drastically underestimated how difficult it would be to solve particular problems, interface with their existing system, etc.

Giving an estimate such as $5,000 to $10,000 with a cap of $10,000 is often a fair way to handle the needs of both sides. The greater the unknown risks, the wider the gap between the first and second number.  Basically the second number is the amount you’d want if you were giving a hard quote.  It should factor in all the risks of dealing with unknowns, the chance that the customer will change their mind a lot, etc.  The first number should be a reasonable estimate if everything goes ok with only a few minor problems.

The client will need to commit to paying up to the higher number.  If the project is worth it to them at $5,000, but not at $10,000 then you need to reduce the scope until the value and your upper limit match.

3. Charging for programming time

Some people like charging a low rate, but charging for every minute they spend on the project. They operate with the idea that, “If I’m doing something I wouldn’t be doing on my own, the clock is running.” This seems reasonable from the programmer’s standpoint.  However, imagine you are personally hiring a graphic designer to create some designs in HTML 5. The designer is good, but has mainly done work in HTML 4.  As a result they are going to be spending at least some of their time learning exactly how to use HTML 5. Do you want to be paying for their learning curve? Probably not.

Chances are very high that not all of your time on a programming project are going to be spent actually coding.  You’ll probably spend time doing the following:

  • Reading the documentation for your framework
  • Looking for open source libraries that will do what you need
  • Looking through mailing lists archives to see how other people solved similar problems.
  • etc.

These all might sound like reasonable things to charge for, but they are investments in making you a better programmer. My preference is to charge a much higher rate as an expert, but not bill for learning anything new.

I use a time tracking tool (Mylyn and TaskTop) that measures how long I spend working on a particular issue in my IDE.  That way I’m only measuring the time spend coding and not any of the time I spend doing research. The tool I use, times out after three minutes.  So if I just need to jump over to lookup the specifics of an API call, all the time is counted.  However, if I head off on a 30 minute search for a better open source expression language to embed in my program, it isn’t.

There are three consequences of this type of billing:

  • The hourly rate I charge is much higher than some of my peers.
  • 20 to 30 billable programming hours is a full week.
  • My estimates are much more accurate because I’m only estimating actual coding time.

4. Getting paid

With known and trusted clients, I bill at the end of the month for any work performed during that month.  With unknown clients, I prefer that they pay ahead of time.  When my work depletes their deposit, they need to deposit some more money.

5. Partnerships

The world is full of people with no programming skills who are trying to invent the next Facebook and feel that the only thing holding them back is the fact that they can’t afford a programmer willing to work for a percentage of the company. You want to avoid people like that.  Ideas are cheap and generally you don’t want to ever partner with someone who is wanting you to bring the man hours and they will bring the “ideas”.

However, there are people that have something significant to offer that you can’t get easily on your own.  In particular this comes in the form of marketing and existing customers. The hard part of making money from selling software isn’t writing the software–it is getting noticed.  Partnerships with people who can help your product get noticed can be valuable.

You want to avoid partnerships where you front all the risk though. Your partner needs to have some type of incentive to use their connections, mailing list, clout or whatever to promote the product once it is ready to go.  One easy way to do this is to split the development cost.  You approach things the same way you do with a client, but you and your partner are both responsible for half the bill.  If you are operating as a sole proprietor, then you simply don’t pay your half of the bill.  If you have a corporation you may pay your half personally.

This type of arrangement means you both have something invested.  You have your time and they have their money–albiet a lot less money than they would have if they were to have tried to do it on their own.

If you deliver the software, but your partner turns out to be incapable of actually marketing it, you still have half of what you’d normally bill.

Partnerships are tricky and you need to be very clear and very careful how you set things up, so make sure you do your homework.  The main point here is to not overlook partnerships when the partner has something genuine to offer–and very rarely is that going to be their idea.

Tapestry Facebook Share Component

I am working on a project where we need to be able to do a Facebook share on a particular page for an event from multiple places within a web application. Putting this functionality into a Tapestry component makes it easy to call it from where ever it is needed and the code makes for a nice concise example of how components work in Tapestry.

In Tapestry components consist of two parts. The first part is the .tml file and the second is the .java file. In general the .tml file is concerned with how things render out in HTML and the .java file is concerned with the programatic side of things and retrieving the data.

The end result is going to be a tag that we can use in other .tml files that will look like this:

<t:facebookshare event="myEvent"/>

First lets look at the .tml file. This file is saved under:

<t:container xmlns:t="http://tapestry.apache.org/schema/tapestry_5_1_0.xsd" >
<a href="#" onclick="window.open('http://www.facebook.com/sharer.php?u=${facebookShareURL}&amp;display=popup','fbshare','menubar=0,location=0,resizable=1,width=700,height=350')">
	Please share this event on Facebook

The ${facebookShareURL} is what will call getFacebookShareUrl() on the java class in order to fill out the component with the data it needs. In this case it will be an encoded URL.

The .java component file is saved in:

import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;

import javax.persistence.Transient;

import net.xeric.register.entities.Event;

import org.apache.tapestry5.Link;
import org.apache.tapestry5.annotations.Parameter;
import org.apache.tapestry5.annotations.Property;
import org.apache.tapestry5.ioc.annotations.Inject;
import org.apache.tapestry5.services.PageRenderLinkSource;

public abstract class FacebookCommon {
    private PageRenderLinkSource linkSource;
    private Event event;
    public String getFacebookShareURL() {
        Link link = linkSource.createPageRenderLinkWithContext("EventInfo", event);
        String linkURL = "";
        try {
            linkURL =  URLEncoder.encode(link.toAbsoluteURI(),"UTF-8");
        } catch (UnsupportedEncodingException e) {          
        return linkURL;

I have a page called EventInfo that shows all the information about a particular event. So this component takes an event as a parameter and then constructs the link to share on the EventInfo page.

This is how we tell Tapestry that we have a required parameter that needs to be bound to an event:

    private Event event;

That means that when we use the component like this:

<t:facebookshare event="myEvent"/>

Tapestry is going to expect the parameter event to be bound to an event object.

Line 42 has the code that constructs our link back to the EventInfo page using the context of the event that we’ve passed in as a parameter to the component.

        Link link = linkSource.createPageRenderLinkWithContext("EventInfo", event);
        String linkURL = "";
        try {
            linkURL =  URLEncoder.encode(link.toAbsoluteURI(),"UTF-8");

The rest of the getFacebookShareURL method simply encodes the URL as a string and returns it. This is the value that replaces the ${facebookShareUrl} in the template when the component is rendered.