Release philosophies

There are two different philosophical approaches to when a release should occur.

1) Release when featureset X is complete
2) Release at time Y

Agile literature talks about turning the “normal” develpment process on it’s head by cutting features to meet timeframe. But in reality, time is always the deciding variable (as anyone who has been through a waterfall deathmarch can attest.)

[Quality and budget are other variables, but can be thought of as special cases of features and time, respectively.]

One thing Agile does try to stress is the awareness of this reality up front, and the mitigation of it by using frequent interations of small featuresets, so that you have a limited set of completed, and hopefully high quality features at any point in time — in case the budget (time) runs out; rather than a larger featureset of incomplete and potentially lower quality features.

I started this as a comment on a previous post called Modeling Products and Projects.

I described the following models:

Product > Release > Features
Project > Milestone > Tasks

restated as complete sentences:

The release of a product contains a set of features that have been implemented.
A milestone for a project consists of a series of tasks that have been completed.

But then a milestone is a marker describing a featureset, and a release is the completion of a series of tasks. Does this conflict my model?

[ There is definitely a correlation between release and milestone, since a release is typically at a milestone, using philosphy 1 (feature based release). But I stated that philosophy 2 (time based release) is almost always the reality. [Now I’m confusing time and accomplishment — scoping is the excercise of reconciling the two.] The idea is to get a realistic featureset completed within your timeframe, and the proposed solution is to build iteratively with smaller featuresets. But still, why does this model essentially “cross”?]

Configuring an Apache VirtualHost to use Tomcat

I was talking to a friend who asked about configuring an Apache virtual host to use Tomcat. I have Apache httpd 2.2.8 running on port 80 and tomcat 5.5.26 running on port 8080 on Windows XP Pro. So the following worked for me:

http://server/ <– handled by Apache

http://server:8080/ <– handled by Tomcat

At first, I didn’t quite understand what he was saying. I thought he wanted something like this:

http://server/ <– to be handled by Apache
http://server/tomcat/
<– to be handled by Tomcat
http://server/any/other/path <– to be handled by Apache

He said he didn’t care about the port 8080 being displayed, so could just use mod_rewrite?

Here’s the rule I came up with, and added to httpd.conf:

RewriteEngine On
RewriteRule ^/tomcat/(.*) http://server:8080/$1

So whenever somone types the following url http://server/tomcat/foo they get a rewritten URL displayed as http://server:8080/foo handled by tomcat. They can continue to navigate relative URLs as long as they stay within the http://server:8080/ domain. Any URL with http://server/ (without port 8080 in the URL) will still be handled by Apache. So will any URLs starting with http://server:80/

This is the simplest possible solution.

I then found out that what he actually wanted was to use one server (OS) to host two domains, one using Apache and the other Tomcat. This was easy enough to solve by creating two virtual hosts, one for each server.

http://apache/ <– handled by Apache
http://tomcat/ <– handled by Tomcat

First, I needed to create the hosts. He was using a registered domain, but I just both to my /etc/hosts file (actually C:\WINDOWS\System32\drivers\etc\hosts):

127.0.0.1 localhost apache tomcat

Do whatever you need to do to make it resolve.

First I created two Apache config files, one for each virtual host. I added the rewrite rule to only the tomcat vhost.

C:/apache/conf/vhosts/apache.conf

<VirtualHost 127.0.0.1:*>

ServerName apache
DocumentRoot “C:/Apache/htdocs”

<Directory />
Options All
Order deny,allow
Allow from all
</Directory>

</VirtualHost>

C:/apache/conf/vhosts/tomcat.conf

<VirtualHost 127.0.0.1:*>

ServerName tomcat
DocumentRoot “C:/Apache/htdocs”
RewriteEngine On
RewriteRule ^/(.*) http://apache:8080/$1

<Directory />
Options All
Order deny,allow
Allow from all
</Directory>

</VirtualHost>

I added the following to my httpd.conf, so that they would be included.

include “C:/Apache/httpd/2.2/conf/vhosts/*.conf”

Also make sure you have the following in httpd.conf to handle virtual hosts on the correct IP

NameVirtualHost 127.0.0.1:*

and that mod_rewrite is enabled.

I restarted Apache and everything was working. Requests to http://apache/ get handled by Apache,

and requests to http://tomcat/ get rewritten to http://tomcat:8080/ and then handled by Tomcat.

I then created exceptions to the rules, so that requests to http://tomcat/exception/* are handled by Apache and requests to http://apache/tomcat/* are handled by Tomcat.

I added the following to apache.conf:

RewriteEngine On
RewriteRule ^/tomcat/(.*) http://apache:8080/$1

and added to tomcat.conf (placed before the other RewriteRule):

RewriteRule ^/exception – [L]

This is simple enough. But what if we didn’t want the port number displayed? (Or any other rewrite confusion?)

Apache httpd server is commonly used as a proxy in front of tomcat or other servers. My friend said he’d heard about a module that handles this. There are in fact several approaches, some of which are out of date.

The original (to my knowledge) was mod_jserv. This is out of date. For a while, the two common methods (that I know of) were mod_jk and mod_proxy. The first ended up being faster because it used a binary protocol, AJP (Apache JServ Protocol), but was more difficult to configure. Another module, mod_jk2 was create with the goal of making that easier. This has since been deprecated in favor of the original mod_jk (currently at version 1.2.6). A newer module that is available, only for Apache 2.2 is mod_proxy_ajp. It uses a proxy, but also uses the AJP binary protocol, so is faster, and keeps open connections between Apache and tomcat. This would possibly be the best of both worlds, but is newer, and is doesn’t have some of the advanced features that mod_jk has. Load balancing and https are two issues that apparently mod_jk handles better than mod_proxy. It also has some additional management and error reporting. Also, I think large request sizes (more than 8k) can be handled by mod_jk (but only up to 64k), but not mod_proxy_ajp . This may have been fixed already.

Here are some links:

http://tomcat.apache.org/connectors-doc/

http://httpd.apache.org/docs/2.2/mod/mod_proxy_ajp.html

http://directwebremoting.org/blog/joe/2006/02/01/mod_jk_is_dead_long_live_mod_proxy_ajp.html

http://blogs.jboss.com/blog/mturk/2007/07/16/Comparing_mod_proxy_and_mod_jk.txt

http://www.mail-archive.com/users@tomcat.apache.org/msg03570.html

http://anilsaldhana.blogspot.com/2006/04/modjk-versus-modproxy.html

http://confluence.atlassian.com/display/DOC/Running+Confluence+behind+Apache

Here’s how I set it up to work on my localhost.

I created the following hosts:

127.0.0.1 localhost apache tomcat modjk modproxyajp

(okay, I actually set up apache.ae tomcat.ae modjk.ae modproxyajp.ae, but deleted the ‘.ae’ from this example for clarity.)

I downloaded mod_jk from here. The latest version is for Apache 2.2.4, but I haven’t seen a problem. I renamed it to mod_jk.so and copied it to my Apache modules directory.

C:\Apache\modules\mod_jk.so

Then I added to following to my httpd.conf (so that I can reference mod_jk outside the vhost):

LoadModule jk_module modules/mod_jk.so
JkWorkersFile “C:/Apache/conf/workers.properties”
JkLogFile “C:/Apache/logs/mod_jk.log”
JkLogLevel info
JkOptions +ForwardURICompatUnparsed

Next I created the workers.properties file referred to in my httpd.conf:

C:/Apache/conf/workers.properties

worker.list=worker1
worker.worker1.host=localhost
worker.worker1.port=8009
worker.worker1.type=ajp13

I then set up a vhost for modjk:

C:/Apache/conf/vhosts/modjk.conf

<VirtualHost 127.0.0.1:*>
ServerName modjk
DocumentRoot “C:/Tomcat/webapps/”
JkMount /* worker1
</VirtualHost>

The JKMount directive says that every URL (/*) is handled by the jk worker1. You can create multiple workers and specify then in your tomcat server.xml config file. This is included by default in mine:

<Connector port=”8009″ enableLookups=”false” redirectPort=”8443″ protocol=”AJP/1.3″ />

After restarting Apache, now, when I go to http://modjk I get the Tomcat homepage:

I also added the following to my original apache.conf so that I can reference mod_jk from a certain directory:

C:/Apache/conf/vhosts/apache.conf

<VirtualHost 127.0.0.1:*>

ServerName apache
DocumentRoot “C:/Apache/htdocs”

RewriteEngine On
RewriteRule ^/tomcat/(.*) http://apache:8080/$1

<Directory />
Options All
Order deny,allow
Allow from all
</Directory>

JkMount /modjk/* worker1

</VirtualHost>

When I go to http://apache/modjk/ it uses Tomcat, but I get a 404, because my path is messed up. I either need to specify the document root for the <Location /modjk> or have Tomcat expect this path.

TODO: fix this.

Then I tried out mod_proxy_ajp. It’s already included with Apache 2.2, so I didn’t need to download anything. I created the following vhost config file:

C:/Apache/conf/vhosts/modproxyajp.conf

<VirtualHost 127.0.0.1:*>
ServerName modproxyajp

DocumentRoot “C:/tomcat/webapps/”

ProxyRequests Off

<Proxy *>
Order deny,allow
Allow from all
</Proxy>

ProxyPass / ajp://localhost:8009/
ProxyPassReverse / ajp://localhost:8009/

<Location />
Order allow,deny
Allow from all
</Location>

</VirtualHost>

Notice the ProxyPass, which is kind of like a rewrite, but it also specifies the ajp:// protocol. If Tomcat were on a different server than Apache, you’d specify the hostname instead of localhost. Port 8009 is the default, and you’ll remember that it’s set for the tomcat connector in server.xml

Now I can restart Apache and go to http://modproxyajp/ and it serves Tomcat (which is actually running on port 8080) through Apache (on port 80), using the AJP binary protocol from mod_jk, but via the built-in AJP proxy.

I also edited my original apache.conf to be able to server mod_proxy_ajp from that host as well.

<VirtualHost 127.0.0.1:*>

ServerName apache
DocumentRoot “C:/Apache/htdocs”

<Directory />
Options All
Order deny,allow
Allow from all
</Directory>

# use mod_rewrite to forward directly to tomcat on port 8080
RewriteEngine On
RewriteRule ^/tomcat/(.*) http://apache:8080/$1

# use mod_jk to forward to tomcat
JkMount /modjk/* worker1

# use mod_proxy_ajp to forward to tomcat
ProxyPass /modproxyajp ajp://localhost:8009/

ProxyPassReverse /modproxyajp ajp://localhost:8009/

</VirtualHost>

I restarted Apache once again and typed http://apache/modproxyajp/ into my browser.

I don’t know whether mod_jk or mod_proxy_ajp is better. My guess is it depends on your definition of “better.” Mod_jk has the advantage of being older, and Mod_proxy_ajp has the advantage of being built in. For simple uses, either appear fine.

I haven’t got into any details about load balancing or complex management or session issues because I don’t know enough about it yet. If there’s a demand, I might post a follow up.

SiliconIndia

I got some spam the other day from someone asking to join their network on a social networking site. So I deleted it.

It wasn’t Facebook or MySpace or LinkedIn, it was a site called Silicon India. I got some more spam and eventually curiousity got me and I looked at the site. It looked like a real community with real users, so I created a profile.

I’m not from India, though I know some people who are (mostly co-workers). I don’t plan to do any outsourcing in India or take a job and move there. (But who knows, maybe someday I will?)

I just thought, there are technology people in India, I’m cool with meeting more technology people, maybe I could find some business contacts, or talk with people who share a passion for testing and open source tools.

Anyway, the point of the story is, I joined some testing groups, and then posted on a blog on that site and got to thinking, and the result was my previous post: Why open source businesses succeed.

It might just be wishful thinking and incoherence, but I thought I’d repost it here. I speculated that maybe someone read my blog, or looked at my resume, or found http://one-shore.com — more likely that someone was a bot fishing for email addresses, but if not, and you’re a member of Silicon India, I’d like to hear from you.

Why open source businesses succeed

There have been a lot of spectacular open source business successes.  MySQL and JBoss come to mind as recent enormous payoffs through acquisition, but there are plenty of others that get less press.  Sun and Redhat themselves are huge open source drivers.

Just like anything else, I’m sure there are plenty of open source failures, but there seems to be a lot of successes these days.

SugarCRM, Alfresco, Pentaho…what makes a business choose to go open source?  There are a lot of reasons, including marketing, user base, license mandates (e.g., GPL based), belief system, etc. Ask any individual business and they can probably tell you.

But what makes an open source product into a successful business.  I think there is, if not a univesal answer, at least a common denominator, and it has nothing to do with shaggy hippies in birkenstocks, even if the founders are shaggy hippies in birkenstocks.

Actually, since a number of them are just that, or other variants anti- or asocial types, their business success is all the more remarkable.What I think is the critical factor has nothing to with how they dress or what they believe.

The successful conversion of an open source team into a business is because you already have a built in team.  The “team” is what makes the business.  And the team is self-selected.  That’s a huge advantage.

People don’t apply for a job to work on an open source project. They don’t do it for the money.  First, they start to use it because they’re interested in the product.  That’s market research that can’t be bought.  Second, the stay and contribute because they like the team.  So if you have an open source team that works well together, you’re way ahead.

How many jobs have you had where you liked eveyone you worked with?  How many would you still show up for if they stopped paying you?  An open source team can answer yes to that.  Everything else is just accounting.

Running a business vs. building a business to sell

Saturday morning I sat down and wrote a rambling user story with a bunch of aside comments for QA Site / Fluffy. The I spend the rest of the weekend studying the literature of Agile/Scrum/XP and user stories.

But in between, I called my first draft user story iteration 1, and did a little personal retrospective. I realized that I want to:

  • define a process
  • to build tools
  • to improve the proccess
  • of software development, testing, and project management
  • so I can develop better tools
  • to build software
  • so I can build software
  • so that I can start a business
  • selling software
  • so that I can work from anywhere in the world
  • developing software
  • and running a business
  • helping other people develop software
  • so I can get rich
  • and not have to work
  • so I can develop software
  • in whatever process I want to
  • wherever I want to

Then I realized that there are two points i missed.

The first is that I don’t want to just produce tools. There are lots of tools, and even if I can make better ones, I’m not directly creating any of them. I want to create something. Not something that can be used to create something.

The second is more important. I want to do all this so that I don’t have to work. I think. I’ve just worked myself into a corner that would leave me unfulfilled at a ripe old age when I could finally hope to enjoy the fruits of my labor.

The truth is that most people at that ripe old age wish they could still be working — doing something meaningful and productive.

And then also I realized that my objection to working all this time is that I’m not creating, which is what I’d rather be doing. Of course I’d rather be doing it less, but when I sail away over the horizon, I hope to be able on most days (when the weather permits) to be working on something.

Granted, I want to take a break to land that Marlin, go for a swim, and climb the volcano to toss a virgin or two in, but I want to be productive too. And I think I’ll be more productive than ever doing the other stuff I want to do as well.

Anyway, I guess that puts me more in the David Heinemier Hansson camp than the Paul Graham camp.

And I think the old fashioned way is not only the “honest” way to do it, but it’s more fulfilling. And you’ll create a better product if you’re doing what you want. Because a million monkey’s might be able to type a lot more than you’d ever get done, but they’ll never produce a Shakespeare. If you don’t have a masterpiece in you, then by all means, go for the quick money.

But you’re probably better off (statistically) going back to college and hoping to make the NBA draft.

User Stories

I really like this formulation from Mike Cohn’s blog:

As a <type of user>, I want <some goal> so that <some reason>.

It’s a great way to formulate a requirement.  But I don’t like the name “user story” as applied to it.  Why?  Because it’s not a story.  It’s one plot point in a story.  I realize that the Agile $entity came up with (or at least popularized) the idea of user stories, but I’m a semantic hair splitter.

I don’t want to take something that’s already been named and give it a different, less widely accepted name — or take a widely accepted usage of a name and use it to describe something else.  So what do I call the something else, the actual “story” that described the user interaction, from which a number of requirements can be derived.  I’m talking about the narrative that describes one user doing a series of tasks, not a formal use case.

The who/what/why of the “user story” described by Cohn is more of an individual requirement, but I think of requirement as something different.  Maybe just something more formal.  I don’t want there to be the barrier to writing user stories of needing to know the template.

A story should have characters, plot, and resolution.  Not roles, reasons, and acceptance criteria.  But if I call a story a story then what would I call a “user story”?

Here are some links about user stories:

http://blog.mountaingoatsoftware.com/?p=24

http://www.mountaingoatsoftware.com/article_view/27-advantages-of-user-stories-for-requirements

http://c2.com/cgi/wiki?UserStoryDiscussion

http://en.wikipedia.org/wiki/User_stories

http://www.extremeprogramming.org/rules/userstories.html

Flex component reference

I found this via an article on theserverside.com.  It’s a Flex appliaction that provides a reference to Flex components.  Very handy.  I’m working on testing tools for flex and experimenting with the framework.  I call this project “fluffy“.  I was going to try to create a QA site dashboard in Flex, but I’m having trouble seeing the advantage over HTML.  I think I’ll try a scrum style “task” board.

More on Canoo Web Test and other tools

My last post drew a hostile comment by Marc Guillemot, one of the committers to Canoo Web Test and HTMLUnit.  I may have made some errors, but I am not aware of them.  I think he may have been confused that I mentioned HTMLUnit uses HTTPClient, and assumed I meant that HTTPClient has all the features of HTMLUnit.

I found on his blog a biased comparison of canoo and selenium, that essentially backs up the points I was trying to make.  It seems one of his chief frustrations is people not being aware of the two different ways web applications can be tested, which was the point of my last post.  From what I can tell, his stance is that I’m an idiot, but that he agrees with me.

It’s nice to have company.

Through my own search, I found out about a tool that allows recording of Web Test scripts.  There is in fact a web test recorder and I hope to try it out soon.

On my current projects, I’m committed to selenium, and as most of it’s fans know, it’s more than just a browser record & playback toy.  As Marc and the Canoo company like to quote “Record/Playback is the least cost-effective method of automating test cases.”

I don’t have time or inclination to debate the one true testing tool, but I disagree that the answer is a complex browser-stub, though I commend the Canoo team for their efforts.  I have used and will undoubtedly use Canoo in the future.

The reality is that printing out the HTML is the least complex part of an application’s functionality, followed second by querying the database.  The user interface does in fact play a significant role, and there is often more complexity in the javascript presentation than the remainder of the logic in most business applications (assuming network communications, transactions, queuing, etc. are abstracted in frameworks — things tools like canoo webtest are no better at validating than browser driving tools like selenium and watir.)

There is a place for both types of tools, and I had hoped to have stated that clearly in my last post.

I also learned about some other interesting tools: one that uses HTMLUnit and Jruby: Celerity, which has WATIR-like syntax; and Cubictest which is an eclipse plugin for writing Selenium and Watir tests.

Another interesting idea I found on some mailing list archive (can’t find the link) is to use Selenium IDE to generate  WebTest scripts.

two ways to automate web testing

There are two ways to automate web testing.   The goal is to test the functionality of a web application.  One way is to write automation that drives a browser.   The second is to use a library that imitates a browser session and communicate directly with the server.

Tools that use the first method include open source applications such as Selenium, Watir, and Samie; and commercial products from HP/Mercury, IBM/Rational and Borland/Segue.  There are also two ways to drive the browser.  The way used by Watir, Samie/Pamie, and presumably by the commercial applications, is to use the automation APIs provided by the browser, typically Microsoft COM — which means IE.  Selenium uses a different method, driving the browser through Javascript.  It can use a proxy server to inject javascript code into the browser or a javascript library can be included on the server.

The other method is to cut the browser out of the equation.  Tools such as HTTPUnit, HTMLUnit, Canoo Web Test, WebDriver, and TestMaker use this method.   A similar method is used by other tools including JMeter, LibWWW, and curl.  The obvious disadvantage to using this method is browser compatibility issues.

The HTTPClient library (used by many of the above) is virtually a browser in it’s own right, with it’s own quirks.  It has support for cookies, Javascript, DOM, and HTTPS.  However, WebDriver doesn’t enable this by default.

Tools that use macros to drive a browser would fall into the first category.  Tools that have an “expect” like dialog would fall into the latter.

I usually advocate the first method, because nothing beats having a real browser excercise your application.

The often overlooked advantage of the second method is that you can more easily run browserless tests as integration and smoke tests.  Because it doesn’t have the overhead of the full browser, it is lightweight, and client independent.  It runs faster and can be more easily run concurrently (which makes it useful for performance testing.)  Browser timing issues (and crashes) are less of a problem.

I’d actually recommend type 2 tools for testing links, page flow, content (text/images), and principal functionality.  But if user interface testing is needed, I’d use type 1 tools. However, the truth is that UI and ajax timing related issues are much easier to find with manual testing.  I’d guess 3/4 of what automation buys you can be done with a browserless tool.

The advantage that browser-driven testing buys you is with recording tools.  However, the penalty is with stability and speed of your tests.

Code Coverage tools

Some code coverage (unit test coverage) tools:

EMMA – open source

Cobertura – open source

Clover – Atlassian

Hansel & Gretel – only found info on an IBM developerworks article

Quilt – open source

Jester – open source

Jester takes a very interesting approach. It actually changes to code and then sees if your tests break. For instance it might change: if (x>y) to if (false)

All of the above target Java. What about for other languages like Perl, PHP, Python, and Ruby? What about for Smalltalk?

PHPCoverage – Spikesource

(here is an interesting list of tools for PHP)

rcov – for Ruby

http://www.semdesigns.com/products/testcoverage/PHPTestCoverage.html

heckle is a Ruby version of Jester.  Link here. Another link to an article on heckle

Of course there’s controversy about the value of code coverage tools, but really it’s an issue of misusing them. They are useful, and they give people something to aim for. A more interesting idea is a “functionality” coverage tool — which would have to be more manually built. An interesting article mentions rspec but that’s not really what I meant, though still an interesting idea.

A requirements coverage matrix shouldn’t be a crutch any more than a code coverage report, but the combination could be powerful.