Web Sites aren’t Made for Ampersands

Today I've had a little fun with ampersands. Those little buggers are nasty to get right in the URLs and the HTML pages and XML config files of a typical Tomcat web site. But after having done it enough today, I think the rules are pretty simple, but need to be followed to the letter.

In URLs - Escape, Escape, Escape!

If you're in a JSP, or Java, in general, then the easiest thing for a URL is to use the URLEncoder that's in the standard JDK. It's possible to do the manual replacement, but it's so easy to use the URLEncoder that there's really no reason to do it the hard way.

  StringBuilder  vars = new StringBuilder();
 
  var.append("report=").append(URLEncoder.encode(report));
  var.append("&page=").append(URLEncoder.encode(page));
  var.append("&name=").append(URLEncoder.encode(name));

it's so easy, that there's no reason not to. However, a surprising number of developers forget to do this simple act.

In HTML Pages Go Verbose

It's been said that the escape sequence & is one of the most verbose HTML escape sequences, and I have to agree. It's a mess, but it needs to exist for the reason that the ampersand is the escape sequence initiator. So it goes. In HTML, use it. It's just what you have to do.

In XML Config Go Verbose Again

It makes a little bit of sense to have the HTML and XML escape sequences for ampersand the same, but as with other things, I would not have been surprised if it had worked out that things were different in the two markups. What I am surprised at is that the URL escape code (%26) is not allowed in the XML config files, but then again, I guess it's exclusively for the URLs.

There's what I learned today. Doesn't sound like much, but it was a pain to pin down.