URL.hashCode() Considered Harmful

I just cut HtmlUnit’s build time by about 20% by changing four lines of code. How? HtmlUnit keeps a small cache of web requests in a HashMap, keyed on the request URL. The problem with this is twofold:

  1. The URL.hashCode() method is synchronized.
  2. The URL.hashCode() method triggers DNS lookups for the URL hosts.

The impact of item 2 was magnified by the fact that some of the HtmlUnit unit tests use a mock web connection to connect to fake URLs. DNS (non)resolution of these fake URLs took an especially long time.

The fix was to key the map entries on the value of URL.toString() instead. Apparently I’m not the first person to stumble across this problem. So think twice before coding your next HashMap<URL, XXX> ;-)

HtmlUnit 2.1 Released

The HtmlUnit team is pleased to announce a new release of HtmlUnit. This latest version includes a number of bug fixes and performance enhancements, and sports excellent support for GWT, jQuery and Sarissa, decent support for Prototype and Dojo, and basic support for YUI. Please see the changelog for more details.

In related news, we’ve (temporarily) forked the Rhino JavaScript engine in order to add browser-compatible JavaScript behavior which is slowly making its way into the Rhino project proper. The most important of these changes (so far) is definition-order property iteration. All of this should be available in the next version; many thanks to Marc Guillemot for his work in this area.

Anyway, give it a whirl and let us know what you think!

Thomas Paine on Software Design

I draw my idea of the form of government software from a principle in nature which no art can overturn, viz. that the more simple any thing is, the less liable it is to be disordered, and the easier repaired when disordered;…

Thomas Paine, Common Sense (1776)

Follow

Get every new post delivered to your Inbox.