URL.hashCode() Considered Harmful

I just cut HtmlUnit’s build time by about 20% by changing four lines of code. How? HtmlUnit keeps a small cache of web requests in a HashMap, keyed on the request URL. The problem with this is twofold:

  1. The URL.hashCode() method is synchronized.
  2. The URL.hashCode() method triggers DNS lookups for the URL hosts.

The impact of item 2 was magnified by the fact that some of the HtmlUnit unit tests use a mock web connection to connect to fake URLs. DNS (non)resolution of these fake URLs took an especially long time.

The fix was to key the map entries on the value of URL.toString() instead. Apparently I’m not the first person to stumble across this problem. So think twice before coding your next HashMap<URL, XXX> ;-)

HtmlUnit 2.1 Released

The HtmlUnit team is pleased to announce a new release of HtmlUnit. This latest version includes a number of bug fixes and performance enhancements, and sports excellent support for GWT, jQuery and Sarissa, decent support for Prototype and Dojo, and basic support for YUI. Please see the changelog for more details.

In related news, we’ve (temporarily) forked the Rhino JavaScript engine in order to add browser-compatible JavaScript behavior which is slowly making its way into the Rhino project proper. The most important of these changes (so far) is definition-order property iteration. All of this should be available in the next version; many thanks to Marc Guillemot for his work in this area.

Anyway, give it a whirl and let us know what you think!

Thomas Paine on Software Design

I draw my idea of the form of government software from a principle in nature which no art can overturn, viz. that the more simple any thing is, the less liable it is to be disordered, and the easier repaired when disordered;…

Thomas Paine, Common Sense (1776)

Hibernate: Trouble in Paradise

I’ve written before about the problems which seem to crop up when you introduce Hibernate into your project. Unfortunately, I’m becoming more and more convinced that these are not isolated issues. Hibernate has a dependency management problem.

Things were not always this way. Hibernate was once independent, unbeholden to outside influences. Sure, you’ve always needed a somewhat-thicker-than-average skin in order to file bugs or ask questions in the forums. An extra helping of patience never hurt, either. But nowadays it seems like you need all of these things, plus the time and skill necessary to patch a JAR or two.

We recently upgraded to the latest Hibernate production JARs: Hibernate Core 3.2.6.GA, Hibernate Annotations 3.3.0.GA and Hibernate EntityManager 3.3.1.GA. Having already benefited from our earlier experience regarding conflicts between Spring and Hibernate, I figured this would take about five minutes: modify our root Maven2 POM, do a full build, verify everything works. Done! Not quite.

The first problem we encountered, reported 4 months ago, is that the production Hibernate EntityManager POM excludes a transitive dependency which it should not exclude. Net result? You have to hack together a custom version of Hibernate EntityManager which does not exclude this transitive dependency. Of course, this transitive dependency exists in the JBoss Maven2 repo, but not in the central repo. Nice.

Next, if you’re using a standard Maven2 Windows installation, you’ll run into this bug, because Hibernate now refuses to load JARs from directories with spaces in them (the standard location for Maven2 local repositories in Windows is in “Documents and Settings”). Very nice. Welcome to 1998!

You may notice a trend here: an old ASM dependency which causes conflicts with other libraries, a required dependency that is mistakenly excluded, and a dependency on a deprecated class in a buggy JBoss utility library.

Two of my co-workers are already suggesting we switch to TopLink. Jokingly, of course. For now. Napoleon is reputed to have said that “if they want peace, nations should avoid the pin-pricks which precede cannon shots.” Third-party libraries should likewise avoid annoying their users with irritating minutiae, or they may find these users mobilizing the artillery.

Space vs Time

A long, long time ago I took a college course, the title of which was Languages and Translation. The content of this sophomore-level course? A smörgåsbord of systems programming, heaps and stacks, pointers, *nix system calls, compilers, lex and yacc, grammars, lexical and semantic analysis, code optimization, and data representation — all taught and learned in C. While learning C. Oh, and Lisp.

This course made quite an impression, mainly because of my initial inexperience. I should explain that the path which led me to L&T went something like this:

  1. I’m a senior in high school. I’m applying to college. I need to choose a major. Hmm… writing that Blackjack game on my TI calculator was pretty cool. It was a whole 50 lines of code! Plus I’m in that typing class, closing in on 20 words per minute. Maybe I’ll try Computer Science!
  2. I’m a freshman at Georgia Tech. First semester. I’m taking Intro to Computing. Man, this HTML nonsense sure uses a lot of brackets! And this Microsoft Access program is impossible to use!
  3. Still a freshman at Tech. Second semester. This Intro to Programming class is pretty crazy! We’re using Java for the assignments. The TA mentioned in passing that there are no pointers in Java, but I have no idea what a pointer is, so I could care less. I’m beginning to grok object-oriented programming.
  4. Welcome to Languages and Translation! Malloc, Malloc, Malloc! Realloc, Calloc, Malloc! Bwahahahahahaha!

Psychological damage aside, this was a great class. Jim Greenlee, who taught the course, was both an evil bastard and a great teacher. One of the tenets of code optimization which he often highlighted was “space versus time,” the idea that you can often optimize for one at the expense of the other, but rarely for both at the same time.

For example, a compiler can decide to inline a short function in order to avoid time-consuming stack allocations, but the compiled program will be larger (less time, but more space). Of course, if space (memory) is at a premium, your compiler might instead try to recognize common code sequences and hoist them into artificial functions (less space, but more time).

Flash forward 8 years. We have a client/server application at work which uses DTOs to transfer data to and from the client, and we use Hibernate on the server to persist our BOs. A specific server call, invoked in the presence of a large amount of data, brings the application to its knees.

Immediately we jumped to conclusions — Arrgh! Hibernate is such a hog! If you don’t code things perfectly, you can’t scale! And sometimes not even then! A quick profiling session confirmed our fears. Three hours later we had bypassed Hibernate in this specific instance, coding to the JDBC API instead. Unfortunately, this wasn’t the last of our performance problems. A second profiling session indicated that we had another bottleneck in our DTO-to-BO conversion routines!

Now, something which must be understood about Hibernate’s collection semantics is that when you use Hibernate to load BO A, which has an X-to-N relationship with BO B, you should (usually) use the collection of Bs provided by Hibernate. For example, if you use Hibernate to load a UserGroup, and you want to modify the list of Users associated with said UserGroup, you should modify the existing list of Users. You should not create a new list, add Users to it, and then give the UserGroup the new list of Users. Why? Because creating a new list results in a one-shot delete, followed by N insert statements. This is usually not desired.

However, a naive approach to modifying the collection provided by Hibernate (clearing it and then adding the BOs which you know you want) is just as bad, because a call to collection.clear() also results in a one-shot delete. The best approach is one of minimal modification to the existing collection.

In the case of DTO-to-BO conversion, where the DTO representation of an object is being transferred to the corresponding BO representation, this means adding items to the BO’s collection that are in the DTO’s collection but not in the BO’s collection, and removing items from the BO’s collection that are in the BO’s collection but not in the DTO’s collection. Elements that are in both collections are simply ignored.

The obvious implementation of this algorithm looks something like this:

for(DTO dto : dtos) {
 if(!contains(bos, dto)) add(bos, dto);
}

for(BO bo : bos) {
 if(!contains(dtos, bo)) remove(bos, bo);
}

Unfortunately, when using lists, the contains() calls above hide a nested loop, resulting in O(n2) performance. Once n gets into the thousands, things start to get sluuuugish. Veeeeery sluuuugish.

The solution? Trade a little space for a lot of time! By constructing HashMaps which contain all of the BOs in the lists, keyed on business keys which uniquely identify the BOs, the contains() calls above can be performed in constant time by invoking map.containsKey(). The result is O(n) performance. Much better!

Java Remoting: Protocol Benchmarks

I’ve been analyzing Java remoting protocols at work over the past couple of days, and thought I’d share some of the results here. Specifically, I’m going to share the results of our protocol benchmarking.

Every application will have different requirements in this area, but most criteria will include performance. At the very least, you will want to know how much (if any) performance you are sacrificing for the sake of your other requirements.

The Protocols

The protocols under consideration are Java’s RMI/JRMP, Oracle’s ORMI (with and without HTTP tunneling enabled), Spring’s HttpInvoker, Caucho’s Hessian and Burlap, and three flavors of Apache XML-RPC (Sun-based, HttpClient-based and Lite-based).

RMI/JRMP is Java’s default binary remoting protocol, and its big advantage is that it supports full object serialization of Serializable and Externalizable objects. RMI/JRMP has historically not been as easy to use as some of the other options, but if you’re using Spring (as we are), it’s as easy to use as anything else (except Apache XML-RPC; see below).

ORMI is OC4J’s EJB remoting protocol, which we’re currently using in production. This protocol was originally developed as part of the Orion application server, whose codebase formed the basis for OC4J. The only differentiator between ORMI and RMI/JRMP is likely to be performance. Of course, this protocol probably won’t be an option unless your serverside component is deployed on OC4J. ORMI supports tunneling via HTTP and HTTPS, should you run into firewall or proxy problems.

Spring’s HttpInvoker is another Java-to-Java binary remoting protocol. It’s also very easy to use and supports full object serialization of Serializable and Externalizable objects. The big difference between HttpInvoker and RMI/JRMP is that HttpInvoker transports data via HTTP, which makes it easier to use when you start running into proxy and firewall issues. Unfortunately, HttpInvoker isn’t an option if you’re not using Spring on both the server and the client.

Caucho’s Hessian protocol is a slim, binary cross-platform remoting protocol. Most cross-platform protocols are XML-based, and thus sacrifice a significant amount of performance in order to achieve interoperability. Hessian’s draw is that it achieves cross-platform interoperability with minimal performance degradation. However, Hessian uses a custom reflection-based serialization mechanism which can be troublesome in certain scenarios (think Hibernate proxies).

Currently in a draft state, Hessian 2 is the second incarnation of the Hessian protocol. You probably don’t want to use it in your production applications quite yet, but it looks very promising. Of course, it’s so easy to switch between Hessian and Hessian 2 that you might want to have a peek just for the heck of it [1].

As far as I can tell, Caucho’s Burlap is essentially Hessian-in-XML. As such, Burlap is not a binary protocol and should be expected to suffer accordingly in the performance department. This being the case, I haven’t been able to figure out exactly why anyone would choose to use Burlap instead of Hessian… aside from a requirement from the marketing department that your RPC be buzzword compliant.

Apache XML-RPC is Apache’s implementation of XML-RPC, an XML-based RPC specification which focuses on interoperability. Apache XML-RPC may be a bit more difficult to set up than the other options if you’re using Spring. The infrastructure for the other protocols is built into the Spring framework, but you’ll have to write the Apache XML-RPC / Spring integration yourself.

Apache XML-RPC supports a number of transport factories under the covers. The three transport factories tested were the default transport factory (based on Sun’s HttpURLConnection), the HttpClient transport factory (based on HttpClient) and the Lite transport factory (based on an Apache XML-RPC minimal HTTP client implementation).

The Configuration

The test consisted of remote invocations to a single method which takes an integer size parameter and returns a list of the specified size. The objects in the returned list each contained 10 instance variables (4 strings of about 40 characters each, 2 dates, 1 long, 1 integer, 1 double and 1 float, none of which were null). The lists returned were static, so there was no list construction overhead.

The results of the first 10 invocations were thrown away, in order to allow the VMs to “warm up.” The next 100 invocations were tallied and averaged.

The tests were performed on a Windows XP SP2 machine with 2GB of RAM and a 2Ghz Core 2 Duo CPU, using Sun’s JVM 1.5.0_09. The client and server were run in two separate VMs on the same host, in order to avoid spurious networking artifacts [2].

The following library versions were used: Spring 2.0.6, Jetty 6.1.5, OC4J 10.1.3.2.0, slf4j 1.4.3, log4j 1.2.14, Apache XML-RPC 3.1, Commons HttpClient 3.1, and Caucho Hessian 3.1.3.

Spring was used on both the clientside and on the serverside in order to hide the remoting details from the interface consumer and implementor. All serverside components were run inside a single Jetty servlet container, except for the ORMI serverside testing components which were run in OC4J.

Vendor extensions and GZIP requesting were enabled for all Apache XML-RPC tests. Streaming and GZIP compressing were enabled for all Apache XML-RPC tests except the Lite tests (the Lite transport factory does not support these options). These configurations represent the best performance optimizations available to the various Apache XML-RPC flavors.

The Results

The following graph illustrates the response times of the various protocols for invocations returning smaller lists:

protocol-benchmark-smaller-lists-3.png

The following graph illustrates the response times of the various protocols for invocations returning larger lists:

protocol-benchmark-larger-lists-3.png

Conclusion

It’s important to note that no general conclusions can be derived from the absolute numbers represented above. Rather, the numbers must be examined relative to each other.

That said, the following observations may be made:

  1. The binary protocols (RMI, ORMI, HttpInvoker and Hessian) are always faster than the XML-based protocols (Burlap and the Apache XML-RPC variants) — except for ORMI with HTTP tunneling enabled.
  2. Performance is pretty even amongst the binary protocols — except for Hessian, which performs well only when compared to the XML-based protocols, and ORMI with HTTP tunneling enabled, which performs on a par with the XML-RPC variants.
  3. Burlap has much better performance than the XML-RPC variants.
  4. Native RMI and Hessian 2 have the best performance until the remote method invocations start returning larger lists, at which point vanilla ORMI takes a slight lead.
  5. Changing Apache XML-RPC’s transport factory does not seem to have a very large effect on performance.
  6. It’s amazing how fast standard ORMI is, compared to ORMI with HTTP tunneling enabled.
  7. Hessian 2 bears watching!

Other Interesting Links

JBoss Remoting Framework Benchmarks: Tom Elrod’s benchmark of the JBoss Remoting framework. Includes part of the Spring Remoting framework.

Nominet’s Protocol Benchmarks: Interesting benchmark of many of the protocols here considered. Intriguingly, HttpInvoker is found to have consistently better performance than RMI/JRMP, a finding which contradicts our results.

Footnotes

[1] If you’re using Spring’s Hessian integration and the latest Hessian JARs, moving to Hessian 2 is as easy as adding the following property to your HessianProxyFactoryBean:

    <property name=”proxyFactory”>
        <bean class=”com.caucho.hessian.client.HessianProxyFactory”>
            <property name=”hessian2Request” value=”true” />
            <property name=”hessian2Reply” value=”true” />
        </bean>
    </property>

[2] The tests were later run on two separate machines over a controlled network, in order to verify that the local tests are representative; they are.

Embedding Jetty 6 in Spring

There’s not much info out there on embedding a Jetty server in Spring, and much of what does exist is for Jetty 5 (not Jetty 6) or uses helper classes that nobody wants to write. However, there is a decent example hidden in the Jetty Subversion repository (which requires an exploded WAR directory structure), and the Jetty documentation is pretty nice. The API docs are useful, as well.

I had two goals: embed Jetty 6 (not 5), and do so without having to rearrange my directories into an exploded WAR structure. After a bit of tinkering, this is the file I ended up with. It contains 5 servlets and their mappings, just so that you can get a feel for the file, but you can certainly use it with any number of servlets.

Note that I had to change the file extension from .xml to .txt in order to appease my blogging software… Note also that this setup does not require the exploded WAR directory structure. The parts you’ll probably want to customize are the servlets list, the servletMappings list, the resourceBase property, and possibly the contextPath property.

Creating a DTO DSL With ANTLR 3

We recently began a new development cycle at my day job. The project in question is a JEE client/server application which uses DTOs to transfer information between the client and the server. One of the lessons learned from the previous cycle was that developers tend to want to generalize and reuse code, including DTO code.

Code reuse is generally a good thing. It’s one of a handful of principles which can more generally be summarized as “lazy programmers are good programmers” [1]. But in our specific scenario, DTOs were being extended and reused to such a degree that we were ending up with a near 1:1 correspondence between DTOs and the BOs from which they were derived.

As a result, we ended up with two problems. First, because the DTOs were no longer owned exclusively by a specific use case, screen or scenario, it became much harder to modify them. Developers who were responsible for use cases whose requirements changed no longer had the luxury of a quick DTO change. They had to get in touch with all the other consumers of the affected DTOs and come to some sort of consensus prior to implementing the change.

The second problem was a bit more insidious. Can you guess what it was? A hint: We’re using Hibernate to map our BOs to a database. Another hint: One of the biggest factors in making ORM scalable (and even feasible) is lazy loading.

You can probably guess what our problem was at this point. By developing these swiss-army knife DTOs which essentially replicated the BO object graph, we forfeited all lazy loading advantages. In fact, because we continued to use lazy loading, we were actually getting worse performance than we could have achieved by enabling eager loading throughout the entire BOM!

The solution is a strict ban of (most) code reuse in DTOs. Unfortunately, that means a lot of replicated code. Even more unfortunately, this replicated code is boilerplate snooze-code that you couldn’t pay an intern to write, let alone real ;-) programmers. Even more double-plus unfortunately, our system is implemented in Java, which is nearly as verbose as my wife [2].

So what’s the solution to the solution? Well, if Groovy supported bound properties, it might be an option [3]. But it doesn’t. JRuby, focusing as it does on the “Ruby” side of the equation and not so much on the “J” side of things, isn’t an option either. There we stood, lost in a sea of problematic solutions and questionable answers, when Martin Fowler and Neal Ford came to me in a vision and told me write my own DSL! Not really, but I did decide to write my own DSL.

The DSL is just an ANTLR grammar which allows a programmer to specify the core pieces of information regarding a DTO in a format which Java programmers will find vaguely familiar. Our DTOs require four or five artifacts per property: a constant for the property name (to be used during property binding on the client), an instance variable, a setter, a getter and possibly inclusion in the hashCode, equals and toString methods (depending on whether or not the property is part of the entity’s business key).

During testing, I found that a large (but not uncommonly so) DTO with 30 properties took a little over 1000 lines of Java code. A lot of that was comments, but meaningful comments are required everywhere in this project. That same DTO could be written in 62 lines of code using the DTO DSL: 2 for the class itself and 2 per property. Here’s an example of the DSL:

sorting.ReceptacleDto {
   /** The receptacle number. */
   BIZ int number;
   /** The name of the receptacle type. */
   String type;
   /** The name of the receptacle status. */
   String status;
   /** The receptacle's gross weight. */
   Quantity grossWeight;
   /** The receptacle's destination. */
   String destination;
}

The first line identifies the DTO’s package and class name. Each line thereafter is part of a two-line comment/property combination. The comment is used to generate meaningful JavaDoc comments for the property name constants, instance variables, setters and getters. Each property declaration identifies the property type and name. The property declarations may also include an optional BIZ keyword which flags the property as being part of the entity’s business key (and therefore part of the equals, hashCode and toString implementations).

I should note that this is explicitly not a round-trip solution. Developers write their DTO using the DSL, generate the Java code, and from that point on maintain the Java code manually. Further, no type checks are performed — in the example above, the grammar has no idea whether or not String is a valid type. It just generates the Java syntax, devoid of semantic checks, and the developer can use an IDE to automatically organize the required imports.

The ANTLR grammar file [4] which parses the DSL is only 340 lines of code, 60 lines of which are lexer and parser rules. Play with it, modify it, change it to fit your needs. Enjoy!

[1] Also: Don’t Repeat Yourself; Premature optimization is the root of all evil; etc.

[2] I kid, I kid!

[3] Bound properties (i.e. setters trigger property change events) are required because we bind the DTOs directly to the view via the JGoodies Binding library.

[4] The original filename is Dto.g, but my blogging software considers grammar files to be dangerous beasts, so I had to disguise it as a text file.

SQL Injection Illustrated

Heh.

HtmlUnit in the Wild: New Features

I’ve been using HtmlUnit to crawl the web for the past couple of weeks. This interesting experience has led to two new features:

First, I’ve added an insecure SSL handler which trusts anyone and everyone. Why? Because websites often have misconfigured or expired SSL certificates, and the standard Java behavior is to throw a bunch of exceptions when this happens. Not very nice. So now you can call WebClient.setUseInsecureSSL(true) instead and continue crawling, happily oblivious to the webmaster’s incompetence.

Second, I’ve added a popup blocker. Lots of sites send a bunch of popups your way, and even though they’re not quite as annoying when you’re using a headless browser like HtmlUnit, they still waste time and bandwidth. So now you can call WebClient.setPopupBlockerEnabled(true), and your crawler will be that much faster.

These features will be available in HtmlUnit 1.14, or you can just grab the latest snapshot build here. Enjoy!

« Previous entries