Creating a DTO DSL With ANTLR 3

We recently began a new development cycle at my day job. The project in question is a JEE client/server application which uses DTOs to transfer information between the client and the server. One of the lessons learned from the previous cycle was that developers tend to want to generalize and reuse code, including DTO code.

Code reuse is generally a good thing. It’s one of a handful of principles which can more generally be summarized as “lazy programmers are good programmers” [1]. But in our specific scenario, DTOs were being extended and reused to such a degree that we were ending up with a near 1:1 correspondence between DTOs and the BOs from which they were derived.

As a result, we ended up with two problems. First, because the DTOs were no longer owned exclusively by a specific use case, screen or scenario, it became much harder to modify them. Developers who were responsible for use cases whose requirements changed no longer had the luxury of a quick DTO change. They had to get in touch with all the other consumers of the affected DTOs and come to some sort of consensus prior to implementing the change.

The second problem was a bit more insidious. Can you guess what it was? A hint: We’re using Hibernate to map our BOs to a database. Another hint: One of the biggest factors in making ORM scalable (and even feasible) is lazy loading.

You can probably guess what our problem was at this point. By developing these swiss-army knife DTOs which essentially replicated the BO object graph, we forfeited all lazy loading advantages. In fact, because we continued to use lazy loading, we were actually getting worse performance than we could have achieved by enabling eager loading throughout the entire BOM!

The solution is a strict ban of (most) code reuse in DTOs. Unfortunately, that means a lot of replicated code. Even more unfortunately, this replicated code is boilerplate snooze-code that you couldn’t pay an intern to write, let alone real ;-) programmers. Even more double-plus unfortunately, our system is implemented in Java, which is nearly as verbose as my wife [2].

So what’s the solution to the solution? Well, if Groovy supported bound properties, it might be an option [3]. But it doesn’t. JRuby, focusing as it does on the “Ruby” side of the equation and not so much on the “J” side of things, isn’t an option either. There we stood, lost in a sea of problematic solutions and questionable answers, when Martin Fowler and Neal Ford came to me in a vision and told me write my own DSL! Not really, but I did decide to write my own DSL.

The DSL is just an ANTLR grammar which allows a programmer to specify the core pieces of information regarding a DTO in a format which Java programmers will find vaguely familiar. Our DTOs require four or five artifacts per property: a constant for the property name (to be used during property binding on the client), an instance variable, a setter, a getter and possibly inclusion in the hashCode, equals and toString methods (depending on whether or not the property is part of the entity’s business key).

During testing, I found that a large (but not uncommonly so) DTO with 30 properties took a little over 1000 lines of Java code. A lot of that was comments, but meaningful comments are required everywhere in this project. That same DTO could be written in 62 lines of code using the DTO DSL: 2 for the class itself and 2 per property. Here’s an example of the DSL:

sorting.ReceptacleDto {
   /** The receptacle number. */
   BIZ int number;
   /** The name of the receptacle type. */
   String type;
   /** The name of the receptacle status. */
   String status;
   /** The receptacle's gross weight. */
   Quantity grossWeight;
   /** The receptacle's destination. */
   String destination;
}

The first line identifies the DTO’s package and class name. Each line thereafter is part of a two-line comment/property combination. The comment is used to generate meaningful JavaDoc comments for the property name constants, instance variables, setters and getters. Each property declaration identifies the property type and name. The property declarations may also include an optional BIZ keyword which flags the property as being part of the entity’s business key (and therefore part of the equals, hashCode and toString implementations).

I should note that this is explicitly not a round-trip solution. Developers write their DTO using the DSL, generate the Java code, and from that point on maintain the Java code manually. Further, no type checks are performed — in the example above, the grammar has no idea whether or not String is a valid type. It just generates the Java syntax, devoid of semantic checks, and the developer can use an IDE to automatically organize the required imports.

The ANTLR grammar file [4] which parses the DSL is only 340 lines of code, 60 lines of which are lexer and parser rules. Play with it, modify it, change it to fit your needs. Enjoy!

[1] Also: Don’t Repeat Yourself; Premature optimization is the root of all evil; etc.

[2] I kid, I kid!

[3] Bound properties (i.e. setters trigger property change events) are required because we bind the DTOs directly to the view via the JGoodies Binding library.

[4] The original filename is Dto.g, but my blogging software considers grammar files to be dangerous beasts, so I had to disguise it as a text file.

5 Comments

  1. akash :-) said,

    December 19, 2007 at 11:28 pm

    Another approach for this mundane task is use xslt and xml. Construct a xml with property name and comment etc. appy xslt’s and u r done. i developed it for one of my previous projects.

  2. Daniel Gredler said,

    December 20, 2007 at 3:18 pm

    True. But do you really want to code in XML? ;-)

  3. Jon Smith said,

    January 23, 2008 at 1:44 pm

    Guess this only makes sense if you have a large volume of different DTO’s that are being passed outside of the jvm. Inside the jvm can just use the BO / models directly right up to the mvc layer.

  4. Daniel Gredler said,

    January 24, 2008 at 12:46 pm

    Yep, DTOs are useful when you don’t want to send detached BOs to remote JVMs. If everything is in a single JVM / transaction / persistence context, or you don’t mind sending BOs back and forth, forget DTOs!

  5. Timo Westkämper said,

    February 22, 2008 at 7:28 am

    Alternatively you might just declare classes like this :

    public class ReceptacleDto {

    /** The name of the receptacle type. */
    String type;
    /** The receptacle’s gross weight. */
    Quantity grossWeight;

    }

    and use CGLIB to construct proxy subclasses with accessors for the fields. Any other metadata you can provide via annotations.

    Binding to fields with no public visibility is not so convenient in Java code, but the accessors can be used when binding and access happens via reflection (e.g. usage in view templates or Dozer)

Post a Comment