May 12, 2007 at 11:47 am (HtmlUnit, Java)
Robert Martin has some constructive criticism for HtmlUnit:
I’m using HtmlUnit to parse and interpret HTML web pages. I’ve been very impressed with this library so far. And I appreciate the hard work and dedication of people who give their software away for free. So, although this blog is a complaint, it should not be misconstrued into anything more than constructive criticism.
What I want to do with HtmlUnit is quite simple. Given a string containing HTML, I’d like to query that HTML for certain tags and attributes… Sweet, simple, uncomplicated. Just create the DOM from an HTML String, and then query that DOM. Unfortunately, HtmlUnit does not appear to be that simple.
Robert’s central complaint is that it’s hard to feed HtmlUnit a String and get back an HtmlPage. The problem is that WebClient, the central HtmlUnit class, does in fact have a getPage(String) method — but it’s geared to the vast majority of users who want to be able to call getPage(“http://my.testable.site”), rather than getPage(“<html><head><title>My Testable Site</title>…</html>”).
Robert claims that HtmlUnit violates the principle of least surprise, i.e. “do the least surprising thing.” Surprise is a subjective beast, but I would submit that catering to the 95% of users who use HtmlUnit as it was intended is definitely the least surprising route to take. That’s not to say that we can’t accomodate alternate uses (perhaps by adding a getPageForHtml(String) to WebClient), but not at the expense of our core competency.
May 5, 2007 at 10:49 pm (HtmlUnit, Java)
As an HtmlUnit committer, every month or so I spend about half an hour searching the web for new mentions of the project, just to get an idea of the latest buzz, unreported problems, unsung praises, etc. During the latest of these searches, I ended up at ohloh, an online directory for open source software projects.
The site lists a large number of projects, and includes interesting metrics for each of them, including codebase size, estimated effort and cost necessary to reproduce the codebase, level of documentation, related projects, user ratings and reviews, etc. Here are some of the metrics, as of today, for HtmlUnit and some related projects:
It’ll be interesting to see if this site takes off or not. Developers often have to pull information from a wide array of sources in order to make informed decisions as to the libraries that they need to include in their stack. While ohloh provides some of this information, there is one glaring omission: community size and growth trends. This is usually measured via mailing list activity — lots of posts to the mailing lists imply a large user community. It would be nice to have this information listed as well, and not just as a factoid in the summary section.
Regardless of this small omission, ohloh is a fun site to browse. It’s interesting to see their take on the various software projects out there, and it’s entertaining to compare competing libraries (SVN vs CVS!). If lots of people actually start using it to recommend and critique software, it might become an extremely useful website.
May 1, 2007 at 9:56 pm (Java, Tapestry)
Well, it’s official: I can now add features to Tapestry without a middleman! If Howard hadn’t beat me to it, I’d probably start out by adding a BeanForm component to Tapestry 5. As things stand, I’ll probably start small — fix some bugs, add some small features, etc.
I enjoy writing (to a degree), so one of my self-imposed tasks may also be to polish the documentation for Tapestry 5. One of the more prominent criticisms of Tapestry 4 has been that the learning curve is steep and the documentation is sub par. It’d be nice to turn that around and move Tapestry 5 to the opposite side of the spectrum: a gentle learning curve facilitated by excellent documentation. On vera comment ça va aller…