Online MS CS at Georgia Tech

Plans become reality.

My alma mater is tackling the future head-on!

Groovy + Selenium + Smack = Cleverbot Answering Service

There was a post on Reddit the other day about using Python to get Cleverbot to respond to incoming IMs. It looked like an interesting hack, but it used Python, Pidgin and D-Bus, none of which I have on hand right now… so I made something similar using Groovy, Selenium and Smack.

You can have it wait for someone to IM you, or you can tell it to troll someone on startup; either way, it automatically saves the chat transcript to disk on shutdown. Here’s the code:

@Grapes([
	@Grab('org.igniterealtime.smack:smack:3.2.1'),
	@Grab('org.igniterealtime.smack:smackx:3.2.1'),
	@Grab('org.seleniumhq.selenium:selenium-firefox-driver:2.5.0'),
	@Grab('org.seleniumhq.selenium:selenium-support:2.5.0')])
import org.jivesoftware.smack.XMPPConnection
import org.jivesoftware.smack.ChatManagerListener
import org.jivesoftware.smack.MessageListener
import org.jivesoftware.smack.Chat
import org.openqa.selenium.firefox.FirefoxDriver
import org.openqa.selenium.support.ui.WebDriverWait
import org.openqa.selenium.By
import com.google.common.base.Predicate
import java.util.concurrent.TimeUnit
import groovy.ui.SystemOutputInterceptor

if(args.length < 3 || args.length > 4) {
	println 'Usage: groovy chat.groovy server username password [userToTroll]'
	return
}

def oldSendMessage = Chat.metaClass.getMetaMethod('sendMessage', [String] as Class[])
Chat.metaClass.sendMessage = { String s ->
	println "cleverbot/me: $s"
	oldSendMessage.invoke(delegate, s);
}

def transcript = ''
new SystemOutputInterceptor({ transcript += it; true; }).start()

def driver = new FirefoxDriver()
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS)
driver.get 'http://www.cleverbot.com/'

def conn = new XMPPConnection(args[0])
conn.connect()
conn.login(args[1], args[2])

def messageListener = { chat, message ->
	println "$message.from: $message.body"
	def response = say message.body, driver
	chat.sendMessage response
} as MessageListener

def chatListener = { chat, createdLocally ->
	chat.addMessageListener messageListener
} as ChatManagerListener

conn.chatManager.addChatListener chatListener

addShutdownHook {
	driver.quit()
	conn.disconnect()
	new File('transcript.txt').write transcript
}

if(args.length >= 4) {
	def chat = conn.chatManager.createChat(args[3], null)
	chat.sendMessage 'Hey!'
}

while(true) { Thread.sleep 1000 }

def say(input, driver) {

	def field = driver.findElement(By.id('stimulus'))
	field.sendKeys input
	field.submit()

	def response = null
	def condition = {
		def element = driver.findElement(By.id('typArea'))
		def finished = (response == element.text
			&& !['', '|'].contains(element.text))
		response = element.text
		return finished
	} as Predicate
	def wait = new WebDriverWait(driver, 30).until(condition)

	return response
}

Here’s the first conversation it had with my brother:

cleverbot/me: Hey!
my brother: sup?
cleverbot/me: What?
my brother: huh?
cleverbot/me: Never mind.

Never mind?!? Come on, Cleverbot! You can do better than that! Let’s try again.

cleverbot/me: Hey!
my brother: sup?
cleverbot/me: Sep.

At this point my brother called me on the phone to ask what I was smoking… I didn’t pick up. Third time’s the charm, right? I tried again, tweaking the initial message to try to draw him into a conversation.

cleverbot/me: did you just call?
my brother: yeah
cleverbot/me: Yeah yeah yeah.
[ed.: this isn't looking so good...]
my brother: was seeing why you’re acting all goofy
cleverbot/me: Did you know what a hijab is?
[ed.: wha? where did that come from?]
my brother: but seriously, when i get the number from the sheet, if there is a 0 at the beginning it’s not included
cleverbot/me: But you continue to lie. You are a computer. You do not get up in the morning, you don’t look in the mirror… You can’t feel… And I feel sorry for you.
[ed.: facepalm...]
my brother: and your sheet won’t bring back values, so i fixed it but have to use an “if” statement which won’t work with your macro
cleverbot/me: Okay.

Okay, indeed. Bah. My brother was asking serious questions about a script I wrote for him, and all Cleverbot could think to talk about was hijabs and chatbot philosophy.

I think I’ll answer my own IMs… for now.

Gradle: Keeping Libraries Up To Date

After a successful proof of concept earlier this year, we’ve started using Gradle (instead of Maven) for new projects at work. One of the first things you might notice about Gradle is that (a) it’s new enough that there isn’t a plugin for every random requirement out there, and (b) it’s flexible enough that the missing plugins aren’t usually a problem — you just implement the functionality yourself.

Here’s a quick task that will let you know if your dependencies need to be updated, a la mvn versions:display-dependency-updates. It’s actually a bit longer than most Gradle snippets, so I suspect there’s a way to express it more succinctly. The irony, of course, is that it relies on the Maven central repo to determine whether or not any of your dependencies are out of date.

// Find any 3rd party libraries which have released new versions
// to the central Maven repo since we last upgraded.
task checkLibVersions << {
  def checked = [:]
  allprojects {
    configurations.each { configuration ->
      configuration.allDependencies.each { dependency ->
        def version = dependency.version
        if(!version.contains('SNAPSHOT') && !checked[dependency]) {
          def group = dependency.group
          def path = group.replace('.', '/')
          def name = dependency.name
          def url = "http://repo1.maven.org/maven2/$path/$name/maven-metadata.xml"
          try {
            def metadata = new XmlSlurper().parseText(url.toURL().text)
            def versions = metadata.versioning.versions.version.collect { it.text() }
            versions.removeAll { it.toLowerCase().contains('alpha') }
            versions.removeAll { it.toLowerCase().contains('beta') }
            versions.removeAll { it.toLowerCase().contains('rc') }
            def newest = versions.max()
            if(version != newest) {
              println "$group:$name $version -> $newest"
            }
          } catch(FileNotFoundException e) {
            logger.debug "Unable to download $url: $e.message"
          } catch(org.xml.sax.SAXParseException e) {
            logger.debug "Unable to parse $url: $e.message"
          }
          checked[dependency] = true
        }
      }
    }
  }
}

Google Summer of Code 2011 Highlights

This year’s list of GSoC projects has been published, and the sheer volume is astounding. Different projects will appeal to different people, but here are some of the efforts that jumped out at me (most interesting to least interesting):

This list is so subjective… does anyone think I’m missing something really cool?

Hudson and Jenkins: Two Months Later

For some reason I’m finding the Hudson / Jenkins split to be particularly interesting, so I’ve updated the chart from my last post on the subject to span two months, rather than the initial two weeks:

It still looks like Jenkins is outpacing Hudson. To be fair, Jason van Zyl (who is working on Hudson) gave advance warning that we’d see something like this:

We are moving more carefully and probably slower then we might like, but we feel that, in order to aggressively add features in the future, the testing infrastructure, development infrastructure, and core features need to be in place. All this work I’m talking about will likely take a release or two to get in place but once that is done we will be moving at a radically different pace.

Nevertheless, one wonders how much longer this can go on before the momentum becomes irreversible.

Hudson and Jenkins: Two Weeks Later

Most Java developers have probably heard about the recent Hudson/Jenkins split. InfoQ’s most recent article on the topic, by Alex Blewitt, ends on an upbeat note:

With the commercial support of both Oracle and Sonatype behind the development of Hudson, the future looks good for the eponymous continuous integration tool. However, Jenkins continues to evolve as well [etc, etc]…

I’m not so sure about Hudson’s future. Fortunately, there are a number of easily measurable indicators that should allow us to gauge the progress of these two rival projects.

First, we can look at commit counts. Specifically, commit counts should provide a decent leading indicator of the health of each project, given that they generally foretell whether or not the project is likely to address user needs in new releases.

Second, we can look at user mailing list post counts, which generally tell the story of user uptake after all of the hard work has gone into fixing bugs, adding features and cutting new releases. I would consider user mailing list usage to be a lagging indicator.

So after two weeks, what does the commit picture look like? Well, Hudson has seen 40 commits since the split [1], while Jenkins has seen 166 commits. It’s not even close. As a leading indicator, this bodes well for Jenkins’ future.

What about user mailing list activity? Remember that logically this should be a lagging indicator. As such, I expected the Hudson user list to remain more active than the Jenkins user list for quite some time, even if Jenkins ends up overshadowing Hudson over time. Astonishingly, the Hudson user list has seen 55 posts since the split, while the Jenkins user list has seen 563 posts in the same time frame! Have users already decided, en masse, to adopt Jenkins over Hudson? That’s certainly the picture painted by these numbers.

Two weeks is an extremely small track record on which to base any conclusions, and some of these numbers may be distorted by infrastructure issues on the Hudson side, but I think it’s safe to say that Hudson’s future is a little murkier than Alex’s optimism might suggest. The good thing is that we’ll be able to revisit these numbers in 6 months and have a pretty clear idea of where things are headed :-)

[1] All of the counts in this post are as of the time of writing, and begin counting on February 1st, 2011.

UPDATE (4/4/2011): I’ve posted a newer commit graph here.

Optimizing Spring Application Startup Time

The application I’m currently working on is built around Spring. A couple of weeks ago, as I waited for it to start up for the 27th time that day, I began wondering if there was any way to get it to start faster. Coincidentally, our whole development team had just received upgraded laptops, so the fact that numbers are stagnating in the “Megahertz” department (but not in the “CPU cores” department) had been recently emphasized.

Further, I had just read a series of articles by Cédric Beust regarding recent improvements to the TestNG threading engine. In short, TestNG allows you to run tests on multiple threads in order to shorten the test suite run time. However, when you start creating tests with dependencies on other tests, TestNG punts and runs all of these groups of dependent tests serially, on a single thread. Cédric wanted to remove this limitation, and approached the enhancement as a topological sort problem, viewing the tests as nodes in a directed acyclic graph.

Most of the startup time in Spring-centric applications involves the initialization of singleton beans. It wasn’t too much of a mental leap to wonder if the same graph/sort approach could be applied to the application startup optimization question — with the Spring beans acting as the nodes in the graph, instead of the tests.

It turns out to be (sort of) possible, and (sort of) worth the time. I’ve posted the details, including the source, here. Feel free to check it out and make suggestions. The Spring framework does contain a very coarse lock that required a pretty horrible hack in order to get any performance gains out of the experiment, but once this is worked out I don’t see why an implementation of this approach couldn’t make it into real applications. Again, see the project page for more details.

On a side note, this is the first time that I’ve used GitHub to host any code, and I have to say that I really like the fact that the README file serves the dual purposes of local project documentation and homepage markup!

HtmlUnit 2.6 Released

HtmlUnit 2.6 has just been released! A ton of work has gone into this release, so I hope people find it useful.

Some highlights:

  • Marc Guillemot tweaked the caching infrastructure a bit, which should result in improved performance, especially for those of you hitting remote websites for scraping and such.
  • Ahmed Ashour has fixed quite a few bugs in a focused effort to make it easier for the GWT folks to incorporate HtmlUnit.
  • Our WebClient serialization support was very broken in 2.5; this has now been fixed.
  • Many, many more bugfixes!

Of course, HtmlUnit continues to support use with a wide variety of JavaScript libraries.

Give it a try and let us know what you think!

Selenium 2.0

As per Simon’s post on the WebDriver mailing list, WebDriver and Selenium are starting the merge that will define Selenium 2.0… wasn’t Selenium 1.0 just released a couple of months ago? And didn’t they take something like 6 years to get to 1.0? Things are certainly speeding up! Anyone care to place a bet on how far out 2.0 is?

HtmlUnit @ JavaOne

Ahmed Ashour and I will be talking about HtmlUnit today at 2:50 PM (room 301). We’ll be discussing browser driving tools and browser simulation tools, their pros and cons, and where HtmlUnit fits into the landscape. Stop by if you’re interested in web application integration testing in general, and HtmlUnit in particular!

« Older entries

Follow

Get every new post delivered to your Inbox.