“Be a good chance of sticking a fork in my eye. Temptations off so that people with garden implements.”
This is just one of a practically infinite number of new years resolutions that can be generated by my latest project, A Random Resolution for 2011. I collected about 50,000 tweets matching the term “resolution” on December 31st, 2010 and January 1st, 2011, used a simple grep to extract substrings that looked like resolutions, and fed the whole thing into a Markov chain text generator.
I love using Markov chain text generators on a corpus like this because they manage to both highlight the similarities among all items in the corpus (any given string of characters is likely to have occurred more than once in the corpus), while juxtaposing parts of seemingly unrelated items in surprising (and often amusing) ways.
Technical details: I built the application in a few hours using Tornado, an open-source web framework from the FriendFeed team at Facebook. The application is running behind an nginx server on an EC2 micro instance (a product I’ve wanted to try since Amazon released it last year). I’m amazed at how these tools made it quick and easy to throw the whole thing together. The text generator is using n-grams of eight characters; I chose eight because seven or fewer characters produced too many non-words, while nine too frequently reproduced tweets unchanged from the source text.