Wed, 09 Dec 2009
Speaking of Twitter...
Like my recent TweetZombie — twitter vocabulary analysis post (see what I did there? :) ) this entry is also Twitter related...
I've had my @followr Twitter account for a while but it's protected and I generally limit followers to people I know or have met IRL. But that's not entirely satisfactory when I'm happy for people to follow along for general technical content. What I really want is a per-Tweet flag for protected or not but seeing as that's not a possibility I'm going to experiment with a second account.
So, if you'd like to follow along check out @RancidBacon for your viewing pleasure. While you'll miss out on such deeply personal insights as "Did I mention I really like warm sunshiney days like this one? :)" you will get most of the technical content. I'm slowly making my way through the backlog of follow requests and will follow them via the RancidBacon account.
Thanks for your interest...
Posted at: 18:35 | category: / | | Comments ()
Tue, 08 Dec 2009
TweetZombie — eating your brain. one tweet at a time.
TweetZombie is a site that does some very basic vocabulary analysis of an individual's Twitter messages. It will tell you the size of the vocabulary that the person uses and provide a vocabulary rating (v-rating). The exact rating calculation method is of course a closely guarded trade secret. :) (And yes, you can try to game it with antidisestablishmentarianism if you really want to do so. You wouldn't be the first.)
A handy pie chart shows you at a glance how often the person replies or retweets. Last I looked the highest rating was 51,801 and the biggest vocabulary was 1,240 words.
Applying new technologies
Development of TweetZombie was an exercise in integrating and learning more about a number of technologies. It was originally developed using Django, jQuery, the Twitter API (via tweepy) and sqlite but then ported to run on Google App Engine with Google App Engine Helper for Django and a side order of Google AdSense. (What do you mean assimilated? :) )
The porting exercise was interesting as developing for the App Engine DataStore with its non-SQL approach to queries was an exercise in changing how one thinks about data retrieval. The main change to thinking was pre-calculating more values up front.
I also took a brief look at making use of the Python Natural Language Toolkit for more sophisticated vocabulary analysis (e.g. n-grams) but have not integrated it yet.
Related Wiki Updates
During the development process I added a few related pages to my Wiki/Notebook:
- Learning About Google App Engine
- Learning About Django
- Learning About Twitter API (with Python)
- Learning About NLTK (Python Natural Language Toolkit)
- Project Log: TweetZombie
Try it yourself
Head to TweetZombie and try it on your own account or on the accounts of your friends and then brag about how superior your intelligence must be. Or something.
Posted at: 20:35 | category: / | | Comments ()
Thu, 03 Dec 2009
No Chumby for me (yet).
The Chumby is an...internet connected thing, created by Andrew "bunnie" Huang known in some circles for his console reverse engineering. The new version looks like this (unfortunately not so soft and cuddly as the previous version but apparently being soft costs too much):
Anyway, recently on his blog bunnie had a competition to guess the number of vias on the new printed circuit board in order to win a new Chumby One. Now, guessing seemed far too slapdash to me but after briefly considering writing something using OpenCV to automate detection I decided to just count things by hand.
Yes, it was as tedious as it sounds. :) It was also complicated by the fact the board is double-sided and the two images supplied didn't overlap fully. So, first I had to locate the vias on each side and then match the pairs. In some cases only one side of the via was visible and also, it wasn't always easy to match up the pairs because even after resizing/scaling the match wasn't precise. I did briefly consider using a technique similar to georectification to align the images but apparently even I have some limits to my perfectionism. :)
Now, the one issue I never dealt with specifically was that it was possible for vias to exist but be hidden on both sides of the board by components. I didn't allow for this. I could've worked out some arbitrary method for guessing the number of hidden vias but just stuck with the visible via count for my submission.
I used Gimp to perform my image manipulation. My Chumby competition submission comment included links to both a layered Gimp file and a flattened JPEG with my via count workings:
(Yeah, it wasn't pretty. :)
My guess was 729 vias which alas was not close enough to the actual total documented count of 785 as mentioned in a follow up post announcing the winner. The closest guess was 781 vias.
But the exercise was still a partial success in my book as part of my reason for documenting the ridiculous extremes I had gone to was a fairly transparent attempt to be noticed even if I didn't win, which earned this remark from bunnie:
I wasn't actually thinking anyone would try to count all the visible vias — kudos to those who put in that effort (omg follower I can't believe you did that!)...
And, this post is, of course, a totally transparent attempt to get you to notice me and think, "Hey, I've got a problem that needs that sort of mindset to solve" and email me with a freelance contract offer. :)
Posted at: 20:50 | category: / | | Comments ()
