Ideas and Code

giovedì 24 settembre 2009

My Personalized Tech News

I've been working on BuzzBox for a while, so I feel now confortable now to share the first links to it.
I'm currently using BuzzBox to read the most popular tech news every day.
News from BuzzBox are:
- personalized: I picked my favourite web sites
- filtered: I get only the most popular news every day
- clustered: so I don't get the same news twice

I like to describe this first stage as a "personalized techmeme" (techmeme is an tech news aggregator).

You can see My BuzzBox at My BuzzBox



I've put the RSS in my Feed Reader, and I consume the news there. Others are pushing thier News from BuzzBox to Twitter. See for example http://twitter.com/anigamBuzzBox or Anu's tech BuzzBox

We have a long roadmap ahead. We want to bring personalized news to social networks, we want to be a preferred place to share and comment about news with your friends and everybody else.

Check out the site and let me know what you think.

domenica 20 settembre 2009

A video blog on App Engine

I'm working on this basic idea: an automatic video blog created from a search on youtube.

I like following the interviews of the David Letterman Show on youtube but it's quite difficult to subscribe to a good RSS for that. A simple query returns old and new results and many duplicate videos. Some interviews are partials, other are of bad quality. So I start building a filtering engine, using the Google Data Api.

I started the project on Google App Engine, on the Java environment.
It's still pretty basic, but you can already see the results:

http://videovertigo.appspot.com/letterman/

I start with the query:
"+letterman 2009|09 -monologue -"top 10" -"top ten""

Then I look in the title and in the description for a Date. The parsing is performed by Antlr (thanks to Piercarlo for implementing this part).
Then I assign a rank to each keyword, based on their frequency in the result set.

Finally I try to cluster the videos that look similar, based on keywords and date.

I'm still playing with the clustering to make it as general as possible. I would like users to build thier own video blog from a complex query, using tools like information extraction and clustering.

In the home page there is a simple search functionality you can use to play with the engine: http://videovertigo.appspot.com/

What do you think? Any idea how to improve the product? Are you an engineer and you would you like to contribute? Please contact me!