Research quesion: What blogs should you read, to be up to date on newsworthy stories?
Given a budget of 100 blogs, the biggest bang for the buck belonged to the popular Instapundit blog, which featured more than 4,500 postings throughout the year. Assuming a budget of 5,000 posts, however, the top-scoring blog was the less well-known sisu site, which featured only 331 posts for all of 2006.
How do you find something like this? How do you even go looking for this information?
During the past couple of weeks, I have been reading about Sentiment Analysis. It was initiated by a post in the Text Analytics mailing list by Seth Grimes and followed by many good posts with links. I read a few, generally understood the concept. It is a fascinating idea. More on this in a later blog post.
I came across this article today. It is some what different, but very useful. How do you find what blogs to read? Researchers at Carnegie Mellon seem to have a solution – an algorithm called Cascade.
A team of researchers and graduate students from Carnegie Mellon eventually created a complex mathematical equation called the cost-effective lazy forward-selection algorithm, later dubbed the Cascades algorithm for simplicity’s sake.
One part seeks to maximize reward, in this case detecting the most news in the least amount of time. Within the algorithm, that reward concept is captured by tallying the number of people who read a news item after it appears on a specific blog. If 10 million people read a story after its initial posting on Blog A but only 1,000 had read it beforehand, the story would be deemed both newsworthy and early-breaking for Blog A’s readers.
A second part of the algorithm seeks to minimize cost, namely the inordinate time that could be spent reading blogs. The team also exploited a mathematical relationship known as the law of diminishing returns.
This algorithm is not only useful for detecting news worthy blogs, but also water pollution. The sensors are just different.
You can read it all in this story on msnbc From ranking blogs to predicting posture.
meta: stumble path
acm technews -> article about cascade algorithm