Wednesday September 26, 2012,

5 min Read

One of my all-time favourite products from Google - Google News turned 10 earlier this week. There are multiple reasons on why it is one of my favorite products.

Last week I wrote an introductory article about IR - a field I am quite excited about. I think Google News is a classical example of a product fully based on IR (though this not the topic of discussion for this article)

Secondly, Google News was one of the truly innovative products to launch at the time it did without any sort of precedent to it - remember, as path breaking as Google search was, it was not a totally novel idea that none had thought of, whereas Google News was.

In my opinion, there are lesser conflicts when the minds of people are prepared for taking in diverse opinions and viewpoints. Google News prepares you exactly for that - helps readers absorb and respect the difference of opinions! There is no other news site that provides the kind of unbiased, all round perspective of news happening across the world. Personally I learnt a lot about how various news organizations operate to cover/uncover various events (at times even block coverage!), their inherent biases, the coverage depth (or shallowness) of the various publishing houses and the best and (unfortunately quite a few times) the worst of Indian journalism, thanks to the powerful tool that Google News is.

And I was super lucky to work on this product for more than 4 years, lead its ranking efforts and write a lot of its scoring code and algorithms that probably power it to this day :)

Okay, lets rewind a decade back - Google News was born then in the aftermath of the September 11 attacks. When the unfortunate incidents of 9/11 happened, the whole world was clamouring for news about what was going on. People were browsing various news channels, surfing multiple news destinations on the web to somehow get minute-by-minute updates, unclear of what exactly the scenario was. Krishna Bharat, a research scientist at Google, came up with this mind-blowing novel idea of having one place on the web that would aggregate all news articles together and show all articles on the same story clustered together. He used the legendary 20% project time at Google to come up with a prototype and very soon Google decided to launch this out as a product and the rest, as they say is history!

Google News is a product that is almost a implementation of every chapter of a classical text book on IR. As Krishna Bharat explains here, Google keeps crawling several thousands of sources across the world all the time (so that it is realtime!), groups all articles together using a IR technique called clustering, classifies them according to the topic they belong using machine learning techniques, ranks clusters in the order of their relevance and importance in every edition and ranks the articles in every cluster - thereby choosing articles that best reflects the latest update on the story from the most authoritative source on the subject.

As a reader, I would say that the clustered news experience that Google news offers is unparallelled. On one occasion, I saw a cluster that talked of a supreme court judgement on a certain state government’s action. The cluster had articles from the nation’s top two news sources. One source said - “SC lauds state government’s efforts” and another source said - “SC rebukes state government’s actions”. Mind you - these were not opinion articles, but articles from the top story of the day. One would expect that a SC judgement would be as objective as it can get and here was such contradictory reporting happening in the media! On another occasion, I learnt about the shocking human rights violations and genocide that happened almost at our doorstep in a neighbouring island country from Google News, while there was a strange deafening silence from the entire Indian media about it. These examples should help drive home the power and unique features that Google News brings to the table.

And coming from Google, let me remind you that all of Google News is automated without any human involvement, unlike a pressroom, the whole process explained earlier done by a cluster of machines! Did I say machines - that reminds me of a fun incident that happened when we folks from Google News met the New York Times journalists at the NYTimes headquarter in NY. It was a hall packed with NYTimes journalists and during the course of our conversations/discussions I remember one of us from Google said - “You human journalists may be good at <so and so>, while our algorithms ...” The whole team at NYTimes were quite amused at our usage of the word “human journalists” and they burst out laughing and the rest of the day they kept making comical references to our language/jargon there!

But let me tell you - the day is not far off, when machines actually start gathering news and covering news events and that would be a coverage that would be far more richer in content - filled with text, sound, visuals, temperature and what not! And that would have the potential of telling nothing but the truth and no biases or personal agenda inserted in between. Wait I forgot - who is going to own those “machine journalists” - probably it all depends on that! For now, lets not worry about that, sit back, sip that coffee and read the news written by “we humans" but powered by some mind-boggling algorithms from Google!

