Kaggle{1}


The article I choose to read this week is about Kaggle, a platform for predictive data modeling competitions. According to the Leena Rao that Kaggle has raised $11 million in Series A financing led by Index Ventures and Khosla Ventures.

For those who are not familiar with Kaggle, here is how it works:

First of all, The competition host prepares the data and a description of the problem. Kaggle offers a consulting service which can help the host do this, as well as frame the competition, anonymize the data, and integrate the winning model into their operations. For example, companies, and organizations can post large data sets to the platform, and ask scientists to solve a problem or question from the data. The thousands of data scientists who participate in Kaggle competitions then develop algorithms to solve these large-scale problems and submit iterations of their algorithms throughout each competition.

Participants experiment with different techniques and compete against each other to produce the best models. For most competitions, submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard. For instance, Kaggle actually maintains a real-time leaderboard of each competition’s standings, so competitors are motivated to exceed the current benchmark until the competition closes. Once a competition ends, the sponsoring organization has a solution, and the field’s top entrants take home the competition prize. Thus far, data scientists from all over the world have submitted nearly 47,000 entries to various Kaggle competitions.

Finally, After the deadline passes, the competition host pays the prize money in exchange for the winning model.

Kaggle’s platform helps companies, governments, and researchers identify solutions to some of the world’s hardest data problems by posting them as competitions to a community of more than 17,000 PhD-level data scientists located around the world.

The reason why I choose this article is not only because it is has something to do with data modeling that we are lectured in the class but also because it is working on one of the most exciting opportunities in big data analytics. As the old saying goes here, “two heads are better than one”. Imagine having thousands of PhD scientist to solve the problems will not only contribute to a highest level of accuracy but also contribute to a strong community that dealing with huge numbers of big problems. Therefore, I would love to share this post with everyone who are interested in this area.

Source: Rao, Leena (November 2nd, 2012) ” Index And Khosla Lead $11M Round In Kaggle, A Platform For Data Modeling Competitions”

Retrieved from: http://techcrunch.com/2011/11/02/index-and-khosla-lead-11m-round-in-kaggle-a-platform-for-data-modeling-competitions/