by Allen D
The article that I chose to talk about this week is called “How Big Is Facebook’s Data? 2.5 Billion Pieces of Content and 500+ Terabytes Ingested Every Day”, by Josh Constine. The title says it all. Facebook revealed to reporters that their system processes over 2.5 billion pieces of content worth 500+ terabytes of data per day. The author talks about how the company system processes approximately 2.7 billion ‘Like’ actions and 300 million photos per day. The Vice President of Engineering, Jay Parikh, revealed that over 100 petebytes of data are stored in their data warehouse. In order for Facebook to support data-intensive activities and distributed applications, they use a software framework called Apache Hadoop. Hadoop provides very large bandwidth across the cluster and enables applications to process petabytes of data and thousands of independent computers. Parikh said to the reporters that Facebook operates the single largest Hadoop system in the world; one that’s even larger than Yahoo’s.
I found this article interesting when Jay Parikh explained how Facebook is taking advantage of their collected data by understanding user reactions, modifying designs in real-time and implement new products. At the same time, it’s quite astonishing that they are able to process this amount of data within minutes and effectively respond to real-time problems. Now that we know how much information Facebook collects from us and their potential to profit from this data, is it absolutely safe to further disclose our information?
Constine, J. (2012, 08 22). How big is facebook’s data? 2.5 billion pieces of content and 500 terabytes ingested every day. Retrieved from http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/