Size of Facebook’s Data

by Allen D
 

The article that I chose to talk about this week is called “How Big Is Facebook’s Data? 2.5 Billion Pieces of Content and 500+ Terabytes Ingested Every Day”, by Josh Constine. The title says it all. Facebook revealed to reporters that their system processes over 2.5 billion pieces of content worth 500+ terabytes of data per day. The author talks about how the company system processes approximately 2.7 billion ‘Like’ actions and 300 million photos per day. The Vice President of Engineering, Jay Parikh, revealed that over 100 petebytes of data are stored in their data warehouse. In order for Facebook to support data-intensive activities and distributed applications, they use a software framework called Apache Hadoop. Hadoop provides very large bandwidth across the cluster and enables applications to process petabytes of data and thousands of independent computers. Parikh said to the reporters that Facebook operates the single largest Hadoop system in the world; one that’s even larger than Yahoo’s.

I found this article interesting when Jay Parikh explained how Facebook is taking advantage of their collected data by understanding user reactions, modifying designs in real-time and implement new products. At the same time, it’s quite astonishing that they are able to process this amount of data within minutes and effectively respond to real-time problems. Now that we know how much information Facebook collects from us and their potential to profit from this data, is it absolutely safe to further disclose our information?

 

Citation:

Constine, J. (2012, 08 22). How big is facebook’s data? 2.5 billion pieces of content and 500 terabytes ingested every day. Retrieved from http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/

4 thoughts on “Size of Facebook’s Data

  • November 26, 2012 at 12:14 am
    Permalink

    I remember writing about this same article for my first blog and really had no full understanding of it. Now that I look back, I am now able to understand completely what the article was talking about. Big Data and Hadoop definitely go hand in hand. With Hadoop, no data is too big. Its breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless.

  • November 26, 2012 at 12:21 am
    Permalink

    It is amazing to see how Facebook generates 500+ terabytes of data each day. The shear amount of data just amazes me and how they are handled in real-time. I read somewhere that Facebook’s Oregon data center has to add servers daily in order to keep up with the growth. However, storing raw data seems like it’s incredibly data intensive since they have to make backups as well.

  • November 26, 2012 at 4:22 am
    Permalink

    It is no surprise how much data Facebook accumulates especially after reading this article on Facebook’s privacy policy: http://www.digitaltrends.com/social-media/terms-conditions-facebooks-data-use-policy-explained/2/ . The article mentions Facebook’s tracking technology which is code embedded pixels that follow every move you make on the web. Also, the facial recognition technology must also add to the bulk of the data processed every day. I think you are right to question whether it is safe to disclose any more of our information.

  • November 26, 2012 at 11:37 pm
    Permalink

    I think it’s amazing the amount of data Facebook generates a day. I understand people are constantly uploading pictures, videos, and text but actual hundreds of terabytes a day? It makes me wonder how Facebook can constantly update their servers to handle all that raw data.

Comments are closed.