Problem in The Process of Getting Data

by Phuong H
The article I picked for this week is “The Big Problem with Big Data” written by Jill Duffy. In the article, the author mentions several problems with having big data. It is good to have a lot of information but most of the time people don’t know what to do with the data. One of the biggest problem with big data is data generation. Almost everyone have an account on either Facebook, Google+, or Twitter. Users are asked to provide or share their information in an exchange of some kind of service. In the end, the information that they collected is what the companies want not the information they want to collect. In another word, user generate data to get what they want. The information might not represent the user because they do it for a reason not voluntary. For example, you are asked to fill out a form and some survey questions before you can download the software. How many people would fill in their “real” information? Most of the time, people just fill in the blank to get pass by. According to the survey in the article written by Richard Karpinski, “The survey found that more than 50% of buyers said they provide a valid name, email address, industry, job title and company name when they register; although less than 40% provide accurate phone numbers” (Karpinski, 2007).










(Spencer, 2012)

Another problem with big data is collection. Company spends a lot of time collecting data but in the mean time they don’t know what to do with the information collected. For example, the statistic from basketball. The number doesn’t really show much about a player. There are day when they feel every good so they perform better and there are day when they are not doing well. Having a lot of information but doesn’t know how to share can be at disadvantage. Especially when the company spend so much time and money collecting data, it is hard for them to share with others. Also, releasing data at a right time is also important. We do not want information that are not relevant and not useful. Lastly, the problem with big data is translation. We can have tons of data but if we don’t know how to interpret it, they do not mean anything.

I choose this article because it is related to our class discussion. We know the pro and the con of having database but I didn’t realized the disadvantage of database discussed by the writer. Everyday, I participate in generating the data but I didn’t realized. Most of the time, I used information that are not related to me because I don’t feel like giving out my information and it is a one time use . Also, I don’t like sharing thing that I found because if too many people know, it no longer special anymore.



Duffy, J. (2012, March 11). The big problem with big data. Retrieved from,2817,2401425,00.asp

Karpinski , R. (2007, June 04). With content the conduit to customers, marketers must keep up with user trends. Retrieved from

Spencer, N. (Photographer). (2012). How much data is created every minute?. [Print Photo]. Retrieved from

4 thoughts on “Problem in The Process of Getting Data

  • September 30, 2012 at 3:57 am

    It is kinda scary to know that information about me is quantified and stored somewhere in the web by the big companies. I, too, believe that having too much data is unnecessary and even burdensome especially when the data are inaccurate. Good post and thanks for the insight!

  • September 30, 2012 at 4:20 pm

    I liked your blog post because it was very similar to mine in a way. It brings up other ways of how too much data can be a bad thing which made it very interesting and enjoyable to read.

    • October 4, 2012 at 1:30 pm

      I would also like to add that i thought it was interesting in the Translation section where they brought up new questions. For example they used the Nike armband, It collects personal and private information about you and you own the device therefore you can use it, but when you get an MRI done, it collects personal and private info about you as well, but we don’t get that freedom of access to that information because it is not owned by us, but shouldn’t we still get access to it? those kind of questions where brought up and i thought it was interesting to think about.

  • October 1, 2012 at 10:43 pm

    Information overflow, duplication, untrustable data present big problems for the Web today. The recent Google Panda update and Penguin update for example, showed efforts of attempting to lower the rank of low-quality sites and return higher-quality sites near the top of the search results.

Comments are closed.