Google’s BigQuery

by Ronny W
BigQuery is a cloud-based service from Google. It is use for analyzing very large sets of data. It was unveiled at the Google I/O developer conference two years ago, and it is now publicly available after a period of limited-availability testing. BigQuery is an online analytical processing system that is able to process larger amount of data in real time. Pricing can vary according to the usage of query and the amount data stored. BigQuery is not for OLTP(online transaction processing) task.

Query is one of the most common function of database. Querying can take up a lot of time depending on the system that is being used. It also depends on the petitioning of the database discussed in class. Database is essential for every business. This BigQuery service combines both database and cloud compatibility together.

BigQuery can help a lot of business observe large amount of data in real time. This service is in the cloud, so it can be access easily without being in the office. I think the down side to this is internet speed. Internet speed might be the biggest issue in this service because of the amount of data that needs to be transferred over through the web. The process is as fast as the slowest part of the whole process. Meaning if the internet is unable to download terabyte worth of the queried data, then all the fast quering speed will not really matter.

  1. How interesting. It’s good to hear that Google is advancing in their cloud-based services and is adding the querying feature. I agree that this service can help many organizations because time is so limited and companies, now, are looking for systems that allow them to easily grab the data they want.

  2. I think you mean partitioning of the database. Although Internet speed is definitely a limiting factor, there’s also the time to retrieve information across the different partitions. That can slow down the information transfer process as well. Does BigQuery allow you to compress the data before download?

  3. Interesting article, one of the bits the summary missed that intrigued me will be that the information charges are pennies per gigabyte stored, and fractions of pennies per gigabyte, so small processes would be dirt cheap. Even more intriguing, Google has devised a charge system that charges by columns computed, not by the columns in the entire table (so if you only use the first 4 columns of a 50-row table, you get charged for 4 columns instead of 50). Perhaps if this catches on, there would be a nice bonus to IS employees that can minimize the amount of columns used per query!

