Database Design and Development

Data Quality Information {1}

by Ming X
The article I read is called the impact of experience and time on the use of data quality information in decision making. “Data Quality Information (DQI) is metadata that can be included with data to provide the user with information regarding the quality of that data.” The article focuses on how the experience of the decision maker and the available processing time influence the use of DQI in decision making. Chengalur-Smith et al. (1999) define data quality information (DQI) to be metadata that addresses the data’s quality. Chengalur-Smith, Ballou and Pazer (1998) explored the consequence of informing decision-makers about the quality of their data. Their project studied two formats of data quality information (DQI), two decision strategies, and both simple and complex levels of decision complexity. Their study found variations in the amount of influence across research design.Organizations wishing to begin a program of using DQI should be aware of the fact that there was a lack of consensus when experts were presented with DQI. Organizations can predict that the addition of information about data quality to a database is likely to change the decision made, but it cannot predict what that new decision may be. read more...

Data Integration – Mashup Service {1}

by Andrew M
The article I read was entitled “Service-Oriented Architecture for High-Dimensional Private Data Mashup” by B. Fung, T. Trojer, P. Hung, L. Xiong, K. Al-Hussaeni and R. Dssouli. This article talks about the integration of data for a service known as Mashup. Mashup combines and integrates data from multiple sources and databases for web applications. The authors use the example of a social media website such as Facebook as an example. Facebook uses Mashup in the sense that it is collecting data from the user in the ways of status updates, Instagram photos, Spotify songs and check-in’s at locations. All of these are examples of data that is being sent in from multiple sources and be combined together. The authors talk through this article about the multiple issues that are faced in protecting user’s data in this type of service. While users usually never give out phone numbers, addresses or Social Security Numbers, much important information is still given out. For example, when a person checks-in at a location, they are broadcasting to everyone that they are at a certain location. This allows potentially unwanted people to have a reference as to where this user is from. read more...

Quality Data and Managing It {1}

by Katheryn T
The article I chose to write about was called “Managing Data Source Quality for Data Warehouse in Manufacturing Services”. This article spoke about the standard and quality management system that needs to take place in order to comply with the International Organization for Standardization (ISO).  First there has to be a quality management system for the data source. This is a process in which a model is put in place that has “several steps to ensure optimal data quality and is enriched with data quality management steps in order to fulfill customer requirements”(Idris & Ahmad, 2011).  Human resources, technical infrastructure, and work environment are needed to process and make sure the DQM is done correctly. The attributes of high quality data are described as accurate  reliable, important, consistent, precise, understandable, and useful. One of the problems with data quality can come from a lack of understanding of the origin of the data. When people manage the data, they need to be able to identify it correctly and need to be able to understand what kind of data it is and how to work with it. Data source quality can be improved by working with the data owner, determining the cause of the data quality, and correct the data source. read more...

Getting Top Quality Data for your Database {1}

by Leonardo S
The article I chose for this week is titled “Managing Data Source quality for data warehouse in manufacturing services”. The main topic of the article is focused around data quality. According to the article, one of the primary success factors of a data warehouse is the quality of the data. There are a few downsides from getting low quality data. Often times someone will have to go back into the database and fix the mistakes. This can take a lot of time and effort which could better be spent on other projects. Another flaw that comes from having bad data is that your data analysis will be all wrong. Database analysts will end up wasting even more time reviewing the data again after it has been fixed. The article mentions a few ways to reduce the amount of low quality data going into your database. Two methods of doing this are Total Data Quality Management and Quality Management System requirements. From what I understand, these are guidelines for collecting and inputting data that helps limit the amount of low quality data. read more...

The Forces that Motivate DBAs {1}

by Arlyn R
In the article (2012), “Motivating the DBA,” the author discusses the database administrator’s role and motivations. Lockwood Lyon begins the article by briefly describing the workday of the following IT professionals: programmers, quality assurance analysts, systems analysts, and project managers. According to Lyon, programmers tackle problems during the application development lifecycle and include “…design, coding and testing through implementation and software upgrades (Lyon).” Quality assurance analysts in a typical workday deal with conformance processes, such as programming standards and developing and analyzing test cases. Systems analysts figure out how to have the current technology assets meet the business requirements, along with ensuring the application meets the desired performance. Project managers in the course of a workday manage the resources both technical and human. Also, project managers make sure all requirements are met at each iteration of the project and within budget and by the deadline.
The author then takes an in-depth look at the database administrator’s (DBA) workday. Lyon provides a few examples of the problems DBAs typically face. Examples of these problems include ensuring that applications are following the database standards and information requirements are met. DBA’s play a major role in the design of the database and choosing the right application to meet business requirements and needs. Managing DBMS software support, such as updating to a newer version and implementing vendor software patches, also falls under the DBA’s responsibility. Another example provided by the author in describing the DBA’s role is the 24/7 availability required of the DBA to fix system issues and failures. read more...

Google’s Solution to Unify Their Databases {4}

by Brian B
The article I chose this week is named “Google Spans Entire Planet With GPS-Powered Database” by Cade Metz. The article starts off by talking about a Google Engineer named Vijay Gill while he was at a conference. The question he was posed was how he would change how “Google’s datacenters if he had a magic wand (Metz, 2012).” His answer was “he would use that magic wand to build a single system that could automatically and instantly juggle information across all of Google’s data centers (Metz, 2012).” The interesting part of this article is that Google has done just that. The solution that he had in his answer is called Spanner. Spanner is a system that lets Google “juggle data across as many as 10 million servers sitting in “hundreds to thousands” of data centers across the globe (Metz, 2012).” The power of Spanner is that it lets many people handle the data around the world, while “all users see the same collection of information at all times (Metz, 2012).” Spanner accomplishes this task with its TrueTime API. Along with this API Google has also gone to the trouble of setting up master servers with built-in atomic clocks coupled with GPS to ensure accurate server times. This allows the entire network to stay roughly synched up with all of the different parts of Google’s data infrastructure. The article goes on to say that usually companies will just use a third party as their clock instead of installing their own.  It ends on the fact that this kind of approach would be cost too much for most companies to implement, but that Google tends to be ahead of the curve. read more...

Data as a new Markets {3}

by Phuong H
In the article ” Data Markets: The Emerging Data Economy” written by Gil Elbaz, the author talks about how people turn data into a new market where they collect, analyze and sell it to the market. There are advantages for both parties. One can make money out of it, the other can use the data and don’t have to maintain it. The author also give 2 examples of data markets like: Jigsaw and Kaggle. Jigsaw is a collection of contact information collect from individual and organization. On the other hand, Kaggle is more of a community where company provide the data and people from around the world join and analyze the data and make prediction, find pattern or whatever the goal of the project is. And in return, these contributor will get a reward. read more...

IBM’s Data Integration Solution {2}

by Andrew S
The author talks about in the article about how IBM bought Ascential Software for $1.1 billion in order to expand its offering in the market for corporate software.  Ascential makes software that helps companies gather and combine information from different computer sources into one system.  With this purchase, it would make IBM one of the leading companies in the emerging business of data integration.  Data integration is one of the fastest growing trends in the technological world, and acquiring Ascential would fill a hole in IBM’s enterprise software offerings.  It will allow it to offer its corporate customers a unified view of their data, regardless of where that data resides.  The need for data integration is increasing rapidly across every industry and this is just one way IBM is adapting to the situation. read more...

McAfee and Sentrigo Data Security {3}

by Andrew H
For my last blog post I read an article by Leena Rao called, “Intel’s McAfee Acquires Sentrigo to Boost Database Security Offerings.” The article talks about how McAfee announced the acquisition of Sentrigo. Sentrigo had raised $19 million in venture funding and offers host based software that protects enterprise databases. It does so by monitoring all of the databases activity in real time and the providing alerts, audit trail, virtual patching, and automatic intrusion capabilities. This is was all done via a program called Hedgehog which has a very small footprint and allows for complete monitoring of all database activities with no interference of any kind. McAfee had actualy partnered with Sentrigo in 2010 and in 2011 they acquired them. Mcafee’s Vulnerability Manager is a product that both companies had worked on before the acquisition, it automatically discovers all databases on a network, scans them, collects a full inventory of configuration details and determines if the latest patches have been applied. Their Database Activity Monitoring (DAM) not only tracks the changes in the database but protects the data from external threats as well with real time alerts and session termination. read more...

Planning for Disaster {4}

by Rudy P
The article I am blogging about this week is titled, ” Disaster Preparedness: Planning Ahead” written by Samara Lynn from PCMag.com. This article attempt to give a few suggestions of planning for a disaster, and really attempts to stressing the planning phases importance. The article gives an example of an earthquake doing damage to a database server, and states without prior planning, IT may be “scrambling to find a place to set up a replacement server, take a copy of the data and applications from the damaged server, and then restore that data and re-install mission-critical apps to give end-users the alternative access they need to continue key operations” (Lynn 2012). The author states a disaster preparation plan must be able to keep data and apps required for day to day operations, running in a remote location, and ready to be accessed. The author gives a 3 steps for a company to be prepared for a disaster. The company should think about and prepare for disasters most likely to affect the immediate area, such as a hurricane hitting a location near the coast. Determine how said disasters would impacts an IT infrastructure and system. The company should also have inner department meetings in order to keep the non IT departments involved in the planning. Today, disaster preparedness is easier than ever to deploy because of technological advances such as cloud computing, virtualization and the increasing power of mobile/portable devices. read more...