Database Design

Big Data and Analytics {Comments Off on Big Data and Analytics}

by Jennifer R
The author talks about how the business intelligence tools used to interpret data are changing as people continually work on adjusting databases process massive amounts of data. Traditionally the data to be analyzed was relational and stored in cubes, with retrieved information “delivered as standard reports”. There is a demand for tools with ‘data discovery’ properties, where they work with near-real time data “to create adhoc reports and graphs”. The location for data storage for databases is changing in conjunction with the analytic tools. Databases used to be stored on disks due to the size being impractical for storage on RAM. Improvements have made database storage on memory feasible; such databases are described by the author as in-memory databases. There’s also been talk about ‘predictive analytics’ tools where they “try to anticipate what will happen based on trends they spot in the data”. read more...

Facial Recognition Database {2}

by Jennifer R
The authors talk about the growing interest in facial recognition databases and how they feel the databases currently being studied by researchers do not mimic real world scenarios. Researchers should take into account that lighting may not be ideal for a clear image, or that the subject may be moving or looking in a direction that may obscure their faces or result in a poor-quality image. The authors also point out the distance of the subject from the camera will affect the image, as well as the type of camera being used. They created their own database, by setting up six cameras of differing quality, with one high-one quality camera being set aside for mug shots. The article details the criteria they used for naming images, such that the names were unique and carried information about the image. Participants were to follow a set of instructions ensuring that they walked past the cameras in the same way as everyone else. They also conducted image capturing at night. The purpose of such a database is to explore the problems of facial recognition programs that have yet to be addressed and to point out the factors of the real-world environment can greatly affect recognition performance. read more...

What to do With Too Much Data {2}

by Tyler K

In the article, the author discusses how the modern database often extends beyond a few hundred entities; modern day companies regularly are wading through terabytes of information, trying to drag useful & meaningful context out of massive loads of information. Several massive problems are brought up – searching through the data is tedious and yields irrelevant results, metadata could vary in the usefulness and the context might not be comprehended by others, attributes could mean the same thing but be sorted separately (ex: Mac, Macintosh, Apple Computer, iMac could all be different ways to describe the same product), and it is very difficult to standardize the data and determine who regulates and incorporates the standardization – and if it’s even worth the time to do so. Thus, the solution offered is simple – relax the standard. Let there be a little differentiation, and create unified product descriptions that can catch multiple ways of describing the same object, determine responsibilities for who is going to ensure data integrity. Even then, there is no hard solution, and the conclusion is that there must be a future implementation of database management systems that can form patterns and relationships with data, have well-documented information on where data is originating from, and develop a system to understand how much is being lost by inaccuracies in the data. read more...

Using RFID to Create the Hospital of the Future {2}

by Tyler K
This article takes on the rather daunting task of restructuring a hospital’s information management systems – the current system for the majority of American hospitals involves utilizing a clerk-based system, where medical history and health information has to be stored in a rather unstructured manner, with varying degrees of specificity in detail on each medical instance, differing even more on a patient-by patient basis. For instance, one patient might have a note attached to his medical history stating “smokes 2 cigarettes per day,” whereas another might have “smokes habitually,” each has a similar meaning, but what detail is stored? In addition, a hospital involves a few staff members dealing with multitudes of patients frequently, with very archaic means of storing information (think of any medical show, patient information is often still stored on a clipboard in the patient’s room), and very confusing structures for staff (since doctors are continually dealing with changing patient rosters). The article suggests using wearable RFID tags as well as photo-based technology to track patients and store information, and also introduces a method of storing patient-staff interactions better – where location-based interactions are logged (thus a doctor can just start working with patients instead of logging in and waiting to receive patient information) and instantly create new patients where necessary.
This topic is relevant to the material learned over the course of the week in that there are several segments on both Physical Design as well as implementation – possibly an implementation that could be facilitated via SQL. The article mentions how location-based interactions with a patient could automatically query the Hospital’s patient database and bring up all relevant information, formatting it into a useful context (all tasks that would require some sort of Structured Query). In addition, there is a segment discussing a topical overview/physical structure of the usage of the system, as well as an infrastructure overview – both topics with a degree of relevancy both to what we have already learned as well as what we might learn in the future.
Since several other students have written about the future of Information Systems and America’s medical system, it is clear that the ideas expressed in this article are certain to be utilized in the near future. As such, it may be prudent to read and learn about such implementation structures – as it is quite possible that we shall be translating theory into application in the near future. read more...

Medical Application of Database Management {2}

by Jennifer R
The article talks about how database management was used to provide a solution for a medical practice seeking to update its technology usage, improve its efficiency and patient care. The author described the medical practice, consisting of 4 physicians and 55 workers, who for three decades did a majority of the communication via faxing and hand-carrying information. He goes on to point out the practice “lacked e-mail and an intranet, and had no system to centrally manage technology, data storage, or security.” A lot of the work  was done on paper and transcribed onto the computer. The solution was implemented by a consulting firm, itSynergy, who “established a Microsoft Windows Small Business Server-based network as a wide area network.” They also installed a server at each of the three offices that runs Microsoft Windows Server 2003 and Microsoft SQL Server 2005. The medical practice now uses gloEMR, a electronic medical records management software, to update and manage patient records. It cut down the cost of transcribing medical records and improved communication within the practice. read more...

Stolen-Cellpone Database to Combat Mobile Theft {2}

by Jennifer R
 

The article talks about how the major companies that provide wireless services are making a joint effort to produce a database that keeps track of stolen cellphones. As technology helps to improve cellphones, it  also increases the value of and demand for smartphones on the market. According to the article, the Metropolitan Police Department say “in Washington, D.C., cellphone-related robberies jumped 54% from 2007 to 2011”. The providers will come out with their own databases before merging them into a single national database. The article lists some important issues that need to be addressed in designing the database, starting with how to deal with the different technologies the providers use. It also talks about the way some phones are currently identified, such as the SIM card for AT&T & T-mobile phones, can make them more attractive to thieves.   There are stolen cellphone databases currently in use in the United Kingdom, Germany, France, and Australia. read more...

Tips on how to choose the correct data types {4}

by Willen L
In this article the author focuses on the importance of proper data types in order to maintain data quality. He gives us some general tips and rules to follow to ensure the correct type is chosen. First, if the data is numeric, favor SMALLINT, INTEGER, OR DECIMAL Data types. Second, if the data is a character, use CHAR or VARCHAR data types. Third, if the data is date and time, use DATE, TIME, or TIMESTAMP data types. Lastly, if the data is a multimedia, use GRAPHIC, VARGRAPHIC, BLOB, or DBCLOB data types. These rules seem simple enough but he states that it is a widespread problem that people are using improper data types. The most common ones he encountered were using CHAR type on date data and using CHAR type on numerical data. Choosing the wrong data type may slow down the system and in general it is best to assign correct data type that best matches the values in the domain to improve overall data quality. read more...

ERD vs UML? what do employers want? {Comments Off on ERD vs UML? what do employers want?}

by Willen L
In this article the author talks about the employment demands of ERD vs. UML. Whether employers prefer one over the other and with the ever fast changing IT field it’s sometime difficult to gauge what skills are preferred in the profession but with all these job search tools online it’s possible to gauge where the demand is. The author analyzed data that he obtained from SkillPROOF since the beginning of 2004 and wrote this article 2 years later in 2006. The data was collected from 137 IT focused companies and the data was collected daily and there were a total of 35,932 jobs recorded. They did a keyword categorization according to history and a sampled content analysis for a week to dig deeper into the matter. They found that data modeling when searched without a specific methology is one of the required knowledge bases. That means a lot of jobs want data modeling but did not specify what type of modeling (ERD vs UML).  UML appears to be more on the application development side and are often listed as a critical skill. ERD tends to focus on database design and maintenance and is also often accompanied by skills in software such as ErWin, Visio and TOAD. read more...

How to Design a Better Database {4}

by Penny P
The authors of this article discussed how physical database design is under-studied because database administrators have to maintain databases on a daily basis. The idea is to design databases that could adjust themselves to the characteristics of applications, such as indexing definitions or automatically gathering information from SQL workloads. The authors explain that self-tuning logical database design is needed. Most databases contains information that causes data redundancies and null values. Two main ways of maintaining efficient databases is to: 1) reduce the length of the join paths “without sacrificing the normal form based on functional dependencies” and 2) reduce the length join paths “by introducing data redundancy” (Marchi, Hacid & Petit). Null values may not be of much importance for the database designers but it matters greatly for the database programmers who performs the SQL queries. The authors come to the conclusion that good designs can’t be obtained when the database is designed, however, a better design could be made afterward. With the use of SQL workloads, they could tune the database and filter out the information that is needed or not needed. The SQL statements should be used to do three main things: minimize null values, maximize the efficiency of queries performed, and data integrity. read more...

Reference Data {Comments Off on Reference Data}

by Alexander V
Summary:

The article talks about the advances of database design and modeling and how it is a necessary skill. Then it goes on to question whether different data models have different rates of accuracy and whether the data model can have all the design information for a database. The author states that there are limits on what models can do and failure to understand the limits can lead to data management problems. He says data modeling in general is focused on the logical level, which is a good thing. The problem that the author brings to light is when there is the divide between the logical and physical database design and the data entered into the database acted to “specify a layer of design.” This type of data is called referential data and is commonly referred to as “code tables, lookup tables or domain values.” Referential data typically contains a “code” which is the  primary key and a description. Reference data has many important properties that other types of data do not have. One property would be that a “code” value usually has a definition.  The author defines reference data as “any kind of data that is used solely to categorize other data found in a database or solely for relating data in a database to information beyond the boundaries of the enterprise.” One way that referential data specifies database design is by “effectively replacing attributes in entities.” He says one of the biggest problems with referential data design is failure to assign definitions to data values. This leads to the problem of the divide between logical and physical models. Data models and databases are full of reference data tables and business users are usually left to deal with data values which are not found in the data model and are needed to understand the database. The author concludes that there needs to be better tools and techniques to deal with this problem. read more...