normalization

You Can Still Use RDBMS over NoSQL {1}

by Hieu H
NoSQL is definitely the buzz in the database world. With such open source packages such as MongoDB and FoundationDB, it makes us NoSQL as accessible as it can get. There are still benefits to using relational databases, such as the ability to use normalization, shared data, and maturity. Some instances in which using a relational database over NoSQL are advantageous include when you’re building smaller databases that are still going to change over time, when there is so much duplicate data that you have to normalize, and when there is no cost advantage to moving away from already proven technology. read more...

Why do We Need Normalization. {4}

by Hongde H
The article I read was about why we need database normalization, an important part of database design that helps reduce redundancy and dependency. It greatly helps dividing large table into smaller ones for less redundancy and better defining the relationship between the data. In other words, it isolates data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships. read more...

Relational Database 101 {2}

by Kathy S
For those students who have no prior experience or knowledge of database design, this is a great read for you. The author of the article explains the introductory terms and information about how data is organized and represented in a Relational Database. The following are the basics one needs to know. In a relational database, data is stored in a two-dimensional matrix (table) and within the table there are multiple columns and rows. Relational Database Management System (RDBMS) software is used to give ability to users to read and manipulate data. The RDBMS relies on SQL (Structured Query Language) constructs and keywords to access the tables and data that is contained within the tables’ columns and rows. It clarifies that each table in a relational database contains information about a single type of data and has a unique name that is distinct from all other tables in that schema (a grouping of objects/tables that serve a similar business function). The author points out the key to good relations. The primary key is very important; it is a column that ensures uniqueness for every row in a table. The author then explains how a relational database connects (relates) tables and organizes information across multiple tables. The foreign key is an important connector that identifies a column or set of columns in one table that refers to a column or set of columns in another table. The author then states that the key to understanding relational databases is knowledge of data normalization and table relationships. The objective of normalization is to eliminate redundancy and thereby avoid future problems with data manipulation. There are 5 most commonly accepted normal forms, but many programmers, analysts, and designers do not normalize beyond the 3rd normal form,  although experienced database designers may. The author goes on to talk about what 1st, 2nd, and 3rd normal forms look like. Lastly, the article mentions how SQL fits in. SQL helps to create new data, delete old data, modify existing data, and retrieve data from a relational database. read more...

When Is Denormalization Good? {3}

by Hieu H
Database normalization is a process in which redundant and duplicate data is reduced. Normalization provides a more readable and organized database. In fact, it is what most database professionals are taught to do and is normally considered a good practice. Denormalization is the exact opposite. In the denormalization process, you take a normalized database of different tables and “combine” them, essential going backwards in the normalization forms. There are certain instances in which this may be beneficial. The pro of denormalization is that access to records are generally faster. SELECT queries perform better and are less complicated because there are less JOIN statements. The cons of denormalization are that INSERT and UPDATE queries take longer. In instances where there are drastically more SELECT queries than INSERT or UPDATE queries, denormalization makes sense and the performance increases greatly outweigh the cons. read more...

Database Design Mistakes {3}

by Hieu H
A poorly designed database can lead to many problems down the road. It may not be apparent at first, since there is very little data, but as the database grows, you may start to experience inefficiencies and poor performance. Some of the seven most common mistakes that database professionals make include not spending enough (or any) time on documentation, little or no normalization, building before designing, improper storage of reference data, not using foreign keys or constraints, not using naming standards, and improperly choosing primary keys. read more...

Denormalization: Intermediate Step {5}

by Jasmine C
The article I read about is very informative.  A quick synopsis of the article is that it discusses a lot of information regarding the techniques of denoralization and the pros and cons of normalization vs denoralization.  Today, normalization is the way to designing a relational database.  However, the biggest disadvantage of normalization is that system performance is very poor.  With normalization, data is organized so that there is minimal updating and data is easily accessible.   At the moment, denormalization techniques do not have concrete guidelines to guide the process.   However, denormalization shows a positive effect on a databases’s performance.   It has been proposed that denormalization be used, in addition to normalization, to play as a middle step to help with system performance .  This article describes three approaches that are used to review the donormalization strategies.  Each of the approaches shows how denormalization positively affects databases. read more...

Normalization vs Denormalization {1}

by Antonio M
This article talks about the differences between Normalization and Denormalization.
The author also lists the the pros and cons of using each technique. Some of the
advantages of Normalized data is that when data needs to be updated in a table
they can be updated much faster because there will not be any duplicated data.
This is good because when using an INSERT statement the data can be inserted into
one location this is also similar when you would use a SELECT statement when getting
data from a single table. One of the issues with normalization is when you join
tables that have been normalized indexing strategies will not work well for these
tables because data can be spread out among other tables. As for the advantages
for denormalization is mostly beneficial when there is a big need to read-load
data from a database. The reason being is that majority of the data needed are
present in the table that is being selected so there would be no need to join
tables since this can be time consuming. Although there will be duplicate data
in a denormalized table and this can be complex when updating data. The author
says that the best way to decide when to either normalize or denormalize a database
is to have a mixture of both and just depends on the need of the database, if it
is read more or updated more. One suggestion the other had mentioned is the use
of triggers when a table has been denormalized, that way when there is ever a need
to update information in a denormalized table instead of updating duplicate data
it can reference the table where that duplicate data is coming from and save time
rather than going through each row of a database and updating it one by one. read more...

Normalization, Confusion among Textbooks {2}

by Kevin Q
The peer reviewed journal I chose was about normalization in general and how some textbooks may have different reasons for normalizing. Carpenter goes on to discuss “E.F. Codd, the acknowledged father of relational database and normalization.” Codd’s normalization is attaching attributes to primary keys like we have discussed in class. Carpenter then mentions Hoffer et al books and their goals of normalization. They are: “(1) Minimize data redundancy, (2) Simplify the enforcement of rerential integrity constraints, (3) Make it eaier to maintain data, and (4) Provide a better design that is an improvise representation of the real world and a stronger basis for future growth.” Carpenter agrees that these goals are great overall but feels that the real goal of normalization should be “to produce correctyl structured relations.” He believes that the clarity on the subject of normalization would help students be able to grasp the concept more easilt and clear up confusion. Carpenter says that not all data redundancies should be eliminated as some, though rare, are essential in some cases. Carpenter then moves onto the subject of what level of normalization are we supposed to reach and how even textbooks can’t give a unified answer or statement. read more...

Normalization… Why? {1}

by Evin C
Normalizing data has become an important aspect of our course which is why I chose this article. Looking through the full article not only gives you a perspective on normalizing data but also unique ways to approach it. Having already gone over the ways to normalize data in class, this summarizes that information yet again to hopefully appeal to more people and help everyone better understand the process. Although it does not have labeled examples, it does have in depth details about each individual normal form and how they are converted and applied. The article then goes on to evaluate how accuracy and performance affect a database, saying “a poorly normalized database and poorly normalized tables can cause problems ranging from excessive disk I/O and subsequent poor system performance to inaccurate data.” It even goes on to say “an improperly normalized condition can result in extensive data redundancy, which puts a burden on all programs that modify the data.” read more...

Normalization Made Easy {1}

by Vincent S
The article I read teaches the basics of normalizing data.  In this class we recently learned the benefits of normalizing data.  If data exists in one more then location in a database, then one should have the ability to change data in one location on the database, and then have that change applied to rest of its similar entries in the database.  That is why having a strong link between redundant entries and formal rules of structure in a database are so important.  For data to be considered normalized, it must reach the third level of normalization.  The first form of normalization requires that a database eliminate repeating groups, create separate tables for each set of related data, and identify each set of related data with a primary key.  One would think this set of requirements is expected of any professional database.  The second form requires that a separate table is created for values applied to multiple records, and that these tables are related due to a foreign key.  The third form requires that one eliminates any field not dependent on a key. read more...