Parallel Algorithms for Automatic Database Normalization

by Jungh K
For this week’s blog assignment, I read an article called “Parallel Algorithms for Automatic Database Normalization”, written by Amir-H. Bahmani, S. Karem Shekofteh, Mahmoud Naghibzadeh, and Hossein Deldari.  Since normalization in sequence is time consuming practice, the authors propose a mechanization of normalization process using multiple computers through parallel algorithms.  The authors implement MPI, which stands for Message Passing Interface, in order to implement the parallel algorithm.  The process of normalization starts by scanning each rows of the dependency matrix to find a partial dependency.  Once partial dependencies are found, new relations are created in order to eliminate partial dependencies and therefore 2NF forms are achieved throughout the relational database.  In order to achieve the optimal normalization, 3NF, each rows of the dependency matrix is once again scanned to find a determinant key whose dependency is “neither partial nor wholly dependent on part of the primary key”.  Once the determinant key is found, separate, but not duplicate, table is formed to transfer the transitive attributes.  In a scenario developed by the authors, numbers of processors greatly improve the time consumed on the database normalization processes.  However, when only 2 processors are used, the communication overhead actually slowed down the whole process.

This week’s topic on normalization provided me with essential knowledge to understand the article.  I believe that normalization process can be very time consuming and complex as numbers of tables in the relational database dynamically increase to accommodate user created contents.  However, using the parallel algorithms for automatic database normalization can reduce data redundancy and integrity while maintaining efficiency of using multiple processors.  While reading this, I was wondering how multithreaded programming, optimized for 6 or 8 cores, would compete in a real life situation.

I never have used more than few tables to develop an application in the real life.  However, AdventureWorks sample provided by Microsoft for SQL Server shows that just shear number of tables in a relational database can be overwhelming, especially for a novice.  A simple script for automatic normalization, I believe, could help developers when rapid development is needed.


Bahmani, A.-H.; Shekofteh, S.K.; Naghibzadeh, M.; Deldari, H.; , “Parallel algorithms for automatic database normalization,” Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on , vol.2, no., pp.157-161, 26-28 Feb. 2010
doi: 10.1109/ICCAE.2010.5451437

One thought on “Parallel Algorithms for Automatic Database Normalization

  • October 21, 2012 at 11:47 pm

    Thanks for the article. It’s great to know that their are tools and systems that can help us normalize a database. During the lecture, I was thinking how horrible it would be to sit down and manually normalize a large database that has been in production for a while. I almost cringed at the thought of having to sit down for hours on end, creating new tables, deleting duplicate data, etc. With automated tools such as those in your article, life would be much more enjoyable as a database admin.

Comments are closed.