Parallel Algorithms for Automatic Database Normalization{1}


For this week’s blog assignment, I read an article called “Parallel Algorithms for Automatic Database Normalization”, written by Amir-H. Bahmani, S. Karem Shekofteh, Mahmoud Naghibzadeh, and Hossein Deldari.  Since normalization in sequence is time consuming practice, the authors propose a mechanization of normalization process using multiple computers through parallel algorithms.  The authors implement MPI, which stands for Message Passing Interface, in order to implement the parallel algorithm.  The process of normalization starts by scanning each rows of the dependency matrix to find a partial dependency.  Once partial dependencies are found, new relations are created in order to eliminate partial dependencies and therefore 2NF forms are achieved throughout the relational database.  In order to achieve the optimal normalization, 3NF, each rows of the dependency matrix is once again scanned to find a determinant key whose dependency is “neither partial nor wholly dependent on part of the primary key”.  Once the determinant key is found, separate, but not duplicate, table is formed to transfer the transitive attributes.  In a scenario developed by the authors, numbers of processors greatly improve the time consumed on the database normalization processes.  However, when only 2 processors are used, the communication overhead actually slowed down the whole process.

This week’s topic on normalization provided me with essential knowledge to understand the article.  I believe that normalization process can be very time consuming and complex as numbers of tables in the relational database dynamically increase to accommodate user created contents.  However, using the parallel algorithms for automatic database normalization can reduce data redundancy and integrity while maintaining efficiency of using multiple processors.  While reading this, I was wondering how multithreaded programming, optimized for 6 or 8 cores, would compete in a real life situation.

I never have used more than few tables to develop an application in the real life.  However, AdventureWorks sample provided by Microsoft for SQL Server shows that just shear number of tables in a relational database can be overwhelming, especially for a novice.  A simple script for automatic normalization, I believe, could help developers when rapid development is needed.

 

Bahmani, A.-H.; Shekofteh, S.K.; Naghibzadeh, M.; Deldari, H.; , “Parallel algorithms for automatic database normalization,” Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on , vol.2, no., pp.157-161, 26-28 Feb. 2010
doi: 10.1109/ICCAE.2010.5451437