Incorporating Concepts for Bioinformatics Data Modeling into EER Models

by Jungh K
For this week’s blog assignment, I found an article, titled “Incorporating Concepts for Bioinformatics Data Modeling into EER Models”, which discusses about implementation of variation of ER conceptual modeling in order to incorporate bioinformatics.  The authors suggests that extending a tradition ER model to add two special relationships, called ordering and functional, enables better implementations of bioinformatics on the conceptual level.  In biological modeling, order is especially important because protein synthesis occurs in a sequence of events.   An advantage of using the ordering relationship is that users are allowed to map out the whole DNA more easily when DNA fragments (>= 100,000 bases) are stored in this fashion.  The functional relationships are used to represent various crucial processes, such as “translation” and “transcription”.  The “translation” relationship notation dictates that two data input, tRNA and mRNA, are translated into protein, an output.  By compartmentalizing each activity into these processes, the data modeler can more efficiently translate complex activities into a database.  In conclusion, the EER model this article proposes doesn’t change the fundamentals of a traditional ER model.  However, by just adding new relationships and their notations, the data model can design a database that is suitable for meeting all the characteristics of bioinformatics data.

The article on EER (Extended or Enhanced ER) directly complements on this week’s class lecture.  Having some basic background knowledge on EER helped me understand the article’s main points more easily.  Also, seeing how EER is implemented in a real-life situation helped me grasp the concepts and advantages of EER.  Despite of other students’ posts about how ER model is becoming obsolete for complex and enormous data readily available these days, I believe that extending the traditional ER in specific ways can overcome its inherent limitations.

Only 4 types of nucleotides (A, T, C, and G) make up a DNA and there are only 20 amino acids that make up a protein. Given how complex and enormous data written on our gene, it is amazing how a simple model shown in the article can incorporate 3 to 4 billion DNA base pairs.

*The shown figure above is directly from the article.

Source:  Feng Ji; Elmasri, R.; Yiming Zhang; Ritesh, B.; Raja, Z.; , “Incorporating concepts for bioingormatics data modeling into EER models,” Computer Systems and Applications, 2005. The 3rd ACS/IEEE International Conference on , vol., no., pp.189-192, 6-6 Jan. 2005
doi: 10.1109/AICCSA.2005.1387039