Data modeling

Data Warehousing and the Best Practices for it {2}

Data Warehousing Practices to Support Business Initiatives and Needs

The article I read was about data warehousing architecture and the practices that are used in businesses and companies.  There are two methods that were mentioned in the article, the Bill Inmon Style and the Ralph Kimball Style.  The article goes into detail to explain the practices of a major U.S. retail company and how they came to choose the Inmon Style.  The Inmon style calls for an atomic-level, third-normal form relational format in which to store extracted and transformed data.  They thought that this method was most useful and applicable to the company.  The author also explores the best practices to use for data warehousing such as data modeling, loading, attributes, and other important factors.  The article concludes informing the reader of the results of these data warehousing practices and how many departments are benefiting from queries and requests for data warehouse data, and it has been a valuable source of data that benefits the entire company.

read more...

Description or Design: Which is Better for Data Modeling? {Comments Off on Description or Design: Which is Better for Data Modeling?}

The article I read about was focusing on the question of how data modeling is characterized. Specifically, is it design or is it description?  In the article, there were hundreds of people surveyed to find out what they thought. Questions spanned from asking what data modelers believe is the scope of the design process to will different data modelers produce different conceptional data models for the same scenario (Simsion, Milton, & Shanks, 2012)? Over the course of the surveys, the researchers discovered that many data modelers view database modeling as design, and other subjects, such as the business problem, were more descriptive than creative. The contributors that felt problems can be handled with design found that businesses don’t really knowing what they want. This was an opportunity for creative and new perspectives to be introduced. Over all, the researchers discovered that challenging business requirements where split between design and descriptive, data modeling was considered a creative activity, and that data modeling does not have one single right answer (Simsion, Milton, & Shanks, 2012).

read more...

Defining Terms in Data Models {Comments Off on Defining Terms in Data Models}

There’s a challenge when it comes to defining terms for data models. The author of the article asks the question, “Does defining the actions something performs solve our definition issues? Or are we instead adding complexities, for example, assigning more than one meaning to the same data element.”  The responses for those questions were grouped into 3 categories. “Defining a term by its actions is an effective technique” according to Madhu Sumkarpalli, who is a business intelligence consultant. He says it is better because that way they can be specific about the term or close to specific rather than being generic and abstract. Basing the term on its actions can define it appropriately and paint the proper picture. “Defining something by its actions is part of the solution” according to Amarjeet Virdi, who is a data architect. He says data entities are meant to represent real life objects and those objects perform functions. Then a new question comes up when the object ceases to perform its function then what? does it cease to exist or have no value to the business anymore? Complexity increases. “Defining something by its actions is not recommended” according to Wade Baskin, who is senior database architect. He says mixing process with data is a dangerous practice. Data should have only one definition regardless of the process. If the data changes as it matures then the change is reflected as a different data element. It is not good to change the current definition of an element based on process or location. Allowing fields with multiple meanings is dangerous and should be avoided. The author feels that defining a term by what it does is effective and it is at least a starting point because most business professionals define things by the roles they play. For example a person playing the role of a customer. The problem is though, is that this approach may eventually lead to data integration issues, hidden business logic and the question of what will happen to the term itself when the activity it performs stops.

read more...

Kaggle {1}

The article I choose to read this week is about Kaggle, a platform for predictive data modeling competitions. According to the Leena Rao that Kaggle has raised $11 million in Series A financing led by Index Ventures and Khosla Ventures.

For those who are not familiar with Kaggle, here is how it works:

read more...

Data Modeling is too subjective. {3}

Data modeling is a lengthy process for designers that involves a thorough analysis of the business requirements, and entity relationships. Most of this is done during the conceptual data modeling phase, in which a simple diagram(Entity Relationship Diagram) illustrates the relationships between the main entities of the business. In the “Necessity of Conceptual Data Modeling for Information Quality“ Pete Stiglich points out some typical problems that occur in conceptual data modeling and the approach that designers must take to create a more realistic model. The author describes a conceptual data model as the “picture on the puzzle box” that outlines what the “information puzzle” (Logical, Physical Model) should look like. Therefore, a lot of thought must go into the conceptual data model to avoid potential problems when developing the final product.

read more...

Importance Data Modeling {2}

The article that I read about talked about that in order for us to be good in working with databases it is important to acquire knowledge in security management, data integration and data recovery in the case of a disaster. Designing a good database doesn’t consist only in linking tables together; it requires lots of skills combined to be able to pull off a complete database structure that will run optimally. Many companies tend to fail when trying to design a good database because developing a good database model requires lots of time to create a well-model design and most of the companies don’t count with the necessary time to allocate to the data modeling phase.

read more...

An Education in Notation {2}

The article I chose this week is entitled “Notation Usage in Data Modeling Education” by Michael Mannino. The article starts off by saying that there has never been a true standard of notation in the data modeling field. There have been attempts in the past, but none of them have ever gained wide usage throughout the industry.  He talks about the fact that there is a wide variety of notation styles which has been increased with the differences in textbooks and the CASE Tools used in the industry. The author believes that creating a standard now would be difficult because of the current “diversity” in modeling notation. The author says that students should be taught both ERD and UML, however they should be taught in separate classes. The article goes on to talk about ways to instruct students in subject matter such as design errors or advancing their knowledge in data modeling. He finishes the article off by saying that “These guidelines are just building blocks to develop data modeling skills. (Mannino, Spring 2006)” He then offers some ideas on how data modeling education can be advanced. One way he suggests is that students be given real world examples of data modeling to actually prepare them for what is in store for them in their field of work.

read more...

Universal Patterns for Data Modeling: A Structural Theme {1}

Authors Len Silverston and Paul Agnew (2009) of “What are Universal Patterns for Data Modeling?” explain how patterns can enhance data models. The authors equate Universal Patterns for Data Modeling to the schematics of a suspension bridge. Although bridge exteriors may differ, the majority of suspension bridges interior construction patterns are designed the same. Silverston and Agnew describe a data modeling pattern “as a template that can serve as a guide for developing data models.” The authors also briefly explain universal data models and how they are reusable data models for common business rules. So instead of re-inventing the wheel, database designers can reuse models that are fundamental to an industry. In contrast, “Universal Patterns in Data Modeling provide the underlying structural themes so that the modelers can reuse these to build any model (Silverston, Agnew).” Therefore, Universal Patterns can be applied to any data model and improve upon a generic Universal Data Model. Silverston and Agnew find that Universal Patterns benefit organizations by constructing upon a universal data model that has proven essential to its industry while utilizing resources towards the specifics of its business.

read more...

The Data Model Owner {2}

The author (of the article I read) is a data modeling consultant and instructor, and he asked the question,”Who owns the data model?” He received over 100 responses.  It is known that an organization has a team of skilled developers, and these developers do the data modeling. When a developer from this organization asks,”Who owns the data model?” How does one respond? 37% say the business should own the model because “the modelers don’t ‘own’ the data model – they are only the caretakers of the model.”  22% say the development team because as the owners, they would keep all the members informed about changes while maintaining the ability for everyone to make changes to the model as needed. 15% say the individual developer because they have the skill and knowledge to manage the data model well. 11% say the application manager or database administrator because they are the ones who feel the pain when the application is not working properly and they are the ones who are contacted first if the application isn’t working as it should. 15% say no one owns the model simply because no one may have the big picture in mind.

read more...

Introducing UML {2}

The article i read this week was “Conceptual Data Modeling in the Introductory Database Course: Is it Time for UML?” by James Suleiman and Monica Garfield. This article gave a few reasons why the Unified Modeling Language or UML should be taught in introductory database courses at universities. When teaching conceptual data modeling, a majority of  schools teach the Entity Relationship or ER notation. This is for various reasons including preference and text book support, as many text books only dedicated only a single chapter or a part of the appendix to UML. There have been an increase in the teaching of UML in courses, however not as the primary modeling notation. The main fact backing the claim that UML should be taught in introduction to database course is its use in industry. The authors make the claim that UML has become an increasingly popular notation in the workforce, and thus should be taught in a more detailed manner in academia.

read more...