What to Expect in Data Warehousing{2}


by Penny P
Data warehouses play a critical role in corporation/institutions. They allow users to quickly retrieve data to be used for data mining or for data analysis tools. The results that they get from these tools are then used to support business decisions that would be made. The authors of the article designed a case study that would help beginners understand the basics of data warehousing. Using the enrollment data from Universities, the main objective was to prepare the data for a mining system that would be used for predicting student enrollment.

In the design process of a data warehouse, two important steps include: 1) clearly identifying the objective of the data warehousing project and 2) determining the user requirements. Two types of summary reports that they are looking for include 1) the enrollment reports for graduate and undergraduate students for the last 30 years and 2) the annual reports of the demographic impact factors on enrollment. The design that the authors use are based on a star schema with each group placed in a dimension table. Aggregation was also used to forecast the number of students for next term. The authors gave two strategies on how aggregation could be used: 1) aggregation based on the fact that the tables should be equal to the number of aggregate functions or 2) aggregation based on the idea that an aggregate table is designed to be a dimension table. That way, one of the aggregated tables will contain keys to multiple aggregate functions that will be referenced by the base fact table. Aggregation helps improve the performance of a data warehouse because it reduces the number of scans on the records. As a result, it helps make the queries more efficient.

This article started out with some basic info about data warehousing and how it is used to help companies make business decisions. When the authors talked about the design, it really described how complex data warehousing could be. There are so many things that the designers have to consider and the idea of working with multiple tables and dimensions makes it sound like a very difficult task. Data warehousing is definitely a great tool for businesses to use and it looks like it holds many benefits for the users. But since we are still new to this, designing one seems like it will be quite a challenge.

 

Reference:

Kulkarni, M., Lu, M., Zhang, D. (2010). “A Case-Based Data Warehousing Courseware.” Information Reuse and Integration (IRI). (August 2010) P. 245-248. Retrieved from http://0-ieeexplore.ieee.org.opac.library.csupomona.edu/stamp/stamp.jsp?tp=&arnumber=5558932.