Database Design and Development

DBA, a great position for a CIS major {4}

by Kevin S
The main purpose of a DBA is to perform maintenance and optimization tasks on a daily basis. However, according to Craig Mullins, DBA’s often become much more than that. Because the DBA is often relied upon by both IT and business associates, the realm of what is asked and/or expected constantly grows. A DBA should expect this, and accept it as it makes him/her more valuable to a company while also extending their own personal abilities. Opportunities to grow may include (in addition to the standard DBA duties):  experience with new technologies, a better understanding of the meaning of data, actively participating in application development, or perhaps just a better understanding of business. read more...

Data Warehouses for Educational Analysis {1}

by Eric C
In a peer reviewed journal entitled “Building a Data Warehouse to Analyze Entrance Exams” written by Dr. Kornelije Rabuzin, explains building a data warehouse dedicated to student data for analysis. Using business intelligence and the combination of databases and data warehouses, it is possible to analyze student data for educational purposes. Such analysis will determine, based on an entrance exam administered to high school students, to see if students are ready for college and what materials do they know or do not know. Having such data and information can give administrators a handful of reports to determine what the best steps are for students going to college after high school. Based on the types of information mentioned in the article, it didn’t seem much can be done to analyze data based on entrance exam scores, high school grades, and regular exam scores. However, it was difficult to extract such information from various databases and it took much programming and two months. The data was loaded and organized into a star schema structure. But after loading everything into the new database warehouse, it was very useful to determine the student selection process and targeting high schools for certain skills in students. read more...

A Strong Database Tool {1}

by Shigom H
Regis Charlot, author of peer-reviewed journal “Providing an Infrastructure For A Cross-Database Management Tool”  presents his teams very own software tool called dbAnalyst as a solution to the many problems encountered in managing a  large database. Database software tools are vendor-specific and require extensive knowledge from experienced Database professionals. dbAnalyst is a database software tool with many features such as the ability to reverse engineer, generate database alert, and explore database content across different databases. The authors make a case for dbAnalyst by presenting several real-life examples in where utilizing their tool can be beneficial. For example, an electronic medical record that consist of multiple software from different vendors can be a headache to deal with.  As discussed in class, some of the challenges with heterogeneous data might include maintaining similar schema structures across databases,  and  migrating database content from different databases. Although the task might seem simple, it  is actually costly and requires a handful of experienced database administrators. Thus, dbAnalyst is an open-architecture software tool that works across different platforms to address these problems. read more...

A Look at SQL Queries {Comments Off on A Look at SQL Queries}

by Andrew H
For this weeks blog post I read an academic journal by Mohammad Dadashzadeh called, “A Simpler Approach to Set Comparison Queries in SQL.” The article talks about set comparison queries which are queries that have results that can only be determined by comparing two sets of data. An example of a comparison query would be Which suppliers will be able to supply all the parts that they are currently shipping or which suppliers are shipping exactly the same parts as supplier SI? In contrast to these two queries a query such as “which suppliers are shipping at least one red part” can be obtained by matching or joining data from one table. The article explains how queries involving set comparison are very difficult to formulate in relational query languages because they must use the error prone EXISTS function. Their solution is to avoid the EXIST function and use the built in SET function as well as the COUNT function. read more...

Another Tool to Help Students Understand Data Warehousing {Comments Off on Another Tool to Help Students Understand Data Warehousing}

by Katheryn T
The article I choose to write on was about a new method of helping students learn about data warehousing better. It is a new tool from the University of California, Sacramento. Since every business is taking advantage of the data mining that is going on, there needs to be well educated people to take care of that data. This article talked about how the courseware developed will help students and beginners understand the beginning phases of data warehousing and the importance of doing it right. There are dimensional models that help students visually see what they are doing. These models can be changed for the progression of the chosen company. This tool helps students learn about their designs and how they need to change with the data. read more...

Data Warehousing: Inmon Style and Kimball Style {2}

by Leonardo S
The article I chose for this week is titled “Best Practices in Data Warehousing to Support Business Initiatives and Needs”. The article talks about two different styles used in data warehousing and lists when you would use one versus the other. The first style is the “Bill Inmon Style” which uses a top-down approach. In contrast, the “Ralph Kimball Style” uses a bottom-up approach. The article uses a major U.S. retail company as an example. This company uses the Inmon style simply because it is what fits well for that particular business. This style of data warehousing demands that you have a third-normal form relational format for your data. The Kimball style, on the other hand, requires that you use a multidimensional style arrangement. In the case of this retail company, the application neutral aspect of the Inmon style made it an easy pick. read more...

Hadoop or EDW {1}

by Brian B
The article that I picked with week is called “Big Data Debate: End Near for Data Warehousing?” by Doug Henschen. The article starts off by giving some background to EDW (Enterprise Data Warehouse). It says that while the technology behind EDW is time tested and thoroughly developed it remains rigid and inflexible when you have to go back and make changes to your data model.  It also says that this is often a very time consuming process that often costs a lot of money to plan out and implement if you ever finish actually modeling and developing the system. It then talks about Hadoop, “which lets you store data on a massive scale at low cost (compared with similarly scaled commercial databases) (Henschen, 2012).” The author says that this is an improvement over normal EDW because it allows more flexibility when it comes to making changes down the road. The problem is that it is not as developed as EDW so it can be difficult to find people who have an intimate knowledge of the software. The article then opens up into a debate between Ben Werther (Pro Hadoop) and Scott Gnau (Pro EDW). Werther essentially says that EDW is a dated technology because by the time you push out the model and get everything implemented you have what amounts to a view of the world a year or more ago, which may or may not be applicable to your business needs today, wasting your companies time and resources. Gnau’s argument boils down to the fact that while Hadoop maybe more flexible it does not allow you to have very good control over the data you have collected. He says that with all of that data being un-modeled it will cause issues for analyst’s to view and sort the data, which is why EDW will stick around to make their jobs more manageable. read more...

Data Cleansing {4}

by Garcello D
 

The article I decided to blog about this week is called “Data Cleansing for Data Warehousing,” it was written by Ari Baumgarten on February 27, 2007. The author opens up with an analogy comparing a politician to data cleansing by stating how Politicians raise money can be compared to data cleansing a warehouse in the sense that one cannot exist without the other. Data Cleansing is also known to be the most time intensive and contentious process for data warehousing projects but what really is data cleaning? Well I’m about to break it down for you. read more...

Data Management for Science Field {3}

by Tseng H. K.
The journal I read this week is called “Portable Data Management Cloud for Field Science” by Yuma Matsui and Aaron Gidding. The author basiclly talks about how data portability and accessibility important in modern science field, espcially in archaeology. These days, cloud computing system has provide great computing infrastructure for many scientific fields that have to work both in field sites and the lab.Moreover, data portability brought the fields with great accessibility for their data. The author talks about how efficiently exchange their data for analysis between work sites and their lab via portable data management system. read more...

SQL Packages and Data Warehousing {1}

by Ming X
The article I read is called the development of ordered SQL packages to support data warehousing, by Wilfred Ng and Mark Levene. “Data warehousing is a corporate strategy that needs to integrate information from several sources of separately developed Database Management Systems.” The authors mention that future database management systems should provide adequate facilities to manage a wide range of information arising from such integration. Since the order of data is usually involved in business queries; users can extend the relational model to incorporate partial orderings into data domains and describe the ordered relational model. User can use OSQL, which allows querying over ordered relational databases. Ordered SQL (OSQL) is an extension of the Data Definition Language (DDL) and Data Manipulation Language (DML) of SQL for the ordered relational model. OSQL can be applied to solve various problems that arise in relational DBMSs involving applications of temporal information, incomplete information and fuzzy information under the unifying framework of the ordered relational model. There are three OSQL packages: OSQL_TIME, OSQL_INCOMP and OSQL_FUZZY. read more...