Database Design and Development

Other People’s data {4}

by Sam T
In this peer reviewed journal, the author discusses about how every company bases some of it’s decisions on external data sources. There are so many data sources from publicly available web services, to sales data to government census data. There’s so much external data and the many ways to get it, it has to be treated differently than internal data. In order to accommodate every user’s data in a timely manner, there has to be a trade-off, where there are three, flexibility, quality and cost when it comes to data integration. read more...

DBA’s move to Cloud {5}

by Garcello D
My final blog is called “Database Administrators prepare to move to the Cloud,” it was written by Maxwell Cooter from techworld.com and was written about a year ago. The article starts of stating how cloud computing is supposed to transform the use of databases within enterprises. According to a survey on Database trends more than a third of the database professionals think that cloud computing is to have the biggest transformational effect on database technology.   Seventy three percent of the individuals that took the Survey voted up for cloud, meaning they believe that moving to cloud would have the most effect on their lives. The results of the survey also stated that production database performance was nominated as the biggest factor that kept staff awake at night with 43 percent placing that as the top of their list. read more...

Technique for quick access from data warehouse. {3}

by Hongde H
The article I choose to read this week is about techniques for quick access from Data Warehouse. In the journal, the author was stating the common problems faced by data warehouse administrators and users. He outlined some query performance techniques which minimizes response time and improves overall efficiency of  data warehouse, particularly when data warehouse is access and updated frequently. By and large performance of the system is improved without accessing the original information sources which provide good strategies that make finer data warehouse. read more...

Amazon’s Redshift {1}

by Hieu H
Cloud-based hosted data warehousing services are gaining in popularity. The primary drivers for this movement are that older enterprise warehouse data systems are expensive and difficult to maintain. Amazon looks to fill those void’s with its new hosted data warehouse service Redshift. What makes it unique is that it’s about a tenth of the cost of regular data warehouses and it automates deployment and maintenance. It is also compatible with many popular business intelligence tools, so people will not have to spend resources to learn new tools. Since Redshift runs off of Amazon’s AWS service, it gets the added benefits of massive failover and redundancy clusters. Customers will not have to worry about data management as it’s already taken care of by Amazon. read more...

The Importance of Data Quality and the Measures Taken {Comments Off on The Importance of Data Quality and the Measures Taken}

by Andrew S
In the article that I read this week, the author talked about the importance of data quality and data source management in a data warehouse project.  Many data warehouse projects fail due to the poor quality of the data, but this article explains that quality characteristics form the backbone of quality management.  The article goes in depth for the five activities that are used in ensure proper quality assurance: quality policy, quality planning, quality control, quality assurance, and quality improvement.  These are the five activities used for quality management that the author goes into detail with in order to identify and understand each quality.  There is a proper procedure that must be taken in managing data source to find the best way to provide a framework and implement tool to reach the goals and objectives within a company. read more...

Oracle’s new Finance Data Warehouse {Comments Off on Oracle’s new Finance Data Warehouse}

by Sam T
Oracle begun offering a data warehouse for financial service industry. Oracle claimed that this data warehouse would be more geared towards the needs of the financial environment. This warehouse is specialized for financial organizations by making it easier to store financial data, generate reports, manage metadata and carry out any other financial data needs. Oracle developed this data warehouse for 15 years using a financial services data model so it can be used for analysis, testing, reporting and possible risks. read more...

BAM: A Real-Time BPM System {Comments Off on BAM: A Real-Time BPM System}

by Arlyn R
At the 2008 3rd ICCIT International Conference, Jin Gu Kang and Kwan Hee Han, proposed a business activity management (BAM) system in the article (2008) “A Business Activity Monitoring System Supporting Real-Time Business Performance Management.” The authors proposed BAM system design and prototype were implemented at a global automotive company. This real-world case scenario explicitly shows their BAM framework applied as a real-time business performance management system. Han and Kang advise that once the structure of the enterprise information system (EIS) of the organization had been thoroughly examined, the BAM system was categorized into the OLAP/analytical processing system. The authors then include the four step procedure in designing the BAM system. The first step is to select and define the monitoring objects from which performance is measured in real-time. In this case, it includes the key performance indicator (KPI) current sales inventory which will aid in determining the company’s operational efficiency. Also, the authors monitor the business process of equipment management in order to have real-time information on statuses of equipment failures. For step two, the conceptual design of the dashboard is created. Business and or technical events are defined in step three in order to capture the trend and status of the KPIs selected in step one. Finally, step four of the design procedure defines how data is extracted for event processing and how it will be displayed on the BAM system’s dashboard. The prototype was then implemented with the following commercial solutions: Oracle BAM (BAM type), Oracle Database 11g (database type), WebMethods (EAI tool), and Java (programming language for UI). Han and Kang include the results of their BAM system with these two dashboard screenshots that cover the KPI status and the business process status for equipment problem management.
read more...

Application of Web Data Mining and Data Warehouse in E-Commerce {Comments Off on Application of Web Data Mining and Data Warehouse in E-Commerce}

by Jungh K
For this week’s blog assignment, I chose an article, titled “Application of Web Data Mining and Data Warehouse in E-Commerce”.  The authors provide overviews of data warehouse and how it is used in E-Commerce environments.  According to the authors, W. H. Inmon, who is considered to be the founder of data warehouse, states that data warehouse is defined as “data collection which is subject-oriented, integrated, and non-volatile and time variant and it is used to support for management decision”.  In the data warehouse, the data is organized to be specific on each subject. Once the original dispersed data are collected and cleaned, the refined data are stored in data warehouse in consistent manners.  The data in data warehouse are rarely deleted or modified even though they are updated in real-time.  The data in data warehouse store historical information and there qualitative analysis allows users to forecast the future tendency.  The authors give examples using customer management modules, which are commonly used in E-Commerce, to demonstrate how data warehouse is used.  In the example, modules are divided into their specific purposes, however; they are set up in a way to provide comprehensive understanding of the customers. read more...

A Cloud-Based Data Warehouse Service from Treasure Data {2}

by Kathy S
The author of this article focused on the cloud-based data warehouse company, Treasure Data. The company received $1.5 million in funding that includes an investment from Yukihiro “Matz” Matsumoto — the creator of the Ruby programming language. Treasure Data has developed a service that brings high-end analysis to businesses that don’t have the resources to afford a solutions from major companies like IBM, Oracle or Teradata. According to the CEO of Treasure Data, Hiro Yoshikawa, the total cost of ownership for a data warehouse suite from one of the enterprise players can cost as much as $5 million. Treasure Data is a subscription service that, at low end, costs $1,500 per month or $1,200 per month with a 12 month commitment. Yoshikawa says that on average the cost over time is more than 10 times less than what an enterprise data warehouse offering would cost. Treasure data has more than 10 customers that include “Fortunes 500” companies and it has more than 100 billion records stored and is processing 10,000 messages per second. Also, Treasure Data borrows from Hadoop but with a twist. Unlike Hadoop, Treasure Data does not require an infrastructure investment. read more...

Size of Facebook’s Data {4}

by Allen D
 

The article that I chose to talk about this week is called “How Big Is Facebook’s Data? 2.5 Billion Pieces of Content and 500+ Terabytes Ingested Every Day”, by Josh Constine. The title says it all. Facebook revealed to reporters that their system processes over 2.5 billion pieces of content worth 500+ terabytes of data per day. The author talks about how the company system processes approximately 2.7 billion ‘Like’ actions and 300 million photos per day. The Vice President of Engineering, Jay Parikh, revealed that over 100 petebytes of data are stored in their data warehouse. In order for Facebook to support data-intensive activities and distributed applications, they use a software framework called Apache Hadoop. Hadoop provides very large bandwidth across the cluster and enables applications to process petabytes of data and thousands of independent computers. Parikh said to the reporters that Facebook operates the single largest Hadoop system in the world; one that’s even larger than Yahoo’s. read more...