Data Mining

Database and Data Mining for Consumer Products and Services {0}

By Jonathan C.

A successful business needs to have quality products and services. Today, to achieve both at the same time with efficiency, business must have a way to achieve the data collecting, storing, analyzing and sharing the information. The introduction of computer enabled business a new way to deal with increase data instead of pen and papers. Some company have their database as early as 1970s. Per Jianfeng Wang, Wal-Mart’s database of programs and control systems were activated in 1972 for store and management levels use. At the early stage, the database is mostly used for product distribution and communication between the suppliers and retailer. Moreover, per Constance L. Hays from The New York Times, database and mining for consumer did not start until early 2000. Now many major business services such as retail, banking, insurance, health, and many more are using database and mining to benefit their relationship with their customers. The world of marketing has change from product-orientation to customer-orientation. read more...

Data Mining Within E-Commerce {0}

By Gary C.

In e-commerce, data mining is critically essential in order to compete with the rapidly growing competition amongst retailers. E-commerce is the exchange of data within the online world in order to garner business transactions. There are patterns and trends within shoppers that are analyzed and broken down in order to determine strategies to identify a multitude of situations, such as from what customers may like based on a previously purchased product all the way to why customers tend to avoid a certain product. The amount of raw data that is transmitted through data mining is astounding and requires a tremendous amount of research in order to determine the most of every possible likely scenario. read more...

Data Mining and its Use in Everyday Life {14}

by Erin S
Every day 2.5 quintillion bytes of data are created and 90 percent of the data in the world today were produced within the past two years. Because the amount of data is growing and at such a large rate, the challenges of handling this data with the intention to use and to apply it using tools such as data mining has become more and more complex, and has caused a constant need to scale up to the large volume of data that must be interpreted. With this large influx of new data and information comes many new opportunities to use and to apply data mining. This most often seems to apply in a business sense, used in order to “improve customer service, better target marketing campaigns, identify high-risk clients, and improve production processes” or in other words to make money, such as when Walmart learned that people have a tendency to buy more Pop Tarts when there was a hurricane warning in the affected area and instructed store managers to place Pop Tarts near the entrance during hurricane season in order to boost sales. Other companies such as Facebook and Twitter make use of this data by selling it to other companies who then apply data mining better market their products by finding new customers  or by  better targeting their products to existing ones. However, data mining isn’t only useful to businesses. It can also affect different aspects of a person’s everyday life. read more...

Intelligent Decision Making Based on Data Mining using Differential Evolution Algorithms and Framework for ETL Workflow Management {Comments Off on Intelligent Decision Making Based on Data Mining using Differential Evolution Algorithms and Framework for ETL Workflow Management}

by Jungh K
For this week’s blog assignment, I chose an article, titled “Intelligence Decision Making Based on Data Mining using Differential Evolution Algorithms and Framework for ETL Workflow Management”.   The authors propose an integrated DSS, which utilizes a data mining technique and a framework for effective ETL workflows.  The specific data mining technique proposed the authors is to add a specialized component, known as the Artificial Intelligence Component (AIC), to business intelligence system.  The AIC utilizes Differential Evolution Algorithms, which replace an option for the current situation to an optimized option, if one exists.  Through this procedure, the authors argue that the DEA will adapt itself to improve the intelligence decision making process with the passages of time.  On top of the data mining discussed in the article, the authors propose to add two layers, application and workflow scheduling, to workflow management.  The application layer receives ETL jobs directly from the data generator.  The authors state that they are numbers of considerations, which must be taken into, for ETL processes.  The considerations include source availability, target availability, priority, job duration, upper bound, required resources, and prerequisite jobs.  The workflow management layer is divided into two parts:  workflow scheduling and workflow execution.  By incorporating the aforementioned considerations for ETL processes, workflow scheduling layer utilizes various algorithms to optimize scheduling.  The work execution layer tracks different ETL jobs and distributes throughout available servers. read more...

Application of Web Data Mining and Data Warehouse in E-Commerce {Comments Off on Application of Web Data Mining and Data Warehouse in E-Commerce}

by Jungh K
For this week’s blog assignment, I chose an article, titled “Application of Web Data Mining and Data Warehouse in E-Commerce”.  The authors provide overviews of data warehouse and how it is used in E-Commerce environments.  According to the authors, W. H. Inmon, who is considered to be the founder of data warehouse, states that data warehouse is defined as “data collection which is subject-oriented, integrated, and non-volatile and time variant and it is used to support for management decision”.  In the data warehouse, the data is organized to be specific on each subject. Once the original dispersed data are collected and cleaned, the refined data are stored in data warehouse in consistent manners.  The data in data warehouse are rarely deleted or modified even though they are updated in real-time.  The data in data warehouse store historical information and there qualitative analysis allows users to forecast the future tendency.  The authors give examples using customer management modules, which are commonly used in E-Commerce, to demonstrate how data warehouse is used.  In the example, modules are divided into their specific purposes, however; they are set up in a way to provide comprehensive understanding of the customers. read more...

Another Tool to Help Students Understand Data Warehousing {Comments Off on Another Tool to Help Students Understand Data Warehousing}

by Katheryn T
The article I choose to write on was about a new method of helping students learn about data warehousing better. It is a new tool from the University of California, Sacramento. Since every business is taking advantage of the data mining that is going on, there needs to be well educated people to take care of that data. This article talked about how the courseware developed will help students and beginners understand the beginning phases of data warehousing and the importance of doing it right. There are dimensional models that help students visually see what they are doing. These models can be changed for the progression of the chosen company. This tool helps students learn about their designs and how they need to change with the data. read more...

Should We Hire or Train IT Professional To Analyze Data {5}

by Phuong H
In the article ” Big Data Analysts: Do You Hire or Train for it?” written by Mary Shacklett, the author asked should the company hire or train a an IT profession to analyze the big data. There is con and pro for both. Hiring an IT profession would definitely boost up the process because they’re used to the tools and the process. However, it comes with a high cost to hire an experienced IT profession. On the other hand, training an IT profession will save money but it will takes time. The author also give a scenario where it would be a good idea to train an IT profession rather than hiring one and vice versa. For example, when there is a problem with data analyzing, accuracies (company like pharmaceutical) or develop complex algorithm, it would be best to hire a professional IT analyst. When it come to data mining, the company should “looking internally at their own engineering and research people” because they are expert in that field doing data mining can also help them searching for the new trend. For instance, “A market research analyst in Marketing can potentially be stretched into a big data analytic expert on customer and product trends.” read more...

Getting Better Results for Web Searches {2}

by Andrew M
The article I decided to talk about this week is entitled “Entity Synonyms for Structured Web Search” and is written by Tao Cheng, Hady Lauw and Stelios Paparizos. This article talks about how currently when users run web searches they are not always getting the desired information they want. Currently, developers manually enter in synonyms for search words or use dictionaries and lookup tables. Another method which can be used is called content analysis. This will actually analyze the search and return more accurate data. All the above methods though are costly and time consuming and many times do return the desired results. The author’s proposed solution is to use something named “entity synonyms.” This in effect links search terms to certain entities so that when someone later searches this term the results will be more accurate. This is done by mining previous search data. Two methods are used to mine the search data. These methods are called Intersecting Page Count (IPC) and Intersecting Click Ratio (ICR). IPC measures how close a website is related to what the user was searching for. If a user stays at a page for a while this is seen as a possible match and is logged. ICR measures the amount a user clicks around a website. The more the user clicks around the more the website is seen as a possible match. Next, a cleaning process is started in which all un-needed terms are cleared from the entity synonym. An example would be if a user searched “Looper Trailer.” The search would look for synonyms for the movie “Looper” and return all possible websites. The word “trailer” would be removed because it is un-needed information. read more...

Vertical to Horizontal is that the question? {1}

by CyberChic
According to Chen and Ordonez in their article “Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis, there has been difficulty in the past preparing database information for data mining. Data mining usually uses aggregate information to start the mining process. With SQL the data that is derived from the query in aggregate form is usually in a vertical view. They give detailed information in their paper explaining how to evaluate and optimize horizontal aggregations of data. Some of the benefits of doing this process in the database itself are to reduce manual work in data Right preparation and data security. read more...

What to Expect in Data Warehousing {2}

by Penny P
Data warehouses play a critical role in corporation/institutions. They allow users to quickly retrieve data to be used for data mining or for data analysis tools. The results that they get from these tools are then used to support business decisions that would be made. The authors of the article designed a case study that would help beginners understand the basics of data warehousing. Using the enrollment data from Universities, the main objective was to prepare the data for a mining system that would be used for predicting student enrollment. read more...