# Wiley - Data Mining with Microsoft SQL Server 2008 (2009)02

Chia sẻ: Hoang Nhan | Ngày: | Loại File: PDF | Số trang:10

0
185
lượt xem
55

## Wiley - Data Mining with Microsoft SQL Server 2008 (2009)02

Mô tả tài liệu
Download Vui lòng tải xuống để xem tài liệu đầy đủ

Wiley - Data Mining with Microsoft SQL Server 2008 (2009)02

Chủ đề:

Bình luận(0)

Lưu

## Nội dung Text: Wiley - Data Mining with Microsoft SQL Server 2008 (2009)02

1. Introduction to Data Mining in SQL Server 2008 3 Figure 1-1 Student table In contrast, the data mining approach for this problem is almost the reverse of the query-and-explore method. Instead of guessing a hypothesis and trying it out in different ways, you ask the question in terms of the data that can support many hypotheses, and allow your data mining system to explore them for you. In this case, you indicate that the columns IQ, Gender, ParentIncome, and ParentEncouragement are to be used as hypotheses in determining CollegePlans. As the data mining system passes over the data, it analyzes the inﬂuence of each input column on the target column. Figure 1-2 shows the hypothetical result of a decision tree algorithm operat- ing on this data set. In this case, each path from the root node to the leaf node forms a rule about the data. Looking at this tree, you see that students with IQs greater than 100 and who are encouraged by their parents are highly likely to attend college. In this case, you have extracted knowledge from the data. As shown here, data mining applies algorithms such as decision trees, clustering, association, time series, and so on to a data set, and then analyzes its contents. This analysis produces patterns, which can be explored for valuable information. Depending on the underlying algorithm, these patterns can be in the form of trees, rules, clusters, or simply a set of mathematical formulas. The information found in the patterns can be used for reporting (to
2. 4 Chapter 1 ■ Introduction to Data Mining in SQL Server 2008 guide marketing strategies, for instance) and for prediction. For example, if you could collect data about undecided students, you could select those who are likely to be interested in continued education and preemptively market to that audience. Attend College: 55% Yes 45% No IQ > 100 IQ ≤ 100 Attend College: Attend College: 79% Yes 35% Yes 21% No 65% No Encouragement = Encouragement = Encouraged Not Encouraged Attend College: Attend College: 94% Yes 69% Yes 6% No 31% No Figure 1-2 Decision tree Business Problems for Data Mining Data mining techniques can be used in virtually all business applications, answering various types of businesses questions. In truth, given the software available today, all you need is the motivation and the know-how. In general, data mining can be applied whenever something could be known, but is not. The following examples describe some scenarios: Recommendation generation — What products or services should you offer to your customers? Generating recommendations is an important business challenge for retailers and service providers. Customers who are provided appropriate and timely recommendations are likely to be more valuable (because they purchase more) and more loyal (because they feel a stronger relationship to the vendor). For example, if you go to online stores such as Amazon.com or Barnesandnoble.com to purchase an item, you are provided with recommendations about other items you may be interested in. These recommendations are derived from using data mining to analyze purchase behavior of all of the retailer’s customers, and applying the derived rules to your personal information.