
Data Mining with Microsoft SQL Server 2000 Technical Reference
Résumé
Contents
Acknowledgments xi
Page xi
Introduction xiii
Page xiii
PART I INTRODUCING DATA MINING Page
1 Understanding Data Mining Page
3
What Is Data Mining? Page 3
Why Use Data Mining? Page 4
How Data Mining Is Currently Used Page 6
Defining the Terms Page 7
Data Mining Methodology Page 9
Analyzing the Problem Page 10
Extracting and Cleansing the Data Page
10
Validating the Data Page 10
Creating and Training the Model Page
10
Querying the Data Mining Model Data Page
10
Maintaining the Validity of the Data-Mining
Model Page 10
Overview of Microsoft Data Mining Page 11
Data Mining vs. OLAP Page 11
Data-Mining Models Page 11
Data-Mining Algorithms Page 12
Using SQL Server Syntax to Data Mine Page
14
Summary Page 14
2 Microsoft SQL Server Analysis Services
Architecture Page 15
Introduction to OLAP Page 16
MOLAP Page 18
ROLAP Page 18
HOLAP Page 19
Server Architecture Page 20
Data Mining Services Within Analysis
Services Page 20
Client Architecture Page 21
PivotTable Service Page 22
OLE DB Page 23
Decision Support Objects (DSO) Page
24
Multidimensional Expressions(MDX) Page
25
Prediction Joins Page 25
Summary Page 26
3 Data Storage Models 27 Page
Why Data Mining Needs a Data Warehouse Page
27
Maintaining Data Integrity Page 28
Reporting Against OLTP Data Can Be Hazardous to Your
Performance Page 31
Data Warehousing Architecture for Data
Mining Page 33
Creating the Warehouse from OLTP
Data Page 33
Optimizing Data for Mining Page 36
Physical Data Mining Structure Page
42
Three-Tier Architecture Page 43
Relational Data Warehouse Page 43
Advantages of Relational Data
Storage Page 44
Building Supporting Tables for Data
Mining Page 45
OLAP cubes Page 46
How Data Mining Uses OLAP Structures Page
46
Advantages of OLAP Storage Page 47
When OLAP Is Not Appropriate for Data
Mining Page 49
Summary Page 49
4 Approaches to Data Mining 51 Page
Directed Data Mining Page 51
Undirected Data Mining Page 52
Data Mining vs. Statistics Page 52
Learning from Historical Data Page
57
Predicting the Future Page 59
Training Data-Mining Models Page 61
Evaluating the Models and Avoiding
Errors Page 62
Summary Page 65
PART II DATA-MINING METHODS Page
5 Microsoft Decision Trees Page 69
Creating the Model Page 69
Analysis Manager Page 70
Visualizing the Model Page 87
Dependency Network Browser Page 94
Inside the Decision Tree Algorithm Page
97
How Predictions Are Derived Page
109
Navigating the Tree Page 109
Navigation vs. Rules Page 112
When to Use Decision Trees Page 113
Summary Page 114
6 Creating Decision Trees with OLAP
Page 115
Creating the Model Page 115
Select Source Type Page 116
Select Source Cube and Data-Mining
Technique Page 116
Select Case Page 118
Select Predicted Entity Page 119
Select Training Data Page 121
Select Dimension and Virtual Cube Page
121
Completing the Data-Mining Model Page
123
OLAP Mining Model Editor Page 125
Content Detail Pane Page 126
Structure Panel Page 126
Prediction Tree List Page 126
Analyzing Data with the OLAP Data-Mining
Model Page 126
Using the Generated Virtual Cube Page
128
Using the Generated Dimension Page
129
Summary Page 133
7 Microsoft Clustering Page 135
The Search for Order Page 136
Looking for Ways to Understand Data Page
136
Clustering as an Undirected Data-Mining
Technique Page 137
How Clustering Works Page 138
Overview of the Algorithm Page 138
The K-Means Method Clustering
Algorithm Page 138
What Is Being Measured Exactly? Page
142
Clustering Factors Page 142
Measuring "Closeness" Page 143
When to Use Clustering Page 146
Visualize Relationships Page 146
Highlight Anomalies Page 146
Create Samples for Other Data-Mining
Efforts Page 148
Weaknesses of Clustering Page 148
Creating a Data-Mining Model Using
Clustering Page 149
Select Source Type Page 150
Select the Table or Tables for Your Mining
Model Page 150
Select the Data-Mining Technique Page
151
Edit Joins Page 152
Select the Case Key Column for Your Mining
Model Page 152
Select the Input and Predictable
Columns Page 152
Viewing the Model Page 154
Organization of the Cluster Nodes Page
154
Order of the Cluster Nodes Page 156
Analyzing the Data Page 156
Summary Page 158
PART III CREATING DATA–MINING APPLICATIONS
WITH CODE Page 8 Using Microsoft Data
Transformation Services (DTS) Page 161
What Is DTS? Page 162
DTS Tasks Page 162
Transform Page 162
Bulk Insert Page 163
Data Driven Query Page 163
Execute Package Page 164
Connections Page 167
Sources Page 167
Configuring a Connection Page 168
DTS Package Workflow Page 169
DTS Package Steps Page 169
Precedence Constraints Page 170
DTS Designer Page 171
Opening the DTS Designer Page 171
Saving a DTS Package Page 172
dtsrun Utility Page 174
Using DTS to Create a Data-Mining Model Page
177
Preparing the SQL Server Environment Page
178
Creating the Package Page 182
Summary Page 208
9 Using Decision Support Objects (DSO)
Page 209
Scripting vs. Visual Basic Page 210
The Server
Object Page 211
The Database
Object Page 219
Creating the Relational Data-Mining Model Using
DSO Page 221
Creating the OLAP Data-Mining Model Using
DSO Page 230
The DataSource
Object Page 232
Data-Mining Model (Decision Support
Objects) Page 233
Adding a New Data Source Page 233
Analysis Server Roles Page 234
Data-Mining Model Roles Page 235
Summary Page 236
10 Understanding Data-Mining Structures Page
237
The Structure of the Data-Mining Model
Case Page 237
Data-Mining Models Look Like Tables Page
237
Using Code to Browse Data-Mining Models Page
238
Using the Schema Rowsets Page 243
MINING_MODELS Schema Rowset Page
243
MINING_COLUMNS Schema Rowset Page
249
MINING_MODEL_CONTENT Schema Rowset Page
259
MINING_SERVICES Schema Rowset Page
262
SERVICE_PARAMETERS Schema Rowset Page
266
MODEL_CONTENT_PMML Schema Rowset Page
268
Summary Page 269
11 Data Mining Using PivotTable Service Page
271
Redistributing Components Page 272
Installing and Registering Components Page
273
File Locations Page 274
Installation Registry Settings Page
275
Redistribution Setup Programs Page
275
Connecting to the PivotTable Service Page
276
Connect to Analysis Services Using PivotTable
Service Page 276
Connect to Analysis Services Using
HTTP Page 280
Building a Local Data-Mining Model Page
280
Storage of Local Mining Models Page
284
SELECT INTO Statement Page 286
INSERT INTO Statement Page 286
OPENROWSET Syntax Page 287
Nested Tables and the SHAPE
Statement Page 289
Using XML in Data Mining Page 290
The PMML Standard Page 290
Summary Page 296
12 Data-Mining Queries Page 297
Components of a Prediction Query Page 297
The Basic Prediction Query Page 298
Specifying the Test Case Source Page
298
Specifying Columns Page 300
The PREDICTION JOIN Clause Page 300
Using Functions as Columns Page 304
Using Tabular Values as Columns Page
304
The WHERE Clause Page 306
Prediction Functions Page 307
Predict
Page 307
PredictProbability
Page 308
PredictSupport
Page 308
PredictVariance
Page 309
PredictStdev
Page 310
PredictProbabilityVariance
Page 310
PredictProbabilityStdev
Page 310
PredictHistogram
Page 310
TopCount
Page 313
TopSum
Page 313
TopPercent
Page 314
RangeMin
Page 314
RangeMid
Page 314
RangeMax
Page 314
PredictScore
Page 314
PredictNodeId
Page 315
Prediction Queries with Clustering Models Page
315
Cluster
Page 315
ClusterProbability
Page 316
ClusterDistance
Page 316
Using DTS to Run Prediction Queries Page
317
Summary Page 322
APPENDIX Page 325
GLOSSARY Page 349
INDEX Page 359
Caractéristiques techniques
PAPIER | |
Éditeur(s) | Microsoft Press |
Auteur(s) | Claude Seidman |
Parution | 29/11/2001 |
Nb. de pages | 368 |
Format | 19,2 x 24 |
Couverture | Relié |
Poids | 1050g |
Intérieur | Noir et Blanc |
EAN13 | 9780735612716 |
ISBN13 | 978-0-7356-1271-6 |
Avantages Eyrolles.com
Consultez aussi
- Les meilleures ventes en Graphisme & Photo
- Les meilleures ventes en Informatique
- Les meilleures ventes en Construction
- Les meilleures ventes en Entreprise & Droit
- Les meilleures ventes en Sciences
- Les meilleures ventes en Littérature
- Les meilleures ventes en Arts & Loisirs
- Les meilleures ventes en Vie pratique
- Les meilleures ventes en Voyage et Tourisme
- Les meilleures ventes en BD et Jeunesse