Skip to main content

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro

Galit Shmueli, Peter C. Bruce, Mia L. Stephens, Nitin R. Patel

ISBN: 978-1-118-87752-4

May 2016

464 pages

$108.99

Description

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® presents an  applied and interactive approach to data mining.

Featuring hands-on applications with JMP Pro®, a statistical package from the SAS Institute, the book
uses engaging, real-world examples to build a theoretical and practical understanding of key data mining methods, especially predictive models for classification and prediction. Topics include data visualization, dimension reduction techniques, clustering, linear and logistic regression, classification and regression trees, discriminant analysis, naive Bayes, neural networks, uplift modeling, ensemble models, and time series forecasting.

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® also includes:

  • Detailed summaries that supply an outline of key topics at the beginning of each chapter
  • End-of-chapter examples and exercises that allow readers to expand their comprehension of the presented material
  • Data-rich case studies to illustrate various applications of data mining techniques
  • A companion website with over two dozen data sets, exercises and case study solutions, and slides for instructors

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® is an excellent textbook for advanced undergraduate and graduate-level courses on data mining, predictive analytics, and business analytics. The book is also a one-of-a-kind resource for data scientists, analysts, researchers, and practitioners working with analytics in the fields of management, finance, marketing, information technology, healthcare, education, and any other data-rich field.

Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University’s Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks, and book chapters, including Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition, also published by Wiley.

Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective and co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner ®, Third Edition, both published by Wiley.

Mia Stephens is Academic Ambassador at JMP®, a division of SAS Institute. Prior to joining SAS, she was an adjunct professor of statistics at the University of New Hampshire and a founding member of the North Haven Group LLC, a statistical training and consulting company. She is the co-author of three other books, including Visual Six Sigma: Making Data Analysis Lean, Second Edition, also published by Wiley.

Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. He is co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition, also published by Wiley.

Related Resources

Instructor

Request an Evaluation Copy for this title

Contact your Rep for all inquiries

Dedication i

Foreword xvii

Preface xviii

Acknowledgments xx

PART I PRELIMINARIES

CHAPTER 1 Introduction 3

1.1 What is Business Analytics? 3

1.2 What is Data Mining? 5

1.3 Data Mining and Related Terms 5

1.4 Big Data 6

1.5 Data science 7

1.6 Why Are There So Many Different Methods? 8

1.7 Terminology and Notation 9

1.8 Road Maps to This Book 11

Order of Topics 12

CHAPTER 2 Overview of the Data Mining Process 15

2.1 Introduction 15

2.2 Core Ideas in Data Mining 16

2.3 The Steps in Data Mining 19

2.4 Preliminary Steps 20

2.5 Predictive Power and Overfitting 28

2.6 Building a Predictive Model with JMP Pro 33

2.7 Using JMP Pro for Data Mining 42

2.8 Automating Data Mining Solutions 42

Data Mining Software Tools (Herb Edelstein) 44

Problems 47

PART II DATA EXPLORATION AND DIMENSION REDUCTION

CHAPTER 3 Data Visualization 52

3.1 Uses of Data Visualization 52

3.2 Data Examples 54

Example 1: Boston Housing Data 54

Example 2: Ridership on Amtrak Trains 55

3.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots 55

Distribution Plots 58

Heatmaps: visualizing correlations and missing values 61

3.4 Multi-Dimensional Visualization 63

Adding Variables: Color, Hue, Size, Shape, Multiple Panels, Animation 63

Manipulations: Re-scaling, Aggregation and Hierarchies, Zooming and Panning, Filtering 67

Reference: Trend Line and Labels 70

Scaling Up: Large Datasets 72

Multivariate Plot: Parallel Coordinates Plot 73

Interactive Visualization 74

3.5 Specialized Visualizations 76

Visualizing Networked Data 76

Visualizing Hierarchical Data: Treemaps 77

Visualizing Geographical Data: Maps 78

3.6 Summary of Major Visualizations and Operations, According to Data Mining Goal 80

Prediction 80

Classification 81

Time Series Forecasting 81

Unsupervised Learning 82

Problems 83

CHAPTER 4 Dimension Reduction 85

4.1 Introduction 85

4.2 Curse of Dimensionality 86

4.3 Practical Considerations 86

Example 1: House Prices in Boston 87

4.4 Data Summaries 88

4.5 Correlation Analysis 91

4.6 Reducing the Number of Categories in Categorical Variables 92

4.7 Converting A Categorical Variable to A Continuous Variable 94

4.8 Principal Components Analysis 94

Example 2: Breakfast Cereals 95

Principal Components 101

Normalizing the Data 102

Using Principal Components for Classification and Prediction 104

4.9 Dimension Reduction Using Regression Models 104

4.10 Dimension Reduction Using Classification and Regression Trees 106

Problems 107

PART III PERFORMANCE EVALUATION

CHAPTER 5 Evaluating Predictive Performance 111

5.1 Introduction 111

5.2 Evaluating Predictive Performance 112

Benchmark: The Average 112

Prediction Accuracy Measures 113

5.3 Judging Classifier Performance 115

Benchmark: The Naive Rule 115

Class Separation 115

The Classification Matrix 116

Using the Validation Data 117

Accuracy Measures 117

Cutoff for Classification 118

Performance in Unequal Importance of Classes 122

Asymmetric Misclassification Costs 123

5.4 Judging Ranking Performance 127

5.5 Oversampling 131

Problems 138

PART IV PREDICTION AND CLASSIFICATION METHODS

CHAPTER 6 Multiple Linear Regression 141

6.1 Introduction 141

6.2 Explanatory vs. Predictive Modeling 142

6.3 Estimating the Regression Equation and Prediction 143

Example: Predicting the Price of Used Toyota Corolla Automobiles . 144

6.4 Variable Selection in Linear Regression 149

Reducing the Number of Predictors 149

How to Reduce the Number of Predictors 150

Manual Variable Selection 151

Automated Variable Selection 151

Problems 160

CHAPTER 7 k-Nearest Neighbors (kNN) 165

7.1 The k-NN Classifier (categorical outcome) 165

Determining Neighbors 165

Classification Rule 166

Example: Riding Mowers 166

Choosing k 167

Setting the Cutoff Value 169

7.2 k-NN for a Numerical Response 171

7.3 Advantages and Shortcomings of k-NN Algorithms 172

Problems 174

CHAPTER 8 The Naive Bayes Classifier 176

8.1 Introduction 176

Example 1: Predicting Fraudulent Financial Reporting 177

8.2 Applying the Full (Exact) Bayesian Classifier 178

8.3 Advantages and Shortcomings of the Naive Bayes Classifier 187

Advantages and Shortcomings of the naive Bayes Classifier 187

Problems 191

CHAPTER 9 Classification and Regression Trees 194

9.1 Introduction 194

9.2 Classification Trees 195

Example 1: Riding Mowers 196

9.3 Growing a Tree 198

Growing a Tree Example 198

Growing a Tree with CART 203

9.4 Evaluating the Performance of a Classification Tree 203

Example 2: Acceptance of Personal Loan 203

9.5 Avoiding Overfitting 204

Stopping Tree Growth: CHAID 205

Pruning the Tree 207

9.6 Classification Rules from Trees 208

9.7 Classification Trees for More Than two Classes 210

9.8 Regression Trees 210

Prediction 213

Evaluating Performance 214

9.9 Advantages and Weaknesses of a Tree 214

9.10 Improving Prediction: Multiple Trees 216

9.11 CART, and Measures of Impurity 218

Measuring Impurity 218

Problems 221

CHAPTER 10 Logistic Regression 224

10.1 Introduction 224

10.2 The Logistic Regression Model 226

Example: Acceptance of Personal Loan 227

Model with a Single Predictor 229

Estimating the Logistic Model from Data: Computing Parameter Estimates 231

10.3 Evaluating Classification Performance 234

Variable Selection 236

10.4 Example of Complete Analysis: Predicting Delayed Flights 237

Data Preprocessing 240

Model Fitting, Estimation and Interpretation - A Simple Model 240

Model Fitting, Estimation and Interpretation - The Full Model 241

Model Performance 243

Variable Selection 245

10.5 Appendix: Logistic Regression for Profiling 249

Appendix A: Why Linear Regression Is Inappropriate for a Categorical Response 249

Appendix B: Evaluating Explanatory Power 250

Appendix C: Logistic Regression for More Than Two Classes 253

Problems 257

CHAPTER 11 Neural Nets 260

11.1 Introduction 260

11.2 Concept and Structure of a Neural Network 261

11.3 Fitting a Network to Data 261

Example 1: Tiny Dataset 262

Computing Output of Nodes 263

Preprocessing the Data 266

Training the Model 267

Using the Output for Prediction and Classification 272

Example 2: Classifying Accident Severity 273

Avoiding overfitting 275

11.4 User Input in JMP Pro 277

11.5 Exploring the Relationship Between Predictors and Response 280

11.6 Advantages and Weaknesses of Neural Networks 281

Problems 282

CHAPTER 12 Discriminant Analysis 284

12.1 Introduction 284

Example 1: Riding Mowers 285

Example 2: Personal Loan Acceptance 285

12.2 Distance of an Observation from a Class 286

12.3 From Distances to Propensities and Classifications 288

12.4 Classification Performance of Discriminant Analysis 292

12.5 Prior Probabilities 293

12.6 Classifying More Than Two Classes 294

Example 3: Medical Dispatch to Accident Scenes 294

12.7 Advantages and Weaknesses 296

Problems 299

CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 302

13.1 Ensembles 303

Why Ensembles Can Improve Predictive Power 303

Simple Averaging 305

Bagging 306

Boosting 306

Advantages and Weaknesses of Ensembles 307

13.2 Uplift (Persuasion) Modeling 308

A-B Testing 308

Uplift 308

Gathering the Data 309

A Simple Model 310

Modeling Individual Uplift 311

Using the Results of an Uplift Model 312

Creating Uplift Models in JMP Pro 313

13.3 Summary 315

Problems 316

PART V MINING RELATIONSHIPS AMONG RECORDS

CHAPTER 14 Cluster Analysis 320

14.1 Introduction 320

Example: Public Utilities 322

14.2 Measuring Distance Between Two Observations 324

Euclidean Distance 324

Normalizing Numerical Measurements 324

Other Distance Measures for Numerical Data 326

Distance Measures for Categorical Data 327

Distance Measures for Mixed Data 327

14.3 Measuring Distance Between Two Clusters 328

14.4 Hierarchical (Agglomerative) Clustering 330

Single Linkage 332

Complete Linkage 332

Average Linkage 333

Centroid Linkage 333

Dendrograms: Displaying Clustering Process and Results 334

Validating Clusters 335

Limitations of Hierarchical Clustering 339

14.5 Nonhierarchical Clustering: The k-Means Algorithm 340

Initial Partition into k Clusters 342

Problems 350

PART VI FORECASTING TIME SERIES

CHAPTER 15 Handling Time Series 355

15.1 Introduction 355

15.2 Descriptive vs. Predictive Modeling 356

15.3 Popular Forecasting Methods in Business 357

Combining Methods 357

15.4 Time Series Components 358

Example: Ridership on Amtrak Trains 358

15.5 Data Partitioning and Performance Evaluation 362

Benchmark Performance: Naive Forecasts 362

Generating Future Forecasts 363

Problems 365

CHAPTER 16 Regression-Based Forecasting 368

16.1 A Model with Trend 368

Linear Trend 368

Exponential Trend 372

Polynomial Trend 374

16.2 A Model with Seasonality 375

16.3 A Model with Trend and Seasonality 378

16.4 Autocorrelation and ARIMA Models 378

Computing Autocorrelation 380

Computing Autocorrelation 380

Improving Forecasts by Integrating Autocorrelation Information 383

Improving Forecasts by Integrating Autocorrelation Information383

Fitting AR Models to Residuals 384

Fitting AR Models to Residuals 384

Evaluating Predictability 387

Evaluating Predictability 387

Problems 389

CHAPTER 17 Smoothing Methods 399

17.1 Introduction 399

17.2 Moving Average 400

Centered Moving Average for Visualization 400

Trailing Moving Average for Forecasting 401

Choosing Window Width (w) 404

17.3 Simple Exponential Smoothing 405

Choosing Smoothing Parameter 406

Relation Between Moving Average and Simple Exponential Smoothing 408

17.4 Advanced Exponential Smoothing 409

Series with a trend 409

Series with a Trend and Seasonality 410

Problems 414

PART VII CASES

CHAPTER 18 Cases 425

18.1 Charles Book Club 425

18.2 German Credit 434

Background 434

Data 434

18.3 Tayko Software Cataloger 439

18.4 Political Persuasion 442

Background 442

Predictive Analytics Arrives in US Politics 442

Political Targeting 442

Uplift 443

Data 444

Assignment 444

18.5 Taxi Cancellations 446

Business Situation 446

Assignment 446

18.6 Segmenting Consumers of Bath Soap 448

Appendix 451

18.7 Direct-Mail Fundraising 452

18.8 Predicting Bankruptcy 455

18.9 Time Series Case: Forecasting Public Transportation Demand 458

References 460

Data Files Used in the Book 461

Index 463