Knowledge Discovery with Support Vector MachinesISBN: 9780470371923
246 pages
August 2009

This book provides an indepth, easytofollow introduction to support vector machines drawing only from minimal, carefully motivated technical and mathematical background material. It begins with a cohesive discussion of machine learning and goes on to cover:

Knowledge discovery environments

Describing data mathematically

Linear decision surfaces and functions

Perceptron learning

Maximum margin classifiers

Support vector machines

Elements of statistical learning theory

Multiclass classification

Regression with support vector machines

Novelty detection
Complemented with handson exercises, algorithm descriptions, and data sets, Knowledge Discovery with Support Vector Machines is an invaluable textbook for advanced undergraduate and graduate courses. It is also an excellent tutorial on support vector machines for professionals who are pursuing research in machine learning and related areas.
PART I.
1 What is Knowledge Discovery?
1.1 Machine Learning.
1.2 The Structure of the Universe X.
1.3 Inductive Learning.
1.4 Model Representations.
Exercises.
Bibliographic Notes.
2 Knowledge Discovery Environments.
2.1 Computational Aspects of Knowledge Discovery.
2.1.1 Data Access.
2.1.2 Visualization.
2.1.3 Data Manipulation.
2.1.4 Model Building and Evaluation.
2.1.5 Model Deployment.
2.2 Other Toolsets.
Exercises.
Bibliographic Notes.
3 Describing Data Mathematically.
3.1 From Data Sets to Vector Spaces.
3.1.1 Vectors.
3.1.2 Vector Spaces.
3.2 The Dot Product as a Similarity Score.
3.3 Lines, Planes, and Hyperplanes.
Exercises.
Bibliographic Notes.
4 Linear Decision Surfaces and Functions.
4.1 From Data Sets to Decision Functions.
4.1.1 Linear Decision Surfaces through the Origin.
4.1.2 Decision Surfaces with an Offset Term.
4.2 A Simple Learning Algorithm.
4.3 Discussion.
Exercises.
Bibliographic Notes.
5 Perceptron Learning.
5.1 Perceptron Architecture and Training.
5.2 Duality.
5.3 Discussion.
Exercises.
Bibliographic Notes.
6 Maximum Margin Classifiers.
6.1 Optimization Problems.
6.2 Maximum Margins.
6.3 Optimizing the Margin.
6.4 Quadratic Programming.
6.5 Discussion.
Exercises.
Bibliographic Notes.
PART II.
7 Support Vector Machines.
7.1 The Lagrangian Dual.
7.2 Dual MaximumMargin Optimization.
7.2.1 The Dual Decision Function.
7.3 Linear Support Vector Machines.
7.4 NonLinear Support Vector Machines.
7.4.1 The Kernel Trick.
7.4.2 Feature Search.
7.4.3 A Closer Look at Kernels.
7.5 SoftMargin Classifiers.
7.5.1 The Dual Setting for SoftMargin Classifiers.
7.6 Tool Support.
7.6.1 WEKA.
7.6.2 R.
7.7 Discussion.
Exercises.
Bibliographic Notes.
8 Implementation.
8.1 Gradient Ascent.
8.1.1 The KernelAdatron Algorithm.
8.2 Quadratic Programming.
8.2.1 Chunking.
8.3 Sequential Minimal Optimization.
8.4 Discussion.
Exercises.
Bibliographic Notes.
9 Evaluating What has been Learned.
9.1 Performance Metrics.
9.1.1 The Confusion Matrix.
9.2 Model Evaluation.
9.2.1 The HoldOut Method.
9.2.2 The LeaveOneOut Method.
9.2.3 NFold CrossValidation.
9.3 Error Confidence Intervals.
9.3.1 Model Comparisons.
9.4 Model Evaluation in Practice.
9.4.1 WEKA.
9.4.2 R.
Exercises.
Bibliographic Notes.
10 Elements of Statistical Learning Theory.
10.1 The VCDimension and Model Complexity.
10.2 A Theoretical Setting for Machine Learning.
10.3 Empirical Risk Minimization.
10.4 VCConfidence.
10.5 Structural Risk Minimization.
10.6 Discussion.
Exercises.
Bibliographic Notes.
PART III.
11 MultiClass Classification.
11.1 OneversustheRest Classification.
11.2 Pairwise Classification.
11.3 Discussion.
Exercises.
Bibliographic Notes.
12 Regression with Support Vector Machines.
12.1 Regression as Machine Learning.
12.2 Simple and Multiple Linear Regression.
12.3 Regression with Maximum Margin Machines.
12.4 Regression with Support Vector Machines.
12.5 Model Evaluation.
12.6 Tool Support.
12.6.1 WEKA.
12.6.2 R.
Exercises.
Bibliographic Notes.
13 Novelty Detection.
13.1 Maximum Margin Machines.
13.2 The Dual Setting.
13.3 Novelty Detection in R.
Exercises.
Bibliographic Notes.
Appendix A: Notation.
Appendix B: A Tutorial Introduction to R.
B.1 Programming Constructs.
B.2 Data Constructs.
B.3 Basic Data Analysis.
Bibliographic Notes.
References.
Index.
Knowledge Discovery with Support Vector Machines (US $126.00)
and Data Mining for Genomics and Proteomics: Analysis of Gene and Protein Expression Data (US $101.95)
Total List Price: US $227.95
Discounted Price: US $170.96 (Save: US $56.99)