Skip to main content

Imbalanced Learning: Foundations, Algorithms, and Applications

Imbalanced Learning: Foundations, Algorithms, and Applications

Haibo He (Editor), Yunqian Ma (Editor)

ISBN: 978-1-118-64620-5 May 2013 Wiley-IEEE Press 216 Pages

 E-Book

$102.99

Description

The first book of its kind to review the current status and future direction of the exciting new branch of machine learning/data mining called imbalanced learning

Imbalanced learning focuses on how an intelligent system can learn when it is provided with imbalanced data. Solving imbalanced learning problems is critical in numerous data-intensive networked systems, including surveillance, security, Internet, finance, biomedical, defense, and more. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation.

The first comprehensive look at this new branch of machine learning, this book offers a critical review of the problem of imbalanced learning, covering the state of the art in techniques, principles, and real-world applications. Featuring contributions from experts in both academia and industry, Imbalanced Learning: Foundations, Algorithms, and Applications provides chapter coverage on:

  • Foundations of Imbalanced Learning
  • Imbalanced Datasets: From Sampling to Classifiers
  • Ensemble Methods for Class Imbalance Learning
  • Class Imbalance Learning Methods for Support Vector Machines
  • Class Imbalance and Active Learning
  • Nonstationary Stream Data Learning with Imbalanced Class Distribution
  • Assessment Metrics for Imbalanced Learning

Imbalanced Learning: Foundations, Algorithms, and Applications will help scientists and engineers learn how to tackle the problem of learning from imbalanced datasets, and gain insight into current developments in the field as well as future research directions.

Preface ix

Contributors xi

1 Introduction 1
Haibo He

1.1 Problem Formulation 1

1.2 State-of-the-Art Research 3

1.3 Looking Ahead: Challenges and Opportunities 6

1.4 Acknowledgments 7

References 8

2 Foundations of Imbalanced Learning 13
Gary M. Weiss

2.1 Introduction 14

2.2 Background 14

2.3 Foundational Issues 19

2.4 Methods for Addressing Imbalanced Data 26

2.5 Mapping Foundational Issues to Solutions 35

2.6 Misconceptions About Sampling Methods 36

2.7 Recommendations and Guidelines 38

References 38

3 Imbalanced Datasets: From Sampling to Classifiers 43
T. Ryan Hoens and Nitesh V. Chawla

3.1 Introduction 43

3.2 Sampling Methods 44

3.3 Skew-Insensitive Classifiers for Class Imbalance 49

3.4 Evaluation Metrics 52

3.5 Discussion 56

References 57

4 Ensemble Methods for Class Imbalance Learning 61
Xu-Ying Liu and Zhi-Hua Zhou

4.1 Introduction 61

4.2 Ensemble Methods 62

4.3 Ensemble Methods for Class Imbalance Learning 66

4.4 Empirical Study 73

4.5 Concluding Remarks 79

References 80

5 Class Imbalance Learning Methods for Support Vector Machines 83
Rukshan Batuwita and Vasile Palade

5.1 Introduction 83

5.2 Introduction to Support Vector Machines 84

5.3 SVMs and Class Imbalance 86

5.4 External Imbalance Learning Methods for SVMs: Data Preprocessing Methods 87

5.5 Internal Imbalance Learning Methods for SVMs: Algorithmic Methods 88

5.6 Summary 96

References 96

6 Class Imbalance and Active Learning 101
Josh Attenberg and Seyda Ertekin

6.1 Introduction 102

6.2 Active Learning for Imbalanced Problems 103

6.3 Active Learning for Imbalanced Data Classification 110

6.4 Adaptive Resampling with Active Learning 122

6.5 Difficulties with Extreme Class Imbalance 129

6.6 Dealing with Disjunctive Classes 130

6.7 Starting Cold 132

6.8 Alternatives to Active Learning for Imbalanced Problems 133

6.9 Conclusion 144

References 145

7 Nonstationary Stream Data Learning with Imbalanced Class Distribution 151
Sheng Chen and Haibo He

7.1 Introduction 152

7.2 Preliminaries 154

7.3 Algorithms 157

7.4 Simulation 167

7.5 Conclusion 182

7.6 Acknowledgments 183

References 184

8 Assessment Metrics for Imbalanced Learning 187
Nathalie Japkowicz

8.1 Introduction 187

8.2 A Review of Evaluation Metric Families and their Applicability to the Class Imbalance Problem 189

8.3 Threshold Metrics: Multiple- Versus Single-Class Focus 190

8.4 Ranking Methods and Metrics: Taking Uncertainty into Consideration 196

8.5 Conclusion 204

8.6 Acknowledgments 205

References 205

Index 207

“This book certainly qualifies as a reference for graduate studies in machine learning. Research students are sure to find it highly valuable and a prized possession, especially taking into account the wealth of supporting literature that the authors have brought to the fore.”  (Computing Reviews, 27 March 2014)