Wiley.com
Print this page Share

Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation, and Enabling Design of Experiments

ISBN: 978-1-118-59796-5
400 pages
September 2016
Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation, and Enabling Design of Experiments (1118597966) cover image

Description

Since mathematical models express our understanding of how nature behaves, we use them to validate our understanding of the fundamentals about systems (which could be processes, equipment, procedures, devices, or products). Also, when validated, the model is useful for engineering applications related to diagnosis, design, and optimization.

First, we postulate a mechanism, then derive a model grounded in that mechanistic understanding. If the model does not fit the data, our understanding of the mechanism was wrong or incomplete. Patterns in the residuals can guide model improvement. Alternately, when the model fits the data, our understanding is sufficient and confidently functional for engineering applications.

This book details methods of nonlinear regression, computational algorithms,model validation, interpretation of residuals, and useful experimental design. The focus is on practical applications, with relevant methods supported by fundamental analysis.

This book will assist either the academic or industrial practitioner to properly classify the system, choose between the various available modeling options and regression objectives, design experiments to obtain data capturing critical system behaviors, fit the model parameters based on that data, and statistically characterize the resulting model. The author has used the material in the undergraduate unit operations lab course and in advanced control applications.

See More

Table of Contents

Series Preface xiii

Preface xv

Acknowledgments xxiii

Nomenclature xxv

Symbols xxxvii

Part I INTRODUCTION

1 Introductory Concepts 3

1.1 Illustrative Example – Traditional Linear Least-Squares Regression 3

1.2 How Models Are Used 7

1.3 Nonlinear Regression 7

1.4 Variable Types 8

1.5 Simulation 12

1.6 Issues 13

1.7 Takeaway 15

Exercises 15

2 Model Types 16

2.1 Model Terminology 16

2.2 A Classification of Mathematical Model Types 17

2.3 Steady-State and Dynamic Models 21

2.3.1 Steady-State Models 22

2.3.2 Dynamic Models (Time-Dependent, Transient) 24

2.4 Pseudo-First Principles – Appropriated First Principles 26

2.5 Pseudo-First Principles – Pseudo-Components 28

2.6 Empirical Models with Theoretical Grounding 28

2.6.1 Empirical Steady State 28

2.6.2 Empirical Time-Dependent 30

2.7 Empirical Models with No Theoretical Grounding 31

2.8 Partitioned Models 31

2.9 Empirical or Phenomenological? 32

2.10 Ensemble Models 32

2.11 Simulators 33

2.12 Stochastic and Probabilistic Models 33

2.13 Linearity 34

2.14 Discrete or Continuous 36

2.15 Constraints 36

2.16 Model Design (Architecture, Functionality, Structure) 37

2.17 Takeaway 37

Exercises 37

Part II PREPARATION FOR UNDERLYING SKILLS

3 Propagation of Uncertainty 43

3.1 Introduction 43

3.2 Sources of Error and Uncertainty 44

3.2.1 Estimation 45

3.2.2 Discrimination 45

3.2.3 Calibration Drift 45

3.2.4 Accuracy 45

3.2.5 Technique 46

3.2.6 Constants and Data 46

3.2.7 Noise 46

3.2.8 Model and Equations 46

3.2.9 Humans 47

3.3 Significant Digits 47

3.4 Rounding Off 48

3.5 Estimating Uncertainty on Values 49

3.5.1 Caution 50

3.6 Propagation of Uncertainty – Overview – Two Types, Two Ways Each 51

3.6.1 Maximum Uncertainty 51

3.6.2 Probable Uncertainty 56

3.6.3 Generality 58

3.7 Which to Report? Maximum or Probable Uncertainty 59

3.8 Bootstrapping 59

3.9 Bias and Precision 61

3.10 Takeaway 65

Exercises 66

4 Essential Probability and Statistics 67

4.1 Variation and Its Role in Topics 67

4.2 Histogram and Its PDF and CDF Views 67

4.3 Constructing a Data-Based View of PDF and CDF 70

4.4 Parameters that Characterize the Distribution 71

4.5 Some Representative Distributions 72

4.5.1 Gaussian Distribution 72

4.5.2 Log-Normal Distribution 72

4.5.3 Logistic Distribution 74

4.5.4 Exponential Distribution 74

4.5.5 Binomial Distribution 75

4.6 Confidence Interval 76

4.7 Central Limit Theorem 77

4.8 Hypothesis and Testing 78

4.9 Type I and Type II Errors, Alpha and Beta 80

4.10 Essential Statistics for This Text 82

4.10.1 t-Test for Bias 83

4.10.2 Wilcoxon Signed Rank Test for Bias 83

4.10.3 r-lag-1 Autocorrelation Test 84

4.10.4 Runs Test 87

4.10.5 Test for Steady State in a Noisy Signal 87

4.10.6 Chi-Square Contingency Test 89

4.10.7 Kolmogorov–Smirnov Distribution Test 89

4.10.8 Test for Proportion 90

4.10.9 F-Test for Equal Variance 90

4.11 Takeaway 91

Exercises 91

5 Simulation 93

5.1 Introduction 93

5.2 Three Sources of Deviation: Measurement, Inputs, Coefficients 93

5.3 Two Types of Perturbations: Noise (Independent) and Drifts (Persistence) 95

5.4 Two Types of Influence: Additive and Scaled with Level 98

5.5 Using the Inverse CDF to Generate n and u from UID(0, 1) 99

5.6 Takeaway 100

Exercises 100

6 Steady and Transient State Detection 101

6.1 Introduction 101

6.1.1 General Applications 101

6.1.2 Concepts and Issues in Detecting Steady State 104

6.1.3 Approaches and Issues to SSID and TSID 104

6.2 Method 106

6.2.1 Conceptual Model 106

6.2.2 Equations 107

6.2.3 Coefficient, Threshold, and Sample Frequency Values 108

6.2.4 Noiseless Data 111

6.3 Applications 112

6.3.1 Applications of the R-Statistic Approach for Process Monitoring 112

6.3.2 Applications of the R-Statistic Approach for Determining Regression Convergence 112

6.4 Takeaway 114

Exercises 114

Part III REGRESSION, VALIDATION, DESIGN

7 Regression Target – Objective Function 119

7.1 Introduction 119

7.2 Experimental and Measurement Uncertainty – Static and Continuous Valued 119

7.3 Likelihood 122

7.4 Maximum Likelihood 124

7.5 Estimating σx and σy Values 127

7.6 Vertical SSD – A Limiting Consideration of Variability Only in the Response Measurement 127

7.7 r-Square as a Measure of Fit 128

7.8 Normal, Total, or Perpendicular SSD 130

7.9 Akaho’s Method 132

7.10 Using a Model Inverse for Regression 134

7.11 Choosing the Dependent Variable 135

7.12 Model Prediction with Dynamic Models 136

7.13 Model Prediction with Classification Models 137

7.14 Model Prediction with Rank Models 138

7.15 Probabilistic Models 139

7.16 Stochastic Models 139

7.17 Takeaway 139

Exercises 140

8 Constraints 141

8.1 Introduction 141

8.2 Constraint Types 141

8.3 Expressing Hard Constraints in the Optimization Statement 142

8.4 Expressing Soft Constraints in the Optimization Statement 143

8.5 Equality Constraints 147

8.6 Takeaway 148

Exercises 148

9 The Distortion of Linearizing Transforms 149

9.1 Linearizing Coefficient Expression in Nonlinear Functions 149

9.2 The Associated Distortion 151

9.3 Sequential Coefficient Evaluation 154

9.4 Takeaway 155

Exercises 155

10 Optimization Algorithms 157

10.1 Introduction 157

10.2 Optimization Concepts 157

10.3 Gradient-Based Optimization 159

10.3.1 Numerical Derivative Evaluation 159

10.3.2 Steepest Descent – The Gradient 161

10.3.3 Cauchy’s Method 162

10.3.4 Incremental Steepest Descent (ISD) 163

10.3.5 Newton–Raphson (NR) 163

10.3.6 Levenberg–Marquardt (LM) 165

10.3.7 Modified LM 166

10.3.8 Generalized Reduced Gradient (GRG) 167

10.3.9 Work Assessment 167

10.3.10 Successive Quadratic (SQ) 167

10.3.11 Perspective 168

10.4 Direct Search Optimizers 168

10.4.1 Cyclic Heuristic Direct Search 169

10.4.2 Multiplayer Direct Search Algorithms 170

10.4.3 Leapfrogging 171

10.5 Takeaway 173

11 Multiple Optima 176

11.1 Introduction 176

11.2 Quantifying the Probability of Finding the Global Best 178

11.3 Approaches to Find the Global Optimum 179

11.4 Best-of-N Rule for Regression Starts 180

11.5 Interpreting the CDF 182

11.6 Takeaway 184

12 Regression Convergence Criteria 185

12.1 Introduction 185

12.2 Convergence versus Stopping 185

12.3 Traditional Criteria for Claiming Convergence 186

12.4 Combining DV Influence on OF 188

12.5 Use Relative Impact as Convergence Criterion 189

12.6 Steady-State Convergence Criterion 190

12.7 Neural Network Validation 197

12.8 Takeaway 198

Exercises 198

13 Model Design – Desired and Undesired Model Characteristics and Effects 199

13.1 Introduction 199

13.2 Redundant Coefficients 199

13.3 Coefficient Correlation 201

13.4 Asymptotic and Uncertainty Effects When Model is Inverted 203

13.5 Irrelevant Coefficients 205

13.6 Poles and Sign Flips w.r.t. the DV 206

13.7 Too Many Adjustable Coefficients or Too Many Regressors 206

13.8 Irrelevant Model Coefficients 215

13.8.1 Standard Error of the Estimate 216

13.8.2 Backward Elimination 216

13.8.3 Logical Tests 216

13.8.4 Propagation of Uncertainty 216

13.8.5 Bootstrapping 217

13.9 Scale-Up or Scale-Down Transition to New Phenomena 217

13.10 Takeaway 218

Exercises 218

14 Data Pre- and Post-processing 220

14.1 Introduction 220

14.2 Pre-processing Techniques 221

14.2.1 Steady- and Transient-State Selection 221

14.2.2 Internal Consistency 221

14.2.3 Truncation 222

14.2.4 Averaging and Voting 222

14.2.5 Data Reconciliation 223

14.2.6 Real-Time Noise Filtering for Noise Reduction (MA, FoF, STF) 224

14.2.7 Real-Time Noise filtering for Outlier Removal (Median Filter) 227

14.2.8 Real-Time Noise Filtering, Statistical Process Control 228

14.2.9 Imputation of Input Data 230

14.3 Post-processing 231

14.3.1 Outliers and Rejection Criterion 231

14.3.2 Bimodal Residual Distributions 233

14.3.3 Imputation of Response Data 235

14.4 Takeaway 235

Exercises 235

15 Incremental Model Adjustment 237

15.1 Introduction 237

15.2 Choosing the Adjustable Coefficient in Phenomenological Models 238

15.3 Simple Approach 238

15.4 An Alternate Approach 240

15.5 Other Approaches 241

15.6 Takeaway 241

Exercises 241

16 Model and Experimental Validation 242

16.1 Introduction 242

16.1.1 Concepts 242

16.1.2 Deterministic Models 244

16.1.3 Stochastic Models 246

16.1.4 Reality! 249

16.2 Logic-Based Validation Criteria 250

16.3 Data-Based Validation Criteria and Statistical Tests 251

16.3.1 Continuous-Valued, Deterministic, Steady State, or End-of-Batch 251

16.3.2 Continuous-Valued, Deterministic, Transient 263

16.3.3 Class/Discrete/Rank-Valued, Deterministic, Batch, or Steady State 264

16.3.4 Continuous-Valued, Stochastic, Batch, or Steady State 265

16.3.5 Test for Normally Distributed Residuals 266

16.3.6 Experimental Procedure Validation 266

16.4 Model Discrimination 267

16.4.1 Mechanistic Models 267

16.4.2 Purely Empirical Models 268

16.5 Procedure Summary 268

16.6 Alternate Validation Approaches 269

16.7 Takeaway 270

Exercises 270

17 Model Prediction Uncertainty 272

17.1 Introduction 272

17.2 Bootstrapping 273

17.3 Takeaway 276

18 Design of Experiments for Model Development and Validation 277

18.1 Concept – Plan and Data 277

18.2 Sufficiently Small Experimental Uncertainty – Methodology 277

18.3 Screening Designs – A Good Plan for an Alternate Purpose 281

18.4 Experimental Design – A Plan for Validation and Discrimination 282

18.4.1 Continually Redesign 282

18.4.2 Experimental Plan 283

18.5 EHS&LP 286

18.6 Visual Examples of Undesired Designs 287

18.7 Example for an Experimental Plan 289

18.8 Takeaway 291

Exercises 292

19 Utility versus Perfection 293

19.1 Competing and Conflicting Measures of Excellence 293

19.2 Attributes for Model Utility Evaluation 294

19.3 Takeaway 295

Exercises 296

20 Troubleshooting 297

20.1 Introduction 297

20.2 Bimodal and Multimodal Residuals 297

20.3 Trends in the Residuals 298

20.4 Parameter Correlation 298

20.5 Convergence Criterion – Too Tight, Too Loose 299

20.6 Overfitting (Memorization) 300

20.7 Solution Procedure Encounters Execution Errors 300

20.8 Not a Sharp CDF (OF) 300

20.9 Outliers 301

20.10 Average Residual Not Zero 302

20.11 Irrelevant Model Coefficients 302

20.12 Data Work-Up after the Trials 302

20.13 Too Many rs! 303

20.14 Propagation of Uncertainty Does Not Match Residuals 303

20.15 Multiple Optima 304

20.16 Very Slow Progress 304

20.17 All Residuals are Zero 304

20.18 Takeaway 305

Exercises 305

Part IV CASE STUDIES AND DATA

21 Case Studies 309

21.1 Valve Characterization 309

21.2 CO2 Orifice Calibration 311

21.3 Enrollment Trend 312

21.4 Algae Response to Sunlight Intensity 314

21.5 Batch Reaction Kinetics 316

Appendix A: VBA Primer: Brief on VBA Programming – Excel in Office 2013 319

Appendix B: Leapfrogging Optimizer Code for Steady-State Models 328

Appendix C: Bootstrapping with Static Model 341

References and Further Reading 350

Index 355

See More

Author Information

R. Russell Rhinehart, Oklahoma State University, USA.
Professor Rhinehart obtained his Ph.D. in Chemical Engineering in 1985 from North Carolina State University, USA. His research interests include process improvement (modeling, optimization, and control), and product improvement (modeling and design). In 2004 he was named as one of InTECHs 50 most influential industry innovators of the past 50 years, and was inducted into the Automation Hall of Fame for the Process Industries in 2005.  He has written extensively for numerous journals and refereed articles.

See More
Back to Top