# Reliability and Risk Models: Setting Reliability Requirements, 2nd Edition

# Reliability and Risk Models: Setting Reliability Requirements, 2nd Edition

ISBN: 978-1-118-87332-8 November 2015 456 Pages

## Description

Includes:

- A unique set of 46 generic principles for reducing technical risk
- Monte Carlo simulation algorithms for improving reliability and reducing risk
- Methods for setting reliability requirements based on the cost of failure
- New reliability measures based on a minimal separation of random events on a time interval
- Overstress reliability integral for determining the time to failure caused by overstress failure modes
- A powerful equation for determining the probability of failure controlled by defects in loaded components with complex shape
- Comparative methods for improving reliability which do not require reliability data
- Optimal allocation of limited resources to achieve a maximum risk reduction
- Improving system reliability based solely on a permutation of interchangeable components

## Table of contents

**Series Preface xvii**

**Preface xix**

**1 Failure Modes: Building Reliability Networks 1**

1.1 Failure Modes 1

1.2 Series and Parallel Arrangement of the Components in a Reliability Network 5

1.3 Building Reliability Networks: Difference between a Physical and Logical Arrangement 6

1.4 Complex Reliability Networks Which Cannot Be Presented as a Combination of Series and Parallel Arrangements 10

1.5 Drawbacks of the Traditional Representation of the Reliability Block Diagrams 11

*1.5.1 Reliability Networks Which Require More Than a Single Terminal Node *11

*1.5.2 Reliability Networks Which Require the Use of Undirected Edges Only,*

*Directed Edges Only or a Mixture of Undirected and Directed Edges *13

*1.5.3 Reliability Networks Which Require Different Edges Referring to the Same Component *16

*1.5.4 Reliability Networks Which Require Negative**‐**State Components *17

**2 Basic Concepts 21**

2.1 Reliability (Survival) Function, Cumulative Distribution and Probability Density Function of the Times to Failure 21

2.2 Random Events in Reliability and Risk Modelling 23

*2.2.1 Reliability and Risk Modelling Using Intersection of Statistically Independent Random Events *23

*2.2.2 Reliability and Risk Modelling Using a Union of Mutually Exclusive Random Events *25

*2.2.3 Reliability of a System with Components Logically Arranged in Series *27

*2.2.4 Reliability of a System with Components Logically Arranged in Parallel *29

*2.2.5 Reliability of a System with Components Logically Arranged in Series and Parallel *31

*2.2.6 Using Finite Sets to Infer Component Reliability *32

2.3 Statistically Dependent Events and Conditional Probability in Reliability and Risk Modelling 33

2.4 Total Probability Theorem in Reliability and Risk Modelling. Reliability of Systems with Complex Reliability Networks 36

2.5 Reliability and Risk Modelling Using Bayesian Transform and Bayesian Updating 43

*2.5.1 Bayesian Transform *43

*2.5.2 Bayesian Updating *44

**3 Common Reliability and Risk Models and Their Applications 47**

3.1 General Framework for Reliability and Risk Analysis Based on Controlling Random Variables 47

3.2 Binomial Model 48

*3.2.1 Application: A Voting System *52

3.3 Homogeneous Poisson Process and Poisson Distribution 53

3.4 Negative Exponential Distribution 56

*3.4.1 Memoryless Property of the Negative Exponential Distribution *57

3.5 Hazard Rate 58

*3.5.1 Difference between Failure Density and Hazard Rate *60

*3.5.2 Reliability of a Series Arrangement Including Components with Constant Hazard Rates *61

3.6 Mean Time to Failure 61

3.7 Gamma Distribution 63

3.8 Uncertainty Associated with the MTTF 65

3.9 Mean Time between Failures 67

3.10 Problems with the MTTF and MTBF Reliability Measures 67

3.11 BX% Life 68

3.12 Minimum Failure‐Free Operation Period 69

3.13 Availability 70

*3.13.1 Availability on Demand *70

*3.13.2 Production Availability *71

3.14 Uniform Distribution Model 72

3.15 Normal (Gaussian) Distribution Model 73

3.16 Log‐Normal Distribution Model 77

3.17 Weibull Distribution Model of the Time to Failure 79

3.18 Extreme Value Distribution Model 81

3.19 Reliability Bathtub Curve 82

**4 Reliability and Risk Models Based on Distribution Mixtures 87**

4.1 Distribution of a Property from Multiple Sources 87

4.2 Variance of a Property from Multiple Sources 89

4.3 Variance Upper Bound Theorem 91

*4.3.1 Determining the Source Whose Removal Results in the Largest Decrease of the Variance Upper Bound *92

4.4 Applications of the Variance Upper Bound Theorem 93

*4.4.1 Using the Variance Upper Bound Theorem for Increasing the Robustness of Products and Processes *93

*4.4.2 Using the Variance Upper Bound Theorem for Developing Six**‐**Sigma Products and Processes *97

Appendix 4.1: Derivation of the Variance Upper Bound Theorem 99

Appendix 4.2: An Algorithm for Determining the Upper Bound of the Variance of Properties from Sampling Multiple Sources 101

**5 Building Reliability and Risk Models 103**

5.1 General Rules for Reliability Data Analysis 103

5.2 Probability Plotting 107

*5.2.1 Testing for Consistency with the Uniform Distribution Model *109

*5.2.2 Testing for Consistency with the Exponential Model *109

*5.2.3 Testing for Consistency with the Weibull Distribution *110

*5.2.4 Testing for Consistency with the Type I Extreme Value Distribution *111

*5.2.5 Testing for Consistency with the Normal Distribution *111

5.3 Estimating Model Parameters Using the Method of Maximum Likelihood 113

5.4 Estimating the Parameters of a Three‐Parameter Power Law 114

*5.4.1 Some Applications of the Three**‐**Parameter Power Law *116

**6 Load–Strength (Demand****‐****Capacity) Models 119**

6.1 A General Reliability Model 119

6.2 The Load–Strength Interference Model 120

6.3 Load–Strength (Demand‐Capacity) Integrals 122

6.4 Evaluating the Load–Strength Integral Using Numerical Methods 124

6.5 Normally Distributed and Statistically Independent Load and Strength 125

6.6 Reliability and Risk Analysis Based on the Load–Strength Interference Approach 130

*6.6.1 Influence of Strength Variability on Reliability *130

*6.6.2 Critical Weaknesses of the Traditional Reliability Measures ‘Safety Margin’ and ‘Loading Roughness’ *134

*6.6.3 Interaction between the Upper Tail of the Load Distribution and the Lower Tail of the Strength Distribution *136

**7 Overstress Reliability Integral and Damage Factorisation Law 139**

7.1 Reliability Associated with Overstress Failure Mechanisms 139

*7.1.1 The Link between the Negative Exponential Distribution and the Overstress Reliability Integral *141

7.2 Damage Factorisation Law 143

**8 Solving Reliability and Risk Models Using a Monte Carlo Simulation 147**

8.1 Monte Carlo Simulation Algorithms 147

*8.1.1 Monte Carlo Simulation and the Weak Law of Large Numbers *147

*8.1.2 Monte Carlo Simulation and the Central Limit Theorem *149

*8.1.3 Adopted Conventions in Describing the Monte Carlo Simulation Algorithms *149

8.2 Simulation of Random Variables 151

*8.2.1 Simulation of a Uniformly Distributed Random Variable *151

*8.2.2 Generation of a Random Subset *152

*8.2.3 Inverse Transformation Method for Simulation of Continuous Random Variables *153

*8.2.4 Simulation of a Random Variable following the Negative Exponential Distribution *154

*8.2.5 Simulation of a Random Variable following the Gamma Distribution *154

*8.2.6 Simulation of a Random Variable following a Homogeneous Poisson Process in a Finite Interval *155

*8.2.7 Simulation of a Discrete Random Variable with a Specified Distribution *156

*8.2.8 Selection of a Point at Random in the N**‐**Dimensional Space Region *157

*8.2.9 Simulation of Random Locations following a Homogeneous Poisson Process in a Finite Domain *158

*8.2.10 Simulation of a Random Direction in Space *158

*8.2.11 Generating Random Points on a Disc and in a Sphere *160

*8.2.12 Simulation of a Random Variable following the Three**‐**Parameter Weibull Distribution *162

*8.2.13 Simulation of a Random Variable following the Maximum Extreme Value Distribution *162

*8.2.14 Simulation of a Gaussian Random Variable *162

*8.2.15 Simulation of a Log**‐**Normal Random Variable *163

*8.2.16 Conditional Probability Technique for Bivariate Sampling *164

*8.2.17 Von Neumann’s Method for Sampling Continuous Random Variables *165

*8.2.18 Sampling from a Mixture Distribution *166

Appendix 8.1 166

**9 Evaluating Reliability and Probability of a Faulty Assembly Using Monte Carlo Simulation 169**

9.1 A General Algorithm for Determining Reliability Controlled by Statistically Independent Random Variables 169

9.2 Evaluation of the Reliability Controlled by a Load–Strength Interference 170

*9.2.1 Evaluation of the Reliability on Demand, with No Time Included *170

*9.2.2 Evaluation of the Reliability Controlled by Random Shocks on a Time Interval *171

9.3 A Virtual Testing Method for Determining the Probability of Faulty Assembly 173

9.4 Optimal Replacement to Minimise the Probability of a System Failure 177

**10 Evaluating the Reliability of Complex Systems and Virtual Accelerated Life Testing Using Monte Carlo Simulation 181**

10.1 Evaluating the Reliability of Complex Systems 181

10.2 Virtual Accelerated Life Testing of Complex Systems 183

*10.2.1 Acceleration Stresses and Their Impact on the Time to Failure of Components *183

*10.2.2 Arrhenius Stress–Life Relationship and Arrhenius**‐**Type Acceleration Life Models *185

*10.2.3 Inverse Power Law Relationship and Inverse Power Law**‐**Type Acceleration Life Models *185

*10.2.4 Eyring Stress–Life Relationship and Eyring**‐**Type Acceleration Life Models *185

**11 Generic Principles for Reducing Technical Risk 189**

11.1 Preventive Principles: Reducing Mainly the Likelihood of Failure 191

*11.1.1 Building in High Reliability in Processes, Components and Systems with Large Failure Consequences *191

*11.1.2 Simplifying at a System and Component Level *192

*11.1.2.1 Reducing the Number of Moving Parts *193

*11.1.3 Root Cause Failure Analysis *193

*11.1.4 Identifying and Removing Potential Failure Modes *194

*11.1.5 Mitigating the Harmful Effect of the Environment *194

*11.1.6 Building in Redundancy *195

*11.1.7 Reliability and Risk Modelling and Optimisation *197

*11.1.7.1 Building and Analysing Comparative Reliability Models *197

*11.1.7.2 Building and Analysing Physics of Failure Models *198

*11.1.7.3 Minimising Technical Risk through Optimisation and Optimal Replacement *199

*11.1.7.4 Maximising System Reliability and Availability by Appropriate Permutations of Interchangeable Components *199

*11.1.7.5 Maximising the Availability and Throughput Flow Reliability by Altering the Network Topology *199

*11.1.8 Reducing Variability of Risk-Critical Parameters and Preventing them from Reaching Dangerous Values *199

*11.1.9 Altering the Component Geometry *200

*11.1.10 Strengthening or Eliminating Weak Links *201

*11.1.11 Eliminating Factors Promoting Human Errors *202

*11.1.12 Reducing Risk by Introducing Inverse States *203

*11.1.12.1 Inverse States Cancelling the Anticipated State with a Negative Impact *203

*11.1.12.2 Inverse States Buffering the Anticipated State with a Negative Impact *203

*11.1.12.3 Inverting the Relative Position of Objects and the Direction of Flows *204

*11.1.12.4 Inverse State as a Counterbalancing Force *205

*11.1.13 Failure Prevention Interlocks *206

*11.1.14 Reducing the Number of Latent Faults *206

*11.1.15 Increasing the Level of Balancing *208

*11.1.16 Reducing the Negative Impact of Temperature by Thermal Design *209

*11.1.17 Self**‐**Stability *211

*11.1.18 Maintaining the Continuity of a Working State *212

*11.1.19 Substituting Mechanical Assemblies with Electrical, Optical or Acoustic Assemblies and Software *212

*11.1.20 Improving the Load Distribution *212

*11.1.21 Reducing the Sensitivity of Designs to the Variation of Design Parameters *212

*11.1.22 Vibration Control *216

*11.1.23 Built**‐**In Prevention *216

11.2 Dual Principles: Reduce Both the Likelihood of Failure and the Magnitude of Consequences 217

*11.2.1 Separating Critical Properties, Functions and Factors *217

*11.2.2 Reducing the Likelihood of Unfavourable Combinations of Risk**‐**Critical Random Variables *218

*11.2.3 Condition Monitoring *219

*11.2.4 Reducing the Time of Exposure or the Space of Exposure *219

*11.2.4.1 Time of Exposure *219

*11.2.4.2 Length of Exposure and Space of Exposure *220

*11.2.5 Discovering and Eliminating a Common Cause: Diversity in Design *220

*11.2.6 Eliminating Vulnerabilities *222

*11.2.7 Self**‐**Reinforcement *223

*11.2.8 Using Available Local Resources *223

*11.2.9 Derating *224

*11.2.10 Selecting Appropriate Materials and Microstructures *225

*11.2.11 Segmentation *225

*11.2.11.1 Segmentation Improves the Load Distribution *225

*11.2.11.2 Segmentation Reduces the Vulnerability to a Single Failure *225

*11.2.11.3 Segmentation Reduces the Damage Escalation *226

*11.2.11.4 Segmentation Limits the Hazard Potential *226

*11.2.12 Reducing the Vulnerability of Targets *226

*11.2.13 Making Zones Experiencing High Damage/Failure Rates Replaceable *227

*11.2.14 Reducing the Hazard Potential *227

*11.2.15 Integrated Risk Management *227

11.3 Protective Principles: Minimise the Consequences of Failure 229

*11.3.1 Fault**‐**Tolerant System Design *229

*11.3.2 Preventing Damage Escalation and Reducing the Rate of Deterioration *229

*11.3.3 Using Fail**‐**Safe Designs *230

*11.3.4 Deliberately Designed Weak Links *231

*11.3.5 Built**‐**In Protection *231

*11.3.6 Troubleshooting Procedures and Systems *232

*11.3.7 Simulation of the Consequences from Failure *232

*11.3.8 Risk Planning and Training *233

**12 Physics of Failure Models 235**

12.1 Fast Fracture 235

*12.1.1 Fast Fracture: Driving Forces behind Fast Fracture *235

*12.1.2 Reducing the Likelihood of Fast Fracture *241

*12.1.2.1 Basic Ways of Reducing the Likelihood of Fast Fracture *242

*12.1.2.2 Avoidance of Stress Raisers or Mitigating Their Harmful Effect *244

*12.1.2.3 Selecting Materials Which Fail in a Ductile Fashion *245

*12.1.3 Reducing the Consequences of Fast Fracture *247

*12.1.3.1 By Using Fail-Safe Designs *247

*12.1.3.2 By Using Crack Arrestors *250

12.2 Fatigue Fracture 251

*12.2.1 Reducing the Risk of Fatigue Fracture *257

*12.2.1.1 Reducing the Size of the Flaws *257

*12.2.1.2 Increasing the Final Fatigue Crack Length by Selecting Material with a Higher Fracture Toughness *257

*12.2.1.3 Reducing the Stress Range by an Appropriate Design *257

*12.2.1.4 Reducing the Stress Range by Restricting the Springback of Elastic Components *258

*12.2.1.5 Reducing the Stress Range by Reducing the Magnitude of Thermal Stresses *259

*12.2.1.6 Reducing the Stress Range by Introducing Compressive Residual Stresses at the Surface *261

*12.2.1.7 Reducing the Stress Range by Avoiding Excessive Bending *262

*12.2.1.8 Reducing the Stress Range by Avoiding Stress Concentrators *263

*12.2.1.9 Improving the Condition of the Surface and Eliminating Low-Strength Surfaces *263

*12.2.1.10 Increasing the Fatigue Life of Automotive Suspension Springs *264

12.3 Early‐Life Failures 265

*12.3.1 Influence of the Design on Early**‐**Life Failures *265

*12.3.2 Influence of the Variability of Critical Design Parameters on Early**‐**Life Failures *266

**13 Probability of Failure Initiated by Flaws 269**

13.1 Distribution of the Minimum Fracture Stress and a Mathematical Formulation of the Weakest‐Link Concept 269

13.2 The Stress Hazard Density as an Alternative of the Weibull Distribution 274

13.3 General Equation Related to the Probability of Failure of a Stressed Component with Complex Shape 276

13.4 Link between the Stress Hazard Density and the Conditional Individual Probability of Initiating Failure 278

13.5 Probability of Failure Initiated by Defects in Components with Complex Shape 279

13.6 Limiting the Vulnerability of Designs to Failure Caused by Flaws 280

**14 A Comparative Method for Improving the Reliability and Availability of Components and Systems 283**

14.1 Advantages of the Comparative Method to Traditional Methods 283

14.2 A Comparative Method for Improving the Reliability of Components Whose Failure is Initiated by Flaws 285

14.3 A Comparative Method for Improving System Reliability 289

14.4 A Comparative Method for Improving the Availability of Flow Networks 290

**15 Reliability Governed by the Relative Locations of Random Variables in a Finite Domain 293**

15.1 Reliability Dependent on the Relative Configurations of Random Variables 293

15.2 A Generic Equation Related to Reliability Dependent on the Relative Locations of a Fixed Number of Random Variables 293

15.3 A Given Number of Uniformly Distributed Random Variables in a Finite Interval (Conditional Case) 297

15.4 Probability of Clustering of a Fixed Number Uniformly Distributed Random Events 298

15.5 Probability of Unsatisfied Demand in the Case of One Available Source and Many Consumers 302

15.6 Reliability Governed by the Relative Locations of Random Variables following a Homogeneous Poisson Process in a Finite Domain 304

Appendix 15.1 305

**16 Reliability and Risk Dependent on the Existence of Minimum Separation Intervals between the Locations of Random Variables on a Finite Interval 307**

16.1 Applications Requiring Minimum Separation Intervals and Minimum Failure‐Free Operating Periods 307

16.2 Minimum Separation Intervals and Rolling MFFOP Reliability Measures 309

16.3 General Equations Related to Random Variables following a Homogeneous Poisson Process in a Finite Interval 310

16.4 Application Examples 312

*16.4.1 Setting Reliability Requirements to Guarantee a Specified MFFOP *312

*16.4.2 Reliability Assurance That a Specified MFFOP Has Been Met *312

0002547085.indd 13 8/18/2015 6:29:01 PM

**xiv **Contents

*16.4.3 Specifying a Number Density Envelope to Guarantee Probability*

*of Unsatisfied Random Demand below a Maximum Acceptable Level *314

*16.4.4 Insensitivity of the Probability of Unsatisfied Demand to the Variance of the Demand Time *315

16.5 Setting Reliability Requirements to Guarantee a Rolling MFFOP Followed by a Downtime 317

16.6 Setting Reliability Requirements to Guarantee an Availability Target 320

16.7 Closed-Form Expression for the Expected Fraction of the Time of Unsatisfied Demand 323

**17 Reliability Analysis and Setting Reliability Requirements Based on the Cost of Failure 327**

17.1 The Need for a Cost‐of‐Failure‐Based Approach 327

17.2 Risk of Failure 328

17.3 Setting Reliability Requirements Based on a Constant Cost of Failure 330

17.4 Drawbacks of the Expected Loss as a Measure of the Potential Loss from Failure 332

17.5 Potential Loss, Conditional Loss and Risk of Failure 333

17.6 Risk Associated with Multiple Failure Modes 336

*17.6.1 An Important Special Case *337

17.7 Expected Potential Loss Associated with Repairable Systems Whose Component Failures Follow a Homogeneous Poisson Process 338

17.8 A Counterexample Related to Repairable Systems 341

17.9 Guaranteeing Multiple Reliability Requirements for Systems with Components Logically Arranged in Series 342

**18 Potential Loss, Potential Profit and Risk 345**

18.1 Deficiencies of the Maximum Expected Profit Criterion in Selecting a Risky Prospect 345

18.2 Risk of a Net Loss and Expected Potential Reward Associated with a Limited Number of Statistically Independent Risk–Reward Bets in a Risky Prospect 346

18.3 Probability and Risk of a Net Loss Associated with a Small Number of Opportunity Bets 348

18.4 Samuelson’s Sequence of Good Bets Revisited 351

18.5 Variation of the Risk of a Net Loss Associated with a Small Number of Opportunity Bets 352

18.6 Distribution of the Potential Profit from a Limited Number of Risk–Reward Activities 353

**19 Optimal Allocation of Limited Resources among Discrete Risk Reduction Options 357**

19.1 Statement of the Problem 357

19.2 Weaknesses of the Standard (0‐1) Knapsack Dynamic Programming Approach 359

*19.2.1 A Counterexample *359

*19.2.2 The New Formulation of the Optimal Safety Budget Allocation Problem *360

*19.2.3 Dependence of the Removed System Risk on the Appropriate Selection of Combinations of Risk Reduction Options *361

*19.2.4 A Dynamic Algorithm for Solving the Optimal Safety Budget Allocation Problem *365

19.3 Validation of the Model by a Recursive Backtracking 369

**Appendix A 373**

A.1 Random Events 373

A.2 Union of Events 375

A.3 Intersection of Events 376

A.4 Probability 378

A.5 Probability of a Union and Intersection of Mutually Exclusive Events 379

A.6 Conditional Probability 380

A.7 Probability of a Union of Non‐disjoint Events 383

A.8 Statistically Dependent Events 384

A.9 Statistically Independent Events 384

A.10 Probability of a Union of Independent Events 385

A.11 Boolean Variables and Boolean Algebra 385

**Appendix B 391**

B.1 Random Variables: Basic Properties 391

B.2 Boolean Random Variables 392

B.3 Continuous Random Variables 392

B.4 Probability Density Function 392

B.5 Cumulative Distribution Function 393

B.6 Joint Distribution of Continuous Random Variables 393

B.7 Correlated Random Variables 394

B.8 Statistically Independent Random Variables 395

B.9 Properties of the Expectations and Variances of Random Variables 396

B.10 Important Theoretical Results Regarding the Sample Mean 397

**Appendix C: Cumulative Distribution Function of the Standard Normal Distribution 399**

**Appendix D: ***χ***2****‐****Distribution 401**

**References 407**

**Index 413**