Skip to main content

Maintaining Mission Critical Systems in a 24/7 Environment



Maintaining Mission Critical Systems in a 24/7 Environment

Peter M. Curtis

ISBN: 978-0-470-08903-3 April 2007 Wiley-IEEE Press 300 Pages


The latest tested and proven strategies to maintain business resiliency and sustainability for our ever-growing global digital economy

Here is a comprehensive study of the fundamentals of mission critical systems, which are designed to maintain ultra-high reliability, availability, and resiliency of electrical, mechanical, and digital systems and eliminate costly downtime. Readers learn all the skills needed to design, fine tune, operate, and maintain mission critical equipment and systems. Practical in focus, the text helps readers configure and customize their designs to correspond to their organizations' unique needs and risk tolerance. Specific strategies are provided to deal with a wide range of contingencies from power failures to human error to fire. In addition, the author highlights measures that are mandated by policy and regulation.

The author of this text has worked in mission critical facilities engineering for more than twenty years, serving clients in banking, defense, utilities, energy, and education environments. His recommendations for maintaining essential operations are based on firsthand experience of what works and what does not.

Most chapters in this text concentrate on an individual component of the mission critical system, including standby generators, automatic transfer switches, uninterruptible power supplies, and fuel, fire, and battery systems. For each component, the author sets forth applications, available models, design choices, standard operating procedures, emergency action plans, maintenance procedures, and applicable codes and standards. Extensive use of photographs and diagrams illustrates how individual components and integrated systems work.

With the rapid growth of e-commerce and 24/7 business operations, mission critical systems have moved to the forefront of concerns among both private and public operations. Facilities engineers, senior administrators, and business continuity professionals involved in information technology and data center design should consult this text regularly to ensure they have done everything they can to protect and sustain their operations to reduce human error, equipment failures, and other critical events. Adapted from material the author has used in academic and professional training programs, this guide is also an ideal desktop reference and textbook.



1. An Overview of Reliability and Resilience in Today’s Mission Critical Facilities.

1.1 Introduction.

1.2 Risk Assessment.

1.3 Capital Costs Versus Operation Costs.

1.4 Change Management.

1.5 Testing and Commissioning.

1.6 Documentation and Human Factor.

1.7 Education and Training.

1.8 Operation and Maintenance.

1.9 Employee Certification.

1.10 Standard and Benchmarking.

2. Policies and Regulations.

2.1 Executive Summary.

2.2 Introduction.

2.3 Industry Regulations and Policies.

2.3.1 U.S. Patriot Act.

2.3.2 The National Strategy for the Physical Protection of Critical Infrastructures and Key Assets.

2.3.3 U.S. Security and Exchange Commission (SEC).

2.3.4 Sound Practices to Strengthen the Resilience of the U.S. Financial System.

2.3.5 Federal Real Property Council (FRPC).

2.3.6 Basel II Accord.

2.3.7 Sarbanes–Oxley (SOX).

2.3.8 NFPA 1600.

3. Mission Critical Facilities Engineering.

3.1 Introduction.

3.2 Companies’ Expectations: Risk Tolerance and Reliability.

3.3 Identifying the Appropriate Redundancy in a Mission Critical Facility.

3.4 Improving Reliability, Maintainability, and Proactive Preventative Maintenance.

3.5 The Mission Critical Facilities Manager and the Importance of the Boardroom.

3.6 Quantifying Reliability and Availability.

3.6.1 Review of Reliability Versus Availability.

3.7 Design Considerations for the Mission Critical Data Center.

3.8 Mission Critical Facility Start-Up.

3.9 The Evolution of Mission Critical Facility Design.

4. Mission Critical Electrical Systems Maintenance.

4.1 Introduction.

4.2 The History of the Maintenance Supervisor and the Evolution of the Mission Critical Facilities Engineer.

4.3 Internal Building Deficiencies and Analysis.

4.4 Evaluating Your System.

4.5 Choosing a Maintenance Approach.

4.6 Standards and Regulations Affecting How Safe Electrical Maintenance Is Performed.

4.7 Maintenance of Typical Electrical Distribution Equipment.

4.7.1 Infrared Scanning.

4.7.2 15-Kilovolt Class Equipment.

4.7.3 480-Volt Switchgear.

4.7.4 Motor Control Centers and Panel Boards.

4.7.5 Automatic Transfer Switches.

4.7.6 Automatic Static Transfer Switches (ASTS).

4.7.7 Power Distribution Units.

4.7.8 277/480-Volt Transformers.

4.7.9 Uninterruptible Power Systems.

4.7.10 A Final Point on Servicing Equipment.

4.8 Being Proactive in Evaluating the Test Reports.

4.9 Data Center Reliability.

5. Standby Generators: Technology, Applications, and Maintenance.

5.1 Introduction.

5.2 The Necessity for Standby Power.

5.3 Emergency, Legally Required, and Optional Systems.

5.4 Standby Systems that Are Legally Required.

5.5 Optional Standby Systems.

5.6 Understanding Your Power Requirements.

5.7 Management Commitment and Training.

5.7.1 Lockout/Tagout.

5.7.2 Training.

5.8 Standby Generator Systems Maintenance Procedures.

5.8.1 Maintenance Record Keeping and Data Trending.

5.8.2 Load Bank Testing.

5.9 Documentation Plan.

5.9.1 Proper Documentation and Forms.

5.9.2 Record Keeping.

5.10 Emergency Procedures.

5.11 Cold Start and Load Acceptance.

5.12 Nonlinear Load Problems.

5.13 Conclusions.

6. Fuel Systems and Design and Maintenance for Fuel Oil (Howard L. Chesneau, Edward English III, and Ron Ritorto).

6.1 Fuel Systems and Fuel Oil.

6.1.1 Fuel Supply Maintenance Items.

6.1.2 Fuel Supply Typical Design Criteria.

6.2 Bulk Storage Tank Selection.

6.3 Codes and Standards.

6.4 Recommended Practices for all Tanks.

6.5 Fuel Distribution System Configuration.

6.5.1 Day Tank Control System.

7. Automatic Transfer Switch Technology, Application, and Maintenance.

7.1 Introduction.

7.2 Overview.

7.3 Transfer Switch Technology and Applications.

7.3.1 Types of Transfer Switches.

7.3.2 Bypass-Isolation Transfer Switches.

7.3.3 Breaker Pair ATSs.

7.4 Control Devices.

7.4.1 Time Delays.

7.4.2 In-Phase Monitor.

7.4.3 Programmed (Delayed) Transition.

7.4.4 Closed Transition Transfer (Parallel Transfer).

7.4.5 Test Switches.

7.4.6 Exercise Clock.

7.4.7 Voltage and Frequency Sensing Controls.

7.5 Optional Accessories and Features.

7.6 ATS Required Capabilities.

7.6.1 Close Against High In-Rush Currents.

7.6.2 Withstand and Closing Rating (WCR).

7.6.3 Carry Full Rated Current Continuously.

7.6.4 Interrupt Current.

7.7 Additional Characteristics and Ratings of ATSs.

7.7.1 NEMA Classification.

7.7.2 System Voltage Ratings.

7.7.3 ATS Sizing.

7.7.4 Seismic Requirement.

7.8 Installation, Maintenance, and Safety.

7.8.1 Installation Procedures.

7.8.2 Maintenance Safety.

7.8.3 Maintenance.

7.8.4 Drawings and Manuals.

7.8.5 Testing and training.

7.9 General Recommendations.

8. The Static Transfer Switch.

8.1 Introduction.

8.2 Overview.

8.2.1 Major Components.

8.3 Typical Static Switch One Line.

8.3.1 Normal Operation.

8.3.2 STS and STS/Transformer Configurations.

8.4 STS Technology and Application.

8.4.1 General Parameters.

8.4.2 STS Location and Type.

8.4.3 Advantages and Disadvantages of the Primary and Secondary STS/Transformer Systems.

8.4.4 Monitoring and Data Logging and Data Management.

8.4.5 STS Remote Communication.

8.4.6 Security.

8.4.7 Human Engineering and Eliminating Human Errors.

8.4.8 Reliability and Availability.

8.4.9 Reparability and Maintainability.

8.4.10 Fault Tolerance and Abnormal Operation.

8.5 Testing.

8.6 Conclusion.

9. The Fundamentals of Power Quality and their Associated Problems.

9.1 Introduction.

9.2 Electricity Basics.

9.2.1 Basic Circuit.

9.3 Transmission of Power.

9.3.1 Life Cycle of Electricity.

9.3.2 Single- and Three-Phase Power Basics.

9.3.3 Unreliable Power Versus Reliable Power.

9.4 Understanding Power Problems.

9.4.1 Power Quality Transients.

9.4.2 RMS Variations.

9.4.3 Causes of Power Line Disturbances.

9.4.4 Power Line Disturbance Levels.

9.5 Tolerances of Computer Equipment.

9.5.1 CBEMA Curve.

9.5.2 ITIC Curve.

9.5.3 Purpose of Curves.

9.6 Power Monitoring.

9.6.1 Example Power Monitoring Equipment.

9.7 The Deregulation Wildcard.

9.8 Troubleshooting Power Quality.

10. An Overview of UPS Systems: Technology, Application, and Maintenance.

10.1 Introduction.

10.2 Purpose of UPS Systems.

10.3 General Description of UPS Systems.

10.3.1 What Is a UPS System?

10.3.2 How Does a UPS System Work?

10.4 Static UPS Systems.

10.4.1 Online.

10.4.2 Double Conversion.

10.4.3 UPS Power Path.

10.5 Components of a Static UPS System.

10.5.1 Power Control Devices.

10.5.2 Line Interactive UPS Systems.

10.6 Rotary Systems.

10.6.1 Rotary UPS Systems.

10.6.2 UPSs Using Diesel.

10.7 Redundancy and Configurations.

10.7.1 Redundancy.

10.7.2 Isolated Redundant.

10.7.3 Tie Systems.

10.8 Batteries and Energy Storage Systems.

10.8.1 Battery.

10.8.2 Flywheel Energy.

10.9 UPS Maintenance and Testing.

10.9.1 Steady-State Load Test.

10.9.2 Harmonic Analysis.

10.9.3 Filter Integrity.

10.9.4 Transient Response Load Test.

10.9.5 Module Fault Test.

10.9.6 Battery Run Down Test.

10.10 Static UPS and Maintenance.

10.10.1 Semi-Annual Checks and Services.

10.11 UPS Management.

10.12 Additional Topics.

10.12.1 Offline (Standby).

11. Data Center Cooling: Systems and Components (Don Beaty).

11.1 Introduction.

11.2 Building Cooling Overview.

11.3 Cooling Within Datacom Rooms.

11.4 Cooling Systems.

11.4.1 Airside.

11.4.2 Waterside.

11.4.3 Air- and Liquid-Cooling Distribution Systems.

11.5 Components Outside the Datacom Room.

11.5.1 Refrigeration Equipment—Chillers.

11.5.2 Heat Rejection Equipment.

11.5.3 Energy Recovery Equipment.

11.6 Components Inside Datacom Room.

11.6.1 CRAC Units.

12. Raised Access Floors (Dan Catalfu).

12.1 Introduction.

12.1.1 What Is an Access Floor?

12.1.2 What Are the Typical Applications for Access Floors?

12.1.3 Why Use an Access Floor?

12.2 Design Considerations.

12.2.1 Determine the Structural Performance Required.

12.2.2 Determine the Required Finished Floor Height.

12.2.3 Determine the Understructure Support Design Type Required.

12.2.4 Determine the Appropriate Floor Finish.

12.2.5 Airflow Requirements.

12.3 Safety Concerns.

12.3.1 Removal and Reinstallation of Panels.

12.3.2 Removing Panels.

12.3.3 Reinstalling Panels.

12.3.4 Stringer Systems.

12.3.5 Protecting the Floor from Heavy Loads.

12.3.6 Grounding the Access Floor.

12.3.7 Fire Protection.

12.3.8 Zinc Whiskers.

12.4 Panel Cutting.

12.4.1 Safety Requirements for Cutting Panels.

12.4.2 Guidelines for Cutting Panels.

12.4.3 Cutout Locations in Panels; Supplemental Support for Cut Panels.

12.4.4 Saws and Blades for Panel Cutting.

12.4.5 Interior Cutout Procedure.

12.4.6 Round Cutout Procedure.

12.4.7 Installing Protective Trim Around Cut Edges.

12.5 Access Floor Maintenance.

12.5.1 Standard High-Pressure Laminate Floor Tile (HPL).

12.5.2 Vinyl Conductive and Static Dissipative Tile.

12.5.3 Cleaning the Floor Cavity.

12.5.4 Removing Liquid from the Floor Cavity.

12.6 Troubleshooting.

12.6.1 Making Pedestal Height Adjustments.

12.6.2 Rocking Panel Condition.

12.6.3 Panel Lipping Condition (Panel Sitting High).

12.6.4 Out-of-Square Stringer Grid (Twisted Grid).

12.6.5 Tipping at Perimeter Panels.

12.6.6 Tight Floor or Loose Floor: Floor Systems Laminated with HPL Tile.

13. Fire Protection in Mission Critical Infrastructures (Brian K. Fabel).

13.1 Introduction.

13.2 Philosophy.

13.2.1 Alarm and Notification.

13.2.2 Early Detection.

13.2.3 Fire Suppression.

13.3 Systems Design.

13.3.1 System Types.

13.3.2 Fire and Building Codes.

13.4 Fire Detection.

13.5 Fire Suppression Systems.

13.5.1 Watermist Systems.

13.5.2 Carbon Dioxide Systems.

13.5.3 Clean Agent Systems.

13.5.4 Inert Gas Agents.

13.5.5 IG-541.

13.5.6 IG-55.

13.5.7 Chemical Clean Agents.

13.5.8 Fire Extinguishers.


Appendix A: Critical Power.

Appendix B: BITS Guide to Business-Critical Power.

Appendix C: Syska Criticality Levels.



FTP Site Study guide (pdf), Chapter tests (pdf and word), and PowerPoint files