Relational Database Index Design and the Optimizers
Over the last few years, hardware and software have advanced beyond all recognition, so it's hardly surprising that relational database performance now receives much less attention. Unfortunately, the reality is that the improved hardware hasn't kept pace with the ever-increasing quantity of data processed today. Although disk packing densities have increased enormously, making storage costs extremely low and sequential read very fast, random reads are still painfully slow. Many of the old design recommendations are therefore no longer valid-the optimal point of indexing has come a long way. Consequently many of the old problems haven't actually gone away-they have simply changed their appearance.
This book provides an easy but effective approach to the design of indexes and tables. Using lots of examples and case studies, the authors describe how the DB2, Oracle, and SQL Server optimizers determine how to access data, and how CPU and response times for the resulting access paths can be quickly estimated. This enables comparisons to be made of the various designs, and helps you choose available choices for the most appropriate design.
This book is intended for anyone who wants to understand the issues of SQL performance or how to design tables and indexes effectively. With this title, readers with many years of experience of relational systems will be able to better grasp the implications that have been brought into play by the introduction of new hardware.
An Instructor's Manual presenting detailed solutions to all the problems in the book is available online from the Wiley editorial department.
An Instructor Support FTP site is also available.
Another Book About SQL Performance!
Myths and Misconceptions.
2. Table and Index Organization.
Buffer Pools and Disk I/Os.
3. SQL Processing.
Optimizers and Access Paths.
Materializing the Result Rows.
4. Deriving the Ideal Index for a SELECT.
Basic Assumptions for Disk and CPU Times.
Three-Star Index—The Ideal Index for a SELECT.
Algorithm to Derive the Best Index for a SELECT.
Ideal Index for Every SELECT?
Cost of an Additional Index.
5. Proactive Index Design.
Detection of Inadequate Indexing.
Basic Question (BQ).
Quick Upper-Bound Estimate (QUBD).
Cheapest Adequate Index or Best Possible Index: Example 1.
Cheapest Adequate Index or Best Possible Index: Example 2.
When to Use the QUBE.
6. Factors Affecting the Index Design Process.
I/O Time Estimate Verification.
Multiple Thin Index Slices.
Filter Factor Pitfall.
Filter Factor Pitfall Example.
7. Reactive Index Design.
EXPLAIN Describes the Selected Access Paths.
Monitoring Reveals the Reality.
LRT-Level Exception Monitoring.
Call-Level Exception Monitoring.
DBMS-Specific Monitoring Issues.
8. Indexing for Table Joins.
Two Simple Joins.
Impact of Table Access Order on Index Design.
Basic Joint Question (BJQ).
Predicting the Table Access Order.
Merge Scan Joins and Hash Joins.
Nested-Loop Joins Versus MS/HJ and Ideal Indexes.
Joining More Than Two Tables.
Why Joins Often Perform Poorly.
Designing Indexes for Subqueries.
Designing Indexes for Unions.
Table Design Considerations.
9. Star Join Considerations.
Indexes on Dimension tables.
Huge Impact of the Table Access Order.
Indexes on Fact Tables.
10. Multiple Index Access.
11. Indexes and Reorganization.
Physical Structure of a B-Tree Index.
How the DBMS Finds an Index Row.
What Happens When a Row IS Inserted?
Are Leaf Page Splits Serious?
When Should an Index Be reorganized?
Volatile Index Columns.
Long Index Rows.
Example: Order-Sensitive Batch Job.
Table Rows Stored in Leaf Pages.
Cost of Index Reorganization.
12. DBMS-Specific Indexing Restrictions.
Number of Index Columns,
Total Length of the Index Columns.
Number of Indexes per Table.
Maximum Index Size.
Index Row Suppression.
DBMS Index Creation Examples.
13. DBMS-Specific Indexing Options.
Index Row Suppression.
Additional Index Columns After the Index Key.
Constraints to Enforce Uniqueness.
DBMS Able to Read an Index in Both Directions.
Index Key Truncation.
Index Skip Scan.
Data-Partitioned Secondary Indexes.
14. Optimizers Are Not Perfect.
Optimizers Do Not Always See the Best Alternative.
Optimizers’ Cost Estimates May Be Very Wrong.
Cost Estimate Formulas.
Do Optimizer Problems Affect Index Design?
15. Additional Estimation Considerations.
Assumptions Behind the QUBE Formula.
Nonleaf Index Pages in Memory.
When the Actual Response Time Can Be Much Shorter Than the QUBE.
Estimating CPU Time (CQUBE).
CPU Estimation Examples.
16. Organizing the Index Design Process.
Computer-Assisted Index Design.
Nine Steps Toward Excellent Indexes.
MICHAEL LEACH, BSc, is a relational database consultant. He retired from IBM with twenty years' experience teaching application and database classes at IBM locations worldwide. Both authors have seen their material translated into many languages for widespread use. Their approach to index design has been successfully applied in numerous performance-critical systems.