Beginning Regular Expressions

ISBN: 978-0-7645-7489-4
768 pages
February 2005
Beginning Regular Expressions (0764574892) cover image


This book introduces the various parts of the construction of a regular expression pattern, explains what they mean, and walks you through working examples showing how they work and why they do what they do. By working through the examples, you will build your understanding of how to make regular expressions do what you want them to do and avoid creating regular expressions that don’t meet your intentions.
Beginning chapters introduce regular expressions and show you a method you can use to break down a text manipulation problem into component parts so that you can make an intelligent choice about constructing a regular expression pattern that matches what you want it to match and avoids matching unwanted text.
To solve more complex problems, you should set out a problem definition and progressively refine it to express it in English in a way that corresponds to a regular expression pattern that does what you want it to do.
The second part of the book devotes a chapter to each of several technologies available on the Windows platform. You are shown how to use each tool or language with regular expressions (for example, how to do a lookahead in Perl or create a named variable in C#).
Regular expressions can be useful in applications such as Microsoft Word, OpenOffice.org Writer, Microsoft Excel, and Microsoft Access. A chapter is devoted to each.
In addition, tools such as the little-known Windows findstr utility and the commercial PowerGrep tool each have a chapter showing how they can be used to solve text manipulation tasks that span multiple files.
The use of regular expressions in the MySQL and Microsoft SQL Server databases are also demonstrated.
Several programming languages have a chapter describing the metacharacters available for use in those languages together with demonstrations of how the objects or classes of that language can be used with regular expressions. The languages covered are VBScript, Javascript, Visual Basic .NET, C#, PHP, Java, and Perl.
XML is used increasingly to store textual data. The W3C XML Schema definition language can use regular expressions to automatically validate data in an XML document. W3C XML Schema has a chapter demonstrating how regular expressions can be used with the xs:pattern element.
Chapters 1 through 10 describe the component parts of regular expression patterns and show you what they do and how they can be used with a variety of text manipulation tools and languages. You should work through these chapters in order and build up your understanding of regular expressions.
The book then devotes a chapter to each of several text manipulation tools and programming languages. These chapters assume knowledge from Chapters 1 through 10, but you can dip into the tool-specific and language-specific chapters in any order you want.

Table of Contents


Chapter 1: Introduction to Regular Expressions.

Chapter 2: Regular Expression Tools and an Approach to Using Them.

Chapter 3: Simple Regular Expressions.

Chapter 4: Metacharacters and Modifiers.

Chapter 5: Character Classes.

Chapter 6: String, Line, and Word Boundaries.

Chapter 7: Parentheses in Regular Expressions.

Chapter 8: Lookahead and Lookbehind.

Chapter 9: Sensitivity and Specificity of Regular Expressions.

Chapter 10: Documenting and Debugging Regular Expressions.

Chapter 11: Regular Expressions in Microsoft Word.

Chapter 12: Regular Expressions in StarOffice/OpenOffice.org Writer.

Chapter 13: Regular Expressions Using findstr.

Chapter 14: PowerGREP.

Chapter 15: Wildcards in Microsoft Excel.

Chapter 16: Regular Expression Functionality in SQL Server 2000.

Chapter 17: Using Regular Expressions with MySQL.

Chapter 18: Regular Expressions and Microsoft Access.

Chapter 19: Regular Expressions in JScript and JavaScript.

Chapter 20: Regular Expressions and VBScript.

Chapter 21: Visual Basic .NET and Regular Expressions.

Chapter 22: C# and Regular Expressions.

Chapter 23: PHP and Regular Expressions.

Chapter 24: Regular Expressions in W3C XML Schema.

Chapter 25: Regular Expressions in Java.

Chapter 26: Regular Expressions in Perl.

Appendix A: Exercise Answers.


Author Information

Andrew Watt is an independent consultant and experienced author with an interest and expertise in XML and Web technologies. He has written and coauthored more than 10 books on Web development and XML, including XPath Essentials and XML Schema Essentials. He has been programming since 1984, moving to Web development technologies in 1994. He’s a well-known voice in several influential online technical communities and is a frequent contributor to many Web development specifications.
Download TitleSizeDownload
Download the Source Code
To use this file, download it to your local machine and unzip it. Windows users can use the Windows built-in ZIP utilities or a 3rd party utility like WinZip. Be sure to use the correct option in your ZIP utility to preserve the directory structure when decompressing the archive.
226.55 KB Click to Download
Corrected Figure 14
This is a corrected figure 14 from chapter 8.
14.17 KB Click to Download
Do you think you've discovered an error in this book? Please check the list of errata below to see if we've already addressed the error. If not, please submit the error via our Errata Form. We will attempt to verify your error; if you're right, we will post a correction below.

ChapterPageDetailsDatePrint Run
37 Error in Text
The word "probabl" appears in the middle of page 37.

Should read "probably"
215 Corrected Figure 14
You can find the new figure, fig 8-14, here:

