| Perl Compatible Regular Expressions |
Article Index for Perl |
Website Links For Perl |
Information AboutPerl Compatible Regular Expressions |
| CATEGORIES ABOUT PERL COMPATIBLE REGULAR EXPRESSIONS | |
| pattern matching | |
| c libraries | |
Perl Compatible Regular Expressions ('''PCRE''') is a Regular Expression C Library inspired by Perl 's external interface, written by Philip Hazel . PCRE is much richer than classic regular expression libraries which is why they have been adopted by many modern Programming Language s. Their syntax is much more powerful and flexible than POSIX regular expressions. The name is therefore a misnomer, because PCRE is "Perl Compatible" only if you consider a subset of PCRE's settings and a subset of Perl's regular expression facilities. PCRE settings also permit PCRE to emulate regular expression libraries other than Perl's, such as the selection of backslash to either enable ( Emacs -like) or disable (Perl-like) special characters like Vertical Bar . C and C++ interfaces are provided by the library itself. The PCRE library is incorporated into a number of prominent Open-source programs, such as the Apache web server and the PHP scripting language. As of Perl 5.9.4 PCRE is also available as a replacement for Perl's default regular expression engine through the re::engine::PCRE module. FEATURES PCRE has developed an extensive and in some ways unique feature set. While originally intended to be feature equivalent with Perl over time a number of features have been first implemented in PCRE and only much later added to Perl. During the PCRE 7.x and Perl 5.9.x (development track) phase the two projects have coordinated development and are to the extent possible feature equivalent. In some cases PCRE has included in mainline releases features that originated with Perl 5.9.x and in some cases Perl 5.9.x has included features that were previously only available in PCRE. 1 Currently, the following features are available: ;Consistent escaping rules: Like Perl, PCRE has consistent escaping rules: any non-alpha-numeric character may be escaped to mean its literal value by prefixing a \ (backslash) before the character, and vice versa, any alpha-numeric character preceded by a backslash typically gives it a special meaning. In the case where the sequence has not been defined to be special it will also be treated as a literal, however this usage is not forward compatible as new versions of PCRE may give such patterns a special meaning. A good example of this is \R which has no special meaning prior to PCRE 7. In POSIX regular expressions, sometimes backslashes escaped non-alpha-numerics (e.g. \.) and sometimes it introduced a special feature (e.g. \(\)).;Extended character classes :Single-letter character classes are supported in addition to the longer POSIX names. For example \d matches any digit exactly as would in POSIX regular expressions.
;Multiline matching : ^ and $ can match at the beginning and end of a string only, or at the start and end of each "line" within the string depending on what options are set.
;Newline/linebreak options :When PCRE is compiled, a Newline default is selected. Which Newline/linebreak is in effect affects where PCRE detects ^-line beginnings and $-ends (in multiline mode) as well as what matches dot (regardless of multiline mode unless the dotall (?s) option is set). It also affects PCRE's matching procedure (since version 7.0): when an unanchored pattern fails to match at the start of a newline sequence, PCRE advances past the entire newline sequence before retrying the match. If the newline option alternative in effect includes CRLF as one of the valid linebreaks, it does not skip the in a CRLF if the pattern contains specific or references (since version 7.3). :The Newline option can be altered with external options when a pattern is compiled as well as when it is run. Few application using PCRE provide users with the means to apply this external option. So, new in version 7.3, the Newline option can also be stated at the start of the pattern using one of the following:
.
.
|