Man page - regexp(7)
Packages contains this manual
Manual
REGEXP
NAMEDESCRIPTION
SEE ALSO
NAME
regexp - Plan 9 regular expression notation
DESCRIPTION
This manual page describes the regular expression syntax used by the Plan 9 regular expression library regexp (3). It is the form used by egrep (1) before egrep got complicated.
A regular expression specifies a set of strings of characters. A member of this set of strings is said to be matched by the regular expression. In many applications a delimiter character, commonly bounds a regular expression. In the following specification for regular expressions the word βcharacterβ means any character (rune) but newline.
The syntax for a regular expression e0 is
e3: literal | charclass | β.β | βΛβ | β$β | β(β e0 β)β
e2: e3
| e2 REP
REP: β*β | β+β | β?β
e1: e2
| e1 e2
e0: e1
| e0 β|β e1
A literal is any non-metacharacter, or a metacharacter (one of .*+?[]()|\Λ$ ), or the delimiter preceded by
A charclass is a nonempty string s bracketed [ s ] (or [Λ s ] ); it matches any character in (or not in) s . A negated character class never matches newline. A substring a - b , with a and b in ascending order, stands for the inclusive range of characters between a and b . In s , the metacharacters an initial and the regular expression delimiter must be preceded by a other metacharacters have no special meaning and may appear unescaped.
A matches any character.
A matches the beginning of a line; matches the end of the line.
The REP operators match zero or more ( * ), one or more ( + ), zero or one ( ? ), instances respectively of the preceding regular expression e2 .
A concatenated regular expression, e1e2 , matches a match to e1 followed by a match to e2 .
An alternative regular expression, e0|e1 , matches either a match to e0 or a match to e1 .
A match to any part of a regular expression extends as far as possible without preventing a match to the remainder of the regular expression.
SEE ALSO
regexp (3)