Search Appliance SBE
\=+*?{},[]^$.-!
" and must be
escaped with a "\
" if they are meant to be taken literally.
The string ">>
" is also special and if it is to be matched,
it should be written "\>>
". Not all of these characters are
special all the time; if an entire string is to be escaped so it
will be interpreted literally, only the characters "\=?+*{[^$.!>
"
need be escaped.\
" followed by an "R
" or an "I
"
means to begin respecting or ignoring alphabetic case distinction.
(Ignoring case is the default.) These switches stay in effect until
the end of the subexpression. They do not apply to characters
inside range brackets.\
" followed by an "L
" indicates that the
characters following are to be taken literally, case-sensitive, up to the next
"\L
". The purpose of this operation is to remove the special
meanings from characters.\F
" (followed by) or
"\P
" (preceded by) can be used to root the rest of an
expression to which it is tied. It means to look for the rest of the
expression "as long as followed by ..." or "as long as preceded
by ..." the subexpression following the \F
or \P
.
Subexpressions before and including one with \P
, and
subexpressions after and including one with \F
, will be
considered excluded from the located expression itself.\
" followed by one of the following C
language character classes matches any character in that class:
alpha
, upper
, lower
, digit
, xdigit
,
alnum
, space
, punct
, print
, graph
,
cntrl
, ascii
. Note that the definition of these classes
may be affected by the current locale.\
" followed by one of the following special characters
will assume the following meaning: n
=newline, t
=tab,
v
=vertical tab, b
=backspace, r
=carriage return,
f
=form feed, 0
=the null character.\
" followed by Xn
or Xnn
where n
is a hexadecimal digit will match that character.\
" followed by any single character (not one of the
above) matches that character. Escaping a character that is not a
special escape is not recommended, as the expression could change
meaning if the character becomes an escape in a future release.^
" placed anywhere in an expression
(except after a "[
") matches the beginning of a line (same as
\x0A
in Unix or \x0D\x0A
in Windows).$
" placed anywhere in an expression
matches the end of a line (\x0A
in Unix, \x0D\x0A
in
Windows).
Note: The beginning of line ("^
") and end of line
("$
") notation expressions for Windows are both identified as a 2
character notation; i.e., REX under Windows matches "\x0D\x0A
"
(carriage return, line feed) as beginning and end of line, rather than
"\x0A
" as beginning, and "\x0D
" as end.
.
" matches any character.[]
") is a set, and
matches any single character from the string. Ranges of ASCII
character codes may be abbreviated with a dash, as in
"[a-z]
" or "[0-9]
". A "^
" occurring
as the first character of the set will invert the meaning of the
set, i.e. any character not in the set will match instead.
A literal "-
" must be preceded by a "\
". The
case of alphabetic characters is always respected within
brackets.
A double-dash ("--
") may be used inside a bracketed set
to subtract characters from the set; e.g. "[\alpha--x]
"
for all alphabetic characters except "x
". The
left-hand side of a set subtraction must be a range, character
class, or another set subtraction. The right-hand side of a set
subtraction must be a range, character class, or a single
character. Set subtraction groups left-to-right. The range
operator "-
" has precedence over set subtraction.
Set subtraction was added in Texis version 6.
>>
" operator in the first position of a fixed
expression will force REX to use that expression as the
"root" expression off which the other fixed expressions are matched.
This operator overrides one of the optimizers in REX. This
operator can be quite handy if you are trying to match an expression
with a "!
" operator or if you are matching an item that is
surrounded by other items. For example: "x+>>y+z+
" would
force REX to find the "y
"s first then go backwards
and forwards for the leading "x
"s and trailing "z
"s.=
" (i.e. 1
occurrence of nothing) is meaningless. However, if such an empty
expression is the first or last in the list, and is the root
expression (i.e. contains ">>
"), it will constrain the whole
expression list to only match at the start or end of the buffer. For
example: ">>=first
" would only match the string
"first
" if it occurs at the start of the search buffer.
Similarly, "last=>>=
" would only match "last
" at the
end of the buffer.!
" character in the first position of an expression
means that it is not to match the following fixed expression. For
example: "start=!finish+
" would match the word
"start
" and anything past it up to (but not including the
word "finish
". Usually operations involving the NOT operator
involve knowing what direction the pattern is being matched in. In
these cases the ">>
" operator comes in handy. If the
`>>
" operator is used, it comes before the "!
". For
example: ">>start=!finish+finish
" would match anything that
began with "start
" and ended with "finish
".
The NOT operator cannot be used by itself in an expression, or as the
root expression in a compound expression.
Note that "!
" expressions match a character at a time, so
their repetition operators count characters, not expression-lengths as
with normal expressions. E.g. "!finish{2,4}
" matches 2 to 4
characters, whereas "finish{2,4}
" matches 2 to 4 times the
length of "finish
".