Regex Syntax
See also Language Tour - Regexes
Basics
Literal characters within a pattern are matched as-is, and are case sensitive.
| Pattern | Matches |
| xyz | abcxyz |
| ran | orange |
Quanitifiers
You can match a variable number of repeated letters with a quantifier symbol.
| Symbol | Match |
| * | Zero or more times |
| + | One or more times |
| ? | Zero or one time |
| {n} | Exact number of times |
| {n,} | N or more times |
| {n1,n2} | Between n1 and n2 times |
Examples:
| Pattern | Matches |
| ab*z | az, abz, abbbbz |
| ab+z | abz, abbbbz |
| ab?z | az, abz |
| ab{3}z | abbbz |
| ab{2,}z | abbz, abbbbbbbz |
| ab{1,2}z | abz, abbz |
Meta Characters
Meta-characters let you match characters by their type.
You can also escape special characters using backslash \.
| Symbol | Match |
| . | Any single character |
| \d | Digit [0-9] |
| \D | Non-digit |
| \w | Alphanumeric character |
| \W | Non-alphanumberic character |
| \s | Whitespace character |
| \S | Non-whitespace character |
| \n | Newline |
| \ | Escape a special character |
Examples:
| Pattern | Matches |
| a.z | abz, a2z, a z, a#z |
| a\d+z | a1z, a2345z |
| 123\D | 123x, 123! |
| \w+ | apple, c64 |
| \w+\W | hello!, correct. |
| \w+\s\w+ | red fish, abc 123 |
| \d+ \+ \d | 2 + 3 |
Character Classes
You can specify a list of characters to match by using a character class [...].
To match all characters that are NOT in the list, start the class with a caret ^.
| Symbol | Match |
| [abc] | Any character that is 'a', 'b', or 'c' |
| [a-z] | Any character between 'a' and 'z' (ASCII order) |
| [^abc] | Any character that is NOT 'a', 'b', or 'c' |
| [+*?.] | Special characters are treated as literals |
| [0-9a-f] | Any character between 0 and 9, or 'a' through 'f' |
Examples:
| Pattern | Matches |
| [tdl]ime | time, dime, lime |
| [a-z]+zzle | fizzle, drizzle, muzzle |
| fla[^w] | flag, flat, fla2 |
| \w+[?!.] | Whoa!, Huh?, okay. |
| [A-Z0-9]+ | C3P0, TRS80, AW3SOM3 |
Non-Greedy Match
By default, quantifiers slurp in as many characters as possible.
Add a `?` to a quantifier to make it non-greedy.
Examples:
| Pattern | Matches | Without '?' |
| .*?/ | abc/def/xyz | abc/def/xyz |
Unicode Characters
To match a specific Unicode code point, use \x{number}.
To match a built-in Unicode character class (see below), use \p{...}.
To match characters that are not in the character class, use uppercase \P{...}.
For more information, see this Unicode Regex Reference.
| Code | Matches |
| \x{1234} | Unicode code point U+1234 |
| \p{L} | Any kind of letter from any language. |
| \p{Z} | Any kind of whitespace or invisible separator. |
| \p{N} | Any kind of numeric character in any script. |
| \p{P} | Any kind of punctuation character. |
Anchor Symbols
Anchor symbols are not characters, but represent positions within the string (between characters).
| Symbol | Match |
| ^ | Beginning of string |
| $ | End of string |
| \b | Word boundary |
| \B | Not a word boundary |
| Pattern | Matches | Not Match |
| ^gr | great, green grape | agriculture |
| ine$ | this is fine | is this fine? |
| s+\b | less is more, hissss | finesse |
Match Groups
You can group together subpatterns with parens (...).
This allows you to do two things:
- Capture the inner match for later use
- Provide "OR" logic via the
|separator
| Symbol | Match |
| (\w+) | Capture word match |
| (ab|cd) | Capture 'ab' or 'cd' |
| (ab|cd|xy) | Capture 'ab' or 'cd' or 'xy' |
| (abc|[0-9]*) | Capture 'abc' or any amount of digits |
Example:
$text = 'Product: Tomatoes, Count: 33' $pattern = rx'Product: (\w+), Count: (\d+|none)' $matches = $text.match($pattern) print($matches[1]) //= 'Tomatoes' print($matches[2]) //= '33'
Modifiers
Modifiers are flags appended to the end of a regex string that change the overall behavior of the pattern.
| Suffix | Name | Effect |
| i | Ignore Case | Case-insensitive match. |
| m | Multiline | ^ and $ match start/end of a line, not the full string. |
| s | Single Line | Dot . also matches newlines. |
| x | Extended | Ignore literal spaces between pattern symbols. (To provide more clarity in complex patterns.) |
$poem = '''
Roses are Red
Violets are Blue
'''
$poem.contains(rx'rose'i)
//= true
$poem.contains(rx'red$'mi)
//= true (end of line) ^
$poem.contains(rx'red.*blue'si)
//= true (across lines) ^
$line = 'ID:ABC-123'
$line.match(rx'\w+ : (\w+ - \d+)'x)
//= { 1: 'ABC-123' }