A regular expression (regex) is a sequence of characters that forms a search pattern. They are used for text searching, replacing, and validation. A regular expression pattern is placed between two forward slashes like this /pattern/i where pattern is the string pattern to be used in the search and i is a modifier flag (in this case i means case-insensitive).
Methods
In JavaScript, the three methods making common use of regular expressions are search(), match() and replace().
- search(): Looks for the pattern in a string. Returns the index (location).
- match(): Looks for the pattern in a string. Returns the matched pattern if found or null if not found.
- replace(): Looks for the pattern in a string. Returns the string with the replacement done if found or unchanged if not found.
Alternatives
In a regular expression, alternatives are denoted with a vertical line character |. It matches any of the alternatives separated with |.
Modifiers (Flags)
Regular expression flags are optional parameters that can modify how a regex pattern is interpreted and applied.
- /g : Performs a global match (find all).
- /i : Performs case-insensitive matching.
- /u : Enables Unicode support.
- /m : Performs multiline matching.
- /d : Specifies the start and end of a match for substrings.
Metacharacters
Metacharacters are characters with a special meaning. They can be used to match digts, words, spaces, and more.
- \d : Matches digits.
- \D : Matches any non-digit.
- \w : Matches word (alphanumeric) characters including the underscore " _ "
- \W : Matches any charcter that is non-word.
- \s : Matches any whitespace character (space, tab, newline).
- \S : Matches any non- whitespace character.
Character Classes
Character classes are enclosed in square brackets []. It matches any character from the set within the brackets.
- [a c d] : Matches either the characrters 'a' or 'c' or 'd'.
- [^x y z] : Matches all characters except 'x', 'y' or 'z'. ^ means NOT.
- [a-k]: Matches any character between 'a' and 'k' inclusive.
- [0-9] : Matches any digit character (same as \d).
Boundaries
Some boundaries match operators:
- ^ : Matches the beginning of a string.
- $: Matches the end of a string.
- . : Matches any one character except newline.
- \b : Matches the beginning or the end of a word.
- \B : Matches not the beginning or the end of a word.
Quantifiers
Quantifiers define the numbers of characters or expressions to match. Works with the pattern preceding them. Some examples with the letter 'a' as the pattern:
- a+ : One or more 'a' characters in a row.
- a* : Zero or more 'a' characters in a row.
- a? : Zero or one 'a' character.
- a{n} : Exactly n 'a' characters in a row.
- a{n,m} : Between n and m 'a' characters in a row.
- a{n,} : n or more 'a' characters in a row.
Escape Sequences
Some characters like ^, $, \, ., *, +, ?, (, ), [, ], {, }, | have special meanings in regular expressions and need to be escaped with a backslash (\) if you want to match them literally.
For example /3\.1416/ matches the exact 3.1416 pattern but /3.1416/ matches 3 followed by any single character then 1416 so 3f1416 would be a match.