The three parts of Regex
Regex stands for ‘Regular expressions’, which are expressions in programming that you can use to match parts of strings. The snippet that you are ‘looking for’ is the Regex.
I like to begin understanding code by thinking about the main components that are involved. These will be explained further below, but they help understand what we are dealing with.
To look for a pattern (regex) inside of a string (or strings), you need to have:
1. A regex. The regex is a variable, containing the words, letters, or numbers you’re looking to find inside one or several strings.
2. A string (or strings) you want to look for this pattern within
3. A method: You might want to find matches for different reasons, and this influnce which method to use. For example, the .test() method returns a true or false – whereas .match() returns the result itself.
2 methods you use for Regex
JavaScript has multiple ways to use regexes – the FreeCodeCamp course introduces two main methods: test and match, explained below.
.test()
The .test()
method takes the regex, applies it to a string (which is placed inside the parentheses), and returns true
or false.
So its useful when you want to want to know whether there is a match.
/regex/.test(‘string’);
Example:
let testStr = “Hello, my name is Kevin.”;
let testRegex = /Kevin/;
testRegex.test(testStr);
//This would return true, because there is a “Kevin” inside the testStr
.match()
The match method is useful when you want to extract the matches found. The match function checks if there is a match, and returns the matches found.
To use the .match()
method, apply the method on a string and pass in the regex inside the parentheses.
‘string’.match(/regex/);
Example:
“Hello, World!”.match(/Hello/);
let ourStr = “Regular expressions”;
let ourRegex = /expressions/;
ourStr.match(ourRegex);
How to make the Regex’ find what you’re looking for: Syntax variations
The regex is the ‘pattern’ you are looking for within strings. There are many ways you can write your Regex in order to find different things. This section contains a summary of the ones introduced in the FreeCodeCamp JavaScript introduction to Regex.
/pattern/
Find X inside of a string
In Javascript, adding /
on each side of a your regex means you are looking for the word inside the string(s) you’re checking.
This option does take capitalisation into account, so if your regex is /myDesiredWord
, it will not be able to find MYdesiredWORD
inside the string youre considering.
let myRegex = /the/
/a|b|c/
Look for either X OR Y in the string
You can add |
(pipe symbol) within the / /
of your regex to look for several words at the same time. It functions as an OR
operator. This means that it’s a match if it finds ANY of the words inside, even if it’s just one of them.
let myRegex = /yes|no|maybe/
You can add |
(pipe symbol) within the / /
of your regex to look for several words at the same time. For example, in the example here, if a string contained only green – it would match myRegex, which has both blue and green – even if there is no blue in the string.
let lookForTwoRegex = /blue|green/
let lookForSeveralWordsRegex = /yes|no|maybe/
/pAtTeRn/i
Search for a match without caring about capitalisation
By adding ani
after your regex slashes, you will be able to find matches even when there is differences in capitalisation. You can remember it by thinking that it stands for ‘ignore capitalisation differences’!
let ignoreCapitalisationRegex = /PoRk/i
Don’t stop at the first match: find all of the instances!
/repeat/g
Want to extract every single instance a word occurs? Write a g
at the end of your regex to say you want ALL of the matches returned.
Notably, this does not. make a difference if you are using the .test()
method, because it only returns true or false – so the outcome is the same regardless of how many repetitions there are. However, when you use .match()
you might want to extract every single instance of the match – this is where this becomes useful.
Note that you can also combine this with other end-letters, including ‘i’ – as shown. in the example. to the left.
let findMultipleInstances = /phrase/g
let findMultipleInstances = /pHrAsE/gi
/./
Use . to match with any letter
Want to find a number of similar words? Perhaps find words despite misspellings? A .
will match absolutely anything.
let findSimilarWords = /hu./g
//could match with both hug, hum, hua…
let findSimilarWords = /.un/g
//could match with both fun, run, sun…
/[abc]/
Create a regex that looks for several alternatives, but not as many as “.”
While a dot lets you match with any letter, you can decide to look for several things – but not everything.
Just add the alternatives you are looking for insdie of brackets [].
let matchSomeLetters = /b[aiu]g/
//this example from FCC would be. amatch with bag, big, and bug – but not with bog.
Restrict open matches to letters, numbers, or both
You could, of course, write. this manually as well by placing all possible options inside brackets. However, the – allows you to pick a range.
This is especially useful if you are looking for any match, but not spaces or signs.
let matchSomeLetters = /b[a-e]g/ //this would match any word starting with b, with a middle letter between A and E in the alphabet, and ends with G.
let matchSomeLetters = /b[a-z0-9]g/
//this would match any letter or number, but not weird signs.
/\w/
Restrict open matches to letters, numbers, or both… in a simpler way.
There is a shortcut for A-Za-z0-9_
, just add a \w
instead!
Note that this also, for some reason, includes underscore.
Add a + if you want to find all instances, but remove it to just find at least 1 🙂
let shortHand = /\w/;
Avoid all letters, numbers, or both… in a simpler way.
A capitalised \E
does the opposite of its lower case pal. It is shorthand for [^A-Za-z0-9_]
let shortHand = /\W/;
/\d/
Match numbers only
\d
is shorthand for [0-9]
. Conversely, like \w, its capitalised brother does the opposite – shorthand for excluding all numbers.
let onlyNumbers = /\d/;
let noNumbers = /\D/;
/[^a]/
Include information about what you do NOT want matches for
Sometimes its easier to exclude what you do. not want, instead of stating every possible option you want – except just one or a few. Add ^
after the opening bracket. Remember that this should happen inside the brackets!
let exampleRegExp = /[^a-z]/i;
//This would return anything thats not in the alphabet
let exampleRegExp = /[^aeiou0-9]/;
//This would return any word that is not a number, nor a vowel.
/go+gle/
Find consecutive instances of a character (1 or more)
The +
allows you to look for multiple instances of a character.
This allows you to use one regex to find both “google”, “gogle”, or “gooooooooogle”.
let myRegex = /go+gle/gi;
//would find matches with both “gogle”, “gooooogle” or “google”
/go*gle/
Get a match even if there are 0 matches
In JavaScript, the asterisk * is used in regular expressions (regex) as a special character to match zero or more occurrences of the preceding character or pattern. It allows you to specify that a particular character or pattern can repeat zero or more times in the text you’re searching for a match.
let myRegex = /go*gle/gi;
//returns null when there is zero matches, a bit more flexible!
/go{3,5}gle/
Find a specific number of characters
If you want to match the regex only if the character in the string occurs a certain number of times, you can add this range after the character within curly braces, seperated by a comma.
These curly braces says that the character before the curly braces is a match if it occurs within the first and second number within the curly braces.
let A4 = “aaaah”;
let A2 = “aah”;
let multipleA = /a{3,5}h/;
multipleA.test(A4);
multipleA.test(A2);
//The first test call would return true, while the second would return false.
Establish only the upper or lower limits of consecutive characters
Building on the example above, simply remove the minimum or maximum to only set one of them.
In the examples to the left: The first example shows how you can set only the lower limit, the second. howyou set an uper limit.
let multipleA = /ha{3,}h/;
//will match if there are at least 3 ‘a’
Decide the exact amount of times a character can occur
With a number inside curly braces and no comma, you say that the letter in front of it is only a match if it occurs this exact number of times.
let multipleA = /ha{6}h/;
//will match if there are exactly 6 ‘a’
let multipleA = /ha{,6}h/;
//will match if there are at less than 6 ‘a’
/h.y?/
Lazy matching: Find the shortest phrase in the string that matches your regex pattern
Regex are by default “greedy”, in that. This means that when using .Match()
(without g), they will look for the longest match possible in the string and return that. You can make it “lazy” by adding ?
to it.
For example, there might be several “matches” to the regex /H[a-z]*!/
pattern. that starts with capital H, and ends with !.
If you have a string like: “Hey! I love you!” – a .match() with the regex above would return the WHOLE string – because it is greedy.
However, you can make it look for the shortest way to match the regex by adding a
?, like this: /H[a-z]*?!/
By adding the question mark, the .match() would return “Hey!” because it also fulfils the pattern – but is shorter.
let goRegex = /H[a-z]!?/;
/^My/
Look for patterns at the beginning of your string
You can use ^
to find matches that are at the very beginning of a string.
Note that this is the same symbol that is used when you are matching any character except specific ones – but in that case, its used inside brackets like this: /p[^i]e/
(this would return any word starting with p and ending with e – except pie. But when used outside of brackets, the same symbol has a different meaning.
let firstRegex = /^Ricky/;
//would find a match if the string START with Ricky 🙂
/end$/
Look for patterns at the end of a string
Conversely to the example above, a $
at the end of your regex would only find the match if it is at the end of the string.
let storyRegex = /story$/;
//would find a match if ‘story’ is at the END of the string.
/\s/
Look for whitespace as part of your pattern
/\s/
Look for non-whitespace as part of your pattern
/\S/
/colou?r/;
Make something optional
A ?
notes that the character before it is optional. You might want to check something that isn’t necessarily there… One example used is how English. and American spellings sometimes differ: Color and colour are both legit spellings. Another use example, is if you have a set of product numbers where most only have 4 characters. – but some have an additional letter in the middle. This part might not be necessary for the product number to be valid in your test, but you might have rules that IF this exists. -it must only be certain kinds of letters.
let american = “color”;
let british = “colour”;
let rainbowRegex= /colou?r/;
rainbowRegex.test(american);
rainbowRegex.test(british);
/q(?=u)/
Make your match conditional on patterns further ahead in the string
A negative lookahead will “look” to make sure that a pattern exists further ahead in the string – in order for this to be a match.
You may add several patters that must, or must not, exist further ahead in order to satisfy the regex.
A positive lookahead is used as (?=...)
where the ...
is the required part (this part it required for a match, but is not incluced in the match).
Conversely, (?!=...)
states that the following is NOT a certain character.
let quRegex= /q(?=u)/;
let qRegex = /q(?!u)/;
/(P(engu|umpk)in)/
Find either one pattern, or another
Using the example from FreeCodeCamp: If you want to find either Penguin or Pumpkin in a string, you can use the following Regular Expression: /P(engu|umpk)in/g
let testStr = “Pumpkin”;
let testRegex = /P(engu|umpk)in/;
testRegex.test(testStr);
//would return true
/(\w+) \1 \1/
Capture groups to ‘capture’ patterns for reuse
Capture groups to ‘capture’ patterns for reuse
Capturing Values: When the regex finds a match, it remembers the specific parts that match each capture group.
For example: Want to find repeating words, but don’t want to restrict it to a specific word – but any? Well, by using brackets, you. can ‘capture’ this pattern – and then use it again later.
The ‘group’ gets its ‘name’ based on its order: the first group can be reused by writing \1, the second as \2, and so on.
let repeatRegex = /(\w+) \1 \1/;
//captures the first word, and then looks for. this same word repeated two more times!