RegEx Basics Guide

Reading Time: 6 minutes

This guide is a quick compilation of many Regular Expresions examples and how to use them to be able to craft powerful time-saving RegEx. It starts with the most basic concepts, so that you can follow this even if you know nothing at all about RegEx yet.

Out in the wild

While I was working on my latest project in the middle of implementing a new functionality, in order to ensure a string contained only letters, I encountered examples in the code like this particular gibberish:

/[.*+?^${}()|[\]\\]/g

— Well, it looks like gibberish at least to me 😅

In fact, you cannot read it in English since it is not meant to be, this little arange of symbols, letters, and sometimes numbers is called ‘Regular Expression’.

I know this can sound and look scary (and it can be sometimes), but in most cases it is pretty inoffensive.

cat-lion

Unveiling the truth

At first glance it may look that using this kind of tool will be a waste of time and useless because you already know how to do looping and string manipulation; by this point you may think: “Why is the relevance of learning this overly complicated syntax.”

Well, that is actually true, the use is the same and in some cases for plain strings just a simple loop will do, in fact:

Regular Expressions or "RegEx", are just patterns used to match character combinations in strings.

In that case for a plain text string, there is no meaningful difference, and arguably it will be better just to use another string method like includes() or indexOf().

However, bear with me for a moment, let's say that you have to validate a password for an account creation with special characters, letters, and length.

Will you be willing to loop for each word and then each letter to see if it contains a specific character? That would just be a hustle.

Wouldn’t it be better for complex situations like this one liner did the job? Well, this one does exactly that:

/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*\W).{8,}$/g

Having this skill at your disposal can transform and reduce some implementation, going from this:

function pigIt(str) {
  var arrayWord = str.split(' ');
  return arrayWord.map(function(word) {
    var firstLetter = word.charAt(0);
    return word.slice(1) + firstLetter + 'ay';
  }).join(' ');
}

to this:

function pigIt(str){
  return str.replace(/(\w)(\w*)(\s|$)/g, "\$2\$1ay\$3")
}
pigIt('Pig latin is cool'); // igPay atinlay siay oolcay

Both will return the same string for moving the first letter of each word to the end of it, then add “ay” (I know it is a silly one, but you get the point).

It will save you a couple of lines of logic, besides they are a clever approach for very complex string matching.

Basic use

Let’s not get ahead of ourselves here (as I am not an expert myself), and first try to learn the basic uses and syntax to cover the most common cases.

You can identify these search patterns as they are surrounded by slashes like: /example/ with some kind of matching string inside. And, in case you wonder: no, they do not need quotes.

Test method

In JavaScript there are multiple ways to test the RegEx. One of them is the test() method —it takes a RegEx and applies it to the string and returns a boolean with the following result:

MyRegeEx.test(myString) // format to follow 
let string = "my cat is cute";
let myRegex = /cat/;
console.log(myRegex.test(string));
> true

or operator |

RegEx has the ability to search for one or another match in the expression using the or operator | :

let string = "my rabbit and dog are cute";
let myRegex = /cat|dog/;
console.log(myRegex.test(string));
> true

BE CAREFUL !

One thing to consider is that this expressions are Case Sensitive, meaning if I were to search for the word Cat in the example above, I would not get a result. But, dont worry, later in the blog post, we will fix this.

Match method

We have just checked if a pattern exists or not inside of a string, but just like a magic trick, we can extract the actual match or matches that fulfills the RegEx. And, how do you do that? Let me show you.

This time we will compare the whole string andmatchit to the RegEx, the inverse of the test method:

MyString.match(myRegex) // format to follow 
let string = "Please extract my cat from here";
let myRegex = /cat/;
console.log(string.match(myRegex));
// [object Array] (1)
["cat"]

Flags

If you want your RegEx to have a better and more specific match, we use flags, this will allow extra functionality to the search, there are quite a bunch of them but we will cover just a few of them so you can get the grasp of it. They follow the same basic pattern:

var regex = /pattern/flags;

Case-insensitive search: (i)

This flag will allow you to make searches within your string, so you do not have to worry about the casing on it. So returning to our lovely cat example:

let string = "my CAT is cute";
let myRegex = /cat/i;
console.log(myRegex.test(string));
> true

Global search (g)

So far we have been only checking for one instance of the string that we are looking for, but what happens if we want more than one.

For that we have thegflag, and as its name suggests, it lets you make a global search within the whole string, so you can retrieve all the instances of a match.

let string = "I have, a cat, another cat and another cat";
let myRegex = /cat/g;
console.log(string.match(myRegex));
// [object Array] (3)
["cat","cat","cat"]

Matching range of characters [abc] / [a-c]

It is possible to select only a certain character to pass the match or to be inside a word, for that we use the range matching.

let string = "bag bug big beg";
let myRegex = /b[aiue]g/g;
console.log(string.match(myRegex));
// [object Array] (4)
["bag","bug","big","beg"]

Notice here, the expression is searching for any word that starts with the letter b, ends with ag and contains in the middle a,i, u or e.

This kind of matching has the possibility to use the range variant, instead of writing all the letters that you want:

  • all the alphabet: [a-zA-Z]
  • all the numbers: [0-9]

The wildcard operator (.)

“With great power comes great responsibility “, it's the feeling that this operator gives me. This wildcardcharacter is used to match anything, you hear it here first folks, ANYTHING, let me show you what I mean.

let string = "Ill tell you a pun and have some fun";
let myRegex = /.un/g;
console.log(string.match(myRegex));
// [object Array] (2)
["pun","fun"]

So here you can see that expression /.un/ will match anything that ends with the un, any character that comes before that will pass the check.

Negated Character sets ( [^] ).

What about when we do not want to match certain characters in a string. Do not worry I got you!. We can use the ^ symbol to exclude those pesky letters or numbers that we do not need.

let string = "Give me 10 bags of ice";
let myRegex = /[^0-9aiueo]/g;
console.log(string.match(myRegex));
// [object Array] (13)
["G","v"," ","m"," "," ","b","g","s"," ","f"," ","c"]

The ^ needs to be inside the bracket range, so it will exclude the next condition. For this example, I removed any nuber and vowels.

Recurring appearance characters: (+) and (*)

This will come in handy when you have a match where a character appears in a string 1 o more times, for that we use the+character:

let string = "success";
let myRegex = /s+/g;
console.log(string.match(myRegex));
// [object Array] (2)
["s","ss"]

There is also a variant for when you want to check for 0 or more times of a character in that string, in this case we will use*, putting it after the character that we want to check.

let string = "lets gooooooo!";
let myRegex = /go*/g;
console.log(string.match(myRegex));
// [object Array] (1)
["gooooooo"]

Here you see it will bring back the word that contains agand all theofollowing that letter. On the other hand...

let string = "lets g!";
let myRegex = /go*/g;
console.log(string.match(myRegex));
// [object Array] (1)
["g"]

It also will check for 0 appearance and vice versa when you change the letter order in the RegEx.

For this basic guide I think this is enough to process, so just have fun with them.

Together we are strong

Now with this you can see a little bit of RegEx’s power and to fully exploit its potential you can mix search patterns, and flags. It is just as easy as placing the flags together at the end of the expression.

Not-case-sensitive global search:

string = "Twinkle twinkle little star";
myRegex = /twinkle/gi;
console.log(string.match(myRegex));
// [object Array] (2)
["Twinkle","twinkle"]

Not-case-sensitive global search also gives me all words which contain the lettergfollowed byozero times or more and all the appearances of letters.

let string = "lets goo! Mississipi team!";
let myRegex = /go*|s+/gi;
console.log(string.match(myRegex));
// [object Array] (4)
["s","goo","ss","ss"]

and so on ...the sky is the limit.

Disclaimer

Throughout this blog I've been saying nothing more than good stuff about RegEx, but like all tools in the world, they are not perfect. They can be useful for some cases, but for some others they can be overkill, just like:

"You can hammer a nail in the wall using a sledgehammer or a maul, surely they both will get the job done.”

But there are simpler and easier ways to do it.

Regular expressions are not used because they can do anything faster than plain string operations, they are used because they can do very complicated operations with little code with reasonably small overhead.

What I am trying to say is that RegEx is great, but only use them when you need to. Solve your problems with the tool that best fits the task.

There is an old saying by Jamie Zawinski that goes something like this:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

Work smart, give them a chance, and if you like to use them here and there and do not hurt anyone by trying, then you will find the perfect situation for them to be used, just the way I did.

Till next time, Happy Coding!

0 Shares:
You May Also Like