All About Using Regular Expression In Jmeter- part 1

Regular Expression is a combination of strings,number and characters that forms a pattern which can be searched for within a longer piece of text in the response.

Here are the expressions that will be used for creating Regular Expression.

1.The Backlash(\)

A backlash is a bridge between a regular expression and a plain text. Using a “\” we can turn a regular expression into a plain text.

For example we have to match “/image?id=0123” in the response body,but “?” itself has a use in the regular expression so in this case we can use “\” inorder to match the text in the response.

eg-> /image\?id=0123.

In this way the ? Is treated as a normal character.

2.Pipe(|)

This expression means “or”,it matches everything on the either side of a pipe(|)

example: Suppose we have a page response where we need to match all the occurances of the word cocktail and mocktail.Then here is the expression that would work

cocktail|mocktail

3.The Question Mark(?)

It means the last item or character after the ?is optional.In other words the ? Makes the characters that come before it optional.

Example: We have a page with combination of the word “smoulder” and “smolder” and our goal is to match the occurances of this word in the page

smou?lder : this checks that while matching the words, the ? Checks atleast one or zero occurance of the letter “u” in the word.

Hence using the above pattern we can be able to find both the word smoulder and smolder from the page.

4.Parenthesis()

It matches two urls.

Example:In short words we can mention that it is used to check everything on one side or everything on the other side.

Say /foldertwo/thanks and /folderone/thanks

we can write the RegEx expression as

/folder(one|two)/thanks.

In the above RegEx allows to match either the thanks page in folderone or the thanks page in foldertwo- and it is the () that allows us to group the matches.

5.Square Brackets & Dashes:

Using square brackets we can make a simple list and then use them for matching the words consisting of the list.

Example : say we have list [aiu], if we use it as p[aiu]n then we shall get the match for pan,pin,pun but not pain.

In the similar way we can also use “-” to create list of items

example:

[a-z]->all lower case letters.

[A-Z]->all upper case letters.

[a-zA-Z0-9]->all upper case, lower case and digits.

Dashes are a way of creating a list of items.

For example if we have a product group of shoes say loafers and each product name has a number appended to it in the url say loafers012 ,loafers013..

so we can write the expression as loafers[0-9] :this expression will match all the product with product name loafers and followed by the digit from 0-9

6.Braces{}

It repeats the last item specific number of times.

For example.if the braces as two numbers in it like {x,y}then the last item will repeat at least “x”times and not more than “y” times. And if the braces has one number in it {x} then it means repeat the last item exactly “x”number of times.

If a company has multiple ip address, so inorder to generate the regular expression for the ip address from the response page.

123\.105\.169\.[0-9]{1,2}

This means that the digit from 0 to 9 will be repeated at least once and not more than two digits.

Now incase of one digit in the braces,

If there is a scenario where a page as multiple 10 digit mobile numbers, and our goal is to match the mobile number using RegEx.

Then we can use : \d{10}->it is used to match single digit.

7.DOt(.)

A dot matches any one character,it represents the numeric,alpha,special character, and a dot even matches a whitespace.

For example we take this regular expression “.ite”, then it would match “site,lite,kite,bite”.

.matches one character, .ite wont match any character because “ite” includes 0 character after “.”to match.

8.Plus Sign(+):

It matches one or more of the previous items.The number of matches is one or more.

For example if a page has word as “aaargh”,”aaargh”,”aargh”

now we can generate the regular expression as aa+argh,then the words that will match are “aaargh,aaaaargh,aargh” but it wont match “argh” as “+” sign will only match one or more words before the + sign.

9.Star(*):

It matches zero or more characters, but + sign matches only one or more characters it doesnot match zero characters.

If a page has words like aargh,argh,aaaargh.. then the regular expression that can be used to match is aa*rgh which will match argh,aargh,aaaargh.

Here argh will be match because * matches zero or more expression.

10.Dot Star(.*):

“.” followed by “*” are the two regular expression when put together means everything will be matched.

Example: /folderone/.*index\.php

In the above,example the regular expression will match to everything that starts with “folderone/” and ends with “index.php”. This means if we have page in /folderone directory that ends with “.html” then that wont be matched.

“.” means repeat any character and “*” means repeat last character zero or more number of times.

In short the “.” matches the last character be it any number, any character, any alphabet and the “*” matches right after it having the ability to match zero or more characters, this means it matches everything.

11.Caret(^):

When we use a ^, it means we need to match the exact the string only that exactly matches our RegEx.

Lets explain this with a simple example.

Suppose we have a web page that contains string that start with a number at the beginning such as:

123RTGS

114FTUR

143MISS

our goal is to match the string at the beginning of the string, then we can write the same as follows:

(^\d+)-> ^ indicates beginning of the string.

->\d indicates matching a single digit.

->+ denotes matching of one or more digits.

Caret can also be used as [^] ->this means that it matches a single character that is not contained within the brackets. Example: [^abc]-> this will match any character other than “a”,”b”,”c”.

In simple words, if we put the ^inside a square bracket[], then it matches only characters that are not right after the caret. So [^0-9] means if the target string contains the digit, it is not a match.

12.$ sign:

It means nothing that the if anything is added after the $ sign then those wont be matched with the target expression,anything that is before the $ sign will be matched.

For example:

Suppose a webpage has page that ends with *.txt,*.txtphp,*.php,*.html. And our goal is to match only the page that ends with .txt, then we will use the following regular expression.

.*\.txt$ → this means all the page that ends with .txt will be matched and rest of the pages wont be matched.

13.Whitespace(\s):

Matches the whitespace characters,which in ASCII are tab,line feed,form feed,carriage return and space. In unicode also matches no-break spaces,next-line,and the variable width spaces.

14.Digits(\d):

Matches a single digit same as [0-9] in Ascii. If followed by + in this way, \d+ then it will match one or more digits in a line or string.

This is all about the Regular Expression, how to use the various characters to generate our own regular expression and then try to match the same with the target expression and continue testing.

Leave a comment