Detailed explanation of regular expression usage in JScript (with example: syntax highlighting in JScript)

Author：Eve Cole Update Time：2009-06-08 18:34:28

Haha, let’s start with a few words. Last year I used C# to make a syntax highlighting tool. It formatted the given code into HTML based on the information in the configuration file, so that it can display the same syntax on the web page as in the editor. Element highlighting effect and support for code folding. That's right, it's similar to what you saw on the blog. Because I was using MSN Space at the time, it did not provide this function, so I had to write one myself.

I used C# to write. At first, I used super cumbersome basic statements such as for, while, switch, if, etc. to judge keywords, etc. Don’t laugh at me. I was stupid and didn’t know what a regular expression was at the time, so I can only use this crude method. Of course, the crude method is still effective. It is just a long code in a function. It will be very difficult to maintain in the future. I thought that other software cannot be written like this, so... After searching on Google for a while, I found some code and open source projects with syntax highlighting, and started taking a look. . . . . Wow, everything is so complicated. To be honest, the thing I don’t like to do the most is to look at other people’s code. I’m not pretentious, but it’s really confusing to look at other people’s code. Unless there is a very detailed document description, otherwise I will I don't want to look at it at first glance. At most, I just look at how other people write the interface, and then guess how it is implemented internally.

Although the search was not very helpful, it still made me know about regular expressions. I forgot where I saw it. At that time, I began to study regular expressions while improving my "broken stuff". Not long after that, I started blogging again in Blog Park, and finally enabled the syntax highlighting function of Blog Park. So I lost a major motivation to write my own code to highlight HTML. Secondly, the syntax highlighting module made in C# can only run on the server side or WinForm program, and what I ultimately want to obtain is HTML code to display on the page. I think client-side scripts are most suitable for this job. It's a pity that I don't know much about JS. . . Later, I started messing around with other things during this period, and did not improve the syntax highlighting module.

I worked overtime last night and came home. I originally planned to continue learning UML and see the patterns. Then I remembered that the company had a module that needed to remove all HTML tags in the results returned by the database, so I opened the regular expression tool RegexBuddy. As a result, I saw a simple tutorial on using regular expressions in JScript in RegexBuddy's help document, so my curiosity arose again, so I opened UltraEdit-32 and started writing simple JavaScript to experiment.

I won’t go over the details of my testing process here, because many places involve repeated testing and many detours. Here I will directly give the usage of regular expressions in JScript that I summarized from the testing.

Enough with the nonsense, let’s get to the point!

The Prime Minister talks about JScript's regular expression object RegExp.

The class name that provides regular expression operations in JScript is RegExp, and objects of the RegExp type can be instantiated in two ways.

Method 1, constructor instantiation:

var myRegex = new RegExp(" \w +", "igm ");
//w+ is the actual regular expression. Note that the first is for escaping. igm means ignoring case, global search, and multi-line search respectively. This will be explained later.
Method two, direct assignment method:

var myRegex = /w+/igm;
//The effect is the same as the previous statement, except that there is no need to use transfer characters here. The original regular expression is what it looks like. igm has the same effect as the igm in the previous example.
The specific method used depends on everyone's preference. Personally, I think the second method is easier to read when writing regex. The RegexBuddy help document also recommends the second method. The RegExp object includes the following operations:

exec(string str): performs regular expression matching and returns the matching results. According to the example running results given by MSDN, each execution of exec starts from the end of the last direct match. And the returned value seems to be a RerExp object, and the explanation given by RegexBuddy is to return an array, but no detailed example is given. I think it is more reliable based on the experimental results.

compile(string regex, string flags): Precompile regular expressions to make them run faster. After testing, the efficiency is significantly improved after pre-compilation. The regex parameter is a regular expression, and flags can be a combination of the following three values: g – global search. My test result is that without adding the g flag, it can only match the first qualified string i – ignoring the case m – Multi-line search seems to be multi-line search by default

test(string str): if str matches the regular expression, it returns true, otherwise it returns false. This match method similar to the string object

RegExp object contains the following attributes:

index: in the string The position of the first matching expression, initially -1
input: the matching target of the regular expression, note that it is read-only
lastIndex: The position of the next matching expression. The original word is (Returns the character position where the next match begins in a searched string.) I don’t know if there is a translation error. I have not used this attribute.
lastMatch: the last string matching the expression
lastParen: the last matched submatch string. For example, if there are multiple matches grouped by () in a regular expression, lastParen represents the last group of matched results.
leftContext: All characters from the beginning of the target string to the starting position of the last match.
rightContext: All characters from the end of the last match to the end of the entire target string.
$1...$9: Indicates the result of the nth group of matches. This is useful when there are multiple () groups in the regular expression.

Next, let’s talk about the operations related to the String object and regular expressions in JScript:

match(string regex): Accepts a regular expression and returns whether the string matches the expression.
replace(srting regex, string str): Replace the substring matching the regular expression with str. This function seems simple, but it also hides more advanced usage. Please see the following example.
Example 1:

var str1 = "A:My name is Peter!nB:Hi Peter!";
str1 = str1.replace(/Peter/g,"Jack");
alert(str1);
This example is as simple as replacing a string. The power of this expression is of course not limited to this. If you are skilled in using it, you can also use it to complete many tasks that previously required a lot of code. For example, add highlighted HTML tags before and after code keywords. From the previous example, it seems that replace can only replace matching text with new text. How can I use it to insert tags before and after keywords? Thinking back, if you can use the matching results when replacing, then things will be easier. Just replace the keywords with: tag header + keyword + tag tail.

But how to use the results of regular expression matching in replace?

At this time we need to use "matching variables". Matching variables are used to represent the results of regular matching. The following is a description of matching variables:
$& -- represents the matching results of all matching groups. Finally, the matching group is the () group of the regular expression.
$$ -- represents the $ character. Because the matching variable uses the $ character, it needs to be escaped.
$n -- similar to the previous $1...$9, indicating the nth set of matching results
$nn -- very simply the result of the nnth group of matches
$` -- is the leftContext mentioned earlier. For example, if abcdefg is matched with d, then abc is its leftContext.
$' -- It's very close to the above, don't read it wrong! , this is the rightContext. By analogy, efg is the rightContext of the above example. So now it is very simple for us to insert tags before and after the keywords:

var str1 = "A:My name is Peter!nB:Hi Peter!" ;
str1 = str1.replace(/Peter/g, "<b>$&</b>");
alert(str1);
It’s already 0:39. . . Let’s stop here.

Regular tool software download (password: regex): regex buddy 2.06.zip
Please see the example I wrote: JScript syntax highlighting (code streamlining)

Here are some examples from MSDN:

function matchDemo()
{
var s; //Declare variable.
var re = new RegExp("d(b+)(d)","ig"); //Regular expression pattern.
var str = "cdbBdbsbdbdz"; //String to be searched.
var arr = re.exec(str); //Perform the search.
s = "$1 returns: " + RegExp.$1 + "n";
s += "$2 returns: " + RegExp.$2 + "n";
s += "$3 returns: " + RegExp.$3 + "n";
s += "input returns : " + RegExp.input + "n";
s += "lastMatch returns: " + RegExp.lastMatch + "n";
s += "leftContext returns: " + RegExp.leftContext + "n";
s += "rightContext returns: " + RegExp.rightContext + "n";
s += "lastParen returns: " + RegExp.lastParen + "n";
return(s); //Return results.
}
document.write(matchDemo());

If any heroes passing by have any opinions on this article, please feel free to post them here. Let’s learn and make progress together.