RegexOne - Learn Regular Expressions - Lesson 15: Other special characters

Lesson 15: Other special characters

This lesson will cover some extra metacharacters, as well as the results of captured groups.

We have already learned the most common metacharacters to capture digits using \d, whitespace using \s, and alphanumeric letters and digits using \w, but regular expressions also provides a way of specifying the opposite sets of each of these metacharacters by using their upper case letters. For example, \D represents any non-digit character, \S any non-whitespace character, and \W any non-alphanumeric character (such as punctuation). Depending on how you compose your regular expression, it may be easier to use one or the other.

Additionally, there is a special metacharacter \b which matches the boundary between a word and a non-word character. It's most useful in capturing entire words (for example by using the pattern \w+\b).

One concept that we will not explore in great detail in these lessons is back referencing, mostly because it varies depending on the implementation. However, many systems allow you to reference your captured groups by using \0 (usually the full matched text), \1 (group 1), \2 (group 2), etc. This is useful for example when you are in a text editor and doing a search and replace using regular expressions to swap two numbers, you can search for "(\d+)-(\d+)" and replace it with "\2-\1" to put the second captured number first, and the first captured number second for example.

Below are a number of different strings, try out the different types of metacharacters or anything we've learned in the previous lessons and continue on when you are ready.

Exercise 15: Matching other special characters

Task	Text
match	The quick brown fox jumps over the lazy dog.
match	There were 614 instances of students getting 90.0% or above.
match	The FCC had to censor the network for saying &$#*@!.

Solution

This lesson is more of a sandbox for you to play with some sample text. The lazy solution to this, can be the simple expression .* :)

Solve the above task to continue on to the next problem, or read the Solution.

Lesson Notes

	abc…	Letters
	123…	Digits
	\d	Any Digit
	\D	Any Non-digit character
	.	Any Character
	\.	Period
	[abc]	Only a, b, or c
	[^abc]	Not a, b, nor c
	[a-z]	Characters a to z
	[0-9]	Numbers 0 to 9
	\w	Any Alphanumeric character
	\W	Any Non-alphanumeric character
	{m}	m Repetitions
	{m,n}	m to n Repetitions
	*	Zero or more repetitions
	+	One or more repetitions
	?	Optional character
	\s	Any Whitespace
	\S	Any Non-whitespace character
	^…$	Starts and ends
	(…)	Capture Group
	(a(bc))	Capture Sub-group
	(.*)	Capture all
	(abc\|def)	Matches abc or def