Lesson 7: Mr. Kleene, Mr. Kleene

A powerful concept in regular expressions is the ability to match an arbitrary number of characters. For example, imagine that you wrote a form that has a donation field that takes a numerical value in dollars. A wealthy user may drop by and want to donate $25,000, while a normal user may want to donate $25.

One way to express such a pattern would be to use what is known as the Kleene Star and the Kleene Plus, which essentially represents either 0 or more or 1 or more of the character that it follows (it always follows a character or group). For example, to match the donations above, we can use the pattern \d* to match any number of digits, but a tighter regular expression would be \d+ which ensures that the input string has at least one digit.

These quantifiers can be used with any character or special metacharacters, for example a+ (one or more a's), [abc]+ (one or more of any a, b, or c character) and .* (zero or more of any character).

Below are a few simple strings that you can match using both the star and plus metacharacters.

Exercise 7: Matching repeated characters
Task Text  
match aaaabcc To be completed
match aabbbbc To be completed
match aacc To be completed
skip a To be completed
Solution

There are at least two 'a's, zero or more 'b's, and at least one 'c' in each line to match, so you can use the expression 'aa+b*c+' to represent this exactly.

Alternatively, an even more restrictive expression would be a{2,4}b{0,4}c{1,2} which puts both an upper and lower bound on the number of each of the characters.

Solve the above task to continue on to the next problem, or read the Solution.