Lesson 12: Nested groups

When you are working with complex data, you can easily find yourself having to extract multiple layers of information, which can result in nested groups. Generally, the results of the captured groups are in the order in which they are defined (in order by open parenthesis).

Take the example from the previous lesson, of capturing the filenames of all the image files you have in a list. If each of these image files had a sequential picture number in the filename, you could extract both the filename and the picture number using the same pattern by writing an expression like ^(IMG(\d+))\.png$ (using a nested parenthesis to capture the digits).

The nested groups are read from left to right in the pattern, with the first capture group being the contents of the first parentheses group, etc.

For the following strings, write an expression that matches and captures both the full date, as well as the year of the date.

Exercise 12: Matching nested groups
Task Text Capture Groups  
capture Jan 1987 Jan 1987 1987 To be completed
capture May 1969 May 1969 1969 To be completed
capture Aug 2011 Aug 2011 2011 To be completed
Solution

This expression requires capturing two parts of the data, both the year and the whole date. This requires using nested capture groups, as in the expression (\w+ (\d+)).

We can alternatively use \s+ in lieu of the space, to catch any number of whitespace between the month and the year.

Solve the above task to continue on to the next problem, or read the Solution.