Problem 3: Matching emails

When you are dealing with HTML forms, it's often useful to validate the form input against regular expressions. In particular, emails are difficult to match correctly due to the complexity of the specification and I would recommend using a built-in language or framework function instead of rolling your own. However, you can build a pretty robust regular expression that matches a great deal of common emails pretty easily using what we've learned so far.

One thing to watch out for is that many people use plus addressing for one time use, such as "name+filter@gmail.com", which gets directly to "name@gmail.com" but can be filtered with the extra information. In addition, some domains have more than one component, for example, you can register a domain at "hellokitty.hk.com" and have an email with the form "ilove@hellokitty.hk.com", so you will have to be careful when matching the domain portion of the email.

Below are a few common emails, in this example, try to capture the name of the email, excluding the filter (+ character and afterwards) and domain (@ character and afterwards).

Exercise 3: Matching emails
Task Text Capture Groups  
capture tom@hogwarts.com tom To be completed
capture tom.riddle@hogwarts.com tom.riddle To be completed
capture tom.riddle+regexone@hogwarts.com tom.riddle To be completed
capture tom@hogwarts.eu.com tom To be completed
capture potter@hogwarts.com potter To be completed
capture harry@hogwarts.com harry To be completed
capture hermione+regexone@hogwarts.com hermione To be completed
Solution

To extract the beginning of each email, we can use a simple expression ^([\w\.]*) which will match emails starting with alphanumeric characters including the period. It will match up to the point in the text where it reaches an '@' or '+'.

Again, you should probably use a framework to match emails!

Solve the above task to continue on to the next problem, or read the Solution.