At first glance, writing a regular expression to match a number should be easy right?
We have the \d special character to match any digit, and all we need to do is match the decimal point right? For simple numbers, that may be right, but when working with scientific or financial numbers, you often have to deal with positive and negative numbers, significant digits, exponents, and even different representations (like the comma used to separate thousands and millions).
Below are a few different formats of numbers that you might encounter. Notice how you will have to match the decimal point itself and not an arbitrary character using the dot metacharacter. If you are having trouble skipping the last number, notice how that number ends the line compared to the rest.
Task | Text | |
match | 3.14529 | |
match | -255.34 | |
match | 128 | |
match | 1.9e10 | |
match | 123,340.00 | |
skip | 720p |
Solution | The expression for this can be quite complicated when you take into account fractional numbers, exponents, and more. For the above example, the expression ^-?\d+(,\d+)*(\.\d+(e\d+)?)?$ will match a string that starts with an optional negative sign, one or more digits, optionally followed by a comma and more digits, followed by an optional fractional component which consists of a period, one or more digits, and another optional component, the exponent followed by more digits. This is not the only solution as there can be many expressions that can match these sets of number strings. |