A regular expression is a combination of symbols and letters which represent a large variety of patterns in a succinct manner. The symbols and special letters can be combined with non-special letters and symbols to form an endless array of ways to match part of a character string.
Many beginning programmers have trouble understanding what a regular expression does. Rather than use a series of if-statements for all strings that match a given word combination, regular expressions allow us to express that combination in a single phrase.
Let's say we had a series of three-letter words: cat, car, cap, cab, cad, can, and caw. If we just wanted those words that ended with the first half of the alphabet, how would we go about it?
If we used if-statements, we would have to use thirteen of them -- one for every letter in the first half of the alphabet. It would look something like this:
import reNeedless to say, this is less than optimum.
list = ['cat', 'car', 'cap', 'cab', 'cad', 'can', 'caw']
for x in list:
if x == "caa":
<do something>
if x == "cab":
<do something>
if x == "cac":
<do something>
and so on.
Regular expressions, on the other hand, allow us to express all of those if-statements in one expression and two lines:
import reThe output:
x = ['cat', 'car', 'cap', 'cab', 'cad', 'can', 'caw']
y = re.compile('ca[a-l]')
for x in list:
if y.match(x):
print x
cab
cad
To use the compile and match methods, you must import the re module, part of the Python Standard Library. If you find that this brief discussion has whet your appetite for more about regular expressions in Python, you should read "Forming a Regular Expression in Python." When you understand what regular expressions are and what they can do for you, you will find the "Python Regex Glossary" to be a helpful reference for forming regular expressions in Python.


