Regular expressions first came around in mathematics and automata theory. A regular expression is simply something which defines a regular language. Without going too much into what "regular" means, think of a language as this way:
- A language is made up of strings. English is a language, for example, and its made of strings.
- Those strings are made of symbols - called an alphabet. So a string is just a concatenation of symbols from the alphabet.
So you could have a string (which is, remember, just a concatenation of symbols) which is not part of a given language. Or it could be in the language.
So lets say you have an alphabet made of 2 symbols: "0" and "1". And lets say you want to create a language using the symbols in that alphabet. You could create the following rule: "In order for a string to be in my language, it must have only 0's and 1's in it."
So these strings are in your language:
These would not be in your language:
That's a pretty simple language. How about this: "In my language, each string [analogous to a valid 'word' in English] must being with a 0, and then can be followed by any number of 0's or 1's"
These are in the language:
- 0111111
- 0000000
- 0101010110001
These are not:
Well rather than defining the language using words - and these languages might get very complex ("1 followed by 2 0's followed by any combination of 1's and 0's ending with a 1"), we came up with this syntax called "regular expressions" to define the language.
The first language would have been:
(0|1)*
(0 or 1, repeated infinitely)
The next: 0(0|1)*
(0, followed by any number of 0's and 1's).
So lets think of programming now. When you create a regex, you are saying "Look at this text. Return to me strings which match this pattern." Which is really saying "I have defined a language. Return to me all strings within this document which are in my language."
So when you create a "regex", you are actually defining a regular language, which is a mathematical concept. (In actuality, perl-like regex define "nonregular" languages, but that is a separate issue.)
By learning the syntax of regex, you are learning the ins and outs of how to create a language, so that later you can see if a given string is "in" the language. Thus, commonly, people say that regex are for pattern matching - which is basically what you are doing when you look at a pattern, and see if it "matches" the rules for your language.
(this was long. does it answer your question at all?)