Text is always comprised of symbols. Letters are symbols. Punctuation marks are symbols. When using regexes, we break down symbols into two types: characters and metacharacters, which you can think of as 'normal' and 'special'.
Here's the tricky bit: When we use regexes, we are invariably using characters and metacharacters in the same piece of text.
So there are a bunch of symbols that can mean either what they normally mean in English (or any other language) or a special pattern matching instruction. (btw - yes, there is an easy way to tell the difference. I'll get to that.)
Here's an example: The symbols . and ?
AS CHARACTERS
. means 'this is the end of the sentence'
? means 'this is the end of the sentence, and this sentence is a question'
AS METACHARACTERS
. means 'any character'
? means 'zero or one of the character before the ?'
Totally different. The best thing to ignore your life-long understanding of what . and ? mean as characters - the relationship between the character meaning and the metacharacter meaning is completely arbitrary.
This still leaves us with the question: If they're the same, how do you tell them apart?

No comments:
Post a Comment