Pattern matching and regular expressions

Character Classes

Valid character classes for the [] glob are defined by the POSIX standard:

alnum alpha ascii blank cntrl digit graph lower print punct space upper word xdigit

Inside [] more than one character class or range can be used, e.g.,

$ echo a[a-z[:blank:]0-9]*

will match any file that starts with an a and is followed by either a lowercase letter or a blank or a digit.

It should be kept in mind, though, that a [] glob can only be wholly negated and not only parts of it. The negating character must be the first character following the opening \[, e.g., this expression matches all files that do not start with an a

$ echo [^a]*

The following does match all files that start with either a digit or a ^

$ echo [[:alpha:]^a]*

It does not match any file or folder that starts with with letter except an a because the ^ is interpreted as a literal ^.

Escaping glob characters

It is possible that a file or folder contains a glob character as part of its name. In this case a glob can be escaped with a preceding \\ in order for a literal match. Another approach is to use double "" or single '' quotes to address the file. Bash does not process globs that are enclosed within "" or ''.

Difference to Regular Expressions

The most significant difference between globs and Regular Expressions is that a valid Regular Expressions requires a qualifier as well as a quantifier. A qualifier identifies what to match and a quantifier tells how often to match the qualifier. The equivalent RegEx to the \* glob is .* where . stands for any character and \* stands for zero or more matches of the previous character. The equivalent RegEx for the ? glob is .{1}. As before, the qualifier . matches any character and the {1} indicates to match the preceding qualifier exactly once. This should not be confused with the ? quantifier, which matches zero or once in a RegEx. The [] glob is can be used just the same in a RegEx, as long as it is followed by a mandatory quantifier.

Equivalent Regular Expressions

Glob | RegEx |

|——|—––| | \* | .* | | ? | . | | [] | [] |

