MySQL provides standard SQL pattern matching as well as a form of pattern matching based on extended regular expressions similar to those used by Unix utilities such as vi, grep, and sed.
SQL pattern matching allows you to use
“_
” to match any single
character and “%
” to match an
arbitrary number of characters (including zero characters). In
MySQL, SQL patterns are case-insensitive by default. Some
examples are shown here. You do not use =
or <>
when you use SQL patterns; use
the LIKE
or
NOT LIKE
comparison operators
instead.
To find names beginning with
“b
”:
mysql> SELECT * FROM pet WHERE name LIKE 'b%';
+--------+--------+---------+------+------------+------------+
| name | owner | species | sex | birth | death |
+--------+--------+---------+------+------------+------------+
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
| Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 |
+--------+--------+---------+------+------------+------------+
To find names ending with
“fy
”:
mysql> SELECT * FROM pet WHERE name LIKE '%fy';
+--------+--------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+--------+--------+---------+------+------------+-------+
| Fluffy | Harold | cat | f | 1993-02-04 | NULL |
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
+--------+--------+---------+------+------------+-------+
To find names containing a
“w
”:
mysql> SELECT * FROM pet WHERE name LIKE '%w%';
+----------+-------+---------+------+------------+------------+
| name | owner | species | sex | birth | death |
+----------+-------+---------+------+------------+------------+
| Claws | Gwen | cat | m | 1994-03-17 | NULL |
| Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 |
| Whistler | Gwen | bird | NULL | 1997-12-09 | NULL |
+----------+-------+---------+------+------------+------------+
To find names containing exactly five characters, use five
instances of the “_
” pattern
character:
mysql> SELECT * FROM pet WHERE name LIKE '_____';
+-------+--------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+-------+--------+---------+------+------------+-------+
| Claws | Gwen | cat | m | 1994-03-17 | NULL |
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
+-------+--------+---------+------+------------+-------+
The other type of pattern matching provided by MySQL uses
extended regular expressions. When you test for a match for
this type of pattern, use the
REGEXP
and NOT
REGEXP
operators (or
RLIKE
and
NOT RLIKE
,
which are synonyms).
The following list describes some characteristics of extended regular expressions:
“.
” matches any single
character.
A character class “[...]
”
matches any character within the brackets. For example,
“[abc]
” matches
“a
”,
“b
”, or
“c
”. To name a range of
characters, use a dash.
“[a-z]
” matches any
letter, whereas “[0-9]
”
matches any digit.
“*
” matches zero or more
instances of the thing preceding it. For example,
“x*
” matches any number of
“x
” characters,
“[0-9]*
” matches any
number of digits, and “.*
”
matches any number of anything.
A REGEXP
pattern match
succeeds if the pattern matches anywhere in the value
being tested. (This differs from a
LIKE
pattern match, which
succeeds only if the pattern matches the entire value.)
To anchor a pattern so that it must match the beginning or
end of the value being tested, use
“^
” at the beginning or
“$
” at the end of the
pattern.
To demonstrate how extended regular expressions work, the
LIKE
queries shown previously are
rewritten here to use REGEXP
.
To find names beginning with
“b
”, use
“^
” to match the beginning of
the name:
mysql> SELECT * FROM pet WHERE name REGEXP '^b';
+--------+--------+---------+------+------------+------------+
| name | owner | species | sex | birth | death |
+--------+--------+---------+------+------------+------------+
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
| Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 |
+--------+--------+---------+------+------------+------------+
Prior to MySQL 3.23.4, REGEXP
is
case sensitive, and the previous query will return no rows. In
this case, to match either lowercase or uppercase
“b
”, use this query instead:
mysql> SELECT * FROM pet WHERE name REGEXP '^[bB]';
From MySQL 3.23.4 on, if you really want to force a
REGEXP
comparison to be case
sensitive, use the BINARY
keyword
to make one of the strings a binary string. This query matches
only lowercase “b
” at the
beginning of a name:
mysql> SELECT * FROM pet WHERE name REGEXP BINARY '^b';
To find names ending with
“fy
”, use
“$
” to match the end of the
name:
mysql> SELECT * FROM pet WHERE name REGEXP 'fy$';
+--------+--------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+--------+--------+---------+------+------------+-------+
| Fluffy | Harold | cat | f | 1993-02-04 | NULL |
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
+--------+--------+---------+------+------------+-------+
To find names containing a
“w
”, use this query:
mysql> SELECT * FROM pet WHERE name REGEXP 'w';
+----------+-------+---------+------+------------+------------+
| name | owner | species | sex | birth | death |
+----------+-------+---------+------+------------+------------+
| Claws | Gwen | cat | m | 1994-03-17 | NULL |
| Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 |
| Whistler | Gwen | bird | NULL | 1997-12-09 | NULL |
+----------+-------+---------+------+------------+------------+
Because a regular expression pattern matches if it occurs anywhere in the value, it is not necessary in the previous query to put a wildcard on either side of the pattern to get it to match the entire value like it would be if you used an SQL pattern.
To find names containing exactly five characters, use
“^
” and
“$
” to match the beginning and
end of the name, and five instances of
“.
” in between:
mysql> SELECT * FROM pet WHERE name REGEXP '^.....$';
+-------+--------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+-------+--------+---------+------+------------+-------+
| Claws | Gwen | cat | m | 1994-03-17 | NULL |
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
+-------+--------+---------+------+------------+-------+
You could also write the previous query using the
{
(“repeat-n
}n
-times”)
operator:
mysql> SELECT * FROM pet WHERE name REGEXP '^.{5}$';
+-------+--------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+-------+--------+---------+------+------------+-------+
| Claws | Gwen | cat | m | 1994-03-17 | NULL |
| Buffy | Harold | dog | f | 1989-05-13 | NULL |
+-------+--------+---------+------+------------+-------+
Section 11.4.2, “Regular Expressions”, provides more information about the syntax for regular expressions.
User Comments
Note for regular users of Regex. MySQL doesn't seem to support use of ? as a greedy operator.
ie use this:
UPDATE pages SET sitesection = 'Softwarey' WHERE pages.path REGEXP '^/for_sale/[^/]*\\.html';
NOT
UPDATE pages SET sitesection = 'Softwarey' WHERE pages.path REGEXP '^/for_sale/[^/]*?\\.html';
In reply to the previews comment.
I believe that what you have in your example is a PERL extension which is not part of the POSIX specification for regular expressions and is not important unless you need to extract the matched substrings (which is not the case).
Match a correctly-formatted email address:
RLIKE '^[[:alnum:][.period.][.hyphen.][.underscore.]]+@([[:alnum:][.hyphen.][.underscore.]]+[.period.])+[[:alnum:][.hyphen.]]{2,3}$
I was finding that I was getting an empty set (zero results) when attempting the 'find names ending with 'fy' example' (and any other queries for ends of names). I discovered that the cause was that my names had trailing whitespace, presumably caused by the loading data from file process in section 3.3.3 of the documentation.
Trailing whitespace can be removed by using 'TRIM()' like so:
UPDATE pet SET name = TRIM(name);
After I performed this I find the results matched that of the example. You may wish to do the same with the other fields, which can be done by replacing 'name' with the field name.
Whether you encounter this problem will depend on the version of MySQL you are using. According to '11.4.1. The CHAR and VARCHAR Types' of the documentation (http://dev.mysql.com/doc/refman/5.0/en/char.html) as of version 5.0.3 of MySQL trailing spaces are retained when values are stored and retrieved.
I think it is probably better to trim the data before storing it in the database as it would save some storage and simplify the query.
Right way to found correctly-formatted email address is
RLIKE "^[[:alnum:][.period.][.hyphen.][.underscore.]]+@([[:alnum:][.hyphen.][.underscore.]]+[.period.])+[[:alnum:][.hyphen.]]{2,4}$"
Don't forget about ".info" domains
with recent introduction of .travel domains, the correct email matching regexp should end with {2,6}$
Guys the above you'll realise it will allow addresses like markx@test
therefore I tried
RLIKE '^[[:alnum:][.period.][.hyphen.][.underscore.]]+@([[:alnum:][.hyphen.][.underscore.]]+[.period.])+.([[:alnum:][.hyphen.]]){2,4}$'
and it worked perfect
To find names ending with “fy”:
mysql> SELECT * FROM pet WHERE name LIKE '%fy';
Here I had to use [...] LIKE '%fy_';
Apperantly there is an extra character, probably the tab or something, that is included.
Add your own comment.