Trudy Pelzer is a senior software architect with MySQL AB, co-author of SQL Performance Tuning, and lives in Edmonton, Alberta.
Jim Winstead is a lead web developer with MySQL AB, and lives in Los Angeles, California.
This is the final article of our series of articles that explain some the new features in MySQL 4.1, which is now available as a generally-available (GA, or production) release.
By Trudy Pelzer and Jim Winstead
We've covered all of the major new features, but there's still a slew of other new features that are more minor in scope (but not necessarily in utility).
The new HELP
statement allows you to get help on various topics
from the server. You can use HELP contents
to get a list of
available topics.
mysql> HELP contents; You asked for help about help category: "Contents" For more information, type 'help- ', where item is one of the following categories : Administration Column Types Data Definition Data Manipulation Functions Geographic features Transactions mysql> HELP Functions; You asked for help about help category: "Functions" For more information, type 'help
- ', where item is one of the following topics : CHAR BYTE DUAL FUNCTION TRUE FALSE categories : Bit Functions Comparison operators Control flow functions Date and Time Functions Encryption Functions Functions and Modifiers for Use with GROUP BY Clauses Information Functions Logical operators Miscellaneous Functions Numeric Functions String Functions mysql> HELP DUAL; Name: 'DUAL' Description: SELECT ... FROM DUAL is an alias for SELECT .... (To be compatible with some other databases).
If you upgraded to MySQL 4.1 from an earlier version, you may need to make
sure that the tables used to store the HELP information have been loaded with
data. The file fill_help_tables.sql
is included with the binary
distributions, and it can simply be loaded in the mysql
database.
mysql -uroot mysql < /path/to/fill_help_tables.sql
A recurring question on the MySQL mailing
lists is how to find out what exactly the warnings are that are reported in
summary form after running a command like LOAD DATA INFILE
. Now
with MySQL 4.1's SHOW WARNINGS
command, you can get access to the
notes, warnings and errors that a statement might generate. (The SHOW ERRORS
statement will just show errors.)
mysql> DROP TABLE IF EXISTS no_such_table; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> SHOW WARNINGS; +-------+------+-------------------------------+ | Level | Code | Message | +-------+------+-------------------------------+ | Note | 1051 | Unknown table 'no_such_table' | +-------+------+-------------------------------+ 1 row in set (0.01 sec)
The SHOW WARNINGS Syntax
section of the reference manual has an example of the warnings reported when
data is truncated during an INSERT
statement. (Which you'll be
able to turn into errors in MySQL 5.0, but that's a topic for our "New in
5.0" series....)
The new ON DUPLICATE KEY UPDATE ...
clause for
INSERT
statements allows special handling when an inserted row
would cause a duplicate value in a UNIQUE
index (or a
PRIMARY KEY
). For example, you could record votes in a poll system
using a statement like the following:
mysql> INSERT INTO poll (favorite, votes) VALUES ('Hamburger', 1) -> ON DUPLICATE KEY UPDATE votes = votes + 1;
You can use the new VALUES()
function to refer to column values
from the INSERT
part of the query.
mysql> INSERT INTO poll (favorite, votes, last_voter) VALUES ('Hamburger', 1, 'Wimpy') -> ON DUPLICATE KEY UPDATE votes = votes + 1, last_voter = VALUES(last_voter);
A CREATE TABLE ... LIKE
statement is a quick way of cloning the
structure of an existing table, including any indexes (but not foreign keys). This can be particularly
useful in creating new tables to be added to an existing MERGE table.
mysql> CREATE TABLE log_20041124 LIKE log_20041123;
ROLLUP
is just one of the many features that, together, make up
what is known as On-Line Analytical Processing, or OLAP, capability.
ROLLUP
provides summary rows for each GROUP BY
level. In effect, it requires the server to return a single group for all
columns named in the GROUP BY
clause, by rolling each group up
into the next until only one remains. The summary rows thus represent
higher-level (or super-aggregate) summary operations.
The new syntax looks like this. Start with GROUP BY
, followed
by a list of columns and/or expressions to group, and add WITH
ROLLUP
at the very end.
[GROUP BY {column_name | expression | column_position} [ASC | DESC], ... [WITH ROLLUP]]
Let's look at a simple example. Assume you have a table, called
T_rollup
, which looks like this.
mysql> SELECT * FROM T_rollup; +------+------+------+ | col1 | col2 | col3 | +------+------+------+ | 1 | a | 0.55 | | 1 | a | 0.55 | | 1 | b | 1.00 | | 1 | b | 1.35 | | 2 | a | 6.00 | | 2 | a | 1.77 | +------+------+------+ 6 rows in set (0.04 sec)
An ordinary SELECT ... GROUP BY
on this table gives this
result.
mysql> SELECT col1, col2, SUM(col3) AS sumcol3 -> FROM T_rollup GROUP BY col1,col2; +------+------+---------+ | col1 | col2 | sumcol3 | +------+------+---------+ | 1 | a | 1.10 | | 1 | b | 2.35 | | 2 | a | 7.77 | +------+------+---------+ 3 rows in set (0.03 sec)
In contrast, the same query, using WITH ROLLUP
, gives this
result.
mysql> SELECT col1, col2, SUM(col3) AS sumcol3 -> FROM T_rollup GROUP BY col1,col2 WITH ROLLUP; +------+------+---------+ | col1 | col2 | sumcol3 | +------+------+---------+ | 1 | a | 1.10 | | 1 | b | 2.35 | | 1 | NULL | 3.45 | | 2 | a | 7.77 | | 2 | NULL | 7.77 | | NULL | NULL | 11.22 | +------+------+---------+ 6 rows in set (0.04)
As the result shows, in addition to the groups returned by the regular
GROUP BY
query, GROUP BY ... WITH ROLLUP
also returns
a summary row for each group.
Thus, for the group where col1
equals one (1),
col2
gets a NULL to indicate a summary row, and col3
gets the sum for the group {col1=1}
.
Then, for the group where col1
equals two (2),
col2
gets a NULL to indicate a summary row, and col3
,
once again, gets the sum for the group {col1=2}
.
The final row shows NULLs for both col1
and col2
,
to indicate a summary row for the whole result, while col3
gets
the sum for the entire result set.
WITH ROLLUP
has some limitations:
ORDER BY
clause to a query that contains
WITH ROLLUP
. You can, however, get around this limitation
somewhat by using MySQL's sort option in the GROUP BY
clause
itself.LIMIT
is applied after ROLLUP
. So a query that
uses LIMIT
to restrict the number of rows returned to the client
may have the summary rows produced by ROLLUP
cut off.A pair of new operators were added in MySQL 4.1, making the results of integer division available.
DIV
divisorFirst we have DIV
, which operates on two numbers, new in
MySQL 4.1.0.
The DIV
operator does integer division on its two numeric
operands. That is, DIV
takes the dividend value, divides it by
the divisor, and returns only the integer portion of the result. This
contrasts with division done by the /
(divide) operator, which returns the
full result of such a division.
DIV
works correctly with numbers up to the
BIGINT
range. It returns NULL if either operand is NULL. Oh, and
there's one other result of this implementation: DIV
is now a
reserved keyword in MySQL.
mysql> SELECT 1025 DIV 13; +-------------+ | 1025 DIV 13 | +-------------+ | 78 | +-------------+ 1 row in set (0.02 sec) mysql> SELECT 1025/13; +---------+ | 1025/13 | +---------+ | 78.85 | +---------+ 1 row in set (0.02 sec)
MOD
divisorThe complement to DIV
is MOD
, and in MySQL 4.1.0
we added new syntax as a synonym for the MOD()
function that
we've supported for some time. In contrast to DIV
,
MOD
takes its dividend operand, divides it by the divisor, and
returns only the remainder portion of the result.
MOD
now works correctly with numbers up to the
BIGINT
range. It returns NULL if either operand is NULL.
mysql> SELECT 1025 MOD 13; +-------------+ | 1025 MOD 13 | +-------------+ | 11 | +-------------+ 1 row in set (0.02 sec)
Both DIV
and MOD
have the same precedence as the
*
, /
and %
operators.
Besides the new date and time, character encoding and collation, and spatial extension functions, several other new functions are now available in MySQL 4.1.
COMPRESS
(string_expression)The COMPRESS
function, as indicated by the name, compresses
the string given as its character string argument. This only works if MySQL
was compiled with the zlib
compression library. If this
isn't the case, COMPRESS
just returns NULL. It also returns NULL if its
argument resolves to NULL.
To re-expand compressed strings, a complementary function,
UNCOMPRESS
, was also added to MySQL 4.1.1. We'll get to that in a
moment.
When storing a compressed string, MySQL follows these rules:
CHAR
or VARCHAR
column.Despite the last rule, it is best that compressed strings not be stored in
CHAR
or VARCHAR
columns. A better alternative is to
store compressed strings in BLOB
columns.
mysql> SELECT LENGTH(COMPRESS(REPEAT('a',1000))); +------------------------------------+ | LENGTH(COMPRESS(REPEAT('a',1000))) | +------------------------------------+ | 21 | +------------------------------------+ 1 row in set (0.02 sec)
UNCOMPRESS
(compressed_string)Here's the complement to COMPRESS
. The UNCOMPRESS
function takes a string that was compressed by the COMPRESS
function, and expands it back to its original size.
As with COMPRESS
, UNCOMPRESS
works only if MySQL
was compiled with the zlib
compression library. If this is
not the case, UNCOMPRESS
just returns NULL. The function also
returns NULL if its argument is not a compressed string, or if the string
expression resolves to NULL.
mysql> SELECT UNCOMPRESS(COMPRESS('aaaa')); +------------------------------+ | UNCOMPRESS(COMPRESS('aaaa')) | +------------------------------+ | aaaa | +------------------------------+ 1 row in set (0.02 sec)
UNCOMPRESSED_LENGTH
(compressed_string)The final new 4.1.1 function that deals with string compression is
UNCOMPRESSED_LENGTH
, which returns the original length of a
compressed string; that is, UNCOMPRESSED_LENGTH
takes a
compressed string as its argument and returns the length of the string before
it was compressed. The function returns NULL if the argument resolves to
NULL.
mysql> SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('A',1000))); +-------------------------------------------------+ | UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',1000))) | +-------------------------------------------------+ | 1000 | +-------------------------------------------------+ 1 row in set (0.02 sec)
BIT_XOR
(expression)The BIT_XOR
function returns the bitwise exclusive-OR (XOR) of
all bits in its single argument, which can be any type of expression. The
calculation is performed with 64-bit (BIGINT
) precision.
BIT_XOR
returns zero (0) if there are no matching rows, or if the
argument resolves to NULL.
(A simple way of thinking about how this works is that each bit in the result will only be set if it was set in an odd number of the values being grouped.)
mysql> SELECT val FROM t; +------+ | val | +------+ | 1 | | 3 | | 5 | | 4 | +------+ 4 rows in set (0.00 sec) mysql> SELECT BIT_XOR(val) FROM t; +--------------+ | BIT_XOR(val) | +--------------+ | 3 | +--------------+ 1 row in set (0.00 sec)
GROUP_CONCAT
(<syntax options>)The new GROUP_CONCAT
function is a biggie. First of all,
rather than just accepting one or two simple arguments, it offers numerous
syntax options.
Specifically, one starts with GROUP_CONCAT
and a pair of
parentheses.
Inside the parentheses, a comma-delimited list of expressions to
concatenate may be preceded by DISTINCT
, and/or followed by an
ORDER BY
option that uses the same syntax as the ORDER
BY
clause in a SELECT
statement. To end, the
SEPARATOR
keyword, followed by a one-character string, may be
added.
GROUP_CONCAT([DISTINCT] expression [,expression ...] [ORDER BY {unsigned_integer | column_name | expression} [ASC | DESC] [,column_name ...]] [SEPARATOR string_value])
GROUP_CONCAT
provides the ability to retrieve concatenated
values of combinations of expressions, which are usually column names. So it
does for expression results what the CONCAT
function does for
strings: concatenates the results together, and returns the result as a
single group. Here's an example; assume you have this table.
mysql> SELECT * FROM xx; +------+------+------+ | col1 | col2 | col3 | +------+------+------+ | one | 10 | 25 | | two | 10 | 50 | | two | 10 | 50 | | one | 20 | 25 | | one | 30 | 25 | +------+------+------+ 5 rows in set (0.06 sec)
A query using GROUP_CONCAT
on the second and third columns
returns this result. The second column shown combines the data from
col2
and col3
. A comma separates the grouped values
for each row represented by the col1
value.
mysql> SELECT col1,GROUP_CONCAT(col2,col3) FROM xx GROUP BY col1; +-------+-------------------------+ | col1 | GROUP_CONCAT(col2,col3) | +-------+-------------------------+ | one | 1025,2025,3025 | | two | 1050,1050 | +-------+-------------------------+ 2 rows in set (0.03 sec)
The DISTINCT
option eliminates duplicate values, just like it
does in SELECT
. Note that the second row now shows just one
group of 1050.
mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3) FROM xx GROUP BY col1; +------+----------------------------------+ | col1 | GROUP_CONCAT(DISTINCT col2,col3) | +------+----------------------------------+ | one | 1025,2025,3025 | | two | 1050 | +------+----------------------------------+ 2 rows in set (0.03 sec)
You can sort the concatenated result by adding an ORDER BY
clause right in the GROUP_CONCAT
function.
mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC) -> FROM xx GROUP BY col1; +------+-----------------------------------------------------+ | col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC) | +------+-----------------------------------------------------+ | one | 3025,2025,1025 | | two | 1050 | +------+-----------------------------------------------------+ 2 rows in set (0.03 sec)
In such cases, ORDER BY
works exactly like you'd expect from
experience with the ORDER BY
clause in a SELECT
statement.
You can also change the separator that MySQL places between groups. The
default, as we've seen in these examples so far, is a comma — but you
can change this with the SEPARATOR
option, which must be
followed by a string showing the separator you'd like MySQL to use. This
example shows a separator of percent (%) rather than comma (,).
mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '%') -> FROM xx GROUP BY col1; +------+-------------------------------------------------------------------+ | col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '%') | +------+-------------------------------------------------------------------+ | one | 3025%2025%1025 | | two | 1050 | +------+-------------------------------------------------------------------+ 2 rows in set (0.03 sec)
If you don't want a separator at all, place two single quotes (that is, an empty string) after the SEPARATOR
option.
Using GROUP_CONCAT
can result in some really big values. To
avoid problems with this, one can set a maximum allowed length for
GROUP_CONCAT
results. This is done at runtime, using the
group_concat_max_len
system variable.
mysql> SET group_concat_max_len=8; Query OK, 0 rows affected (0.01 sec) mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '') -> FROM xx GROUP BY col1; +------+------------------------------------------------------------------+ | col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '') | +------+------------------------------------------------------------------+ | one | 30252025 | | two | 1050 | +------+------------------------------------------------------------------+ 2 rows in set, 1 warning (0.03 sec)
Once a maximum length has been set, the GROUP_CONCAT
result
is truncated to this maximum length. So the result here only includes eight
characters for the grouped column in the first row.
VARIANCE
(numeric_expression)Another new aggregate function added in MySQL 4.1.0 is
VARIANCE
. The VARIANCE
function returns the standard
population variance of its argument, which must be either a numeric expression
or an expression with a data type that MySQL can convert to a numeric data
type.
VARIANCE
considers rows as the whole population, not as a
sample; this means that it uses the number of rows returned as the denominator
in its calculation. This, by the way, is different from the way Oracle
calculates variances.
mysql> SELECT col2 FROM xx; +------+ | col2 | +------+ | 10 | | 10 | | 10 | | 20 | | 30 | +------+ 5 rows in set (0.03 sec) mysql> SELECT VARIANCE(col2) FROM xx; +----------------+ | VARIANCE(col2) | +----------------+ | 64.0000 | +----------------+ 1 row in set (0.02 sec)
CRC32
(string_expression)The CRC32
function returns a 32-bit unsigned value. The
function's purpose is to compute a cyclic redundancy check value for a given
string. The single argument can resolve to a column name or character string
expression. The function returns NULL if the argument resolves to NULL.
mysql> SELECT CRC32('trudy'); +----------------+ | CRC32('trudy') | +----------------+ | 1899238533 | +----------------+ 1 row in set (0.02 sec)
Cyclic redundancy checks are used to run quick error tests. For example,
you could determine whether a BLOB
's bytes are changed or
corrupted by an operation by comparing the CRC32()
value prior to
the operation to the value afterwards. Or you could do a quick
BLOB
comparison; it's quicker to compare two BLOB
s'
CRC32
values rather than comparing each byte of the
BLOB
s themselves.
DEFAULT
(column_name)The DEFAULT
function, new in MySQL 4.1.0, returns the default
value defined for a given column.
mysql> CREATE TABLE xz (col1 VARCHAR(10) DEFAULT 'hello'); Query OK, 0 rows affected (0.09 sec) mysql> INSERT INTO xz VALUES ('bob'); Query OK, 1 row affected (0.01 sec) mysql> SELECT DEFAULT(col1) FROM xz; +---------------+ | DEFAULT(col1) | +---------------+ | hello | +---------------+ 1 row in set (0.02 sec)
IS_USED_LOCK
(string_expression)The IS_USED_LOCK
function checks whether a given lock is in
use. The single argument must resolve to a string that identifies a lock. If
the lock is in use, IS_USED_LOCK
returns the connection
identifier of the client that holds the lock. Otherwise, the function returns
NULL. It also returns NULL if the argument resolves to NULL.
mysql> SELECT IS_USED_LOCK('lock1'); +-----------------------+ | IS_USED_LOCK('lock1') | +-----------------------+ | NULL | +-----------------------+ 1 row in set (0.19 sec)
UNHEX
(string_expression)The UNHEX
function is the opposite of the familiar
HEX
function that's been part of MySQL since version 3.22.4.
UNHEX
interprets each pair of hexadecimal digits in its string
argument as a number, converts that number to the character it represents,
and returns the result as a binary string. UNHEX
returns NULL if
the argument resolves to NULL.
mysql> SELECT UNHEX('4D7953514C'); +---------------------+ | UNHEX('4D7953514C') | +---------------------+ | MySQL | +---------------------+ 1 row in set (0.02 sec)
UUID
()The UUID()
function — note the mandatory parentheses
— returns a Universal Unique Identifier (UUID), generated according to
a specification published by The Open Group. A UUID is a 128-bit number that
is globally unique in both space and time. This means that two separate calls
to UUID()
should always return two different values, regardless
of the circumstances.
mysql> SELECT UUID(); +--------------------------------------+ | UUID() | +--------------------------------------+ | 9c87d546-5891-1027-b8d6-dce9fa28dace | +--------------------------------------+ 1 row in set (0.02 sec)
UUIDs are represented by a string of five hexadecimal numbers in the format shown here — 8 hex digits, then 3 numbers with 4 hex digits each, and then 12 digits.
UUID()
does not work with replication at this time.
Before MySQL 4.1, the MEMORY (ne้ HEAP) storage engine only supported
HASH
indexes, which are not generally useful with queries that
access a range of indexed values. Performance of hash indexes with a high
degree of key duplication can also be less than ideal.
Now you can declare the type of an index when creating a table that uses the MEMORY storage engine (and also the NDB, or cluster, storage engine).
mysql> CREATE TABLE lookup (id INT, INDEX USING BTREE (id)) ENGINE = MEMORY;