About the Authors

Trudy Pelzer is a senior software architect with MySQL AB, co-author of SQL Performance Tuning, and lives in Edmonton, Alberta.

Jim Winstead is a lead web developer with MySQL AB, and lives in Los Angeles, California.

We are always looking for interesting articles about MySQL! Have you written something and would like to it published here? Please contact us via feedback form »

This article in Japanese

All the Little Features

This is the final article of our series of articles that explain some the new features in MySQL 4.1, which is now available as a generally-available (GA, or production) release.

Discuss this article in our online forums!

By Trudy Pelzer and Jim Winstead

We've covered all of the major new features, but there's still a slew of other new features that are more minor in scope (but not necessarily in utility).

New Statements

HELP

The new HELP statement allows you to get help on various topics from the server. You can use HELP contents to get a list of available topics.

mysql> HELP contents;
You asked for help about help category: "Contents"
For more information, type 'help ', where item is one of the following
categories :
   Administration
   Column Types
   Data Definition
   Data Manipulation
   Functions
   Geographic features
   Transactions

mysql> HELP Functions;
You asked for help about help category: "Functions"
For more information, type 'help ', where item is one of the following
topics :
   CHAR BYTE
   DUAL
   FUNCTION
   TRUE FALSE
categories :
   Bit Functions
   Comparison operators
   Control flow functions
   Date and Time Functions
   Encryption Functions
   Functions and Modifiers for Use with GROUP BY Clauses
   Information Functions
   Logical operators
   Miscellaneous Functions
   Numeric Functions
   String Functions

mysql> HELP DUAL;
Name: 'DUAL'
Description:
SELECT ... FROM DUAL is an alias for SELECT ....
(To be compatible with some other databases).

If you upgraded to MySQL 4.1 from an earlier version, you may need to make sure that the tables used to store the HELP information have been loaded with data. The file fill_help_tables.sql is included with the binary distributions, and it can simply be loaded in the mysql database.

mysql -uroot mysql < /path/to/fill_help_tables.sql

SHOW WARNINGS and SHOW ERRORS

A recurring question on the MySQL mailing lists is how to find out what exactly the warnings are that are reported in summary form after running a command like LOAD DATA INFILE. Now with MySQL 4.1's SHOW WARNINGS command, you can get access to the notes, warnings and errors that a statement might generate. (The SHOW ERRORS statement will just show errors.)

mysql> DROP TABLE IF EXISTS no_such_table;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+-------------------------------+
| Level | Code | Message                       |
+-------+------+-------------------------------+
| Note  | 1051 | Unknown table 'no_such_table' |
+-------+------+-------------------------------+
1 row in set (0.01 sec)

The SHOW WARNINGS Syntax section of the reference manual has an example of the warnings reported when data is truncated during an INSERT statement. (Which you'll be able to turn into errors in MySQL 5.0, but that's a topic for our "New in 5.0" series....)

INSERT ... ON DUPLICATE KEY UPDATE ...

The new ON DUPLICATE KEY UPDATE ... clause for INSERT statements allows special handling when an inserted row would cause a duplicate value in a UNIQUE index (or a PRIMARY KEY). For example, you could record votes in a poll system using a statement like the following:

mysql> INSERT INTO poll (favorite, votes) VALUES ('Hamburger', 1)
    -> ON DUPLICATE KEY UPDATE votes = votes + 1;

You can use the new VALUES() function to refer to column values from the INSERT part of the query.

mysql> INSERT INTO poll (favorite, votes, last_voter)
                        VALUES ('Hamburger', 1, 'Wimpy')
    -> ON DUPLICATE KEY UPDATE votes = votes + 1,
                                  last_voter = VALUES(last_voter);

CREATE TABLE ... LIKE

A CREATE TABLE ... LIKE statement is a quick way of cloning the structure of an existing table, including any indexes (but not foreign keys). This can be particularly useful in creating new tables to be added to an existing MERGE table.

mysql> CREATE TABLE log_20041124 LIKE log_20041123;

GROUP BY ... WITH ROLLUP

ROLLUP is just one of the many features that, together, make up what is known as On-Line Analytical Processing, or OLAP, capability.

ROLLUP provides summary rows for each GROUP BY level. In effect, it requires the server to return a single group for all columns named in the GROUP BY clause, by rolling each group up into the next until only one remains. The summary rows thus represent higher-level (or super-aggregate) summary operations.

The new syntax looks like this. Start with GROUP BY, followed by a list of columns and/or expressions to group, and add WITH ROLLUP at the very end.

[GROUP BY {column_name | expression | column_position}
        [ASC | DESC], ... [WITH ROLLUP]]

Let's look at a simple example. Assume you have a table, called T_rollup, which looks like this.

mysql> SELECT * FROM T_rollup;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
|    1 | a    | 0.55 |
|    1 | a    | 0.55 |
|    1 | b    | 1.00 |
|    1 | b    | 1.35 |
|    2 | a    | 6.00 |
|    2 | a    | 1.77 |
+------+------+------+
6 rows in set (0.04 sec)

An ordinary SELECT ... GROUP BY on this table gives this result.

mysql> SELECT col1, col2, SUM(col3) AS sumcol3
    -> FROM T_rollup GROUP BY col1,col2;
+------+------+---------+
| col1 | col2 | sumcol3 |
+------+------+---------+
|    1 | a    |    1.10 |
|    1 | b    |    2.35 |
|    2 | a    |    7.77 |
+------+------+---------+
3 rows in set (0.03 sec)

In contrast, the same query, using WITH ROLLUP, gives this result.

mysql> SELECT col1, col2, SUM(col3) AS sumcol3
    -> FROM T_rollup GROUP BY col1,col2 WITH ROLLUP;
+------+------+---------+
| col1 | col2 | sumcol3 |
+------+------+---------+
|    1 | a    |    1.10 |
|    1 | b    |    2.35 |
|    1 | NULL |    3.45 |
|    2 | a    |    7.77 |
|    2 | NULL |    7.77 |
| NULL | NULL |   11.22 |
+------+------+---------+
6 rows in set (0.04)

As the result shows, in addition to the groups returned by the regular GROUP BY query, GROUP BY ... WITH ROLLUP also returns a summary row for each group.

Thus, for the group where col1 equals one (1), col2 gets a NULL to indicate a summary row, and col3 gets the sum for the group {col1=1}.

Then, for the group where col1 equals two (2), col2 gets a NULL to indicate a summary row, and col3, once again, gets the sum for the group {col1=2}.

The final row shows NULLs for both col1 and col2, to indicate a summary row for the whole result, while col3 gets the sum for the entire result set.

WITH ROLLUP has some limitations:

You cannot add an ORDER BY clause to a query that contains WITH ROLLUP. You can, however, get around this limitation somewhat by using MySQL's sort option in the GROUP BY clause itself.
LIMIT is applied after ROLLUP. So a query that uses LIMIT to restrict the number of rows returned to the client may have the summary rows produced by ROLLUP cut off.
The NULLs in each summary row are produced when the row is sent to the client. Because these NULL values become part of the result set at such a late stage in query processing, it isn't possible to include them in NULL-value tests within the query.

New Operators

A pair of new operators were added in MySQL 4.1, making the results of integer division available.

dividend DIV divisor

First we have DIV, which operates on two numbers, new in MySQL 4.1.0.

The DIV operator does integer division on its two numeric operands. That is, DIV takes the dividend value, divides it by the divisor, and returns only the integer portion of the result. This contrasts with division done by the / (divide) operator, which returns the full result of such a division.

DIV works correctly with numbers up to the BIGINT range. It returns NULL if either operand is NULL. Oh, and there's one other result of this implementation: DIV is now a reserved keyword in MySQL.

mysql> SELECT 1025 DIV 13;
+-------------+
| 1025 DIV 13 |
+-------------+
|          78 |
+-------------+
1 row in set (0.02 sec)

mysql> SELECT 1025/13;
+---------+
| 1025/13 |
+---------+
|   78.85 |
+---------+
1 row in set (0.02 sec)

dividend MOD divisor

The complement to DIV is MOD, and in MySQL 4.1.0 we added new syntax as a synonym for the MOD() function that we've supported for some time. In contrast to DIV, MOD takes its dividend operand, divides it by the divisor, and returns only the remainder portion of the result.

MOD now works correctly with numbers up to the BIGINT range. It returns NULL if either operand is NULL.

mysql> SELECT 1025 MOD 13;
+-------------+
| 1025 MOD 13 |
+-------------+
|          11 |
+-------------+
1 row in set (0.02 sec)

Both DIV and MOD have the same precedence as the *, / and % operators.

New Functions

Besides the new date and time, character encoding and collation, and spatial extension functions, several other new functions are now available in MySQL 4.1.

Compression

COMPRESS(string_expression)

The COMPRESS function, as indicated by the name, compresses the string given as its character string argument. This only works if MySQL was compiled with the zlib compression library. If this isn't the case, COMPRESS just returns NULL. It also returns NULL if its argument resolves to NULL.

To re-expand compressed strings, a complementary function, UNCOMPRESS, was also added to MySQL 4.1.1. We'll get to that in a moment.

When storing a compressed string, MySQL follows these rules:

Rule 1: Empty strings are stored as empty strings.
Rule 2: Non-empty strings are stored with a four-byte length prefix (low byte first), followed by the compressed string itself. The length prefix tells the server the length of the uncompressed string.
Rule 3: A string ending with a space gets an extra '.' (period character) added to the end, to avoid problems with endspace trimming should the result be stored in a CHAR or VARCHAR column.

Despite the last rule, it is best that compressed strings not be stored in CHAR or VARCHAR columns. A better alternative is to store compressed strings in BLOB columns.

mysql> SELECT LENGTH(COMPRESS(REPEAT('a',1000)));
+------------------------------------+
| LENGTH(COMPRESS(REPEAT('a',1000))) |
+------------------------------------+
|                                 21 |
+------------------------------------+
1 row in set (0.02 sec)

UNCOMPRESS(compressed_string)

Here's the complement to COMPRESS. The UNCOMPRESS function takes a string that was compressed by the COMPRESS function, and expands it back to its original size.

As with COMPRESS, UNCOMPRESS works only if MySQL was compiled with the zlib compression library. If this is not the case, UNCOMPRESS just returns NULL. The function also returns NULL if its argument is not a compressed string, or if the string expression resolves to NULL.

mysql> SELECT UNCOMPRESS(COMPRESS('aaaa'));
+------------------------------+
| UNCOMPRESS(COMPRESS('aaaa')) |
+------------------------------+
| aaaa                         |
+------------------------------+
1 row in set (0.02 sec)

UNCOMPRESSED_LENGTH(compressed_string)

The final new 4.1.1 function that deals with string compression is UNCOMPRESSED_LENGTH, which returns the original length of a compressed string; that is, UNCOMPRESSED_LENGTH takes a compressed string as its argument and returns the length of the string before it was compressed. The function returns NULL if the argument resolves to NULL.

mysql> SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('A',1000)));
+-------------------------------------------------+
| UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',1000))) |
+-------------------------------------------------+
|                                            1000 |
+-------------------------------------------------+
1 row in set (0.02 sec)

Aggregate

BIT_XOR(expression)

The BIT_XOR function returns the bitwise exclusive-OR (XOR) of all bits in its single argument, which can be any type of expression. The calculation is performed with 64-bit (BIGINT) precision. BIT_XOR returns zero (0) if there are no matching rows, or if the argument resolves to NULL.

(A simple way of thinking about how this works is that each bit in the result will only be set if it was set in an odd number of the values being grouped.)

mysql> SELECT val FROM t;
+------+
| val  |
+------+
|    1 |
|    3 |
|    5 |
|    4 |
+------+
4 rows in set (0.00 sec)

mysql> SELECT BIT_XOR(val) FROM t;
+--------------+
| BIT_XOR(val) |
+--------------+
|            3 |
+--------------+
1 row in set (0.00 sec)

GROUP_CONCAT(<syntax options>)

The new GROUP_CONCAT function is a biggie. First of all, rather than just accepting one or two simple arguments, it offers numerous syntax options.

Specifically, one starts with GROUP_CONCAT and a pair of parentheses.

Inside the parentheses, a comma-delimited list of expressions to concatenate may be preceded by DISTINCT, and/or followed by an ORDER BY option that uses the same syntax as the ORDER BY clause in a SELECT statement. To end, the SEPARATOR keyword, followed by a one-character string, may be added.

GROUP_CONCAT([DISTINCT] expression [,expression ...]
             [ORDER BY {unsigned_integer | column_name | expression}
                 [ASC | DESC] [,column_name ...]]
             [SEPARATOR string_value])

GROUP_CONCAT provides the ability to retrieve concatenated values of combinations of expressions, which are usually column names. So it does for expression results what the CONCAT function does for strings: concatenates the results together, and returns the result as a single group. Here's an example; assume you have this table.

mysql> SELECT * FROM xx;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| one  |   10 |   25 |
| two  |   10 |   50 |
| two  |   10 |   50 |
| one  |   20 |   25 |
| one  |   30 |   25 |
+------+------+------+
5 rows in set (0.06 sec)

A query using GROUP_CONCAT on the second and third columns returns this result. The second column shown combines the data from col2 and col3. A comma separates the grouped values for each row represented by the col1 value.

mysql> SELECT col1,GROUP_CONCAT(col2,col3) FROM xx GROUP BY col1;
+-------+-------------------------+
| col1  | GROUP_CONCAT(col2,col3) |
+-------+-------------------------+
| one   | 1025,2025,3025          |
| two   | 1050,1050               |
+-------+-------------------------+
2 rows in set (0.03 sec)

The DISTINCT option eliminates duplicate values, just like it does in SELECT. Note that the second row now shows just one group of 1050.

mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3) FROM xx GROUP BY col1;
+------+----------------------------------+
| col1 | GROUP_CONCAT(DISTINCT col2,col3) |
+------+----------------------------------+
| one  | 1025,2025,3025                   |
| two  | 1050                             |
+------+----------------------------------+
2 rows in set (0.03 sec)

You can sort the concatenated result by adding an ORDER BY clause right in the GROUP_CONCAT function.

mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC)
    -> FROM xx GROUP BY col1;
+------+-----------------------------------------------------+
| col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC) |
+------+-----------------------------------------------------+
| one  | 3025,2025,1025                                      |
| two  | 1050                                                |
+------+-----------------------------------------------------+
2 rows in set (0.03 sec)

In such cases, ORDER BY works exactly like you'd expect from experience with the ORDER BY clause in a SELECT statement.

You can also change the separator that MySQL places between groups. The default, as we've seen in these examples so far, is a comma — but you can change this with the SEPARATOR option, which must be followed by a string showing the separator you'd like MySQL to use. This example shows a separator of percent (%) rather than comma (,).

mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '%')
    -> FROM xx GROUP BY col1;
+------+-------------------------------------------------------------------+
| col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '%') |
+------+-------------------------------------------------------------------+
| one  | 3025%2025%1025                                                    |
| two  | 1050                                                              |
+------+-------------------------------------------------------------------+
2 rows in set (0.03 sec)

If you don't want a separator at all, place two single quotes (that is, an empty string) after the SEPARATOR option.

Using GROUP_CONCAT can result in some really big values. To avoid problems with this, one can set a maximum allowed length for GROUP_CONCAT results. This is done at runtime, using the group_concat_max_len system variable.

mysql> SET group_concat_max_len=8;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT col1,GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '')
    -> FROM xx GROUP BY col1;
+------+------------------------------------------------------------------+
| col1 | GROUP_CONCAT(DISTINCT col2,col3 ORDER BY col2 DESC SEPARATOR '') |
+------+------------------------------------------------------------------+
| one  | 30252025                                                         |
| two  | 1050                                                             |
+------+------------------------------------------------------------------+
2 rows in set, 1 warning (0.03 sec)

Once a maximum length has been set, the GROUP_CONCAT result is truncated to this maximum length. So the result here only includes eight characters for the grouped column in the first row.

VARIANCE(numeric_expression)

Another new aggregate function added in MySQL 4.1.0 is VARIANCE. The VARIANCE function returns the standard population variance of its argument, which must be either a numeric expression or an expression with a data type that MySQL can convert to a numeric data type.

VARIANCE considers rows as the whole population, not as a sample; this means that it uses the number of rows returned as the denominator in its calculation. This, by the way, is different from the way Oracle calculates variances.

mysql> SELECT col2 FROM xx;
+------+
| col2 |
+------+
|   10 |
|   10 |
|   10 |
|   20 |
|   30 |
+------+
5 rows in set (0.03 sec)

mysql> SELECT VARIANCE(col2) FROM xx;
+----------------+
| VARIANCE(col2) |
+----------------+
|        64.0000 |
+----------------+
1 row in set (0.02 sec)

Miscellaneous

CRC32(string_expression)

The CRC32 function returns a 32-bit unsigned value. The function's purpose is to compute a cyclic redundancy check value for a given string. The single argument can resolve to a column name or character string expression. The function returns NULL if the argument resolves to NULL.

mysql> SELECT CRC32('trudy');
+----------------+
| CRC32('trudy') |
+----------------+
|     1899238533 |
+----------------+
1 row in set (0.02 sec)

Cyclic redundancy checks are used to run quick error tests. For example, you could determine whether a BLOB's bytes are changed or corrupted by an operation by comparing the CRC32() value prior to the operation to the value afterwards. Or you could do a quick BLOB comparison; it's quicker to compare two BLOBs' CRC32 values rather than comparing each byte of the BLOBs themselves.

DEFAULT(column_name)

The DEFAULT function, new in MySQL 4.1.0, returns the default value defined for a given column.

mysql> CREATE TABLE xz (col1 VARCHAR(10) DEFAULT 'hello');
Query OK, 0 rows affected (0.09 sec)

mysql> INSERT INTO xz VALUES ('bob');
Query OK, 1 row affected (0.01 sec)

mysql> SELECT DEFAULT(col1) FROM xz;
+---------------+
| DEFAULT(col1) |
+---------------+
| hello         |
+---------------+
1 row in set (0.02 sec)

IS_USED_LOCK(string_expression)

The IS_USED_LOCK function checks whether a given lock is in use. The single argument must resolve to a string that identifies a lock. If the lock is in use, IS_USED_LOCK returns the connection identifier of the client that holds the lock. Otherwise, the function returns NULL. It also returns NULL if the argument resolves to NULL.

mysql> SELECT IS_USED_LOCK('lock1');
+-----------------------+
| IS_USED_LOCK('lock1') |
+-----------------------+
|                  NULL |
+-----------------------+
1 row in set (0.19 sec)

UNHEX(string_expression)

The UNHEX function is the opposite of the familiar HEX function that's been part of MySQL since version 3.22.4. UNHEX interprets each pair of hexadecimal digits in its string argument as a number, converts that number to the character it represents, and returns the result as a binary string. UNHEX returns NULL if the argument resolves to NULL.

mysql> SELECT UNHEX('4D7953514C');
+---------------------+
| UNHEX('4D7953514C') |
+---------------------+
| MySQL               |
+---------------------+
1 row in set (0.02 sec)

UUID()

The UUID() function — note the mandatory parentheses — returns a Universal Unique Identifier (UUID), generated according to a specification published by The Open Group. A UUID is a 128-bit number that is globally unique in both space and time. This means that two separate calls to UUID() should always return two different values, regardless of the circumstances.

mysql> SELECT UUID();
+--------------------------------------+
| UUID()                               |
+--------------------------------------+
| 9c87d546-5891-1027-b8d6-dce9fa28dace |
+--------------------------------------+
1 row in set (0.02 sec)

UUIDs are represented by a string of five hexadecimal numbers in the format shown here — 8 hex digits, then 3 numbers with 4 hex digits each, and then 12 digits.

The first three numbers are generated from a timestamp.
The fourth number preserves temporal uniqueness in cases where the timestamp may not be monotonic (for example, due to daylight savings time).
The fifth number is an IEEE 802 node number that provides spatial uniqueness.
If the node number isn't available (for example, because the host computer has no Ethernet card), MySQL normally substitutes a randomly generated 48-bit number.

UUID() does not work with replication at this time.

Other New Features

BTREE for MEMORY

Before MySQL 4.1, the MEMORY (ne� HEAP) storage engine only supported HASH indexes, which are not generally useful with queries that access a range of indexed values. Performance of hash indexes with a high degree of key duplication can also be less than ideal.

Now you can declare the type of an index when creating a table that uses the MEMORY storage engine (and also the NDB, or cluster, storage engine).

mysql> CREATE TABLE lookup (id INT, INDEX USING BTREE (id)) ENGINE = MEMORY;