Harrison Fisk is a trainer and consultant with MySQL AB and lives in Binghamton, NY.
This is the fourth of our on-going series of articles that explain some the new features in MySQL 4.1, which as of this writing is in the gamma phase of its development cycle, on the way to a production-ready release in the near future.
By Harrison Fisk
Server-side prepared statements are an exciting new feature in MySQL 4.1. In this article, I will explain what, why, when, and how to use prepared statements.
Prepared statements are the ability to set up a statement once, and then execute it many times with different parameters. They are designed to replace building ad hoc query strings, and do so in a more secure and efficient manner. A typical prepared statement would look something like:
SELECT * FROM Country WHERE code = ?
The ?
is what is a called a placeholder. When you
execute the above query, you would need to supply the value for it,
which would replace the ?
in the query above.
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack. Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters. This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
The increase in performance in prepared statements can come from a few
different features. First is the need to only parse the query a single time.
When you initially prepare the statement, MySQL will parse the statement to
check the syntax and set up the query to be run. Then if you execute the query
many times, it will no longer have that overhead. This pre-parsing can lead to
a speed increase if you need to run the same query many times, such as when
doing many INSERT
statements.
(Note: While it will not happen with MySQL 4.1, future versions will also cache the execution plan for prepared statements, eliminating another bit of overhead you currently pay for each query execution.)
The second place where performance may increase is through the use of the new binary protocol that prepared statements can use. The traditional protocol in MySQL always converts everything into strings before sending them across the network. This means that the client converts the data into strings, which are often larger than the original data, sends it over the network (or other transport) to the server, which finally decodes the string into the correct datatype. The binary protocol removes this conversion overhead. All types are sent in a native binary form, which saves the conversion CPU usage, and can also cut down on network usage.
Prepared statements can be useful for all of the above reasons, however they
should not (and can not) be used for everything in your application. First
off, the type of queries that they work on is limited to DML
(INSERT
, REPLACE
, UPDATE
, and
DELETE
), CREATE TABLE
, and SELECT
queries. Support for additional query types will be added in further versions,
to make the prepared statements API more general.
Sometimes prepared statements can actually be slower than regular queries. The reason for this is that there are two round-trips to the server, which can slow down simple queries that are only executed a single time. In cases like that, one has to decide if it is worth trading off the performance impact of this extra round-trip in order to gain the security benefits of using prepared statements.
Most of the common programming languages used with MySQL already have support for server-side prepared statements, and many others having support coming soon.
The first API where prepared statements were implemented was in the C API. The API underwent quite a few changes throughout the alpha and beta lifecycle, but is now considered stable for general use. You can read more about the C API in the "C API Prepared Statements" section of the MySQL Reference Manual.
Another API that has prepared statement support is PHP. PHP 5 has a new MySQL interface called "mysqli". You can read more about the mysqli extension in the mysqli section of the PHP Manual. The API provided by the mysqli extension is also nearly a one-for-one match with the C API, so the documentation for the C API may also be useful in learning about the PHP API.
Those of you that are Perl or Java users have had prepared statements for quite a long time. However, those were client-side prepared statements. The client-side prepared statements provide the same security benefit, but none of the performance increases. Don't worry though, MySQL Connector/J has support for server-side prepared statements in the new 3.1 release. Perl's DBD::mysql driver will have support in the next release of the 2.9 tree. The best part is that your code is already written to use them, so all you have to do to take advantage of the new feature is to upgrade the driver behind the scenes.
The new MySQL Connector/Net provider for the .NET platform also includes support for prepared statements (both server-side for use with MySQL 4.1 and later, and client-side prepared statements for earlier versions of the server).
There is not currently a release of Connector/ODBC that supports server-side prepared statements, but the current versions emulate prepared statements within the driver. Server-side prepared statement support will be included in the next major release of Connector/ODBC, version 3.53, which is currently under development.
Finally, there is also an SQL interface to the prepared
statements. That means you can do prepared statements without any special API
support from your programming language. There are three new SQL statements to
support this: PREPARE
, EXECUTE
, and
DEALLOCATE PREPARE
. This doesn't take advantage of the new
binary protocol, so it should only be used for testing or when a native API
isn't available.
mysql> PREPARE stmt_name FROM "SELECT name FROM Country WHERE code = ?"; Query OK, 0 rows affected (0.09 sec) Statement prepared mysql> SET @test_parm = "FIN"; Query OK, 0 rows affected (0.00 sec) mysql> EXECUTE stmt_name USING @test_parm; +---------+ | name | +---------+ | Finland | +---------+ 1 row in set (0.03 sec) mysql> DEALLOCATE PREPARE stmt_name; Query OK, 0 rows affected (0.00 sec)
Now you should have a good idea about the advantages and disadvantages of when to use prepared statements. They can lead to faster, more secure code if used properly; they work quite well as a complement to the current APIs that are currently available.