Robin Schumacher is MySQL's Director of Product Management and has over 13 years of database experience in DB2, MySQL, Oracle, SQL Server and other database engines. Before joining MySQL, Robin was Vice President of Product Management at Embarcadero Technologies.
No matter what they say, I don't think it's true. Database 'experts' say that 80-90% of a database's overall performance comes from the SQL code that's running against it. And I just don't buy it. Instead, I think that most of a database's performance comes from optimal design, and after that SQL code comes next. If you properly design your database, then you're way ahead of the game in terms of performance issues. That's when well-written SQL code can glide along your design like butter on a hot skillet.
For example, most DBAs and developers know that large table scans typically cause performance issues, with the remedy many times being: (1) well-placed indexes, (2) horizontal partitioning, or (3) column-oriented tables where data warehouse use cases are in play. No matter how much you tweak your SQL code, not putting one of the aforementioned designs in place against big tables will most times stymie the performance on your database. This is why I firmly believe that design is the main contributor to a well-running database, and that when you monitor your database's performance, you're really monitoring the design you've given it.
Now – that said, I also believe that badly written SQL code is the second main culprit behind a lot of database sluggishness. The problem is, once a system enters production, how does a DBA locate and correct poor code that's clogging the arteries of the databases they have responsibility for?
Prior to MySQL 5.1, it wasn't all that easy to find the worst running SQL on a MySQL instance. And truth be told, MySQL still has a ways to go before it's an effortless process. But, the good news is there are some new aids in MySQL 5.1 and MySQL Enterprise that do make it more efficient to find and troubleshoot bad SQL code that's hiding inside a MySQL instance. Let's take a quick tour through these new features and see what help they provide.
Database performance analysts typically use three forms of analysis to interrogate a database that's ailing:
The above list is in order of the most useful sets of information. Eliminating bottlenecks is job #1, analyzing and massaging workloads is #2, and ratios are last in importance although they aren't completely without merit. To perform each style of analysis, you have to first capture information and then evaluate it for the keys that will help you determine what's going on and what (if anything) needs to change to make your system run faster.
In MySQL 5.1, there are some performance objects that you can now query – vs. issue show commands for – that yield data on overall MySQL performance metrics (that you can transform into ratios), as well show you who's currently on your system and what they're running.
Starting with who's logged on, the new processlist table in the information_schema database lets you query data about who's connected and if they're currently running anything:
mysql> select * -> from information_schema.processlist\G *************************** 1. row *************************** ID: 33 USER: root HOST: 154.231.0.3:1833 DB: gim2 COMMAND: Sleep TIME: 17 STATE: INFO: NULL *************************** 2. row *************************** ID: 31 USER: root HOST: 154.231.0.3:1751 DB: NULL COMMAND: Query TIME: 0 STATE: executing INFO: select * from information_schema.processlist 2 rows in set (0.00 sec)
Before MySQL 5.1, you had the show processlist command, but you couldn't filter on anything via SQL; now you can look at only active or idle processes and more. You can also query all the various MySQL global statistics to build a few ratios:
mysql> select * from global_status; +------------------------+----------------+ | VARIABLE_NAME | VARIABLE_VALUE | +------------------------+----------------+ | ABORTED_CLIENTS | 7 | | ABORTED_CONNECTS | 2 | | BINLOG_CACHE_DISK_USE | 0 | | BINLOG_CACHE_USE | 0 | | BYTES_RECEIVED | 14912630 | | BYTES_SENT | 569470 | | COM_ADMIN_COMMANDS | 58 | | COM_ASSIGN_TO_KEYCACHE | 0 | | COM_ALTER_DB | 0 | | COM_ALTER_DB_UPGRADE | 0 | | COM_ALTER_EVENT | 0 | | COM_ALTER_FUNCTION | 0 | | COM_ALTER_PROCEDURE | 0 | | COM_ALTER_SERVER | 0 | | COM_ALTER_TABLE | 14 | | COM_ALTER_TABLESPACE | 0 | | COM_ANALYZE | 0 | | COM_BACKUP_TABLE | 0 | | COM_BEGIN | 0 | | COM_BINLOG | 0 | . . .
These are good for a start, but then you'll need to go a little deeper and do some more work to ferret out problem SQL – let's look at that now.
MySQL supplies two logs that allow you to see SQL code that's sent against a server: the general query log gets all SQL, while the slow query log only collects SQL with certain performance characteristics (e.g. SQL that runs longer than a specified period of time, etc.) Prior to MySQL 5.1, these logs weren't that simple to use – you had to set an option in the MySQL configuration file, stop and restart the server, and when you were finished collecting what you wanted, reverse the process.
Now in 5.1, two things have changed: (1) You can dynamically trace SQL in both logs by issuing a command from a query prompt, (2) You have the option of tracing to a file or a table, with file (what was present in earlier MySQL versions) being the default. Being able to dynamically start/stop tracing of SQL gives you much more control over how much data you collect, and being able to query the traced information gives you more flexibility in terms of analyzing and working with what you collect.
To trace issued SQL into tables vs. files, issue the following command:
mysql> set global log_output='table'; Query OK, 0 rows affected (0.00 sec)
Note that this will tell MySQL to push all traced SQL into their appropriate tables, which are based off the CSV (comma separated value) engine. One nice aspect of the CSV tables that I like is that you can take the underlying files and directly import them into Microsoft Excel or OpenOffice Calc to prepare reports and the like.
Next, to enable tracing simply issue another command from a MySQL client prompt. For example, to trace all SQL sent to a MySQL Server, you can do the following:
mysql> set global general_log=1; Query OK, 0 rows affected (0.00 sec)
You can then trace to your heart's content and once you're satisfied with what you've got, just disable tracing again:
mysql> select count(*) from gim2.broker; +----------+ | count(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) mysql> set global general_log=0; Query OK, 0 rows affected (0.00 sec) mysql> select * from mysql.general_log\G *************************** 1. row *************************** event_time: 2009-01-21 09:53:03 user_host: root[root] @ [154.231.0.3] thread_id: 27 server_id: 0 command_type: Query argument: select count(*) from gim2.broker *************************** 2. row *************************** event_time: 2009-01-21 09:53:08 user_host: root[root] @ [154.231.0.3] thread_id: 27 server_id: 0 command_type: Query argument: set global general_log=0 2 rows in set (0.00 sec)
Admittedly, the MySQL general query log doesn't contain a wealth of diagnostic information to use:
mysql> desc mysql.general_log; +--------------+-------------+------+-----+-------------------+ | Field | Type | Null | Key | Default | +--------------+-------------+------+-----+-------------------+ | event_time | timestamp | NO | | CURRENT_TIMESTAMP | | user_host | mediumtext | NO | | NULL | | thread_id | int(11) | NO | | NULL | | server_id | int(11) | NO | | NULL | | command_type | varchar(64) | NO | | NULL | | argument | mediumtext | NO | | NULL | +--------------+-------------+------+-----+-------------------+
This isn't to say there's no value there – you can certainly ascertain workload patterns of queries and their originating source along with other diagnostics from parsing the argument column (e.g. how much a table is accessed, etc.) But, the slow query log/table contains better stats for troubleshooting SQL:
mysql> desc mysql.slow_log; +----------------+--------------+------+-----+-------------------+ | Field | Type | Null | Key | Default | +----------------+--------------+------+-----+-------------------+ | start_time | timestamp | NO | | CURRENT_TIMESTAMP | | user_host | mediumtext | NO | | NULL | | query_time | time | NO | | NULL | | lock_time | time | NO | | NULL | | rows_sent | int(11) | NO | | NULL | | rows_examined | int(11) | NO | | NULL | | db | varchar(512) | NO | | NULL | | last_insert_id | int(11) | NO | | NULL | | insert_id | int(11) | NO | | NULL | | server_id | int(11) | NO | | NULL | | sql_text | mediumtext | NO | | NULL | +----------------+--------------+------+-----+-------------------+
Tracing 'slow' running SQL is similar to collecting all SQL but there are one or two extra steps depending on the criteria you want to use. First, you need to tell MySQL how 'slow' a query needs to be in order to be collected. This is currently indicated in seconds if you want to trace to a table, but it can also be expressed in microseconds if you want to send the output to a file. We'll just trace to a table for now and set the criteria to one second:
mysql> set global long_query_time=1; Query OK, 0 rows affected (0.00 sec)
Once you do this, you can then start tracing slow running SQL by issuing this command:
mysql> set global slow_query_log=1; Query OK, 0 rows affected (0.00 sec)
Now you'll be capturing all SQL that takes longer than one second to execute – such as this one:
mysql> select b.client_id, -> b.client_first_name, -> b.client_last_name, -> (select sum(number_of_units * price) -> from client_transaction c -> where c.action = 'buy' and -> c.client_id = b.client_id) - -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) portfolio_value -> from client_transaction a, -> client b -> where a.client_id = b.client_id -> group by b.client_id, -> b.client_first_name, -> b.client_last_name -> having portfolio_value > 1000000; +-----------+-------------------+------------------+-----------------+ | client_id | client_first_name | client_last_name | portfolio_value | +-----------+-------------------+------------------+-----------------+ | 5 | ABNER | ROSSELLETT | 1252115.50 | | 500 | CANDICE | BARTLETT | 1384877.50 | +-----------+-------------------+------------------+-----------------+ 2 rows in set (3.70 sec)
mysql> select * from mysql.slow_log\G *************************** 1. row *************************** start_time: 2009-01-21 10:10:50 user_host: root[root] @ [154.231.0.3] query_time: 00:00:03 lock_time: 00:00:00 rows_sent: 2 rows_examined: 1323751 db: gim2 last_insert_id: 0 insert_id: 0 server_id: 0 sql_text: select b.client_id, b.client_first_name, b.client_last_name, (select sum(number_of_units * price) from client_transaction c where c.action = 'buy' and c.client_id = b.client_id) - (select sum(number_of_units * price) from client_transaction d where d.action = 'sell' and d.client_id = b.client_id) portfolio_value from client_transaction a, client b where a.client_id = b.client_id group by b.client_id, b.client_first_name, b.client_last_name having portfolio_value > 1000000 1 row in set (0.02 sec)
Notice again that you've got better information to work with in the slow_log table in terms of understanding what's going on underneath the covers – you can see how many rows were examined, the database being accessed, elapsed time, and more.
Now you have a bucket of bad SQL to work with – OK, what's next?
Some people absolutely love tinkering with SQL code with the goal of making it run better. And I'll admit there's a certain smell of victory in the air when you get a query to run much faster than it used to – your colleagues will admire you, your boss will be impressed, and people will want to date you. Well, the last part won't likely happen, but you get what I mean.
But the thing is – working with ugly SQL can be a pretty demanding task. There are companies out there that specialize in auto-rewriting of bad SQL code, but most don't excel at it. Vendors such as Oracle and others have attempted the same with either rewrites using stored outlines, advice on index additions, or materialized view suggestions. But sadly, it's still much of a manual process these days.
Of course, you can always start with an EXPLAIN of some bad SQL you've caught in your slow query log web:
mysql> explain select b.client_id, -> b.client_first_name, -> b.client_last_name, -> (select sum(number_of_units * price) -> from client_transaction c -> where c.action = 'buy' and -> c.client_id = b.client_id) - -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) portfolio_value -> from client_transaction a, -> client b -> where a.client_id = b.client_id -> group by b.client_id, -> b.client_first_name, -> b.client_last_name -> having portfolio_value > 1000000\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: b type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 5682 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: PRIMARY table: a type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: gim2.b.client_id rows: 1 Extra: Using index *************************** 3. row *************************** id: 3 select_type: DEPENDENT SUBQUERY table: d type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: func rows: 1 Extra: Using where *************************** 4. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: c type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: func rows: 1 Extra: Using where 4 rows in set (0.00 sec)
Honestly, though, there's not much info that stands out for the query in question. So what's the next step?
In MySQL 5.1, you now can dig a little deeper using the SQL profiler, which is a utility that interrogates the execution of a SQL query more than a standard EXPLAIN does. Using the tool is very straightforward – first you set profiling on for your user session:
mysql> set profiling=1; Query OK, 0 rows affected (0.00 sec)
Then, you execute the query as normal in the MySQL command line utility:
mysql> select b.client_id, -> b.client_first_name, -> b.client_last_name, -> (select sum(number_of_units * price) -> from client_transaction c -> where c.action = 'buy' and -> c.client_id = b.client_id) - -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) portfolio_value -> from client_transaction a, -> client b -> where a.client_id = b.client_id -> group by b.client_id, -> b.client_first_name, -> b.client_last_name -> having portfolio_value > 1000000; +-----------+-------------------+------------------+---------- | client_id | client_first_name | client_last_name | portfolio +-----------+-------------------+------------------+---------- | 5 | ABNER | ROSSELLETT | 1252 | 500 | CANDICE | BARTLETT | 1384 +-----------+-------------------+------------------+---------- 2 rows in set (4.03 sec)
Afterwards, if you don't want the utility to capture the execution of other statements that you run, you can then disable profiling like this:
mysql> set profiling=0; Query OK, 0 rows affected (0.00 sec)
MySQL captures the diagnostic information into a table that you can then reference for your particular session. You have show commands available such as the following:
mysql> show profiles\G *************************** 1. row *************************** Query_ID: 1 Duration: 4.03080100 Query: select b.client_id, b.client_first_name, b.client_last_name, (select sum(number_of_units * price) from client_transaction c where c.action = 'buy' and c.client_id = b.client_id) - (select sum(number_of_units * price) from client_transaction d where d.a 1 row in set (0.02 sec)
And you can run more sophisticated analysis using standard SQL against the profiling table such as:
mysql> select min(seq) seq,state,count(*) numb_ops, -> round(sum(duration),5) sum_dur, round(avg(duration),5) avg_dur, -> round(sum(cpu_user),5) sum_cpu, round(avg(cpu_user),5) avg_cpu -> from information_schema.profiling -> where query_id = 1 -> group by state -> order by seq; +-------+----------------------+----------+---------+---------+---------+---------+ | seq | state | numb_ops | sum_dur | avg_dur | sum_cpu | avg_cpu | +-------+----------------------+----------+---------+---------+---------+---------+ | 1 | starting | 1 | 0.00015 | 0.00015 | 0.00000 | 0.00000 | | 2 | Opening tables | 1 | 0.00003 | 0.00003 | 0.00100 | 0.00100 | | 3 | System lock | 2 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | | 4 | Table lock | 2 | 0.00011 | 0.00005 | 0.00000 | 0.00000 | | 5 | init | 1 | 0.00007 | 0.00007 | 0.00000 | 0.00000 | | 6 | optimizing | 3 | 0.00004 | 0.00001 | 0.00000 | 0.00000 | | 7 | statistics | 3 | 0.00006 | 0.00002 | 0.00000 | 0.00000 | | 8 | preparing | 3 | 0.00004 | 0.00001 | 0.00000 | 0.00000 | | 9 | Creating tmp table | 1 | 0.00005 | 0.00005 | 0.00000 | 0.00000 | | 10 | executing | 37351 | 0.17083 | 0.00000 | 0.04899 | 0.00000 | | 11 | Copying to tmp table | 1 | 0.00005 | 0.00005 | 0.00000 | 0.00000 | | 16 | Sending data | 37351 | 3.85767 | 0.00010 | 3.76143 | 0.00010 | | 74718 | Sorting result | 1 | 0.00148 | 0.00148 | 0.00200 | 0.00200 | | 74720 | end | 2 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | | 74721 | removing tmp table | 1 | 0.00011 | 0.00011 | 0.00000 | 0.00000 | | 74723 | query end | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | | 74724 | freeing items | 1 | 0.00006 | 0.00006 | 0.00000 | 0.00000 | | 74725 | logging slow query | 2 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | | 74727 | Opening table | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | | 74730 | cleaning up | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 | +-------+----------------------+----------+---------+---------+---------+---------+ 20 rows in set (0.80 sec)
From the above data, we can see that all the time is basically spent in two operations – executing and sending data. We can see that sorts, copy to temp tables, etc., really aren't a problem even though the EXPLAIN said they were in play.
Given that subqueries are involved in the query, it may be reasonable to try and pull one of them up into the main query, so let's try that and see if much happens:
mysql> set profiling=1; Query OK, 0 rows affected (0.00 sec) mysql> select b.client_id, -> b.client_first_name, -> b.client_last_name, -> sum(number_of_units * price) - -> case -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) -> when NULL then 0 -> else -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) -> end portfolio_value -> from client_transaction a, -> client b -> where a.client_id = b.client_id and -> action = 'buy' -> group by b.client_id, -> b.client_first_name, -> b.client_last_name -> having portfolio_value > 1000000; +-----------+-------------------+------------------+-----------------+ | client_id | client_first_name | client_last_name | portfolio_value | +-----------+-------------------+------------------+-----------------+ | 5 | ABNER | ROSSELLETT | 1252115.50 | | 500 | CANDICE | BARTLETT | 1384877.50 | +-----------+-------------------+------------------+-----------------+ 2 rows in set (0.45 sec) mysql> set profiling=0; Query OK, 0 rows affected (0.00 sec) mysql> select min(seq) seq,state,count(*) numb_ops, -> round(sum(duration),5) sum_dur, round(avg(duration),5) avg_dur, -> round(sum(cpu_user),5) sum_cpu, round(avg(cpu_user),5) avg_cpu -> from information_schema.profiling -> where query_id = 2 -> group by state -> order by seq; +-------+----------------------+----------+---------+---------+---------+------- | seq | state | numb_ops | sum_dur | avg_dur | sum_cpu | avg_cp +-------+----------------------+----------+---------+---------+---------+------- | 1 | starting | 1 | 0.00019 | 0.00019 | 0.00100 | 0.0010 | 2 | Opening tables | 1 | 0.00004 | 0.00004 | 0.00000 | 0.0000 | 3 | System lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.0000 | 4 | Table lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.0000 | 5 | init | 1 | 0.00012 | 0.00012 | 0.00000 | 0.0000 | 6 | optimizing | 3 | 0.00005 | 0.00002 | 0.00000 | 0.0000 | 7 | statistics | 3 | 0.00006 | 0.00002 | 0.00000 | 0.0000 | 8 | preparing | 3 | 0.00005 | 0.00002 | 0.00000 | 0.0000 | 9 | Creating tmp table | 1 | 0.00006 | 0.00006 | 0.00000 | 0.0000 | 10 | executing | 11193 | 0.05172 | 0.00000 | 0.01700 | 0.0000 | 11 | Copying to tmp table | 1 | 0.00009 | 0.00009 | 0.00000 | 0.0000 | 16 | Sending data | 11193 | 0.40741 | 0.00004 | 0.37494 | 0.0000 | 22402 | Sorting result | 1 | 0.00245 | 0.00245 | 0.00300 | 0.0030 | 22404 | end | 2 | 0.00001 | 0.00001 | 0.00000 | 0.0000 | 22405 | removing tmp table | 1 | 0.00012 | 0.00012 | 0.00000 | 0.0000 | 22407 | query end | 1 | 0.00001 | 0.00001 | 0.00000 | 0.0000 | 22408 | freeing items | 1 | 0.00006 | 0.00006 | 0.00000 | 0.0000 | 22409 | logging slow query | 1 | 0.00000 | 0.00000 | 0.00000 | 0.0000 | 22410 | cleaning up | 1 | 0.00001 | 0.00001 | 0.00000 | 0.0000 +-------+----------------------+----------+---------+---------+---------+------- 19 rows in set (0.73 sec) mysql> explain select b.client_id, -> b.client_first_name, -> b.client_last_name, -> sum(number_of_units * price) - -> case -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) -> when NULL then 0 -> else -> (select sum(number_of_units * price) -> from client_transaction d -> where d.action = 'sell' and -> d.client_id = b.client_id) -> end portfolio_value -> from client_transaction a, -> client b -> where a.client_id = b.client_id and -> action = 'buy' -> group by b.client_id, -> b.client_first_name, -> b.client_last_name -> having portfolio_value > 1000000\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: b type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 5682 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: PRIMARY table: a type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: gim2.b.client_id rows: 1 Extra: Using where *************************** 3. row *************************** id: 3 select_type: DEPENDENT SUBQUERY table: d type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: func rows: 1 Extra: Using where *************************** 4. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: d type: ref possible_keys: client_transaction_client key: client_transaction_client key_len: 4 ref: func rows: 1 Extra: Using where 4 rows in set (0.01 sec)
That did the trick. Much better performance as the SQL profiler shows and the small difference indicated in the EXPLAIN.
With the SQL profiler and the 5.1 additions on the general and slow query logs, you're in much better shape to find and troubleshoot bad SQL. But what if you have a lot of servers to manage or want more data to work with?
If you're a MySQL Enterprise customer, you just got a huge new benefit with the 2008 MySQL Enterprise release that occurred at the end of last year. The new MySQL Query Analyzer was added into the MySQL Enterprise Monitor and it packs a lot of punch for those wanting to ensure their systems are free of bad running SQL code.
I won't make a sales pitch for the new addition – you can read up on it on the here: http://www.mysql.com/products/enterprise/query.html. But let me tell you the two things I particularly like about it from a DBA perspective:
I maintain my position that design is the biggest contributor to overall database performance, but right on its heels is well-crafted SQL code. With MySQL 5.1, you've now got more firepower at your disposal to ensure bad running SQL isn't draining the life out of your systems, and in MySQL Enterprise, the new Query Analyzer makes locating bad SQL across all your servers real easy.
To understand more about the general and slow query logs, visit http://dev.mysql.com/doc/refman/5.1/en/query-log.html and http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html. For more information on the SQL profiler, you can review the documentation found in the MySQL 5.1 manual at: http://dev.mysql.com/doc/refman/5.1/en/show-profiles.html.
If you haven't downloaded MySQL 5.1, don't wait any longer – you can get your copy here: http://dev.mysql.com/downloads/mysql/5.1.html. And you can also get a trial of the MySQL Enterprise Monitor with the new Query Analyzer at: http://www.mysql.com/trials/.
We certainly appreciate all the help and feedback you can throw our way regarding the new SQL diagnostic tools – thanks as always for your support of MySQL and Sun.