A key determinant of the efficiency of compression in reducing
          the size of data files is the nature of the data itself.
          Recall that compression works by identifying repeated strings
          of bytes in a block of data. Completely randomized data is the
          worst case. Typical data often has repeated values, and so
          compresses effectively. Character strings often compress well,
          whether defined in CHAR,
          VARCHAR, TEXT or
          BLOB columns. On the other hand, tables
          containing mostly binary data (integers or floating point
          numbers) or data that is previously compressed (for example
          JPEG
          or
          PNG
          images) may not generally compress well, significantly or at
          all.
        
          Compression is chosen on a table by table basis with the
          InnoDB storage engine, and a table and all of its indexes use the
          same (compressed) page size. It might be that the primary key
          (clustered) index, which contains the data for all columns of
          a table, compresses more effectively than the secondary
          indexes. For those cases where there are long rows, the use of
          compression may result in long column values being stored
          “off-page”, as discussed in
          Section 5.3, “DYNAMIC Row Format”. Those overflow
          pages may compress well. Given these considerations, for many
          applications, some tables compress more effectively than
          others, and you may find that your workload performs best only
          with a subset of tables compressed.
        
          Experimenting is the only way to determine whether or not to
          compress a particular table. InnoDB compresses data in 16K
          chunks corresponding to the uncompressed page size, and in
          addition to user data, the page format includes some internal
          system data that is not compressed. Compression utilities
          compress an entire stream of data, and so may find more
          repeated strings across the entire input stream than InnoDB
          would find in a table compressed in 16K chunks. But you can
          get a sense of how compression efficiency by using a utility
          that implements LZ77 compression (such as
          gzip or WinZip) on your data file.
        
          Another way to test compression on a specific table is to copy
          some data from your uncompressed table to a similar,
          compressed table (having all the same indexes) and look at the
          size of the resulting file. When you do so (if nothing else
          using compression is running), you can examine the ratio of
          successful compression operations to overall compression
          operations. (In the INNODB_CMP table,
          compare COMPRESS_OPS to
          COMPRESS_OPS_OK. See
          INNODB_CMP
          for more information.) If a high percentage of compression
          operations complete successfully, the table might be a good
          candidate for compression.
        
This is the User’s Guide for InnoDB storage engine 1.1 for MySQL 5.5, generated on 2010-04-13 (revision: 19994) .

