All strings sent from the JDBC driver to the server are
converted automatically from native Java Unicode form to the
client character encoding, including all queries sent via
Statement.execute()
,
Statement.executeUpdate()
,
Statement.executeQuery()
as well as all
PreparedStatement
and
CallableStatement
parameters with the exclusion of parameters set using
setBytes()
,
setBinaryStream()
,
setAsciiStream()
,
setUnicodeStream()
and
setBlob()
.
Prior to MySQL Server 4.1, Connector/J supported a single
character encoding per connection, which could either be
automatically detected from the server configuration, or could
be configured by the user through the
useUnicode
and
characterEncoding
properties.
Starting with MySQL Server 4.1, Connector/J supports a single
character encoding between client and server, and any number of
character encodings for data returned by the server to the
client in ResultSets
.
The character encoding between client and server is
automatically detected upon connection. The encoding used by the
driver is specified on the server via the
character_set
system variable for server
versions older than 4.1.0 and
character_set_server
for server
versions 4.1.0 and newer. For more information, see
Section 9.1.3.1, “Server Character Set and Collation”.
To override the automatically detected encoding on the client
side, use the characterEncoding
property
in the URL used to connect to the server.
When specifying character encodings on the client side, Java-style names should be used. The following table lists Java-style names for MySQL character sets:
MySQL to Java Encoding Name Translations.
MySQL Character Set Name | Java-Style Character Encoding Name |
ascii | US-ASCII |
big5 | Big5 |
gbk | GBK |
sjis | SJIS (or Cp932 or MS932 for MySQL Server < 4.1.11) |
cp932 | Cp932 or MS932 (MySQL Server > 4.1.11) |
gb2312 | EUC_CN |
ujis | EUC_JP |
euckr | EUC_KR |
latin1 | Cp1252 |
latin2 | ISO8859_2 |
greek | ISO8859_7 |
hebrew | ISO8859_8 |
cp866 | Cp866 |
tis620 | TIS620 |
cp1250 | Cp1250 |
cp1251 | Cp1251 |
cp1257 | Cp1257 |
macroman | MacRoman |
macce | MacCentralEurope |
utf8 | UTF-8 |
ucs2 | UnicodeBig |
Do not issue the query 'set names' with Connector/J, as the driver will not detect that the character set has changed, and will continue to use the character set detected during the initial connection setup.
To allow multiple character sets to be sent from the client, the
UTF-8 encoding should be used, either by configuring
utf8
as the default server character set, or
by configuring the JDBC driver to use UTF-8 through the
characterEncoding
property.
User Comments
I had been having trouble getting JDBC connections to use the useUnicode and characterEncoding parameters using a JDBC URL like this:
jdbc:mysql://localhost/some_db?useUnicode=yes&characterEncoding=UTF-8
My problem turned out to be that I was using MySQL Connector/J version 3.1.11. When I migrated to 3.1.13 the approach described here worked.
I had troubles with the UTF-8, too. sending UTF-8 characters into the database is fine, but when retrieving them, the JDBC driver doesn't encode them properly into the String Object, so I did a workaround and it worked with me:
String message = new String( rs.getBytes("message"), "UTF-8");
the message column is actually a varchar, anyway, it is working fine with me, but I am still not sure of it is the right way to do it
Add your own comment.