AU14C04-Codepages and DB2
AU14C04-Codepages and DB2
Roland Schock
ARS Computer und Consulting GmbH
Session Code: C04
2014-09-10 | Platform: LUW
#IDUG
2
Overview
Character Sets
A, B, C, ... ᇹぁゆ ㌹ ㌺
agpx
A b c d 亹怔떟떥
#IDUG
4
Character Encoding
ASCII
• Expansion of the SBCS concept from one byte to two bytes per
character
• Mainly used for asiatic languages with more than 256
characters to encode
• Latin text is expanded to twice the size of SBCS
#IDUG
8
Unicode
UTF-8
Overview
• As default DB2 server and clients use the local settings of the
operating system or user:
• Windows: The server process is using the default region settings of the
operating system.
• Linux/Unix: The codepage is derived from the locale setting for the
instance user (i.e. the user running the database processes).
• Client (LUW): The current locale settings of the user determine the code
page used during CONNECT.
• Programming language: Java is always using Unicode when connecting
to a database via JDBC.
#IDUG
15
At prepare/bind time
Overview
Client Server
uses code page X uses code page Y
Other considerations
More considerations
More considerations
Overview
Troubleshooting
Pitfalls
db2set DB2CODEPAGE
• Know what you intend to do, if you use the DB2 environment
variable DB2CODEPAGE
• It tells DB2 you will feed it with the right code points regardless
of the displayed symbols.
db2set DB2CONSOLECP
Performance considerations
Links
• DB2 Infocenter
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp
• Unicode
http://www.unicode.org
Roland Schock
ARS Computer und Consulting GmbH
[email protected]
C04
Code sets, NLS and character conversion vs. DB2