1
- <!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.93 2009/04 /06 08:42:52 heikki Exp $ -->
1
+ <!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.94 2009/05 /06 16:15:20 tgl Exp $ -->
2
2
3
3
<chapter id="charset">
4
4
<title>Localization</>
20
20
21
21
<listitem>
22
22
<para>
23
- Providing a number of different character sets defined in the
24
- <productname>PostgreSQL</productname> server, including
25
- multiple-byte character sets, to support storing text in all
26
- kinds of languages, and providing character set translation between
27
- client and server.
23
+ Providing a number of different character sets to support storing text
24
+ in all kinds of languages, and providing character set translation
25
+ between client and server.
28
26
</para>
29
27
</listitem>
30
28
</itemizedlist>
@@ -75,8 +73,8 @@ initdb --locale=sv_SE
75
73
names on your system depends on what was provided by the operating
76
74
system vendor and what was installed. On most Unix systems, the command
77
75
<literal>locale -a</> will provide a list of available locales.
78
- Windows uses more verbose names, such as <literal>German_Germany</>
79
- or <literal>Swedish_Sweden.1252</>.
76
+ Windows uses more verbose locale names, such as <literal>German_Germany</>
77
+ or <literal>Swedish_Sweden.1252</>, but the principles are the same .
80
78
</para>
81
79
82
80
<para>
@@ -133,7 +131,7 @@ initdb --locale=sv_SE
133
131
fixed when the database is created. You can use different settings
134
132
for different databases, but once a database is created, you cannot
135
133
change them for that database anymore. <literal>LC_COLLATE</literal>
136
- and <literal>LC_CTYPE</literal> are those categories. They affect
134
+ and <literal>LC_CTYPE</literal> are these categories. They affect
137
135
the sort order of indexes, so they must be kept fixed, or indexes on
138
136
text columns will become corrupt. The default values for these
139
137
categories are determined when <command>initdb</command> is run, and
@@ -169,7 +167,7 @@ initdb --locale=sv_SE
169
167
For a given locale category, say the collation, the following
170
168
environment variables are consulted in this order until one is
171
169
found to be set: <envar>LC_ALL</envar>, <envar>LC_COLLATE</envar>
172
- (the variable corresponding to the respective category),
170
+ (or the variable corresponding to the respective category),
173
171
<envar>LANG</envar>. If none of these environment variables are
174
172
set then the locale defaults to <literal>C</literal>.
175
173
</para>
@@ -186,8 +184,9 @@ initdb --locale=sv_SE
186
184
187
185
<para>
188
186
To enable messages to be translated to the user's preferred language,
189
- <acronym>NLS</acronym> must have been enabled at build time. This
190
- choice is independent of the other locale support.
187
+ <acronym>NLS</acronym> must have been selected at build time
188
+ (<literal>configure --enable-nls</>). All other locale support is
189
+ built in automatically.
191
190
</para>
192
191
</sect2>
193
192
@@ -325,6 +324,7 @@ initdb --locale=sv_SE
325
324
<envar>LC_COLLATE</> locale settings. For <literal>C</> or
326
325
<literal>POSIX</> locale, any character set is allowed, but for other
327
326
locales there is only one character set that will work correctly.
327
+ (On Windows, however, UTF-8 encoding can be used with any locale.)
328
328
</para>
329
329
330
330
<sect2 id="multibyte-charset-supported">
@@ -752,6 +752,14 @@ createdb -E EUC_KR -T template0 --lc-collate=ko_KR.euckr --lc-ctype=ko_KR.euckr
752
752
CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
753
753
</programlisting>
754
754
755
+ Notice that the above commands specify copying the <literal>template0</>
756
+ database. When copying any other database, the encoding and locale
757
+ settings cannot be changed from those of the source database, because
758
+ that might result in corrupt data. For more information see
759
+ <xref linkend="manage-ag-templatedbs">.
760
+ </para>
761
+
762
+ <para>
755
763
The encoding for a database is stored in the system catalog
756
764
<literal>pg_database</literal>. You can see it by using the
757
765
<option>-l</option> option or the <command>\l</command> command
@@ -777,7 +785,7 @@ $ <userinput>psql -l</userinput>
777
785
<para>
778
786
On most modern operating systems, <productname>PostgreSQL</productname>
779
787
can determine which character set is implied by an <envar>LC_CTYPE</>
780
- setting, and it will enforce that only the correct database encoding is
788
+ setting, and it will enforce that only the matching database encoding is
781
789
used. On older systems it is your responsibility to ensure that you use
782
790
the encoding expected by the locale you have selected. A mistake in
783
791
this area is likely to lead to strange misbehavior of locale-dependent
@@ -1225,7 +1233,7 @@ RESET client_encoding;
1225
1233
1226
1234
<listitem>
1227
1235
<para>
1228
- The web site of the Unicode Consortium
1236
+ The web site of the Unicode Consortium.
1229
1237
</para>
1230
1238
</listitem>
1231
1239
</varlistentry>
0 commit comments