OpenText StreamServe 5.6.2 Code Pages and Unicode Support User Guide
OpenText StreamServe 5.6.2 Code Pages and Unicode Support User Guide
2
Code pages and Unicode support
User Guide
Rev A
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide
Rev A
Open Text SA
40 Avenue Monterey, Luxembourg, Luxembourg L-2163
Tel: +352 264566-1
Copyright ©2014 Open Text SA and/or Open Text ULC. All Rights Reserved.
Open Text is a trademark or registered trademark of Open Text SA and/or Open Text ULC. The list of trademarks is not
exhaustive of other trademarks, registered trademarks, product names, company names, brands and service names
mentioned herein are property of Open Text SA and/or Open Text ULC or other respective owners.
Disclaimer
Every effort has been made to ensure the accuracy of the features and techniques presented in this publication. However,
Open Text Corporation and its affiliates accept no responsibility and offer no warranty whether expressed or implied, for
the accuracy of this publication.
3
Contents
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
4
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
5
The Unicode standard provides a code point for every character in modern use
worldwide. It enables plain text data to be transported through different
platforms, systems, and programs without corruption. Unicode standardizes
three encoding forms and seven encoding schemes:
A code page is a coded character set, in which each character is assigned a unique
code within the Unicode code space. Code pages usually cover only a small
subset of the Unicode characters.
For more information about the Unicode standard, see http://www.unicode.org.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
6 Code pages and Unicode support in StreamServe
About code pages and Unicode support
In this example, input data is ISO 8859-15 encoded. The StreamServer converts
the input data to UTF-16, processes the data, and uses ISO 8859-15 to encode the
output data before sending it to the printer.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Code pages and Unicode support in StreamServe 7
About code pages and Unicode support
See Specifying code pages for input data on page 13 and Specifying code pages for
output data on page 21.
If you do not specify a code page for the input data, the StreamServer may fail to
process the data correctly. However, if input data conforms to ISO 8859-1 (Latin
1) you do not have to specify a code page for the input. Similarly, if both the input
and output data conforms to ISO 8859-1 you do not have to specify a code page
for the output.
Bidirectional text
Plain text data that contains Arabic or Hebrew text in logical order is treated the
same way as data that contains unidirectional left-to-right text. Arabic/Hebrew
text in visual order must be reordered to logical order before the StreamServer
processes the text. Output from the StreamServer can also be reordered from
logical to visual order if required (e.g. Arabic text in PDF output). See Bidirectional
text on page 27.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
8 Code pages and Unicode support in StreamServe
About code pages and Unicode support
Known limitations
The StreamServe Unicode support has some limitations:
• Unicode encoded text in overlays, created in the StreamServe Overlay
Editor, is not supported.
• The StreamServe MailOUT Process does not support Unicode. Instead, you
must use an SMTP (MIME) output connector and the appropriate Process.
• Do not use characters outside the ASCII range in executable scripts or for
variables.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Code pages supported by the StreamServer 9
About code pages and Unicode support
Name Description
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
10 Code pages supported by the StreamServer
About code pages and Unicode support
Name Description
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Code pages supported by the StreamServer 11
About code pages and Unicode support
Name Description
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
12 Code pages supported by the StreamServer
About code pages and Unicode support
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
13
You must specify which code page the source application uses to encode the
input to the StreamServer. First, you identify the code page used for encoding the
input (see Identifying the code page used to encode input data on page 14), then you
select this code page when you configure the code page settings for the input in
the Design Center. Where possible, use a Unicode encoding scheme for the input
data.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
14 Identifying the code page used to encode input data
Specifying code pages for input data
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Specifying code pages per input connector 15
Specifying code pages for input data
Most significant byte first When the input is encoded using the encoding
(Big Endian) schemes UTF-16BE (big endian without byte
order mark) or UTF-16 (big endian with or
without byte order mark)
Most significant byte last When the input is encoded using the encoding
(Little Endian) schemes UTF-16LE (little endian without byte
order mark) or UTF-16 (little endian with byte
order mark.
In this example, input data received via the input connector is ISO 8859-15
encoded. A code page filter with the code page ISO 8859-15 is connected to the
input connector.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
16 Specifying code pages per input type
Specifying code pages for input data
Most significant byte first When the input is encoded using the encoding
(Big Endian) schemes UTF-16BE (big endian without byte
order mark) or UTF-16 (big endian with or
without byte order mark)
Most significant byte last When the input is encoded using the encoding
(Little Endian) schemes UTF-16LE (little endian without byte
order mark) or UTF-16 (little endian with byte
order mark).
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Specifying code pages per input type 17
Specifying code pages for input data
In this example the input connector receives ISO 8859-15 encoded page formatted
data, and ISO 8859-2 encoded XML formatted data. A code page filter with the
code page ISO 8859-15 is connected to the PageIN branch, and a code page filter
with the code page ISO 8859-2 is connected to the XMLIN branch.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
18 Dynamically selecting code pages for the input to an Event
Specifying code pages for input data
Prerequisites
• The input data must be represented by single-byte characters.
• No code page filter is added to the input connector that receives the input
data.
In this example the PageIN Event receives both ISO 8859-15 and ISO 8859-2
encoded input. The following lookup table is used to dynamically select the
appropriate code page:
Western ISO 8859-15
Eastern ISO 8859-2
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Dynamically selecting code pages for the input to an Event 19
Specifying code pages for input data
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
20 Dynamically selecting code pages for the input to an Event
Specifying code pages for input data
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
21
The output can inherit the code page specified for the input, or you can specify a
new code page for the output. The code page you specify must be supported by
the output device (e.g. a printer).
Inherit code page If you want to use the same code page for both input
and output
Select a code page If you want to select a different code page for the
output. This code page must cover at a minimum all the
characters covered in the code page for the input.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
22
Specifying code pages for output data
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
23
This section describes how to specify code pages for support files and logs.
• Specifying code pages for table files on page 24
• Specifying code pages for function files on page 25
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
24 Specifying code pages for table files
Specifying code pages for support files and logs
//!codePage UTF8!
ENG Printer_1
SWE Printer_2
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Specifying code pages for function files 25
Specifying code pages for support files and logs
CodePage UTF8
func timetotal()
...
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
26 Specifying code pages for function files
Specifying code pages for support files and logs
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
27
Bidirectional text
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
28 Reordering visually ordered input
Bidirectional text
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
Reordering output to visual order 29
Bidirectional text
Visually ordered page formatted Arabic text is in addition shaped – the glyphs
for the letters are cursively joined, lam-alif ligatures are formed, and mirror
characters (e.g. parentheses and brackets) are mirrored. Contextual shaping must
therefore be performed on the text in order to create the correct sequences of
glyphs. Shaping is automatically performed if you enable reordering of page
formatted output. This functionality is restricted to be without vowel marks.
Note: Visually ordered Arabic PCL output must be UTF-8 encoded.
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A
30 Reordering output to visual order
Bidirectional text
OpenText™ StreamServe 5.6.2 Code pages and Unicode support User Guide Rev A