The C++ IO Streams and Locales
The C++ IO Streams and Locales
Rogue Wave Standard C++ Library Iostreams and Locale User's Guide and Reference
for
Rogue Wave's implementation of the Standard C++ Library.
Based on ANSI's Working Paper for Draft Proposed International Standard for
Information Systems--Programming Language C++.
Timothy A. Budd
Product Team:
Development:
Quality Engineering:
Manuals:
Support:
North Krimsley
Joe Delaney
Part #
RW81-01-100096
Printing Date:
October, 1996
Rogue Wave Software, Inc., 850 SW 35th St., Corvallis, Oregon, 97333 USA
Product Information:
(541) 754-3010
(800) 487-3217
Technical Support:
(541) 754-2311
FAX:
(541) 757-6650
http://www.roguewave.com
ii
Table of Contents
1. Internationalization .............................................................. 1
1.1 How to Read this Section ...............................................................................2
1.2 Internationalization and Localization...........................................................2
1.2.1 Localizing Cultural Conventions ...........................................................3
1.2.2 Character Encodings for Localizing Alphabets ...................................7
1.2.3 Summary..................................................................................................13
1.3 The Standard C Locale and the Standard C++ Locales ...........................13
1.3.1 The C Locale ............................................................................................13
1.3.2 The C++ Locales .....................................................................................16
1.3.3 Facets ........................................................................................................17
1.3.4 Differences between the C Locale and the C++ Locales ...................19
1.3.5 Relationship between the C Locale and the C++ Locale...................23
1.4 The Locale.......................................................................................................23
1.5 The Facets .......................................................................................................26
1.5.1 Creating a Facet Object ..........................................................................26
1.5.2 Accessing a Locales Facets ...................................................................27
1.5.3 Using a Streams Facet ...........................................................................28
1.5.4 Creating a Facet Class for Replacement in a Locale ..........................31
1.5.5 The Facet Id .............................................................................................34
1.5.6 Creating a Facet Class for Addition to a Locale.................................35
1.6 User-Defined Facets: An Example .............................................................37
1.6.1 A Phone Number Class .........................................................................37
1.6.2 A Phone Number Formatting Facet Class ..........................................38
1.6.3 An Inserter for Phone Numbers ...........................................................39
1.6.4 The Phone Number Facet Class Revisited ..........................................39
1.6.5 An Example of a Concrete Facet Class ................................................42
1.6.6 Using Phone Number Facets.................................................................43
1.6.7 Formatting Phone Numbers .................................................................43
1.6.8 Improving the Inserter Function ..........................................................44
2. Stream Input/Output.......................................................... 49
2.1 How to Read This Section ............................................................................50
2.1.1 Code Examples .......................................................................................50
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
iii
iv
2.11.3 Caveat...................................................................................................134
2.12 Creating New Stream Classes by Derivation ........................................135
2.12.1 Choosing a Base Class........................................................................136
2.12.2 Construction and Initialization.........................................................137
2.12.3 The Example........................................................................................139
2.12.4 Using iword/pword for RTTI in Derived Streams .......................144
2.13 Defining A Code Conversion Facet ........................................................146
2.13.1 Categories of Code Conversions ......................................................147
2.13.2 Example 1: Defining a Tiny Character Code Conversion (ASCII <-> EBCDIC) 148
2.13.3 Error Indication in Code Conversion Facets ..................................150
2.13.4 Example 2: Defining a Multibyte Character Code Conversion (JIS <-> Unicode)
..........................................................................................................................151
2.14 Differences between Standard and Traditional Iostreams ..................154
2.14.1 The Character Type ............................................................................154
2.14.2 Internationalization ............................................................................155
2.14.3 File Streams .........................................................................................155
2.14.4 String Streams .....................................................................................155
2.14.5 Streams with Assign...........................................................................156
2.15 Differences between Standard and Rogue Wave IOStreams ..............156
2.15.1 Extensions ............................................................................................156
2.15.2 Restrictions ..........................................................................................157
2.15.3 Deprecated Features...........................................................................157
Appendix ............................................................................................................167
NOTE: See Part B for the Locale & Iostreams Reference Section (listed alphabetically)
1.
Internationalization
Section
1.2.1.1 Language
Of course, language itself varies from country to country, and even within a
country. Your program may require output messages in English, Deutsch,
Franais, Italiano, or any number of languages commonly used in the world
today.
Languages may also differ in the alphabet they use. Examples of different
languages with their respective alphabets are given below:
American English:
German:
Greek:
and punctuation
1.2.1.2 Numbers
The representation of numbers depends on local customs, which vary from
country to country. For example, consider the radix character, the symbol
used to separate the integer portion of a number from the fractional portion.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved..
Internationalization
Germany
10,00,000.55
Nepal
1.2.1.3 Currency
We are all aware that countries use different currencies. However, not
everyone realizes the many different ways we can represent units of
currency. For example, the symbol for a currency can vary. Here are two
different ways of representing the same amount in US dollars:
$24.99
US
USD 24.99
The placement of the currency symbol varies for different currencies, too,
appearing before, after, or even within the numeric value:
155
Japan
13,50 DM
Germany
14 19s. 6d.
-S 1,1
Austria
1,1 DM
-1,1 DM
Germany
SFr. 1.1
SFr.-1.1
Switzerland
HK$1.1
(HK$1.1)
Hong Kong
US
Hungary
29/10/96
Italy
29/10/1996
, 29 1996
Greece
29.10.96
Germany
US time
16:55 Uhr
German time
And the following example shows different representations of the same time:
11:45:15
Digital representation, US
11:45:15
1.2.1.5 Ordering
Languages may vary regarding collating sequence; that is, their rules for
ordering or sorting characters or strings. The following example shows the
same list of words ordered alphabetically by different collating sequences:
Sorted by 1ASCII rules
Airplane
Airplane
Zebra
hnlich
bird
bird
car
car
hnlich
Zebra
The ASCII collation orders elements according to the numeric value of bytes,
which does not meet the requirements of English language dictionary
1 ASCII stands for American Standard Code for Information
Interchange. A 7-bit code is used in the US.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved..
Internationalization
chaleco
cuna
cuna
chaleco
da
da
llava
loro
loro
llava
maz
maz
The word llava is sorted after loro and before maz, because in Spanish ll is
a digraph2, i.e., it is treated as a single character that is sorted after l and
before m. Similarly, the digraph ch in Spanish is treated as a single character
to be sorted after c, but before d. Two characters that are paired and treated
as a single character are referred to as a two-to-one character code pair.
In other cases, one character is treated as if it were actually two characters.
The German single character , called the sharp s, is treated as ss. This
treatment makes a difference in the ordering, as shown in the example
below:
Rosselenker
Rosselenker
Rostbratwurst
Rohaar
Rohaar
Rostbratwurst
JIS X 0208-1983
JIS X 0208-1990
JIS X 0212-1990
JIS-ROMAN
ASCII
In Japan <ESC>$B
JIS X 0208-1983
two-byte characters
shift to Kanji
ASCII
one-byte characters
shift to ASCII
Any byte having a value in the range 0x21-7E is assumed to be a onebyte ASCII/JIS Roman character.
2.
Any byte having a value in the range 0xA1-DF is assumed to be a onebyte half-width katakana character.
3.
While this encoding is more compact than JIS, it cannot represent as many
characters as JIS. In fact, Shift-JIS cannot represent any characters in the
supplemental character set JIS X 0212-1990, which contains more than 6,000
characters.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved..
Internationalization
Any byte having a value in the range 0x21-7E is assumed to be a onebyte ASCII/JIS Roman character.
2.
Any byte having a value in the range 0xA1-FE is assumed to be the first
byte of a two-byte character from the set JIS X0208-1990. The second
byte must also have a value in that range.
3.
4.
Any byte having the value 0x8F is assumed to be followed by two more
bytes with values in the range 0xA1-FE, which together represent a
character from the set JIS X0212-1990.
The last two cases involve a prefix byte with values 0x8E and 0x8F,
respectively. These bytes are somewhat like shift sequences in that they
introduce a change in subsequent byte interpretation. However, unlike the
shift sequences in JIS which introduce a sequence, these prefix bytes must
precede every multibyte character, not just the first in a sequence. For this
reason, each multibyte character encoded in this manner stands alone and
EUC is not considered to involve shift states.
10
11
There are many wide character standards, including those shown below:
ISO 10646.UCS-24
16-bit characters
ISO 10646.UCS-4
32-bit characters
Unicode5
16-bit characters
The programming language C++ supports wide characters; their native type
in C++ is called whar_t. The syntax for wide character constants and wide
character strings is similar to that for ordinary, tiny character constants and
strings:
La is a wide character constant, and
Labc is a wide character string.
H[WHUQDOILOH
J a p a n
<ESC>
$ B
JIS
LQWHUQDOEXIIHU
p
n
Unicode
12
1.2.3 Summary
In this section, we discussed a variety of issues involved in developing
software for worldwide use. For all of these areas in which cultural
conventions differ from one region to another, the Standard C++ Library
provides services that enable you to easily internationalize your C++
programs. These services include:
13
Content
LC_NUMERIC
LC_TIME
LC_MONETARY
LC_CTYPE
LC_COLLATE
Collation sequence
LC_MESSAGE
C Library
Services
LC_NUMERIC
LC_MONETARY
struct lconv
LC_TIME
.
setlocale ( )
decimal_point
thousand_separator
currency_symbol
negative_sign
...
...
external
represenation
of a locale
scanf()
printf()
mbtowc()
isdigit()
stftime()
...
...
14
15
Information covered
setlocale(), ...
Character classification
strftime(), ...
strfmon()
Monetary functions
String collation
Multibyte functions
Message retrieval
16
time_get<>
get_time ( )
get_date ( )
Locale
...
time_put<>
put( )
...
codecvt<>
convert( )
...
C++ Library
Figure 5. A C++ locale is a container of facets
1.3.3 Facets
Facet classes encapsulate data that represents a set of culture and language
dependencies, and offer a set of related internationalization services. Facet
classes are very flexible. They can contain just about any internationalization
service you can invent. The Standard C++ Library offers a number of
predefined standard facets, which provide services similar to those contained
in the C library. However, you could bundle additional internationalization
services into a new facet class, or purchase a facet library.
17
The names of the standard facets obey certain naming rules. The get facet
classes, like num_get and time_get, handle parsing. The put facet classes
handle formatting. The punct facet classes, like numpunct and moneypunct,
represent rules and symbols.
18
The Standard C locale is a global resource: there is only one locale for the
entire application. This makes it hard to build an application that has to
handle several locales at a time.
To explore this difference in further detail, let us see how locales are typically
used.
19
global locale must change between input and output operations. Before a
price is read from the English price list, the locale must be switched from the
German locale used for printing the invoice to a US English locale. Before
inserting the price into the invoice, the global locale must be switched back to
the German locale. To read the next input from the price list, the locale must
be switched back to English, and so forth. Figure 6 summarizes this activity:
S ULFHOLVW
4 9 .
9 9
1 2 0 0 .
0 0
US English
German
LQ Y RLFH
D M
7 3 , 9 8
D M
1 . 7 7 6 , 0 0
20
(QJOLVKORFDOH
SULFHOLVW
4 9 .
9 9
1 2 0 0 .
0 0
*HUPDQORFDOH
LQYRLFH
D M
7 3 , 9 8
D M
1 . 7 7 6 , 0 0
Because the examples given above are brief, switching locales might look like
a minor inconvenience. However, it is a major problem once code
conversions are involved.
To underscore the point, let us revisit the JIS encoding scheme using the shift
sequence described in Figure 2, and repeated below. With these encodings,
you will recall that you must maintain a shift state while parsing a character
sequence, as shown in Figure 8:
In Japan <ESC>$B
JIS X 0208-1983
two-byte characters
shift to Kanji
ASCII
one-byte characters
shift to ASCII
21
H[WHUQDOILOH
J a p a n
<ESC>
$ B
Japanese
using JIS
WKHJOREDOORFDOH
LQWHUQDOEXIIHU
p
Figure 9. Parsing input from a multibyte file using the global C locale
The global C locale can be switched during parsing; for example, from a
locale object specifying the input to be in JIS encoding, to a locale object using
EUC encoding instead. The current shift state becomes invalid each time the
locale is switched, and you have to carefully maintain the shift state in an
application that switches locales.
As long as the locale switches are intentional, this problem can presumably
be solved. However, in multithreaded environments, the global C locale
may impose a severe problem, as it can be switched inadvertently by another
otherwise unrelated thread of execution. For this reason, internationalizing a
C program for a multithreaded environment is difficult.
If you use C++ locales, on the other hand, the problem simply goes away.
You can imbue each stream with a separate locale object, making inadvertent
switches impossible.
Let us now see how C++ locales are intended to be used.
22
Global locale. There is a global locale in C++, as there is in C. You can make
a given locale object global by calling locale::global(). You can create
snapshots of the current global locale by calling the default constructor for a
locale locale::locale(). Snapshots are immutable locale objects and are
not affected by any subsequent changes to the global locale.
Internationalized components like iostreams use it as a default. If you do not
explicitly imbue your streams with any particular locale object, a snapshot of
the global locale is used.
Using the global C++ locale, you can work much as you did in C. You
activate the native locale once at program startin other words, you make it
globaland use snapshots of it thereafter for all tasks that are localedependent. The following code demonstrates this procedure:
locale::global(locale());
locale::global(locale(Fr_CH));
//1
//2
//3
//4
23
\\1
\\2
//1 You can create a locale object from a C locales external representation.
The constructor locale::locale(const char* std_name) takes the
name of a C locale. This locale name is like the one you would use for a
call to the C library function setlocale().
//2 You can also use a predefined locale object, locale :: classic(),
The following example shows how you can construct a locale object as a
copy of the classic locale object, and take the numeric facet objects from a
German locale object:
locale loc ( locale::classic(), locale(De_DE), LC_NUMERIC );
24
time_get<>
imp
vector<facet*>
get_time ( )
get_date ( )
...
time_put<>
locale l2(l1)
put( )
...
codecvt<>
convert( )
...
imp
locale l3
(l2,locale(fr)
,LC_TIME)
vector<facet*>
time_get<>
get_time ( )
get_date ( )
...
time_put<>
put( )
...
25
Buy a facet library, which provides you with facet classes and objects.
time_get_byname<>
get_time( )
. % A07/'4
'4++ %
get_date( )
...
. % A/10' 6 #4 ;
time_put_byname<>
. % A6 + / '
put( )
...
codecvt_byname<>
convert( )
Id
Facet
...
Locale
external
represenation of a
C locale
Standard
C++ Library
ASCII-EBCDIC
code conversion
convert( )
...
phone number
formatting
put( )
get_area_code( )
...
C++ Library
facet library
Standard C++ Library, as shown in Figure 11:
Figure 11. Creating facet objects
Facets are interdependent. For example, the num_get and num_put facet
objects rely on a numpunct facet object. In most cases, facet objects are not
9 A byname facet creates a facet from the external representation of
a C locale. See section 1.3.3.1 for the naming conventions
of facet names.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved..
Internationalization
26
\\1
\\2
representation.
//2 The member function compare() of this facet object is used for string
comparison.
The string class in the Standard C++ Library does not provide any service for
locale-sensitive string comparisons. Hence, you will generally use a collate
facets compare service, as demonstrated above, or the locale's function call
operator instead:
string name1(Peter Gartner);
string name2 (Peter Grtner);
locale loc(De_DE);
if ( loc(name1, name2) )
{ }
27
use_facet(const locale&);
has_facet(const locale&);
The code below demonstrates how they are used. It is an example of the
ctype facets usage; all upper case letters of a string read from the standard
input stream are converted to lower case letters and written to the standard
output stream.
string in;
cin >> in;
if (has_facet< ctype<char> >(locale::locale()))
{ cout << use_facet< ctype<char> >(locale::locale())
.tolower(in.begin(),in.end());
}
\\1
\\2
\\3
//1
cout.imbue(locale::locale(De_CH));
cout << aDate;
//3
//4
//2
28
//1 A date object is created. It is of type tm, which is the time structure
\\1
\\2
\\3
\\4
\\5
\\6
\\7
\\8
\\9
Time formatting facet objects write the formatted output via an iterator
into an output container (see the sections on containers and iterators in
the User's Guide). In principle, this can be an arbitrary container that
has an output iterator, such as a string or a C++ array.
Here we want the time-formatting facet object to bypass the streams
formatting layer and write directly to the output streams underlying
stream buffer. Therefore, the output container shall be a stream buffer.
29
//4 We define a variable that will hold a reference to the locale objects
time_put facet object. The time formatting facet class time_put has two
template parameters:
The first template parameter is the character type used for output. Here
we provide the streams character type as the template argument.
The second template parameter is the output iterator type. Here we
provide the stream buffer iterator type outIter_t that we had defined
as before.
//5 Here we get the time-formatting facet object from the streams locale via
use_facet().
//6 We define a variable to hold the output iterator returned by the facet
(iter_type
(a)
,ios_base&
(b)
,char_type
(c)
,const tm*
(d)
,char)
(e)
The types iter_type and char_type stand for the types that
were provided as template arguments when the facet class was
instantiated. In this case, they are
ostreambuf_iterator<charT,traits> and charT, where charT and
traits are the respective streams template arguments.
Here is the actual call:
nextpos = fac.put(os,os,os.fill(),&date,'x');
b)
30
c)
The third parameter is the fill character. It is used when the output
has to be adjusted and blank characters have to be filled in. We
provide the streams fill character, which one can get by calling the
streams fill() function.
d)
e)
f)
//8 As we work with output stream buffer iterators, we can even check for
bits.
\\1
\\2
\\3
31
//1 A variable of type bool is defined. Its initial value is the boolean value
of the logical expression (argc > 1), so the variable any_arguments
Now let us replace this facet. To make it more exciting, let's use not only a
different language, but also different words for true and false, such as Yes!
and No!. For just using another language, we would not need a new facet;
we would simply use the right native locale, and it would contain the right
facet.
template <class charT, charT* True, charT* False>
class CustomizedBooleanNames
: public numpunct_byname<charT> {
typedef basic_string<charT> string;
protected:
string do_truename() {return True;}
string do_falsename() {return False;}
~CustomizedBooleanNames() {}
public:
explicit CustomizedBooleanNames(const char* LocName)
//1
//2
//3
//4
32
: numpunct_byname<charT>(LocName) {}
};
//1 The new facet is a class template that takes the character type as a
template parameter, and the string representation for true and false as
The byname facets read the respective locale information from the
external representation of a C locale. The name provided to construct a
byname facet is the name of a locale, as you would use it in a call to
setlocale().
//3 The virtual member functions do_truename() and do_falsename() are
Now lets replace the numpunct facet object in a given locale object, as shown
in Figure 12:
Locale
Id
Facet
num_get<>
num_put<>
numpunct<> CustomizedBooleanNames
moneypunct<>
time_get<>
time_put<>
ctype<>
codecvt<>
message<>
...
...
\\1
\\2
\\3
\\4
33
//1 A locale object is constructed with an instance of the new facet class.
The locale object will have all facet objects from a German locale object,
except that the new facet object CustomizedBooleanNames will substitute
for the numpunct facet object.
//2 The new facet object takes all information from a German numpunct
facet object, and replaces the default native names true and false with
the provided strings Ja.(Yes.) and Nein.(No.).
Note that the facet object is created on the heap. That's because the
locale class by default manages installation, reference-counting, and
destruction of all its facet objects.
//3 The standard output stream cout is imbued with the newly created
locale object.
//4 The expression (argc > 1) yields a boolean value, which indicates
whether the program was called with arguments. This boolean values
alphanumeric representation is printed to the standard output stream.
The output might be:
Argument vorhanden? Ja.
34
\\1
\\2
\\3
35
Now lets add the new facet object to a given locale object, as shown in
Locale
Id
Facet
num_get<>
num_put<>
numpunct<>
moneypunct<>
time_get<>
time_put<>
ctype<>
codecvt<>
message<>
Umlaut
...
Figure 13:
Figure 13. Adding a new facet to a locale
The code for this procedure is given below:
locale loc(locale(), // native locale
new Umlaut); // the new facet
//1
char c,d;
while (cin >> c){
d = use_facet<ctype<char> >(loc).tolower(c);
//2
if (has_facet<Umlaut>(loc))
//3
{ if (use_facet<Umlaut>(loc).is_umlaut(d))
//4
cout << c << belongs to the German alphabet! << \n;
}
}
//1 A locale object is constructed with an instance of the new facet class.
The locale object will have all facet objects from the native locale object,
plus an instance of the new facet class Umlaut.
//2 Let's assume our new umlaut facet class is somewhat limited; it can
called. Note that the syntax for using this newly contrived facet object
is exactly like the syntax for using the standard ctype facet.
36
Local
(541) 754-3010
Domestic
+1-541-754-3010
International
1-541-754-3010
Dialed in the US
001-541-754-3010
Local
(089) / 636-48018
Domestic
+49-89-636-48018
International
19-49-89-636-48018
37
//"de"
//"89"
//"636-48018"
US_phone_put
German_phone_put
Figure 14. The relationship of the phone_put facet to the implementing facets
Here is a first tentative declaration of the new facet class phone_put:
class phone_put: public locale::facet
{
public:
static locale::id id;
phone_put(size_t refs = 0) : locale::facet(refs) { }
string_t put(const string_t& ext
,const string_t& area
,const string_t& cnt) const;
//1
//2
//3
//4
};
//1 Derive from the base class locale::facet, so that a locale object will be
38
//3 Define a constructor that takes the reference count that will be handed
A facet needs access to a table of all country codes, so that one can enter
a mnemonic for the country instead of looking up the respective country
code. For example, I would like to say: This is a phone number
somewhere in Japan without having to know what the country code for
Japan is.
39
public:
typedef string string_t;
static locale::id id;
phone_put(size_t refs = 0) : locale::facet(refs)
, myCountryCode_("")
, intlPrefix_("")
string_t put(const string_t& ext,
const string_t& area,
const string_t& cnt) const;
protected:
phone_put( const string_t& myC
, const string_t& intlP
, size_t refs = 0)
: locale::facet(refs)
, myCountryCode_(myC)
, intlPrefix_(intlP)
{ }
const string_t myCountryCode_;
const string_t intlPrefix_;
};
//1
//2
//3
Note how this class serves as a base class for the facet classes that really
implement a locale-dependent phone number formatting. Hence, the public
constructor does not need to be extended, and a protected constructor is
added instead (see //1 above).
prefixMap_t
US
Fr
UK
1
33
44
De
Jp
49
81
Figure 15. Map associating country codes with mnemonics for countries' names
In the following code, we add the table of country codes:
class phone_put: public locale::facet
{
public:
class prefixMap_t : public map<string,string>
//1
{
public:
prefixMap_t() { insert(tab_t(string("US"),string("1")));
Copyright 1996 Rogue Wave Software, Inc. All rights reserved..
Internationalization
40
insert(tab_t(string("De"),string("49")));
// ...
}
};
static const prefixMap_t* std_codes()
{ return &stdCodes_; }
protected:
static const prefixMap_t stdCodes_;
};
//2
//3
As the table of country codes is a constant table that is valid for all telephone
number facet objects, it is added as a static data member stdCodes_ (see //3).
The initialization of this data member is encapsulated in a class, prefixMap_t
(see //1). For convenience, a function std_codes() is added to give access to
the table (see //2).
Despite its appealing simplicity, however, having just one static country code
table might prove too inflexible. Consider that mnemonics might vary from
one locale to another due to different languages. Maybe mnemonics are not
called for, and you really need more extended names associated with the
actual country code.
In order to provide more flexibility, we can build in the ability to work with
an arbitrary table. A pointer to the respective country code table can be
provided when a facet object is constructed. The static table, shown in Figure
16 below, will serve as a default:
phone_put
myCountryCode_
myCountryCode_
intlPrefix_
intlPrefix_
country_codes_
country_codes_
prefixMap_t
Etats Unis
France
33
Grande Bretagne
44
Allemagne
49
Japon
81
//1
41
,
:
,
,
{ if (tab)
else
size_t refs = 0)
locale::facet(refs)
countryCodes_(tab), delete_it_(del)
myCountryCode_(""), intlPrefix_("")
{ countryCodes_ = tab;
delete_it_ = del;
}
{ countryCodes_ = &stdCodes_;
delete_it_ = false; }
//2
}
string_t put(const string_t& ext,
const string_t& area,
const string_t& cnt) const;
const prefixMap_t* country_codes() const
{ return countryCodes_; }
//3
//5
//1 The constructor is enhanced to take a pointer to the country code table,
together with the flag for memory management of the provided table.
//2 If no table is provided, the static table is installed as a default.
//3 For convenience, a function that returns a pointer to the current table is
added.
//4 The table is deleted if the memory management flags says so.
//5 Protected data members are added to hold the pointer to the current
42
public:
US_phone_put(
,
,
,
:
{ }
};
//1
//2
//3
//1 Imbue an output stream with a locale object that has a phone number
011-33-1-60170716
(541) 711-PARK
19 49 89 636 40938
43
Now consider that the locale object imbued on a stream might change, but
the cached static country code table does not. The cache is filled once, and all
changes to the streams locale object have no effect on this inserter functions
cache. Thats probably not what we want. What we do need is some kind of
notification each time a new locale object is imbued, so that we can update
the cache.
44
Destruction of a stream,
parray
xalloc()
index
pword()
prefixMap_t
US
Fr
UK
De
Jp
1
33
44
49
81
//1
//2
//3
45
//2 The pointer to the code table is stored in the array via pword().
//3 The callback function and the index are registered.
The actual callback function will later have access to the cache via the index
to parray.
At this point, we still need a callback function that updates the cache each
time the streams locale is replaced. Such a callback function could look like
this:
void cacheCountryCodes(ios_base::event event
,ios_base& str,int cache)
{ if (event == ios_base::imbue_event)
{
locale loc = str.getloc();
const phone_put<char>& ppFacet =
use_facet<phone_put<char> > (loc);
*((phone_put::prefixMap_t*) str.pword(cache)) =
*(ppFacet.country_codes());
//1
//2
//3
}
}
//1 It checks whether the event was a change of the imbued locale,
//2 retrieves the phone number facet from the streams locale, and
//3 stores the country code table in the cache. The cache is accessible via
the streams parray.
//1
46
47
2.
Stream Input/Output
SSeecct ti o
i onn
50
least a few of these features will not be supported by your own compiler.
The consequence is that some techniques demonstrated and explained in this
User's Guide will not work with your compiler either.
We include examples that might not compile, rather than omitting certain
techniques entirely, to demonstrate the full range of techniques the Standard
C++ language will support. This User's Guide was written with an eye to the
C++ of the future. Compilers will catch up, and techniques that don't work
with your current compiler will work once your compiler can understand
Standard C++. Hopefully, including these techniques will extend the
usefulness of this User's Guide to you.
Also, the code examples are simplified in that the necessary #include <>
statements and the using directive for the standard namespace ::std are
omitted. The intent is to make the examples as readable and focused as
possible rather than ceaselessly repeating the same code fragments.
2.1.2 Terminology
The Standard C++ Library consists mostly of class and function templates.
Abbreviations for these templates are used throughout this User's Guide. For
example, fstream stands for template <class charT, class traits> class
basic_fstream. A slightly more succinct notation for a class template is also
frequently used: basic_fstream <charT, traits>.
In addition to abbreviations, you will find certain contrived technical terms.
For example, file stream stands for the abstract notion of the file stream class
template; badbit stands for the state flag ios_base::badbit.
51
We can compare the standard iostreams not only with the traditional C++
iostreams library, but also with the I/O support in the Standard C Library.
Many former C programmers still prefer the input/output functions offered
by the C library, often referred to as C stdio. Their familiarity with the C
library is justification enough for using the C stdio instead of C++ iostreams,
but there are other reasons as well. For example, calls to the C functions
printf() and scanf() are admittedly more concise with C stdio. However,
C stdio has drawbacks, too, such as type insecurity and inability to extend
consistently for user-defined classes. We'll discuss these in more detail in the
following sections.
Since there are overloaded versions of the shift operator operator<<(), the
right operator will always be called. The function cout << i calls
operator<<(int), and cout << name calls operator<<(const char*).
Hence, the standard iostreams are typesafe.
All we need to do is overload operator<<() for this new type Pair, and we
can output pairs this way:
Pair p(5, May);
cout << p;
52
program
communication channel
file
display
IOStreams supports data transfer between a program and external devices.
+HOOR?WZRUOG?
program data
formatting
158 Hello
world!
53
H[ WHUQ DOILOH
J a p a n
<ESC>
$ B
JIS
n
Unicode
program
buffer
external
device
54
Locales. Both the formatting and the transport layers use the streams
locale. (See the section on internationalization and locales.) The
formatting layer delegates the handling of numeric entities to the locales
numeric facets. The transport layer uses the locales code conversion
facet for character-wise transformation between the buffer content and
characters transported to and from the external device. Figure 22 below
shows how locales are used with iostreams:
program
locale:
numeric facets
buffer
external
device
locale:
code conversion facet
55
File I/O. Iostreams can still be used for input and output to files,
although file I/O has lost some of it former importance. In the past,
alpha-numeric user-interfaces were often built using file input/output to
the standard input and output channels. Today almost all applications
have graphical user interfaces.
Nevertheless, iostreams are still useful for input and output to files other
than the standard input and output channels, and to all other kinds of
external media that fit into the file abstraction. For example, the Rogue
Wave class library for network communications programming, Net.h++,
uses iostreams for input and output to various kinds of communication
streams like sockets and pipes.
56
program
buffer
external
device
57
ios_base
basic_ios<charT:class,traits:class>
basic_istream<charT:class,traits:class>
basic_ostream<charT:class,traits:class>
basic_iostream<charT:class,traits:class>
basic_istringstream
<charT:class,traits:class>
basic_stringstream
<charT:class,traits:class>
basic_ifstream
<charT:class,traits:class,
Allocator:class>
basic_ostringstream
<charT:class,traits:class>
basic_fstream
<charT:class,traits:class,
Allocator:class>
basic_ofstream
<charT:class,traits:class,
Allocator:class>
58
Let us discuss in more detail the components and characteristics of the class
hierarchy given in the figure:
The Iostreams Base Class ios_base. This class is the base class of all
stream classes. Independent of character type, it encapsulates
information that is needed by all streams. This information includes:
Additionally, ios_base defines several types that are used by all stream
classes, such as format flags, status bits, open mode, exception class, etc.
Note that ios is not a class anymore, as it was in the traditional iostreams.
If you have existing programs that use the old iostreams, they may no
longer be compilable with the standard iostreams. (See list of
incompatibilities in section 2.14)
The end-of-file value. For type char, the end-of file value is
represented by an integral constant called EOF. For type wchar_t,
there is a constant defined that is called WEOF. For an arbitrary userdefined character type, the associated character traits define what the
end-of-file value for this particular character type is.
The type of the EOF value. This needs to be a type that can hold the
EOF value. For example, for single-byte characters, this type is int,
different from the actual character type char.
59
The Input and Output Streams. The three stream classes for input and
output are:
basic_istream <class charT, class traits=char_traits<charT> >
basic_ostream <class charT, class traits=char_traits<charT> >
basic_iostream<class charT, class traits=char_traits<charT> >
Class istream handles input, class ostream is for output. Class iostream
deals with input and output; such a stream is called a bidirectional stream.
The three stream classes define functions for parsing and formatting,
which are overloaded versions of operator>>() for input, called
extractors, and overloaded versions of operator<<() for output, called
inserters.
Additionally, there are member functions for unformatted input and
output, like get(), put(), etc.
The File Streams. The file stream classes support input and output to
and from files. They are:
basic_ifstream<class charT, class traits=char_traits<charT> >
basic_ofstream<class charT, class traits=char_traits<charT> >
basic_fstream<class charT, class traits=char_traits<charT> >
There are functions for opening and closing files, similar to the C
functions fopen() and fclose(). Internally they use a special kind of
stream buffer, called a file buffer, to control the transport of characters
to/from the associated file. The function of the file streams is illustrated
in Figure 25:
60
basic_ofstream
<charT:class,traits:class>
program
basic_filebuf
<charT:class,traits:class>
open()
close()
overflow()
underflow()
open()
close()
IRUPDWWLQJ
external
file
WUDQVSRUW
GHVWLQDWLRQ
The String Streams. The string stream classes support in-memory I/O;
that is, reading and writing to a string held in memory. They are:
basic_istringstream<class charT, class traits=char_traits<charT> >
basic_ostringstream<class charT, class traits=char_traits<charT> >
basic_stringstream<class charT, class traits=char_traits<charT> >
There are functions for getting and setting the string to be used as a
buffer. Internally a specialized stream buffer is used. In this particular
case, the buffer and the external device are the same. Figure 26 below
illustrates how the string stream classes work:
basic_stringbuf
<charT:class,traits:class>
basic_ostringstream
<charT:class,traits:class>
program
str()
str(basic_string<charT>&)
IRUPDWWLQJ
str()
str(basic_string<charT>&)
overflow()
underflow()
WUDQVSRUW
GHVWLQDWLRQ
basic_stringbuf
<charT:class,traits:class,
Allocator:class>
basic_filebuf
<charT:class,traits:class,
Allocator:class>
61
Classes of the transport layer are often referred to as the stream buffer
classes. Figure 27 gives the class hierarchy of all stream buffer classes:
Figure 27. Hierarchy of the transport layer
The stream buffer classes are responsible for transfer of characters from and
to external devices.
It does not have any knowledge about the external device. Instead, it
defines two virtual functions, overflow() and underflow(), to perform
the actual transport. These two functions have knowledge of the
peculiarities of the external device they are connected to. They have to
be overwritten by all concrete stream buffer classes, like file and string
buffers.
The stream buffer class maintains two character sequences: the get area,
which represents the input sequence read from an external device, and
the put area, which is the output sequence to be written to the device.
There are functions for providing the next character from the buffer, such
as sgetc(), etc. They are typically called by the formatting layer in order
to receive characters for parsing. Accordingly, there are also functions
for placing the next character into the buffer, such as sputc(), etc.
A stream buffer also carries a locale object.
The File Buffer. The file buffer classes associate the input and output
sequences with a file. A file buffer takes the form:
basic_filebuf<class charT, class traits=char_traits<charT> >
The file buffer has functions like open() and close(). The file buffer
class inherits a locale object from its stream buffer base class. It uses the
locales code conversion facet for transforming the external character
encoding to the encoding used internally. Figure 28 shows how the file
external file
buffer (wide characters)
(multi-byte characters)
buffer works:
Figure 28. Character code conversion performed by the file buffer
The String Stream Buffer. These classes implement the in-memory I/O:
basic_stringbuf<class charT, class traits=char_traits<charT> >
62
With string buffers, the internal buffer and the external device are one
and the same. The internal buffer is dynamic, in that it is extended if
necessary to hold all the characters written to it. You can obtain copies of
the internally held buffer, and you can provide a string to be copied into
the internal buffer.
basic_streambuf<charT,traits> *
basic_ios<charT,traits>
basic_filebuf<charT,traits>
basic_ifstream<charT,traits>
63
locale
ios_base
basic_streambuf<charT,traits> *
basic_ios<charT,traits>
locale
basic_filebuf<charT,traits>
basic_ifstream<charT,traits>
64
Narrow
character
stream
Wide
character
stream
Associated
C standard
files
cin
wcin
stdin
cout
wcout
stdout
cerr
wcerr
stderr
clog
wclog
stderr
Like the C standard files, these streams are all associated by default with the
terminal.
The difference between clog and cerr is that clog is fully buffered, whereas
output to cerr is written to the external device after each formatting. With a
fully buffered stream, output to the actual external device is written only
when the buffer is full. Thus clog is more efficient for redirecting output to a
file, while cerr is mainly useful for terminal I/O. Writing to the external
device after every formatting, to the terminal in the case of cerr, serves the
purpose of synchronizing output to and input from the terminal.
The standard streams are initialized in such a way that they can be used in
constructors and destructors of static objects. Also, the predefined streams
are synchronized with their associated C standard files. See Section 2.10.5 for
details.
Both operators are overloaded for all built-in types in C++, as well as for
some of the types defined in the Standard C++ Library; for example, there
are inserters and extractors for bool, char, int, long, float, double, string,
etc. When you insert or extract a value to or from a stream, the C++ function
overload resolution chooses the correct extractor operator, based on the
values type. This is what makes C++ iostreams type-safe and better than C
stdio (see Section 2.2.1.1).
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
65
is equivalent to:
(cout.operator<<(result: )).operator<<(x);
and:
template<class charT, class traits>
basic_ostream<charT, traits>&
basic_ostream<charT, traits>::operator<<(type x)
{
// write x
return *this;
}
Simple input and output of units as shown above is useful, yet not sufficient
in many cases. For example, you may want to vary the way output is
formatted, or input is parsed. Iostreams allow you to control the formatting
features of its input and output operators in many ways. With iostreams,
you can specify:
The width of an output field and the adjustment of the output within this
field;
The precision and format of floating point numbers, and whether or not
the decimal point should always be included;
Whether you want to skip white spaces when reading from an input
stream;
15 The shift operators for the character types, like char and
wchar_t, are an exception to this rule; they are global
functions in the standard library namespace ::std.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
66
The streams format state is the main means of format control, as we will
demonstrate in the next section.
Defined in
Effect
Default
width()
ios_base
precision()
ios_base
Precision of floating
point values
fill()
basic_ios
The space
character
base class
<charT,traits>
Parameters that can have only a few different values, typically two or
three. They are represented by one or more bits in a data member of
type fmtflags in class ios_base. These are usually called format flags.
You can set format flags using the setf() function in class ios_base,
clear them using unsetf(), and retrieve them through the flags()
function.
Some format flags are grouped because they are mutually exclusive; for
example, output within an output field can be adjusted to the left or to
the right, or to an internally specified adjustment. One and only one of
67
Group
Effect
adjustfield
stdio
left17
left
left
right
right
internal
Default
%i
dec
decimal base
%d,%u
oct
octal base
%o
hex
hexadecimal base
%x
dec
68
floatfield
%g,%G
fixed
fixed
in fixed-point notation
%f
scientific
in scientific notation
%e,%E
boolalpha
showpos
Generates a + sign in
non-negative generated
numeric output
showpoint
Always generates a
decimal-point in
generated floating-point
output
showbase
Generates a prefix
indicating the numeric
base of a generated
integer output
skipws
unitbuf
uppercase
Replaces certain
lowercase letters with
their uppercase
equivalents in generated
output
%X
%E
%G
The effect of setting a format parameter is usually permanent; that is, the
parameter setting is in effect until the setting is explicitly changed. The only
exception to this rule is the field width. The width is automatically reset to
its default value 0 after each input or output operation that uses the field
width.18 Here is an example:
69
\\1
\\2
\\1
\\2
\\3
\\4
\\5
\\6
//1 Store the current format flag setting, in order to restore it later on.
//2 Change the adjustment from the default setting right to left.
//3 Set the field width from its default 0 to 10. A field width of 0 means
that no padding characters are inserted, and this is the default behavior
of all insertions.
//4 Clear the adjustment flags.
//5 Change the precision for floating-point values from its default 6 to 2,
and set yet another couple of format flags that affect floating-point
values.
//6 Restore the original flags.
70
2.3.3.2 Manipulators
Format control requires calling a streams member functions. Each such call
interrupts the respective shift expression. But what if you need to change
formats within a shift expression? This is possible in iostreams. Instead of
writing:
cout<< 812 << '|';
cout.setf(ios_base::left,ios_base::adjustfield);
cout.width(10);
cout<< 813 << 815 << '\n';
In this example, objects like left, setw, and endl are called manipulators. A
manipulator is an object of a certain type; lets call the type manip for the time
being. There are overloaded versions of basic_istream <charT,traits>::
operator>>() and basic_ostream <charT,traits>:: operator<<() for
type manip. Hence a manipulator can be extracted from or inserted into a
stream together with other objects that have the shift operators defined.
(Section 2.8 explains in greater detail how manipulators work and how you
can implement your own manipulators.)
The effect of a manipulator need not be an actual input to or output from the
stream. Most manipulators set just one of the above described format flags,
or do some other kind of stream manipulation. For example, an expression
like:
cout << left;
is equivalent to:
cout.setf (ios_base::left, ios_base::adjustfield);.
Nothing is inserted into the stream. The only effect is that the format flag for
adjusting the output to the left is set.
On the other hand, the manipulator endl inserts the newline character to the
stream, and flushes to the underlying stream buffer. The expression:
cout << endl;
is equivalent to:
cout << \n; cout.flush();
is equivalent to:
cout.width(10);
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
71
In general, you can think of a manipulator as an object you can insert into or
extract from a stream, in order to manipulate that stream.
Some manipulators can be applied only to output streams, others only to
input streams. Most manipulators change format bits only in one of the
stream base classes, ios_base or basic_ios<charT,traits>. These can be
applied to input and output streams.
Table 6 below gives an overview of all manipulators defined by iostreams.
The first column, Manipulator, lists its name. All manipulators are classes
defined in the namespace ::std. The second column, Use, indicates whether
the manipulator is intended to be used with istreams (i), ostreams (o), or
both (io). The third column, Effect, summarizes the effect of the
manipulator. The last column, Equivalent, lists the corresponding call to the
streams member function.
Note that the second column indicates only the intended use of a manipulator.
In many cases, it is possible to apply an output manipulator to an input
stream, and vice versa. Generally, this kind of non-intended manipulation is
harmless in that it has no effect. For instance, if you apply the output
manipulator showpoint to an input stream, the manipulation will simply be
ignored. However, if you use an output manipulator on a bidirectional
stream during input, the manipulation will affect not current input
operations, but subsequent output operations.
Table 6: Manipulators
Manipulator
Use
Effect
Equivalent
boolalpha
io
io.setf(ios_base::boolalpha)
dec
io
Converts integers
to/from decimal
notation
io.setf(ios_base::dec,
ios_base::basefield)
endl
o.put(o.widen('\n'));
o.flush()
ends
o.put(o.widen('\0'))
fixed
o.setf(ios_base::fixed,
ios_base::floatfield)
flush
o.flush()
72
hex
io
Converts integers
to/from hexadecimal
notation
io.setf(ios_base::hex,
ios_base::basefield)
internal
o.setf(ios_base::internal,
ios_base::adjustfield)
left
o.setf(ios_base::left,
ios_base::adjustfield)
noboolalpha
io
io.unsetf(ios_base::boolalpha)
noshowbase
o.unsetf (ios_base::showbase)
noshowpoint
o.unsetf (ios_base::showpoint)
noshowpos
o.unsetf (ios_base::showpos)
noskipws
i.unsetf(ios_base::skipws)
nounitbuf
o.unsetf(ios_base::unitbuf)
o.unsetf (ios_base::uppercase)
nouppercase
oct
io
Converts to/from
octal notation
io.setf(ios_base::oct,
ios_base::basefield)
resetiosflags
(ios_base::fmtflag
s mask)
io
io.setf((ios_base::fmtflags)0,
mask)
right
o.setf(ios_base::right,
ios_base::adjustfield)
o.setf(ios_base::scientific,
ios_base::floatfield)
scientific
setbase
(int base)
io
io.setf (base ==
8?ios_base::oct: base == 10 ?
ios_base::dec : base == 16 ?
ios_base::hex :
ios_base::fmtflags(0),
ios_base::basefield)
setfill(charT c)
io
io.fill(c)
73
setiosflags
(ios_base::fmtflag
s mask)
io
io.setf(mask)
setprecision
(int n)
io
Sets precision of
floating point values
io.precision(n)
setw(int n)
io
io.width(n)
showbase
Generates a prefix
indicating the
numeric base of an
integer
o.setf(ios_base::showbase)
showpoint
Always generates a
decimal-point for
floating-point values
o.setf(ios_base::showpoint)
showpos
o.setf(ios_base::showpos)
skipws
i.setf(ios_base::skipws)
unitbuf
o.setf(ios_base::unitbuf)
uppercase
Replaces certain
lowercase letters with
their uppercase
equivalents
o.setf(ios_base::uppercase)
ws
74
Other cultural conventions, like the grouping of digits, are irrelevant. There
is no formatting of numeric values that involves grouping.19
2.
When the first relevant character is found, they extract characters from
the input stream until they find a separator; that is, a character that does
not belong to the item. White space characters in particular are
separators.
3.
The separator remains in the input stream and becomes the first
character extracted in a subsequent extraction.
75
You can use the manipulator noskipws to switch off the automatic skipping
of white space characters. For example, extracting white space characters
may be necessary if you expect the input has a certain format, and you need
to check for violations of the format requirements. This procedure is shown
in the following code:
cin >> noskipws;
char c;
do
{ float fl;
c = ; cin >> fl >> c;
if (c == , || c == \n)
process(fl);
}
while (c == ,);
if (c != \n) error();
If you have to skip a sequence of characters other than white spaces, you can
use the istreams member function ignore(). The call:
basic_ifstream<myChar,myTraits> InputStream(file-name);
InputStream.ignore(numeric_limits<streamsize>::max()
,myChar(\n));
ignores all characters until the end of the line. This example uses a file
stream that is not predefined. File streams are described in Section 2.5.3.
Note that the field width will be reset to 0 after the extraction of a string.
There are subtle differences between extracting a character sequence into a
character array and extracting it into a string object. For example:
char buf[SZ];
cin >> buf;
is different from:
string s;
cin >> s;
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
76
Error category
ios_base::goodbit
Everythings fine
ios_base::eofbit
ios_base::failbit
ios_base::badbit
Note that the flag ios_base::goodbit is not really a flag; its value, zero,
indicates the absence of any error flag. It means the stream is OK. By
convention, all input and output operations have no effect once the stream
state is different than zero.
There are several situations when both eofbit and failbit are set; however,
the two have different meanings and do not always occur in conjunction.
The flag ios_base::eofbit is set when there is an attempt to read past the
end of an input sequence. This occurs in the following two typical examples:
1.
77
2.
After reading the last available character, the extraction not only reads
past the end of the input sequence; it also fails to extract the requested
character. Hence, failbit is set in addition to eofbit.
2.
In addition to these input and output operations, there are other situations
that can trigger failure. For example, file streams set failbit if the
associated file cannot be opened (see Section 2.5).
The flag ios_base::badbit indicates problems with the underlying stream
buffer. These problems could be:
Generally, you should keep in mind that badbit indicates an error situation
that is likely to be unrecoverable, whereas failbit indicates a situation that
78
might allow you to retry the failed operation. The flag eofbit simply
indicates the end of the input sequence.
What can you do to check for such errors? You have two possibilities for
detecting stream errors:
You can declare that you want to have an exception raised once an error
occurs in any input or output operation, or
You can actively check the stream state after each input or output
operation.
Effect
bool good()
bool eof()
bool fail()
bool bad()
bool operator!()
As fail()
operator void*()
iostate rdstate()
It is a good idea to check the stream state in some central place, for example:
if (!cout) error();
The state of cout is examined with operator!(), which will return true if the
stream state indicates an error has occurred.
An ostream can also appear in a boolean position to be tested as follows:
if (cout << x) // okay!
The magic here is the operator void*() that returns a non-zero value when
the stream state is non-zero.
Finally, the explicit member functions can also be used:
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
79
\\1
\\2
//1 In calling the exceptions() function, you indicate what flags in the
streams state shall cause an exception to be thrown.24
80
//2 Objects thrown by the streams operations are of types derived from
ios_base::failure. Hence this catch clause will catch all stream
If an exception is thrown while the file is in use here, the file will never be
closed. With a file stream, however, the file will be closed whenever the file
stream goes out of scope, as in the following example:
void use_file(const char* fileName)
{
ofstream f(fileName);
// use file
}
Here the file will be closed even if an exception occurs during use of the open
file.
There are three class templates that implement file streams: basic_ifstream
<charT,traits>, basic_ofstream <charT,traits>, and basic_fstream
81
<charT,traits>. These templates are derived from the stream base class
basic_ios <charT, traits>. Therefore, they inherit all the functions for
formatted input and output described in Section 2.3, as well as the stream
state. They also have functions for opening and closing files, and a
constructor that allows opening a file and connecting it to the stream. For
convenience, there are the regular typedefs ifstream , ofstream, and
fstream , w i t h wifstream , wofstream, and wfstream for the respective tiny
and wide character file streams.
The buffering is done through a specialized stream buffer class,
basic_filebuf <charT,traits>.
You can reposition a file stream to arbitrary file positions. This usually
does not make any sense with the predefined streams, as they are
connected to the terminal by default.
82
There are two ways to create a file stream26: you can create an empty file
stream, open a file, and connect it to the stream later on; or you can open the
file and connect it to a stream at construction time. These two procedures are
demonstrated in the two following examples, respectively:
ifstream file;
;
file.open(argv[1]);
if (!file) // error: unable to open file for input
\\1
\\2
or:
ifstream source(src.cpp);
if (!source) // error: unable to open src.cpp for input
\\3
//1 A file stream is created that is not connected to any file. Any operation
83
//1
//2
//3
//4
}
}
//1 Open a file and connect the file stream to it.
//2 Any subsequent open on this stream will fail.
//3 Hence the failbit will be set.
//4 However, is_open() still returns true, because the file stream still is
//1
//1 Here the file stream fil goes out of scope and the file it is connected to
//1
//2
//3
//4
//5
84
processed.
//5 Close the file again. The file stream is empty again.
Effects
ios_base::in
ios_base::out
ios_base::ate
ios_base::app
ios_base::trunc
ios_base::binary
Binary mode
85
Bidirectional file streams, on the other hand, do not have the flag set
implicitly. This is because a bidirectional stream does not have to be in both
input and output mode in all cases. You might want to open a bidirectional
stream for reading only or writing only. Bidirectional file streams therefore
have no implicit input or output mode. You always have to set a
bidirectional file stream's open mode explicitly.
86
Input mode only works for files that already exist. Otherwise, the stream
construction will fail, as indicated by failbit set in the stream state. Files
that are opened for writing will be created if they do not yet exist. The
constructor only fails if the file cannot be created.
C stdio
Equivalent
Effect
in
out|trunc
out|app
in|out
r+
in|out|trunc
w+
in|out|app
a+
out
ifstream
in
ofstream
out
87
fstream
in|out
//1
88
//1
//2
//3
As with file streams, there are three class templates that implement string
streams: basic_istringstream <charT,traits,Allocator>,
basic_ostringstream <charT,traits,Allocator>, and basic_stringstream
<charT,traits,Allocator>. These are derived from the stream base classes,
basic_istream <charT, traits>, basic_ostream <charT, traits>, and
basic_iostream <charT, traits>. Therefore they inherit all the functions for
formatted input and output described in Section 2.3, as well as the stream
state. They also have functions for setting and retrieving the string that
serves as source or sink, and constructors that allow you to set the string
before construction time. For convenience, there are the regular typedefs
istringstream, ostringstream, and stringstream, with wistringstream,
wostringstream, and wstringstream for the respective tiny and wide
character string streams.
The buffering is done through a specialized stream buffer class,
basic_stringbuf <charT,traits,Allocator>.
89
Output string streams are dynamic.29 The internal buffer is allocated once an
output string stream is constructed. The buffer is automatically extended
during insertion each time the internal buffer is full.
Input string streams are always static. You can extract as many items as are
available in the string you provided the string stream.
This class has private data members of type tm, which is the time structure
defined in the C library (in header file <ctime>).
29 This was different in the old iostreams, where you could have
dynamic and static output streams. See section 2.14.4 for
further details.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
90
or
date aDate;
cout << '\n' << "Please, enter a date (day month year)" << '\n';
cin >> aDate;
cout << aDate << '\n';
For the next step, we would implement shift operators as inserters and
extractors for date objects. Here is an extractor for class date:
template<class charT, class Traits>
basic_istream<charT, Traits>&
operator>> (basic_istream<charT,Traits>& is,
date& dat)
{
is >> dat.tm_date.tm_mday;
is >> dat.tm_date.tm_mon;
is >> dat.tm_date.tm_year;
return is;
}
//1
//2
//3
//4
//5
//1 The returned value for extractors (and inserters) is a reference to the
extracted.
//3 The second parameter is a reference, or alternatively a pointer, to an
91
The inserter can be built analogously, as shown in the following code. The
only difference is that you would hand over a constant reference to a date
object, because the inserter is not supposed to modify the object it prints.
template<class charT, class Traits>
basic_ostream<charT, Traits>&
operator << (basic_ostream<charT, Traits >& os, const date& dat)
{
os << dat.tm_date.tm_mon << '-';
os << dat.tm_date.tm_mday << '-';
os << dat.tm_date.tm_year ;
return os;
}
//1
//2
//3
return is;
}
//1 Use the time_get facet of the input stream's locale to handle parsing of
arguments, including:
A range of input iterators. For the sake of performance and efficiency,
facets directly operate on a stream's buffer. They access the stream
buffer through stream buffer iterators. (See the section on stream buffer
iterators in the Standard C++ Library User's Guide.) Following the
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
92
//1
//2
//1 Here we use the time_put facet of the stream's locale to handle
formatting of dates.
//2 The facet's put() function takes the following arguments:
93
The fill character. We would use the stream's fill character here.
Naturally, we could use any other fill character; however, the stream's
settings are normally preferred.
A pointer to a time structure. This structure will be filled with the result
of the parsing.
A format specifier. This can be a character, like 'x' in our example here,
or alternatively, a character sequence containing format specifiers, each
consisting of a % followed by a character. An example of such a format
specifier string is "%A, %B %d, %Y". It has the same effect as the format
specifiers for the strftime() function in the C library; it produces a
date like: Tuesday, June 11, 1996. We don't use a format specifier
string here, but simply the character 'x', which specifies that the
locale's appropriate date representation shall be used.
Note how these versions of the inserter and extractor differ from previous
simple versions: we no longer rely on existing inserters and extractors for
built-in types, as we did when we used operator<<(int) to insert the date
object's data members individually. Instead, we use a low-level service like
the time facet's get_date() service. The consequence is that we give away
all the functionality that high-level services like the inserters and extractors
already provide, such as format control, error handling, etc.
The same happens if you decide to access the stream's buffer directly,
perhaps for optimizing your program's runtime efficiency. The stream
buffer's services, too, are low-level services that leave to you the tasks of
format control, error handling, etc.
In the following sections, we will explain how you can improve and
complete your inserter or extractor if it directly uses low-level components
like locales or stream buffers.
94
Use the setstate() function for setting the stream's error state. It
automatically throws the ios_base::failure exception according to
the exceptions switch in the stream's exception mask.
not call any functions from the formatting layer. This would cause a
dead-lock in a multithreading situation, since the sentry object locks
the stream through the stream's mutex (= mutual exclusive lock). A
nested call to one of the stream's member functions would again
create a sentry object, which would wait for the same mutually
exclusive lock and, voil, you have deadlock. Use the stream buffer's
functions instead. They do not use the stream's mutex, and are more
efficient anyway.
Please note: Do not call the stream's input or output functions after
creating a sentry object in your inserter or extractor. Use the stream
buffer's functions instead.
95
//1
try {
//2
//3
if(ipfx)
{
use_facet<time_get<charT,Traits> >(is.getloc())
.get_date(is, istreambuf_iterator<charT,Traits>()
,is, err, &dat.tm_date);
if (!dat) err |= ios_base::failbit;
}
} // try
catch(...)
{
bool flag = FALSE;
try { is.setstate(ios_base::failbit); }
catch( ios_base::failure ) { flag= TRUE; }
if ( flag ) throw;
}
//4
//5
//6
//7
//8
//9
//10
if ( err ) is.setstate(err);
/11
return is;
}
//1 The variable err will keep track of errors as they occur. In this
example, it is handed over to the time_get facet, which will set the
block, so that the respective error states could be set correctly before the
exception is actually thrown.
//3 Here we define the sentry object that does all the preliminary work, like
semantically valid, e.g., it would detect wrong dates like February 30.
Extracting an invalid date should be treated as a failure, so we set the
failbit.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
96
Note that in this case it is not advisable to set the failbit through the
stream's setstate() function, because setstate() also raises
exceptions if they are switched on in the stream's exception mask. We
don't want to throw an exception at this point, so we add the failbit to
the state variable err.
//7 Here we catch all exceptions that might have been thrown so far. The
intent is to set the stream's error state before the exception terminates
the extractor, and to rethrow the original exception.
//8 Now we eventually set the stream's error state through its steatite()
function. This call might throw an ios_base::failure exception
//1
> >
//2
//3
//4
//5
97
if ( err ) os.setstate(err);
return os;
}
The inserter and the extractor have only a few minor differences:
//1 We prefer to use the other put() function of the locale's time_put facet.
function.
//3 The put() function returns an iterator pointing immediately after the
the stream's format settings and adjusts the output according to the
respective field width. The rule is that the field width shall be reset
after each usage.
2.7.4.2 An Afterthought
Why is it seemingly so complicated to implement an inserter or extractor?
Why doesn't the first simple approach suffice?
First, it is not really as complicated as it seems if you stick to the patterns: we
give these patterns in the next section. Second, the simple extractors and
inserters in our first approach do suffice in many cases, when the userdefined type consists mostly of data members of built-in types, and runtime
efficiency is not a great concern.
However, whenever you care about the runtime efficiency of your input and
output operations, it is advisable to access the stream buffer directly. In such
cases, you will be using fast low-level services and hence will have to add
format control, error handling, etc., because low-level services do not handle
this for you. In our example, we aimed at optimal performance; the extractor
and inserter for locale-dependent parsing and formatting of dates are very
efficient because the facets directly access the stream buffer. In all these
cases, you should follow the patterns we are about to give.
98
//7
//8
//9
//10
/11
return is;
}
99
catch(...)
{
bool flag = FALSE;
try { os.setstate(ios_base::failbit); }
catch( ios_base::failure ) { flag= TRUE; }
if ( flag ) throw;
}
if ( err ) os.setstate(err);
return os;
}
2.8 Manipulators
We have seen examples of manipulators in Section 2.3.3.2. There we learned
that:
The inserted objects setw(10) and endl are the manipulators. As a side
effect, the manipulator setw(10) sets the stream's field width to 10.
Similarly, the manipulator endl inserts the end of line character and flushes
the output.
As we have mentioned previously, extensibility is a major advantage of
iostreams. We've seen in the previous section how you can implement
inserters and extractors for user-defined types that behave like the built-in
input and output operations. Additionally, you can add user-defined
manipulators that fit seamlessly into the iostreams framework. In this
section, we will see how to do this.
First of all, to be extracted or inserted, a manipulator must be an object of a
type that we call manipT, for which overloaded versions of the shift operators
exist. (Associated with the manipulator type manipT, there is usually a
function that we will call fmanipT()that we will explain in detail later.) Here's
the pattern for the manipulator extractor:
template <class charT, class Traits>
basic istream<charT,Traits>&
operator>> (basic istream<charT,Traits>& istr
,const manipT& manip)
{
return fmanipT(istr, );
}
With this extractor defined, you can extract a manipulator Manip, which is an
object of type manipT, by simply saying:
cin >> Manip;
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
100
ios_base&
basic_ios<charT,Traits>&
basic_istream<charT,Traits>&
basic_ostream<charT,Traits>&
(*pf)(ios_base&)
(*pf)(basic_ios<charT,Traits>)
(*pf)(basic_istream<charT,Traits>)
(*pf)(basic_ostream<charT,Traits>)
where output_stream_type is one of the function pointer types (1), (2), or (4).
101
with endl as the actual argument for pf. In other words, cout << endl; is
equal to cout.operator<<(endl);
Here is another manipulator, boolalpha, that can be applied to input and
output streams. The manipulator boolalpha is a pointer to a function of type
(1):
ios_base& boolalpha(ios_base& strm)
{
strm.setf(ios_base::boolalpha);
return strm;
}
flushing is not necessary because the standard output stream cout is tied to
the standard input stream cin, so input and output to the standard streams
are synchronized anyway. Since no flush is required, the intent is probably
to insert the end-of-line character. If you consider typing '\n' more trouble
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
102
than typing endl, you can easily add a simple manipulator nl that inserts the
end-of-line character, but refrains from flushing the stream.
103
With this inserter defined, the expression cout << Manip(x); is equal to a
call to the shift operator sketched above; i.e., operator<<(cout, Manip(x));
Assuming that a side effect is created by an associated function fmanipT, the
manipulator must call the associated function with its respective
argument(s). Hence it must store the associated function together with its
argument(s) for a call from inside the shift operator.
The associated function fmanipT can be a static or a global function, or a
member function of type manipT, for example.
In the inserter above, we've assumed that the associated function fmanipT is a
static or a global function, and that it takes exactly one argument. Generally,
the manipulator type manipT might look like this:
template <class FctPtr, class Arg1, class Arg2, >
class manipT
{
public:
manipT(FctPtr, Arg1, Arg2, );
private:
FctPtr fp_;
Arg1
arg1_;
Arg2
arg2_;
};
104
(2) Manip(x) is a constructor call. In this case, Manip would be the name of a
class with a constructor that takes an argument of type X and constructs a
manipulator object of type Manip; i.e., Manip and manipT would be
identical:
class Manip {
public:
Manip(X x);
};
Solutions (1) and (2) are semantically different from solution (3). In solution
(1), Manip is a function and therefore need not be created by the user. In
solution (2), Manip is a class name and an unnamed temporary object serves
as manipulator object. In solution (3), however, the manipulator object Manip
must be explicitly created by the user. Hence the user has to write:
manipT Manip;
cout << Manip(x);
105
For any of the three solutions just discussed, there is also a choice of
associated functions. The associated function fmanipT can be either:
a) A static or a global function;
b) A static member function;
c)
Among these choices, (b), i.e. use of a static member function, is the
preferable in an object-oriented program because it permits encapsulation of
the manipulator together with its associated function. This is particularly
recommended if the manipulator has state, as in solution (3), where the
manipulator is a function object, and the associated function has to access the
manipulator's state.
Using ( c), i.e. a virtual member function, introduces the overhead of a virtual
function call each time the manipulator is inserted or extracted. It is useful if
the manipulator has state, and the state needs to be modified by the
associated manipulator function. A static member function would only be
able to access the manipulator's static data; a non-static member function,
however, can access the object-specific data.
106
The manipulator type manipT can be derived from the manipulator type
smanip defined by iostreams. Here is an alternative implementation of a
manipulator like setprecision():
class setprecision : public smanip<int> {
public:
setprecision(int n) : smanip<int>(sprec_, n) { }
private:
static ios_base& sprec_(ios_base& str, int n)
{ str.precision(n);
return str;
}
};
The idea here is that the associated function fmanipT is a non-static member
function of the manipulator type manipT. In such a model, the manipulator
does not store a pointer to the associated function fmanipT , but defines the
associated function as a pure virtual member function. Consequently, the
manipulator type manipT will be an abstract class, and concrete manipulator
types will be derived from this abstract manipulator type. They will have to
implement the virtual member function that represents the associated
function.
Clearly, we need a new manipulator type because the standard manipulator
type smanip is implementation-defined. In Rogue Wave's Standard C++
Library, it has no virtual member functions, but stores a pointer to the
associated function. Here is the abstract manipulator type we need:
template <class Arg, class Ostream>
class virtsmanip
{
public:
typedef Arg argument_type;
typedef Ostream ostream_type;
virtsmanip (Arg a) : arg_(a) { }
protected:
virtual Ostream& fct_(Ostream&,Arg) const = 0;
Arg arg_;
friend Ostream&
operator<< (Ostream& ostr
,const virtsmanip<Arg,Ostream>& manip);
};
107
This type virtsmanip differs from the standard type smanip in several ways:
The argument arg_ and the virtual function fct_() are protected
members, and consequently the respective shift operator for the
manipulator type has to be a friend function.
108
We would like to derive the Tag manipulator here from the standard
manipulator smanip. Unfortunately, smanip is restricted to associated
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
109
(*pf_)(Ostream&, Arg);
arg_;
friend Ostream&
operator<<
(Ostream& ostr, const osmanip<Ostream,Arg>& manip);
};
Then we need to define the inserter for the new manipulator type osmanip:
template <class Ostream, class Arg>
Ostream&
operator<< (Ostream& ostr,const osmanip<Ostream,Arg>& manip)
{
(*manip.pf_)(ostr,manip.arg_);
return ostr;
}
Note that the semantics of this type of manipulator differ from the previous
ones, and from the standard manipulator setprecision. The manipulator
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
110
This kind of manipulator is more flexible. In the example above, you can see
that the default text is set to "v1.2 >> " when the manipulator is created.
Thereafter you can use the manipulator as a parameterless manipulator and
it will remember this text. You can also use it as a manipulator taking an
argument, and provide it with a different argument each time you insert it.
Example 5: Function Object and Virtual Member Function. In the
previous example, a static member function is used as the associated
function. This has the slight disadvantage that the associated function
cannot modify the manipulator's state. Should modification be necessary,
you might consider using a virtual member function instead.
Our final example here is a manipulator that stores additional data, the
previously mentioned lineno manipulator. It adds the next line number
each time it is inserted:
LineNo lineno;
while (!cout)
{
cout << lineno << ;
}
The manipulator is implemented following the (3) and (b) pattern, i.e.:
The manipulator object contains a line number that is initialized when the
manipulator object is constructed. Each time the lineno manipulator is
inserted, the line number is incremented.
For the manipulator base type, we use a slightly modified version of the
manipulator type osmanip from Example 3. The changes are necessary
because the associated function in this case may not be a constant member
function:
emplate <class Arg, class Ostream>
class virtsmanip
{
public:
typedef Arg argument_type;
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
111
112
This solution is bad for at least two reasons. First, assignments to any of the
predefined streams should be avoided. The predefined stream cin, cout,
cerr, and clog have special properties and are treated differently from other
streams. If you reassign them, as done with cout in the example above, you
lose their special properties. Second, assignment and copying of streams is
hazardous. The assignment of the output stream fil will compile and might
even work; however, your program is likely to crash afterwards.32
113
locale
format state
error state
stream buffer
exception mask
stream
EJCTCEVGTDWHHGT
Figure 31. Data held by a stream object. Please note that some data members are
omitted for the sake of simplicity.
The stream buffer contains several pointers to the actual character buffer it
maintains. The default copy constructor and assignment operator will not
correctly handle these pointers.
114
To achieve the equivalent effect of a copy, you might consider copying each
data member individually. This can be done as in the following example:
int main(int argc, char argv[])
{
ofstream out;
if (argc > 1)
out.open(argv[1]);
else
{ out.copyfmt(cout);
out.clear(cout.rdstate());
out.rdbuf(cout.rdbuf());
}
// output to out, e.g.
out << "Hello world!" << endl;
}
//1
//2
//3
//1 The copyfmt() function copies all data from the standard output stream
cout to the output file stream out, except the error state and the stream
buffer. (There is a function exceptions() that allows you to copy the
exception mask separately; i.e., cout.exceptions(fil.exceptions());.
However, you need not do this explicitly, since copyfmt() already
Please note the little snag here. After the call to rdbuf(), the buffer is shared
between the two streams, as shown in Figure 32:
115
source
stream
target
stream
error
state
clear()
error
state
exception
mask
exception()
exception
mask
locale
locale
other
stream
data
copyfmt()
format
state
other
stream
data
format
state
stream
buffer
pointer
rdbuf()
stream
buffer
pointer
stream
buffer
E J C T C E V G T
D W H H G T
//1
//2
116
out.rdbuf(cout.rdbuf());
}
// output to out, e.g.
out << "Hello world!" << endl;
//3
As we copy the standard output stream's entire internal data, we also copy
its special behavior. For instance, the standard output stream is
synchronized with the standard input stream. (See Section 2.10.4 for further
details.) If our output file stream out is a copy of cout, it is forced to
synchronize its output operations with all input operations from cin. This
might not be desired, especially since synchronization is a time-consuming
activity. Here is a more efficient approach using only the stream buffer of
the standard output stream:
int main(int argc, char argv[])
{
filebuf* fb = new filebuf;
ostream out((argc>1)?
fb->open(argv[1],ios_base::out|ios_base::trunc):
cout.rdbuf());
if (out.rdbuf() != fb)
delete fb;
out << "Hello world!" << endl;
}
//1
//2
//3
//4
//1 Instead of creating a file stream object, which already contains a file
buffer object, we construct a separate file buffer object on the heap that
we can hand over to an output stream object if needed. This way we
can delete the file buffer object if not needed. In the original example,
we constructed a file stream object with no chance of eliminating the file
buffer object if not used.
//2 An output stream is constructed. The stream has either the standard
connected to the file buffer object. (Note that you must ensure that the
lifetime of this stream buffer object exceeds the lifetime of the output
stream that uses it.) The open() function returns a pointer to the file
buffer object. This pointer is used to construct the output stream object.
//4 If no file name is provided, the standard output stream's buffer is used.
As in the original example, out inserts through the standard output stream's
buffer, but lacks the special properties of a standard stream.
Here is an alternative solution that uses file descriptors, a non-standard
feature of Rogue Wave's implementation of the standard iostreams34:
117
//1
//2
//1 If the program is provided with a file name, the file is opened and
The effect is the same as in the previous solution, because the standard
output stream cout is connected to the C standard input file stdout. This is
the simplest of all solutions, because it doesnt involve reassigning or sharing
stream buffers. The output file stream's buffer is simply connected to the
right file. However, this is a non-standard solution, and may decrease
portability.
//1
//2
//3
//4
}
//1 A pointer to an ostream is used. (Note that it cannot be a pointer to an
ofstream, because the standard output stream cout is not a file stream,
but a plain stream of type ostream.)
//2 A file stream for the named output file is created on the heap and
output file.
118
Working with pointers and references has a drawback: you have to create an
output file stream object on the heap and, in principle, you have to worry
about deleting the object again, which might lead you into other dire straits.
In summary, creating a copy of a stream is not trivial and should only be
done if you really need a copy of a stream object. In many cases, it is more
appropriate to use references or pointers to stream objects instead, or to
share a stream buffer between two streams.
Keep in mind: Never create a copy of a stream object when a reference or a
pointer to the stream object would suffice, or when a shared stream buffer
would solve the problem.
\\1
file1.setf(ios_base::fixed, ios_base::floatfield);
file1.precision(5);
file2.setf(ios_base::scientific, ios_base::floatfield);
file2.precision(3);
\\2
\\3
\\4
//1 The stream buffer of file1 is replaced by the stream buffer of file2.
119
Note that file2 in the example above has to be an output stream rather than
an output file stream. This is because file streams do not allow you to switch
the file stream buffer.
47,11
\\1
47.11
Again, there is a little snag. In Figure 33, note that a stream buffer has a
locale object of its own, in addition to the streams locale object.
stream1
stream 2
locale
locale
stream buffer
locale
120
Usually the streams locale and the stream buffers locale are identical.
However, when you share a stream buffer between two streams with
different locales, you must decide which locale the stream buffer will use.35
You can set the stream buffers locale by calling the pubimbue() function as
follows:
..file1.imbue(locale("De_DE"));
file2.imbue(locale("En_US"));
file1.rdbuf()->pubimbue(locale(De_DE));
//1
//2
//3
//4
//5
//6
35 Whether and how the user can influence the setting of the
stream buffers locale is still open. At the time of this
writing, a call to the streams imbue() function changes
the stream buffers locale object as well. In the example
above, the shared stream buffer will have the locale object
of file2. This is not a problem here because the stream
buffer only uses the locales conversion facet, and both
locales will probably have the same void conversion facet.
However, this would cause a problem in the case where
the streams locales have different code conversion facets.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
121
Notice that there is a difference between the solutions that you can see by
comparing Figure 34 and Figure 35. An input and an output stream that
share a stream buffer, as shown in Figure 34, can still have separate format
settings, different locales, different exception masks, and so on.
input s t r e a m
output s t r e a m
exception
mask
locale
error
state
exception
mask
locale
error
state
other
stream
data
fomat
state
stream
buffer
pointer
other
stream
data
fomat
state
stream
buffer
pointer
stream
buffer
EJCTCEVGTDWHHGT
122
In contrast, the bidirectional stream shown in Figure 35 can have only one
format setting, one locale, and so on:
bidirectional s t r e a m
exception
mask
locale
error
state
other
stream
data
format
state
stream
buffer
pointer
stream
buffer
EJCTCEVGTDWHHGT
123
//1
//2
//3
//1 The easiest way to put the entire file content into a memory location for
and so on.
In cases where this procedure is insufficient, you should create a string that
contains the header information and process the header by means of the
string operations find(), compare(), etc.
fstream fil("/tmp/inout");
header_stream << fil.rdbuf();
string header_string = header_stream.str();
// process the header, e.g.
string::size_type pos = header_string.rfind('.');
If the header contains binary data instead of text, even a string will probably
not suffice. Here you would want to see the header as a plain byte sequence,
i.e., an ordinary char* buffer. But note that a code conversion might already
have been performed, depending on the locale attached to the file stream. In
cases where you want to process binary data, you have to make sure that the
attached locale has a non-converting code conversion facet:
fstream fil("/tmp/inout");
header_stream << fil.rdbuf();
string header_string = header_stream.str();
const char* header_char_ptr = header_string.data();
// process the header, e.g.
int idx;
memcpy((char*) &idx,header_char_ptr,sizeof(int));
124
external file
HUVTGCOTFDWH
string stream
UVTKPIUVTGCOUVT
string
UVTKPIFCVC
EJCT pointer
125
stream 2
stream1
file buffer
file buffer
external file
\\1
\\2
126
//1 The attempt to extract anything from the file /tmp/fil after this
insertion will probably fail, because the string "Hello " is buffered and
\\1
ofstr.seekp(p);
ofstr << "Peter!" << flush;
ifstr >> s;
\\2
\\3
\\4
\\5
ifstr.sync();
ifstr >> s;
\\6
127
//1 Here the input stream extracts the first string from the shared file. In
doing so, the input stream fills its buffer. It reads as many characters
from the external file as needed to fill the internal buffer. For this
reason, the number of characters to be extracted from the file is
implementation-specific; it depends on the size of the internal stream
buffer.
//2 The output stream overwrites part of the file content. Now the file
content and the content of the input stream's buffer are inconsistent.
The file contains "Hello Peter!"; the input stream's buffer still contains
"Hello World!".
//3 This extraction takes the string "World!" from the buffer instead of
yielding "Peter!", which is the current file content.
//4 More characters are appended to the external file. The file now contains
"Hello Peter! Happy Birthday!", whereas the input stream's buffer is
still unchanged.
//5 This extraction yields nothing. The input stream filled its buffer with
the entire content of the file because the file is so small in our toy
example. Subsequent extractions made the input stream hit the end of
its buffer, which is regarded as the end of the file as well. The
extraction results in eofbit set, and nothing will be extracted. There is
no reason to ever access the external file again.
//6 A call to sync() eventually forces the input stream to refill the buffer
from the external device, beginning with the current file position. After
the synchronization, the input stream's buffer will contain "Happy
Birthday!\n". The next extraction will yield "Happy".
As the draft specifies the behavior of sync() as implementationdefined, you can alternatively try repositioning the input stream to the
current position instead; i.e., istr.seekg(ios_base::cur);
Please note: If you have to synchronize several streams that share a file, it
is advisable to call the sync() function after each output operation and
before each input operation.
\\1
while (some_condition)
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
128
\\2
\\1
\\2
//1 Switch off the unitbuf flag. Alternatively, using manipulators, you can
say ostr << nounitbuf;
//2 Flush the buffer and switch on the unitbuf flag again. Alternatively,
you can say ostr << flush << unitbuf;
//1
while (some_condition)
{ ostr << " some output ";
string s;
while (istr >> s)
// process input ;
}
//2
istr.tie(old_tie);
//3
//1 The input stream istr is tied to the output stream ostr. The tie()
129
cin is tied to cout; i.e., before each input operation on cin, the output
stream cout is forced to flush its buffer.
cerr is synchronized using the unitbuf format flag; i.e., after each output
to cerr, its buffer is flushed.
clog is connected to the same output channel and thus behaves like cerr,
except that it is not synchronized with any of the other standard streams;
i.e., it does not have the unitbuf flag set.
130
131
that a date must consist of the name of the weekday, the name of the month,
the day of the month, and the yearas in Friday, July 12, 1996.
Now imagine you want to improve the input and output operations for the
date class by allowing specification of such format strings. How can you do
this? Other format information is stored in the stream's format state;
consequently, you may want to store the format string for dates somewhere
in the stream as well. And indeed, you can.
Streams have an array for private use. An array element is of a union type
that allows access as a long or as a pointer to void.39 The array is of
unspecified size, and new memory is allocated as needed. In principle, you
can think of it as infinitely long.
You can use this array to store in a stream whatever additional information
you might need. In our example, we would want to store the format string.
The array can be accessed by two functions: iword() and pword(). Both
functions take an index to an array element and return a reference to the
respective element. The function iword() returns a reference to long; the
function pword() allows access to the array element as a pointer to void.
Indices into the array are maintained by the xalloc() function, a static
function in class ios_base that returns the next free index into the array.
\\1
132
\\2
\\3
\\4
reference returned by pword() is only used for storing the pointer to the
date format string. Generally, one should never store a reference
returned by iword() or pword() in order to access the stored data
through this reference later on. This is because these references can
become invalid once the array is reallocated or copied. (See the Class
Reference for more details.)
//3 The inserter for date objects needs to access the index into the array of
pointers, so that it can read the format string and use it. Therefore, the
inserter has to be declared as a friend. In principle, the extractor would
have to be a friend, too; however, the standard C++ locale falls short of
supporting the use of format strings like the ones used by the standard
C function strptime(). Hence, the implementation of a date extractor
that supports date format strings would be a lot more complicated than
the implementation for the inserter, which can use the stream's locale.
We have omitted the extractor for the sake of brevity.
The inserter for date objects given below is almost identical to the one we
described in Section 2.7.3:
template<class charT, class Traits>
basic_ostream<charT, Traits> &
operator << (basic_ostream<charT, Traits >& os, const date& dat)
133
{
ios_base::iostate err = 0;
char* patt = 0;
int
len = 0;
charT* fmt = 0;
try {
typename basic_ostream<charT, Traits>::sentry opfx(os);
if(opfx)
{
patt = (char*) os.pword(setfmt.datfmtIdx);
len = strlen(patt);
fmt = new charT[len];
\\1
use_facet<ctype<charT> >(os.getloc()).
widen(patt, patt+len, fmt);
if (use_facet<time_put<charT
,ostreambuf_iterator<charT,Traits> > >
(os.getloc())
.put(os,os,os.fill(),&dat.tm_date,fmt,fmt+len)
.failed()
)
err = ios_base::badbit;
os.width(0);
\\2
}
} //try
catch(...)
{
delete [] fmt;
bool flag = FALSE;
try {
os.setstate(ios_base::failbit);
}
catch( ios_base::failure ) { flag= TRUE; }
if ( flag ) throw;
}
delete [] fmt;
if ( err ) os.setstate(err);
return os;
}
The only change from the previous inserter is that the format string here is
read from the iostream storage (in statement //1) instead of being the fixed
string "%x". The format string is then provided to the locale's time
formatting facet (in statement //2).
2.11.3 Caveat
Note that the solution suggested here has a pitfall.
The manipulator takes the format specification and stores it. The inserter
retrieves it and uses it. In such a situation, the question arises: Who owns
the format string? In other words, who is responsible for creating and
deleting it and hence controlling its lifetime? Neither the manipulator nor
the inserter can own it because they share it.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
134
the setfmt manipulator can be used safely, even with static streams like
cout.
135
odatstream ostr(cout);
//
ostr << setfmt("%D") << today;
In the next sections, we will explore how we can implement such a derived
stream type.
Choose ios_base if you add information and services that do not depend
on the stream's character type.
136
are equally rare, because they make sense only if file- or string-related data
or services must be added or modified.
Choose basic_istream <charT,Traits>, basic_ostream <charT,Traits>,
or basic_iostream <charT, Traits> as a base class when deriving new
stream classes, unless you have good reason not to do so.
137
There are several ways to provide the stream buffer required for constructing
such a stream:
138
Take the stream buffer from another stream. In the example below, the
stream buffer is borrowed from the standard error stream cerr:
MyOstream<char,char_traits<char> > mostr(cerr.rdbuf());
mostr << "Hello world\n";
Remember that the stream buffer is now shared between mostr and cerr
(see Section 2.9.2 for details).
\\3
\\4
private:
charT* fmt_;
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
\\5
\\6
139
output stream, so that the two streams will share the stream buffer.
The constructor also takes an optional argument, the date format string.
This is always a sequence of tiny characters.
//2 The format string is widened or translated into the stream's character
type charT. This is because the format string will be provided to the
140
\\1
\\2
\\3
use_facet(os.getloc(),
(ctype<charT>*)0).widen(patt,patt+3,buf);
}
141
\\4
if (use_facet<time_put<charT,ostreambuf_iterator<charT,Traits>
> >(os.getloc())
.put(os,os,os.fill(),&dat.tm_date,fmt,fmt+Traits::length(fmt)).fai
led())
err = ios_base::badbit;
os.width(0);
}
} //try
catch(...)
{
bool flag = FALSE;
try {
os.setstate(ios_base::failbit);
}
catch( ios_base::failure ) { flag= TRUE; }
if ( flag ) throw;
}
if ( err ) os.setstate(err);
return os;
}
//1 We will perform a dynamic cast in statement //2. A dynamic cast
142
(*pf_)(Ostream&, Arg);
arg_;
friend Ostream&
operator<< (Ostream& ostr, const osmanip<Ostream,Arg>& manip);
};
template <class Ostream, class Arg>
Ostream& operator<< (Ostream& ostr,const osmanip<Ostream,Arg>&
manip)
{
(*manip.pf_)(ostr,manip.arg_);
return ostr;
}
\\1
\\2
\\3
\\4
\\5
//1 The function sfmt() is the function associated with the setfmt
object.
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
143
144
An index into the arrays for additional storage; in other words, Where do
I find the RTTI?, and
The content or type identification that all concerned parties expect to find
there; in other words, What will I find?
In the sketch below, the derived stream class reserves an index into the
additional storage. The index is a static data member of the derived stream
class, and identifies all objects of that stream class. The content of that
particular slot in the stream's additional storage, which is accessible through
pword(), is expected to be the respective stream object's this pointer.
Here are the modifications to the derived class odatstream:
template <class charT, class Traits=char_traits<charT> >
class odatstream : public basic_ostream <charT,Traits>
{
public:
static int xindex()
{
static int inited = 0;
static int value = 0;
if (!inited)
{
value = xalloc();
inited++;
}
return value;
}
\\1
the arrays for additional storage. It also serves as the access function to
the index.
//2 The reserved slot in the arrays for additional storage is filled with the
145
slot in the stream's storage is the stream's address. If it is, the stream is
considered to be a date output stream.
Note that the technique described in this section is not safe. There is no way
to ensure that date output streams and their related functions and classes are
the only ones that access the reserved slot in a date output stream's
additional storage. In principle, every stream object of any type can access
all entries through iword() or pword(). It's up to your programming
discipline to restrict access to the desired functions. It is unlikely, however,
that all streams will make the same assumptions about the storage's content.
Instead of agreeing on each stream object's address as the run-time-type
identification, we also could have stored certain integers, pointers to certain
strings, etc. Remember, it's the combination of reserved index and assumed
content that represents the RTTI substitute.
146
State-dependent conversions.
147
Derive a new facet type from the standard code conversion facet type
codecvt.
2.
Specialize the new facet type for the character type char.
3.
Implement the member functions that are used by the file buffer.
4.
It is empty because we will specialize the class template for the character
type char.
2.13.2.2 Specialize the New Facet Type and Implement the Member
Functions
Each code conversion facet has two main member functions, in() and out():
Function in()is responsible for the conversion done on reading from the
external device; and
The other member functions of a code conversion facet used by a file stream
buffer are:
148
All public member functions of a facet call the respective, protected virtual
member function, named do_...(). Here is the declaration of the specialized
facet type:
class AsciiEbcdicConversion<char, char, mbstate_t>
: public codecvt<char, char, mbstate_t>
{
protected:
result do_in(mbstate_t& state
,const char* from, const char* from_end, const
char*& from_next
,char* to
, char* to_limit
, char*&
to_next) const;
result do_out(mbstate_t& state
,const char* from, const char* from_end, const
char*& from_next
,char* to
, char* to_limit
, char*&
to_next) const;
bool do_always_noconv() const thow()
{ return false; };
int do_encoding() const throw();
{ return 1; }
};
For the sake of brevity, we implement only those functions used by Rogue
Wave's implementation of file stream buffers. If you want to provide a code
conversion facet that is more widely usable, you would also have to
implement the functions do_length() and do_max_length().
The implementation of the functions do_in() and do_out() is
straightforward. Each of the functions translates a sequence of characters in
the range [from,from_end) into the corresponding sequence [to,to_end).
The pointers from_next and to_next point one beyond the last character
successfully converted. In principle, you can do whatever you want, or
whatever it takes, in these functions. However, for effective communication
with the file stream buffer, it is important to indicate success or failure
properly.
149
fstream inout("/tmp/fil");
AsciiEbcdicConversion<char,char,mbstate_t> cvtfac;
locale cvtloc(locale(),&cvtfac);
inout.rdbuf()->pubimbue(cvtloc)
cout << inout.rdbuf();
\\1
\\2
\\3
//1 When a file is created, a snapshot of the current global locale is attached
as the default locale. Remember that a stream has two locale objects:
one used for formatting numeric items, and a second used by the
stream's buffer for code conversions.
//2 Here the stream buffer's locale is replaced by a copy of the global locale
ok, which should obviously be returned when the conversion went fine.
150
2.
3.
4.
Instantiate new stream types using the new character traits type.
5.
Imbue a file stream's buffer with a locale that carries the new code
conversion facet.
151
However, if you do not want to rely on a non-standard and thus nonportable feature of the library, you have to define a new character traits type
and redefine the necessary types:
152
In this case, the function do_encoding()has to return -1, which identifies the
code conversion as state-dependent. Again, the functions in() and out()
have to conform to the error indication policy explained under class
codecvt in the Class Reference.
The distinguishing characteristic of a state-independent conversion is that
the conversion state argument to in() and out() is used for communication
between the file stream buffer and the code conversion facet. The file stream
buffer is responsible for creating, maintaining, and deleting the conversion
state. At the beginning, the file stream buffer creates a conversion state
Copyright 1996 Rogue Wave Software, Inc. All rights reserved.
Stream Input/Output
153
object that represents the initial conversion state and hands it over to the
code conversion facet. The facet modifies it according to the conversion it
performs. The file stream buffer receives it and stores it between two
subsequent code conversions.
\\1
\\2
\\3
154
2.14.2 Internationalization
Another new feature of the standard iostreams is internationalization.
Traditional iostreams were incapable of adjusting to local conventions.
Output of numerical items was always done following the US English
conventions for number formatting. The new iostreams are internationalized
to allow for local conventions. They use the standard locales described in the
section on locales.
155
2.15.1 Extensions
Rogue Waves implementation of the standard iostreams has several
extensions that we will describe briefly in the sections below.
156
2.15.2 Restrictions
Rogue Waves implementation of the standard iostreams has several
restrictions, most of which correspond to the limited capabilities of current
compilers in handling Standard C++. These restrictions include:
Member templates;
157
Appendix:
Implementation Dependencies and
Open Issues in the Standard
Implementation-Dependent Behavior
1.
2.
3.
4.
2.
159
160