Serial-Spec in Java
Serial-Spec in Java
Revision 1.4.4
Copyright 1996 - 2001Sun Microsystems, Inc.
901 San Antonio Road, Palo Alto, California 94303-4900 U.S.A.
All rights reserved. Copyright in this document is owned by Sun Microsystems, Inc.
Sun Microsystems, Inc. (SUN) hereby grants to you a fully-paid, nonexclusive, nontransferable, perpetual,
worldwide limited license (without the right to sublicense) under SUN's intellectual property rights that are
essential to practice this specification. This license allows and is limited to the creation and distribution of clean
room implementations of this specification that (i) include a complete implementation of the current version of
this specification without subsetting or supersetting, (ii) implement all the interfaces and functionality of the
standard java.* packages as defined by SUN, without subsetting or supersetting, (iii) do not add any additional
packages, classes or methods to the java.* packages (iv) pass all test suites relating to the most recent published
version of this specification that are available from SUN six (6) months prior to any beta release of the clean
room implementation or upgrade thereto, (v) do not derive from SUN source code or binary materials, and (vi)
do not include any SUN binary materials without an appropriate and separate license from SUN.
RESTRICTED RIGHTS LEGEND
Use, duplication, or disclosure by the U.S. Government is subject to restrictions of FAR 52.227-14(g)(2)(6/87)
and FAR 52.227-19(6/87), or DFAR 252.227-7015(b)(6/95) and DFAR 227.7202-1(a).
TRADEMARKS
Sun, the Sun logo, Sun Microsystems, JavaBeans, JDK, Java, HotJava, the Java Coffee Cup logo, Java Work-
Shop, Visual Java, Solaris, NEO, Joe, Netra, NFS, ONC, ONC+, OpenWindows, PC-NFS, SNM, SunNet Man-
ager, Solaris sunburst design, Solstice, SunCore, SolarNet, SunWeb, Sun Workstation, The Network Is The
Computer, ToolTalk, Ultra, Ultracomputing, Ultraserver, Where The Network Is Going, Sun WorkShop,
XView, Java WorkShop, the Java Coffee Cup logo, and Visual Java are trademarks or registered trademarks of
Sun Microsystems, Inc. in the United States and other countries.
UNIX is a registered trademark in the United States and other countries, exclusively licensed through
X/Open Company, Ltd. OPEN LOOK® is a registered trademark of Novell, Inc.
All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC Interna-
tional, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an
architecture developed by Sun Microsystems, Inc.
THIS PUBLICATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
For further information on Intellectual Property matters contact Sun Legal Department.
Change History
Aug. 16, 2001 Updates for Java™ 2 SDK, Standard Edition, v1.4 Beta 2
• Added support for class-defined readObjectNoData methods, to be used for
initializing serializable class fields in cases not covered by class-defined readObject
methods. See Section 3.5, “The readObjectNoData Method”, as well as Appendix
A, “Security in Object Serialization”.
• New methods ObjectOutputStream.writeUnshared and
ObjectInputStream.readUnshared provide a mechanism for ensuring unique
references to deserialized objects. See Section 2.1, “The ObjectOutputStream
Class”, Section 3.1, “The ObjectInputStream Class”, as well as Appendix A,
“Security in Object Serialization”.
• Documented new security checks in the one-argument constructors for
ObjectOutputStream and ObjectInputStream. See Section 2.1, “The
ObjectOutputStream Class” and Section 3.1, “The ObjectInputStream Class”.
• Added caution against using inner classes for serialization in Section 1.10, “The
Serializable Interface”.
• Clarified requirement that class-defined writeObject methods invoke
ObjectOutputStream.defaultWriteObject or writeFields once before
writing optional data, and that class-defined readObject methods invoke
ObjectInputStream.defaultReadObject or readFields once before reading
optional data. See Section 2.3, “The writeObject Method” and Section 3.4, “The
readObject Method”.
Page iii
• Clarified the behavior of ObjectInputStream when class-defined readObject or
readExternal methods attempt read operations which exceed the bounds of
available data; see Section 3.4, “The readObject Method” and Section 3.6, “The
readExternal Method”.
• Clarified the description of non-proxy class descriptor field type strings to require
that they be written in “field descriptor” format; see Section 6.2, “Stream
Elements”.
July 30, 1999 Updates for Java™ 2 SDK, Standard Edition, v1.3 Beta
• Added the ability to write String objects for which the UTF encoding is longer
than 65535 bytes in length. See Section 6.2, “Stream Elements”.
• New methods ObjectOutputStream.writeClassDescriptor and
ObjectInputStream.readClassDescriptor provide a means of
customizing the serialized representation of ObjectStreamClass class
descriptors. See Section 2.1, “The ObjectOutputStream Class” and Section 3.1,
“The ObjectInputStream Class”.
• Expanded Appendix A, “Security in Object Serialization”.
Page iv Chapter :
Feb. 6, 1998 Updates for JDK™ 1.2 Beta 3
• Introduced the concept of STREAM_PROTOCOL versions. Added the
STREAM_PROTOCOL_2 version to indicate a new format for Externalizable
objects that enable skipping by an Externalizable object within the stream,
even when the object’s class is not available in the local Virtual Machine.
Compatibility issues are discussed in Section 6.3, “Stream Protocol Versions.”
• The ObjectInputStream.resolveClass method can return a local class in a
different package than the name of the class within the stream. This capability
enables renaming of packages between releases. The serialVersionUID and the
base class name must be the same in the stream and in the local version of the class.
See Section 3.1, “The ObjectInputStream Class.”
• Allow substitution of String or array objects when writing them to or reading
them from the stream. See Section 2.1, “The ObjectOutputStream Class” and
Section 3.1, “The ObjectInputStream Class.”
Chapter : Page v
July 3, 1997 Updates for JDK™ 1.2 Alpha
• Documented the requirements for specifying the serialized state of classes. See
Section 1.5, “Defining Serializable Fields for a Class.”
• Added the Serializable Fields API to allow classes more flexibility in accessing the
serialized fields of a class. The stream protocol is unchanged. See Section 1.7,
“Accessing Serializable Fields of a Class,” Section 2.2, “The
ObjectOutputStream.PutField Class,” and Section 3.2, “The
ObjectInputStream.GetField Class.”
• Clarified that field descriptors and data are written to and read from the stream in
canonical order. See Section 4.1, “The ObjectStreamClass Class.”
Page vi Chapter :
Table of Contents
1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Writing to an Object Stream . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Reading from an Object Stream . . . . . . . . . . . . . . . . . . . . . 3
1.4 Object Streams as Containers . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Defining Serializable Fields for a Class . . . . . . . . . . . . . . . 4
1.6 Documenting Serializable Fields and Data for a Class . . 5
1.7 Accessing Serializable Fields of a Class . . . . . . . . . . . . . . . 7
1.8 The ObjectOutput Interface. . . . . . . . . . . . . . . . . . . . . . . . . 7
1.9 The ObjectInput Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.10 The Serializable Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.11 The Externalizable Interface . . . . . . . . . . . . . . . . . . . . . . . . 10
1.12 Protecting Sensitive Information . . . . . . . . . . . . . . . . . . . . 11
2 Object Output Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 The ObjectOutputStream Class. . . . . . . . . . . . . . . . . . . . . . 13
Page iii
2.2 The ObjectOutputStream.PutField Class . . . . . . . . . . . . . . 21
2.3 The writeObject Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 The writeExternal Method. . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 The writeReplace Method . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 The useProtocolVersion Method. . . . . . . . . . . . . . . . . . . . . 23
3 Object Input Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 The ObjectInputStream Class . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 The ObjectInputStream.GetField Class . . . . . . . . . . . . . . . 34
3.3 The ObjectInputValidation Interface . . . . . . . . . . . . . . . . . 34
3.4 The readObject Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 The readObjectNoData Method . . . . . . . . . . . . . . . . . . . . . 36
3.6 The readExternal Method . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7 The readResolve Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4 Class Descriptors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1 The ObjectStreamClass Class . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Dynamic Proxy Class Descriptors . . . . . . . . . . . . . . . . . . . 40
4.3 Serialized Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 The ObjectStreamField Class. . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 Inspecting Serializable Classes . . . . . . . . . . . . . . . . . . . . . . 43
4.6 Stream Unique Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Versioning of Serializable Objects . . . . . . . . . . . . . . . . . . . . . . 47
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Topics:
• Overview
• Writing to an Object Stream
• Reading from an Object Stream
• Object Streams as Containers
• Defining Serializable Fields for a Class
• Documenting Serializable Fields and Data for a Class
• Accessing Serializable Fields of a Class
• The ObjectOutput Interface
• The ObjectInput Interface
• The Serializable Interface
• The Externalizable Interface
• Protecting Sensitive Information
1.1 Overview
The ability to store and retrieve Java™ objects is essential to building all but the most
transient applications. The key to storing and retrieving objects in a serialized form is
representing the state of objects sufficient to reconstruct the object(s). Objects to be
saved in the stream may support either the Serializable or the Externalizable
interface. For Java™ objects, the serialized form must be able to identify and verify the
Java™ class from which the contents of the object were saved and to restore the
contents to a new instance. For serializable objects, the stream includes sufficient
Page 1
1
information to restore the fields in the stream to a compatible version of the class. For
Externalizable objects, the class is solely responsible for the external format of its
contents.
Objects to be stored and retrieved frequently refer to other objects. Those other objects
must be stored and retrieved at the same time to maintain the relationships between the
objects. When an object is stored, all of the objects that are reachable from that object
are stored as well.
The writeObject method (see Section 2.3, “The writeObject Method”) serializes the
specified object and traverses its references to other objects in the object graph
recursively to create a complete serialized representation of the graph. Within a stream,
the first reference to any object results in the object being serialized or externalized
and the assignment of a handle for that object. Subsequent references to that object are
Primitive data types are written to the stream with the methods in the DataOutput
interface, such as writeInt, writeFloat, or writeUTF. Individual bytes and arrays
of bytes are written with the methods of OutputStream. Except for serializable
fields, primitive data is written to the stream in block-data records, with each record
prefixed by a marker and an indication of the number of bytes in the record.
The readObject method deserializes the next object in the stream and traverses its
references to other objects recursively to create the complete graph of objects
serialized.
Primitive data types are read from the stream with the methods in the DataInput
interface, such as readInt, readFloat, or readUTF. Individual bytes and arrays of
bytes are read with the methods of InputStream. Except for serializable fields,
primitive data is read from block-data records.
Each object that acts as a container implements an interface which allows primitives
and objects to be stored in or retrieved from it. These interfaces are the
ObjectOutput and ObjectInput interfaces which:
• Provide a stream to write to and to read from
• Handle requests to write primitive types and objects to the stream
• Handle requests to read primitive types and objects from the stream
Each object which is to be stored in a stream must explicitly allow itself to be stored
and must implement the protocols needed to save and restore its state. Object
Serialization defines two such protocols. The protocols allow the container to ask the
object to write and read its state.
Note – There is, however, a limitation to the use of this mechanism to specify
serializable fields for inner classes. Inner classes can only contain final static fields that
are initialized to constants or expressions built up from constants. Consequently, it is
not possible to set serialPersistentFields for an inner class (though it is
possible to set it for static member classes). For other restrictions pertaining to
serialization of inner class instances, see section Section 1.10, “The Serializable
Interface”.
@serial field-description
The optional field-description describes the meaning of the field and its acceptable
values. The field-description can span multiple lines. When a field is added after the
initial release, a @since tag indicates the version the field was added. The field-
description for @serial provides serialization-specific documentation and is
appended to the javadoc comment for the field within the serialized form
documentation.
• The @serialField tag is used to document an ObjectStreamField component
of a serialPersistentFields array. One of these tags should be used for each
ObjectStreamField component. The syntax is as follows:
@serialData data-description
The javadoc application recognizes the serialization javadoc tags and generates a
specification for each Serializable and Externalizable class. See Section C.1, “Example
Alternate Implementation of java.io.File” for an example that uses these tags.
When a class is declared Serializable, the serializable state of the object is defined by
serializable fields (by name and type) plus optional data. Optional data can only be
written explicitly by the writeObject method of a Serializable class. Optional
data can be read by the Serializable class’ readObject method or serialization
will skip unread optional data.
When a class is declared Externalizable, the data that is written to the stream by the
class itself defines the serialized state. The class must specify the order, types, and
meaning of each datum that is written to the stream. The class must handle its own
evolution, so that it can continue to read data written by and write data that can be read
by previous versions. The class must coordinate with the superclass when saving and
restoring data. The location of the superclasses data in the stream must be specified.
The default mechanism is used automatically when reading or writing objects that
implement the Serializable interface and do no further customization. The
serializable fields are mapped to the corresponding fields of the class and values are
either written to the stream from those fields or are read in and assigned respectively.
If the class provides writeObject and readObject methods, the default
mechanism can be invoked by calling defaultWriteObject and
defaultReadObject. When the writeObject and readObject methods are
implemented, the class has an opportunity to modify the serializable field values before
they are written or after they are read.
When the default mechanism cannot be used, the serializable class can use the
putFields method of ObjectOutputStream to put the values for the serializable
fields into the stream. The writeFields method of ObjectOutputStream puts the
values in the correct order, then writes them to the stream using the existing protocol
for serialization. Correspondingly, the readFields method of
ObjectInputStream reads the values from the stream and makes them available to
the class by name in any order. See Section 2.2, “The ObjectOutputStream.PutField
Class” and Section 3.2, “The ObjectInputStream.GetField Class.” for a detailed
description of the Serializable Fields API.
The writeObject method is used to write an object. The exceptions thrown reflect
errors while accessing the object or its fields, or exceptions that occur in writing to
storage. If any exception is thrown, the underlying storage may be corrupted. If this
occurs, refer to the object that is implementing this interface for more information.
The readObject method is used to read and return an object. The exceptions thrown
reflect errors while accessing the objects or its fields or exceptions that occur in
reading from the storage. If any exception is thrown, the underlying storage may be
corrupted. If this occurs, refer to the object implementing this interface for additional
information.
Note – Serialization of inner classes (i.e., nested classes that are not static member
classes), including local and anonymous classes, is strongly discouraged for several
reasons. Because inner classes declared in non-static contexts contain implicit non-
transient references to enclosing class instances, serializing such an inner class
instance will result in serialization of its associated outer class instance as well.
Synthetic fields generated by javac (or other Java™ compilers) to implement inner
classes are implementation dependent and may vary between compilers; differences in
such fields can disrupt compatibility as well as result in conflicting default
serialVersionUID values. The names assigned to local and anonymous inner
classes are also implementation dependent and may differ between compilers. Since
inner classes cannot declare static members other than compile-time constant fields,
they cannot use the serialPersistentFields mechanism to designate serializable
fields. Finally, because inner classes associated with outer instances do not have zero-
argument constructors (constructors of such inner classes implicitly accept the
enclosing instance as a prepended parameter), they cannot implement
Externalizable. None of the issues listed above, however, apply to static member
classes.
Note – The writeExternal and readExternal methods are public and raise the
risk that a client may be able to write or read information in the object other than by
using its methods and fields. These methods must be used only when the information
held by the object is not sensitive or when exposing it does not present a security risk.
Note – Inner classes associated with enclosing instances cannot have no-arg
constructors, since constructors of such classes implicitly accept the enclosing instance
as a prepended parameter. Consequently the Externalizable interface mechanism
cannot be used for inner classes and they should implement the Serializable
interface, if they must be serialized. Several limitations exist for serializable inner
classes as well, however; see Section 1.10, “The Serializable Interface”, for a full
enumeration.
The easiest technique is to mark fields that contain sensitive data as private
transient. Transient fields are not persistent and will not be saved by any
persistence mechanism. Marking the field will prevent the state from appearing in the
stream and from being restored during deserialization. Since writing and reading (of
private fields) cannot be superseded outside of the class, the transient fields of the class
are safe.
Particularly sensitive classes should not be serialized at all. To accomplish this, the
object should not implement either the Serializable or the Externalizable
interface.
Some classes may find it beneficial to allow writing and reading but specifically handle
and revalidate the state as it is deserialized. The class should implement
writeObject and readObject methods to save and restore only the appropriate
state. If access should be denied, throwing a NotSerializableException will
prevent further access.
Topics:
• The ObjectOutputStream Class
• The ObjectOutputStream.PutField Class
• The writeObject Method
• The writeExternal Method
• The writeReplace Method
• The useProtocolVersion Method
Page 13
2
public void writeUnshared(Object obj)
throws IOException;
public writeFields()
throws IOException;
public void write(byte b[], int off, int len) throws IOException;
protected ObjectOutputStream()
throws IOException;
protected writeObjectOverride()
throws NotActiveException, IOException;
}
2. If there is data in the block-data buffer, the data is written to the stream and
the buffer is reset.
3. If the object is null, null is put in the stream and writeObject returns.
4. If the object has been previously replaced, as described in Step 8, write the
handle of the replacement to the stream and writeObject returns.
5. If the object has already been written to the stream, its handle is written to the
stream and writeObject returns.
If the replacement object is not one of the types covered by Steps 3 through 7,
processing resumes using the replacement object at Step 10.
11. For regular objects, the ObjectStreamClass for the class of the object is
written by recursively calling writeObject. It will appear in the stream only
the first time it is referenced. A handle is assigned for this object. Starting in
Java™ 2 SDK, Standard Edition, v1.3, writeObject calls
writeClassDescriptor to output the ObjectStreamClass object.
Exceptions may occur during the traversal or may occur in the underlying stream. For
any subclass of IOException, the exception is written to the stream using the
exception protocol and the stream state is discarded. If a second IOException is
thrown while attempting to write the first exception into the stream, the stream is left
in an unknown state and StreamCorruptedException is thrown from
writeObject. For other exceptions, the stream is aborted and left in an unknown and
unusable state.
While writing an object via writeUnshared does not in itself guarantee a unique
reference to the object when it is deserialized, it allows a single object to be defined
multiple times in a stream, so that multiple calls to the
ObjectInputStream.readUnshared method (see Section 3.1, “The
ObjectInputStream Class”) by the receiver will not conflict. Note that the rules
described above only apply to the base-level object written with writeUnshared, and
not to any transitively referenced sub-objects in the object graph to be serialized.
The putFields method returns a PutField object the caller uses to set the values of
the serializable fields in the stream. The fields may be set in any order. After all of the
fields have been set, writeFields must be called to write the field values in the
canonical order to the stream. If a field is not set, the default value appropriate for its
The reset method resets the stream state to be the same as if it had just been
constructed. Reset will discard the state of any objects already written to the stream.
The current point in the stream is marked as reset, so the corresponding
ObjectInputStream will reset at the same point. Objects previously written to the
stream will not be remembered as already having been written to the stream. They will
be written to the stream again. This is useful when the contents of an object or objects
must be sent again. Reset may not be called while objects are being serialized. If
called inappropriately, an IOException is thrown.
The annotateClass method is called while a Class is being serialized, and after
the class descriptor has been written to the stream. Subclasses may extend this method
and write other information to the stream about the class. This information must be
read by the resolveClass method in a corresponding ObjectInputStream
subclass.
When objects are being replaced, the subclass must ensure that the substituted object is
compatible with every field where the reference will be stored, or that a
complementary substitution will be made during deserialization. Objects, whose type is
not a subclass of the type of the field or array element, will later abort the
deserialization by raising a ClassCastException and the reference will not be
stored.
The writeStreamHeader method writes the magic number and version to the
stream. This information must be read by the readStreamHeader method of
ObjectInputStream. Subclasses may need to implement this method to identify the
stream’s unique format.
The flush method is used to empty any buffers being held by the stream and to
forward the flush to the underlying stream. The drain method may be used by
subclassers to empty only the ObjectOutputStream’s buffers without forcing the
underlying stream to be flushed.
Each subclass of a serializable object may define its own writeObject method. If a
class does not implement the method, the default serialization provided by
defaultWriteObject will be used. When implemented, the class is only
responsible for writing its own fields, not those of its supertypes or subtypes.
The responsibility for the format, structure, and versioning of the optional data lies
completely with the class.
A new default format for writing Externalizable data has been introduced in JDK™
1.2. The new format specifies that primitive data will be written in block data mode by
writeExternal methods. Additionally, a tag denoting the end of the External object
is appended to the stream after the writeExternal method returns. The benefits of
this format change are discussed in Section 3.6, “The readExternal Method.”
Compatibility issues caused by this change are discussed in Section 2.6, “The
useProtocolVersion Method.”
Stream protocol versions are discussed in Section 6.3, “Stream Protocol Versions.”
Topics:
• The ObjectInputStream Class
• The ObjectInputStream.GetField Class
• The ObjectInputValidation Interface
• The readObject Method
• The readExternal Method
• The readResolve Method
Page 25
3
public final Object readObject()
throws OptionalDataException, ClassNotFoundException,
IOException;
protected ObjectInputStream()
throws StreamCorruptedException, IOException;
protected readObjectOverride()
throws OptionalDataException, ClassNotFoundException,
IOException;
}
4. If the object in the stream is a handle to a previous object, return the object.
7. If the object in the stream is a String, read its UTF encoding, add it and its
handle to the set of known objects, and proceed to Step 11.
8. If the object in the stream is an array, read its ObjectStreamClass and the
length of the array. Allocate the array, and add it and its handle in the set of
known objects. Read each element using the appropriate method for its type
and assign it to the array. Proceed to Step 11.
9. For all other objects, the ObjectStreamClass of the object is read from the
stream. The local class for that ObjectStreamClass is retrieved. The class
must be serializable or externalizable.
10. An instance of the class is allocated. The instance and its handle are added to
the set of known objects. The contents restored appropriately:
a. For serializable objects, the no-arg constructor for the first non-serializable
supertype is run. For serializable classes, the fields are initialized to the
default value appropriate for its type. Then the fields of each class are
restored by calling class-specific readObject methods, or if these are not
defined, by calling the defaultReadObject method. Note that field
11. Process potential substitutions by the class of the object and/or by a subclass of
ObjectInputStream:
a. If the class of the object defines the appropriate readResolve method, the
method is called to allow the object to replace itself.
b. Then if previously enabled by enableResolveObject, the
resolveObject method is called to allow subclasses of the stream to
examine and replace the object. If the previous step did replace the original
object, the resolveObject method is called with the replacement object.
If a replacement took place, the table of known objects is updated so the
replacement object is associated with the handle. The replacement object is then
returned from readObject.
All of the methods for reading primitives types only consume bytes from the block
data records in the stream. If a read for primitive data occurs when the next item in the
stream is an object, the read methods return -1 or the EOFException as appropriate.
The value of a primitive type is read by a DataInputStream from the block data
record.
The exceptions thrown reflect errors during the traversal or exceptions that occur on
the underlying stream. If any exception is thrown, the underlying stream is left in an
unknown and unusable state.
When the reset token occurs in the stream, all of the state of the stream is discarded.
The set of known objects is cleared.
The readUnshared method is used to read “unshared” objects from the stream. This
method is identical to readObject, except that it prevents subsequent calls to
readObject and readUnshared from returning additional references to the
deserialized instance returned by the original call to readUnshared. Specifically:
• If readUnshared is called to deserialize a back-reference (the stream
representation of an object which has been written previously to the stream), an
ObjectStreamException will be thrown.
• If readUnshared returns successfully, then any subsequent attempts to deserialize
back-references to the stream handle deserialized by readUnshared will cause an
ObjectStreamException to be thrown.
The defaultReadObject method is used to read the fields and object from the
stream. It uses the class descriptor in the stream to read the fields in the canonical
order by name and type from the stream. The values are assigned to the matching fields
by name in the current class. Details of the versioning mechanism can be found in
Section 5.5, “Compatible Java™ Type Evolution.” Any field of the object that does not
appear in the stream is set to its default value. Values that appear in the stream, but not
in the object, are discarded. This occurs primarily when a later version of a class has
written additional fields that do not occur in the earlier version. This method may only
be called from the readObject method while restoring the fields of a class. When
called at any other time, the NotActiveException is thrown.
Starting with the Java™ SDK, Standard Edition, v1.3, the readClassDescriptor
method is used to read in all ObjectStreamClass objects.
readClassDescriptor is called when the ObjectInputStream expects a class
descriptor as the next item in the serialization stream. Subclasses of
ObjectInputStream may override this method to read in class descriptors that have
been written in non-standard formats (by subclasses of ObjectOutputStream which
have overridden the writeClassDescriptor method). By default, this method
reads class descriptors according to the format described in Section 6.4, “Grammar for
the Stream Format”.
The resolveClass method is called while a class is being deserialized, and after the
class descriptor has been read. Subclasses may extend this method to read other
information about the class written by the corresponding subclass of
ObjectOutputStream. The method must find and return the class with the given
name and serialVersionUID. The default implementation locates the class by
calling the class loader of the closest caller of readObject that has a class loader. If
the class cannot be found ClassNotFoundException should be thrown. Prior to
JDK™ 1.1.6, the resolveClass method was required to return the same fully
qualified class name as the class name in the stream. In order to accommodate package
renaming across releases, method resolveClass only needs to return a class with
the same base class name and SerialVersionUID in JDK™ 1.1.6 and later versions.
The readStreamHeader method reads and verifies the magic number and version of
the stream. If they do not match, the StreamCorruptedMismatch is thrown.
The defaulted method returns true if the field is not present in the stream. An
IllegalArgumentException is thrown if the requested field is not a serializable
field of the current class.
Each get method returns the specified serializable field from the stream. I/O
exceptions will be thrown if the underlying stream throws an exception. An
IllegalArgumentException is thrown if the name or type does not match the
name and type of an field serializable field of the current class. The default value is
returned if the stream does not contain an explicit value for the field.
Each subclass of a serializable object may define its own readObject method. If a
class does not implement the method, the default serialization provided by
defaultReadObject will be used. When implemented, the class is only responsible
for restoring its own fields, not those of its supertypes or subtypes.
The readObject method of the class, if implemented, is responsible for restoring the
state of the class. The values of every field of the object whether transient or not, static
or not are set to the default value for the fields type. Either ObjectInputStream’s
defaultReadObject or readFields method must be called once (and only once)
before reading any optional data written by the corresponding writeObject method;
even if no optional data is read, defaultReadObject or readFields must still be
invoked once. If the readObject method of the class attempts to read more data than
is present in the optional part of the stream for this class, the stream will return -1 for
bytewise reads, throw an EOFException for primitive data reads (e.g., readInt,
readFloat), or throw an OptionalDataException with the eof field set to true
for object reads.
The responsibility for the format, structure, and versioning of the optional data lies
completely with the class. The @serialData javadoc tag within the javadoc comment
for the readObject method should be used to document the format and structure of
the optional data.
If the class being restored is not present in the stream being read, then its
readObjectNoData method, if defined, is invoked (instead of readObject);
otherwise, its fields are initialized to the appropriate default values. For further detail,
see section 3.5.
One last similarity between a constructor and a readObject method is that both
provide the opportunity to invoke a method on an object that is not fully constructed.
Any overridable (neither private, static nor final) method called while an object is
being constructed can potentially be overridden by a subclass. Methods called during
the construction phase of an object are resolved by the actual type of the object, not the
type currently being initialized by either its constructor or
readObject/readObjectNoData method. Therefore, calling an overridable method
from within a readObject or readObjectNoData method may result in the
unintentional invocation of a subclass method before the superclass has been fully
initialized.
Note – The readExternal method is public, and it raises the risk of a client being
able to overwrite an existing object from a stream. The class may add its own checks
to insure that this is only called when appropriate.
A new stream protocol version has been introduced in JDK™ 1.2 to correct a problem
with Externalizable objects. The old definition of Externalizable objects
required the local virtual machine to find a readExternal method to be able to
properly read an Externalizable object from the stream. The new format adds
enough information to the stream protocol so serialization can skip an
Externalizable object when the local readExternal method is not available.
Due to class evolution rules, serialization must be able to skip an Externalizable
object in the input stream if there is not a mapping for the object using the local
classes.
For example, a Symbol class could be created for which only a single instance of each
symbol binding existed within a virtual machine. The readResolve method would
be implemented to determine if that symbol was already defined and substitute the
preexisting equivalent Symbol object to maintain the identity constraint. In this way
the uniqueness of Symbol objects can be maintained across serialization.
Note – The readResolve method is not invoked on the object until the object is fully
constructed, so any references to this object in its object graph will not be updated to
the new object nominated by readResolve. However, during the serialization of an
object with the writeReplace method, all references to the original object in the
replacement object’s object graph are replaced with references to the replacement
object. Therefore in cases where an object being serialized nominates a replacement
object whose object graph has a reference to the original object, deserialization will
result in an incorrect graph of objects. Furthermore, if the reference types of the object
being read (nominated by writeReplace) and the original object are not compatible,
the construction of the object graph will raise a ClassCastException.
Topics:
• The ObjectStreamClass Class
• Dynamic Proxy Class Descriptors
• Serialized Form
• The ObjectStreamField Class
• Inspecting Serializable Classes
• Stream Unique Identifiers
Page 39
4
public ObjectStreamField[] getFields();
The lookup method returns the ObjectStreamClass descriptor for the specified
class in the virtual machine. If the class has defined serialVersionUID it is
retrieved from the class. If the serialVersionUID is not defined by the class, it is
computed from the definition of the class in the virtual machine. If the specified class
is not serializable or externalizable, null is returned.
The getName method returns the fully-qualified name of the class. The class name is
saved in the stream and is used when the class must be loaded.
The forClass method returns the Class in the local virtual machine if one was
found by ObjectInputStream.resolveClass method. Otherwise, it returns
null.
The getTypeCode method returns a character encoding of the field type (‘B’ for
byte, ‘C’ for char, ‘D’ for double, ‘F’ for float, ‘I’ for int, ‘J’ for long,
‘L’ for non-array object types, ‘S’ for short, ‘Z’ for boolean, and ‘[‘ for
arrays).
The isPrimitive method returns true if the field is of primitive type, or false
otherwise.
The isUnshared method returns true if values of the field should be written as
“unshared” objects, or false otherwise.
The getOffset method returns the offset of the field’s value within instance data of
the class defining the field.
The toString method returns a printable representation with name and type.
When invoked on the command line with one or more class names, serialver prints the
serialVersionUID for each class in a form suitable for copying into an evolving
class. When invoked with no arguments, it prints a usage line.
The stream-unique identifier is a 64-bit hash of the class name, interface class names,
methoit is strongly recommended thatds, and fields. The value must be declared in all
versions of a class except the first. It may be declared in the original class but is not
required. The value is fixed for all compatible classes. If the SUID is not declared for
a class, the value defaults to the hash for that class. Serializable classes do not
need to anticipate versioning; however, Externalizable classes do.
Note – It is strongly recommended that serializable classes that are inner classes or
which contain inner classes declare the serialVersionUID data member. This is because
different implementations of compilers could use different names for synthetic
members that are generated for the implementation of inner classes, and these names
are used in the current computation of SUIDs.
The initial version of an Externalizable class must output a stream data format
that is extensible in the future. The initial version of the method readExternal has
to be able to read the output format of all future versions of the method
writeExternal.
3. The name of each interface sorted by name written using UTF encoding.
4. For each field of the class sorted by field name (except private static and private
transient fields):
a. The name of the field in UTF encoding.
b. The modifiers of the field written as a 32-bit integer.
c. The descriptor of the field in UTF encoding
Topics:
• Overview
• Goals
• Assumptions
• Who’s Responsible for Versioning of Streams
• Compatible Java™ Type Evolution
• Type Changes Affecting Serialization
5.1 Overview
When Java™ objects use serialization to save state in files, or as blobs in databases, the
potential arises that the version of a class reading the data is different than the version
that wrote the data.
Versioning raises some fundamental questions about the identity of a class, including
what constitutes a compatible change. A compatible change is a change that does not
affect the contract between the class and its callers.
This section describes the goals, assumptions, and a solution that attempts to address
this problem by restricting the kinds of changes allowed and by carefully choosing the
mechanisms.
Page 47
5
The proposed solution provides a mechanism for “automatic” handling of classes that
evolve by adding fields and adding classes. Serialization will handle versioning
without class-specific methods to be implemented for each version. The stream format
can be traversed without invoking class-specific methods.
5.2 Goals
The goals are to:
• Support bidirectional communication between different versions of a class operating
in different virtual machines by:
• Defining a mechanism that allows Java™ classes to read streams written by older
versions of the same class.
• Defining a mechanism that allows Java™ classes to write streams intended to be
read by older versions of the same class.
• Provide default serialization for persistence and for RMI.
• Perform well and produce compact streams in simple cases, so that RMI can use
serialization.
• Be able to identify and load classes that match the exact class used to write the
stream.
• Keep the overhead low for nonversioned classes.
• Use a stream format that allows the traversal of the stream without having to invoke
methods specific to the objects saved in the stream.
5.3 Assumptions
The assumptions are that:
• Versioning will only apply to serializable classes since it must control the stream
format to achieve it goals. Externalizable classes will be responsible for their own
versioning which is tied to the external format.
• All data and objects must be read from, or skipped in, the stream in the same order
as they were written.
• Classes evolve individually as well as in concert with supertypes and subtypes.
• Classes are identified by name. Two classes with the same name may be different
versions or completely different classes that can be distinguished only by
comparing their interfaces or by comparing hashes of the interfaces.
• Default serialization will not perform any type conversions.
• The stream format only needs to support a linear sequence of type changes, not
arbitrary branching of a type.
java.lang.Object java.lang.Object’
foo foo’
bar bar’
For the purposes of the discussion here, each class implements and extends the
interface or contract defined by its supertype. New versions of a class, for example
foo’, must continue to satisfy the contract for foo and may extend the interface or
modify its implementation.
The following are the principle aspects of the design for versioning of serialized object
streams.
• The default serialization mechanism will use a symbolic model for binding the
fields in the stream to the fields in the corresponding class in the virtual machine.
• Each class referenced in the stream will uniquely identify itself, its supertype, and
the types and names of each serializable field written to the stream. The fields are
ordered with the primitive types first sorted by field name, followed by the object
fields sorted by field name.
• Two types of data may occur in the stream for each class: required data
(corresponding directly to the serializable fields of the object); and optional data
(consisting of an arbitrary sequence of primitives and objects). The stream format
defines how the required and optional data occur in the stream so that the whole
class, the required, or the optional parts can be skipped if necessary.
• The required data consists of the fields of the object in the order defined by the
class descriptor.
• The optional data is written to the stream and does not correspond directly to
fields of the class. The class itself is responsible for the length, types, and
versioning of this optional information.
• If defined for a class, the writeObject/readObject methods supersede the
default mechanism to write/read the state of the class. These methods write and read
the optional data for a class. The required data is written by calling
defaultWriteObject and read by calling defaultReadObject.
• The stream format of each class is identified by the use of a Stream Unique
Identifier (SUID). By default, this is the hash of the class. All later versions of the
class must declare the Stream Unique Identifier (SUID) that they are compatible
with. This guards against classes with the same name that might inadvertently be
identified as being versions of a single class.
The descriptions are from the perspective of the stream being read in order to
reconstitute either an earlier or later version of the class. In the parlance of RPC
systems, this is a “receiver makes right” system. The writer writes its data in the most
suitable form and the receiver must interpret that information to extract the parts it
needs and to fill in the parts that are not available.
Topics:
• Overview
• Stream Elements
• Stream Protocol Versions
• Grammar for the Stream Format
• Example
6.1 Overview
The stream format satisfies the following design goals:
Page 55
6
written to the stream is assigned a handle that is used to refer back to the object.
Handles are assigned sequentially starting from 0x7E0000. The handles restart at
0x7E0000 when the stream is reset.
The representation of String objects depends on the length of the UTF encoded
string. If the UTF encoding of the given String is less than 65536 bytes in length,
the String is written in the standard Java UTF-8 format. Starting with the Java™ 2
SDK, Standard Edition, v1.3, strings for which the UTF encoding length is greater than
or equal to 65536 bytes are written in a variant “long” UTF format. The “long” UTF
format is identical to the standard Java UTF-8 format, except that it uses 8 bytes to
write the length of the UTF string, instead of 2 bytes. The typecode preceding the
String in the serialization stream indicates which format was used to write the
String.
All primitive data written by classes is buffered and wrapped in block-data records,
regardless if the data is written to the stream within a writeObject method or
written directly to the stream from outside a writeObject method. This data can
only be read by the corresponding readObject methods or be read directly from the
stream. Objects written by the writeObject method terminate any previous block-
data record and are written either as regular objects or null or back references, as
appropriate. The block-data records allow error recovery to discard any optional data.
When called from within a class, the stream can discard any data or objects until the
endBlockData.
Block data boundaries have been standardized. Primitive data written in block data
mode is normalized to not exceed 1024 byte chunks. The benefit of this change was
to tighten the specification of serialized data format within the stream. This change
is fully backward and forward compatible.
Notation Meaning
(datatype) This token has the data type specified, such as byte.
token[n] A predefined number of occurrences of the token, that is an array.
x0001 A literal value expressed in hexadecimal. The number of hex digits
reflects the size of the value.
<xxx> A value read from the stream used to indicate the length of an array.
Note that the symbol (long-utf) is used to designate a string written in “long” UTF
format. For details, refer to Section 6.2, “Stream Elements”.
The flag SC_BLOCKDATA is set if the Externalizable class is written into the
stream using STREAM_PROTOCOL_2. By default, this is the protocol used to write
Externalizable objects into the stream in JDK™ 1.2. JDK™ 1.1 writes
STREAM_PROTOCOL_1.
The flag SC_SERIALIZABLE is set if the class that wrote the stream extended
java.io.Serializable but not java.io.Externalizable, the class
reading the stream must also extend java.io.Serializable and the default
serialization mechanism is to be used.
The flag SC_EXTERNALIZABLE is set if the class that wrote the stream extended
java.io.Externalizable, the class reading the data must also extend
Externalizable and the data will be read using its writeExternal and
readExternal methods.
Example
Consider the case of an original class and two instances in a linked list:
class List implements java.io.Serializable {
int value;
List next;
public static void main(String[] args) {
try {
List list1 = new List();
List list2 = new List();
list1.value = 17;
list1.next = list2;
list2.value = 19;
list2.next = null;
Topics:
• Overview
• Design Goals
• Security Issues
• Preventing Serialization of Sensitive Data
• Writing Class-Specific Serializing Methods
• Guarding Unshared Deserialized Objects
• Preventing Overwriting of Externalizable Objects
• Encrypting a Bytestream
A.1 Overview
The object serialization system allows a bytestream to be produced from a graph of
objects, sent out of the Java™ environment (either saved to disk or transmitted over the
network) and then used to recreate an equivalent set of new objects with the same state.
What happens to the state of the objects outside of the environment is outside of the
control of the Java™ system (by definition), and therefore is outside the control of the
security provided by the system. The question then arises: once an object has been
serialized, can the resulting byte array be examined and changed in a way that
compromises the security of the Java program that deserializes it? The intent of this
section is to address these security concerns.
Page 65
A.2 Design Goals
The goal for object serialization is to be as simple as possible and yet still be
consistent with known security restrictions; the simpler the system is, the more likely it
is to be secure. The following points summarize the security measures present in object
serialization:
• Only objects implementing the java.io.Serializable or
java.io.Externalizable interfaces can be serialized. Mechanisms are
provided which can be used to prevent the serialization of specific fields (typically,
those containing sensitive or unneeded data).
• The serialization package cannot be used to recreate or reinitialize objects.
Deserializing a byte stream may result in the creation of new objects, but will not
overwrite or modify the contents of existing objects.
• Although deserializing an object may trigger downloading of code from a remote
source, the downloaded code is restricted by all of the usual Java™ code
verification and security mechanisms. Classes loaded as a side-effect of
deserialization are no more or less secure than those loaded in any other fashion.
In version 1.4 of the Java™ 2 SDK, Standard Edition, support was added for class-
defined readObjectNoData methods (see Section 3.5, “The readObjectNoData
Method”). Non-final serializable classes which initialize fields to non-default values
should define a readObjectNoData method to ensure consistent state in the event
that a subclass instance is deserialized and the serialization stream does not list the
class in question as a superclass of the deserialized object. This may occur in cases
where the receiving party uses a different version of the deserialized instance’s class
than the sending party, and the receiver’s version extends classes that are not extended
by the sender’s version. This may also occur if the serialization stream has been
tampered; hence, readObjectNoData is useful for initializing deserialized objects
properly despite a “hostile” or incomplete source stream
In the copying approach, the sub-objects deserialized from the stream should be treated
as "untrusted input": newly created objects, initialized to have the same value as the
deserialized sub-objects, should be substituted for the sub-objects by the readObject
method. For example, suppose an object has a private byte array field, b, that must
remain private:
private void readObject(ObjectInputStream s)
throws IOException, ClassNotFoundException
{
s.defaultReadObject();
b = (byte[])b.clone();
It is also important to note that calling clone may not always be the right way to
defensively copy a sub-object. If the clone method cannot be counted on to produce
an independent copy (and not to "steal" a reference to the copy), an alternative means
should be used to produce the copy. An alternative means of copying should always be
used if the class of the sub-object is not final, since the clone method or helper
methods that it calls may be overridden by subclasses.
Starting in version 1.4 of the Java™ 2 SDK, Standard Edition, unique references to
deserialized objects can also be ensured by using the
ObjectOutputStream.writeUnshared and
ObjectInputStream.readUnshared methods, thus avoiding the complication,
performance costs and memory overhead of defensive copying. The readUnshared
and writeUnshared methods are further described in Section 3.1, “The
ObjectInputStream Class” and Section 2.1, “The ObjectOutputStream Class”.
Object serialization allows encryption, both by allowing classes to define their own
methods for serialization and deserialization (inside which encryption can be used),
and by adhering to the composable stream abstraction (allowing the output of a
serialization stream to be channelled into another filter stream which encrypts the
data).
Exception Description
ObjectStreamException Superclass of all serialization exceptions.
Page 71
Exception Description
NotActiveException Thrown if writeObject state is invalid within
the following ObjectOutputStream methods:
• defaultWriteObject
• putFields
• writeFields
Thrown if readObject state is invalid within the
following ObjectInputStream methods:
• defaultReadObject
• readFields
• registerValidation
InvalidObjectException Thrown when a restored object cannot be made
valid.
OptionalDataException Thrown by readObject when there is primitive
data in the stream and an object is expected. The
length field of the exception indicates the number of
bytes that are available in the current block.
WriteAbortedException Thrown when reading a stream terminated by an
exception that occurred while the stream was being
written.
Topics:
• Example Alternate Implementation of java.io.File
The system class java.io.File represents a filename and has methods for parsing,
manipulating files and directories by name. It has a single private field that contains the
current file name. The semantics of the methods that parse paths depend on the current
path separator which is held in a static field. This path separator is part of the serialized
state of a file so that file name can be adjusted when read.
The serialized state of a File object is defined as the serializable fields and the
sequence of data values for the file. In this case, there is one of each.
Serializable Fields:
String path; // path name with embedded separators
Serializable Data:
char // path name separator for path name
An alternate implementation might be defined as follows:
class File implements java.io.Serializable {
...
Page 73
private String[] pathcomponents;
// Define serializable fields with the ObjectStreamClass
/**
* @serialField path String
* Path components separated by separator.
*/
private static final ObjectStreamField[] serialPersistentFields
= }
new ObjectStreamField(“path”, String.class)
};
...
/**
* @serialData Default fields followed by separator character.
*/
private void writeObject(ObjectOutputStream s)
throws IOException
{
ObjectOutputStream.PutField fields = s.putFields();
StringBuffer str = new StringBuffer();
for(int i = 0; i < pathcomponents; i++) {
str.append(separator);
str.append(pathcomponents[i]);
}
fields.put(“path”, str.toString());
s.writeFields();
s.writeChar(separatorChar); // Add the separator character
}
...
private void readObject(ObjectInputStream s)
throws IOException
{
ObjectInputStream.GetField fields = s.readFields();
String path = (String)fields.get(“path”, null);
...
char sep = s.readChar(); // read the previous separator char
// parse path into components using the separator
// and store into pathcomponents array.
}
}