Saturday, 13 August 2011

Serialization in Java

In my previous post about cloning there was a mention of serialization. So,

What is serialization?
It is the process of reading or writing an object to a stream. Here the object's state is written to a sequence of bytes, this transformation is called serialization. The object state can then be retrieved from the stream of bytes into a live object for use later in the program. This process is called de-serialization.

What can or cannot be serialized?
The variables which are marked as static or transient are not serialized all others can be. Static variables are not serialized as the don't belong to any individual object of the class, where as transient construct lets the programmer control which variables need not be serialized. So what happens to these variables when the state is de-serialized; these variables will get the default values.

When and why would one need serialization?
Serialization has found its uses in many places
  1. As mentioned in the post for 'Deep cloning and Shallow cloning' we can achieve deep cloning via serialization.
  2. Whenever an object needs to be saved for future use its state can be serialized to a file or database or in-memory and then later de-serialized.(ex object stored in HTTP session should be serializable to support in-memory replication for scalability) NOTE: Though not mandatory but as a convention, when an objects state is written to a file, the file should have an extension of .ser 
  3. To send an object over the network i.e from one JVM to another (ex. objects passed in RMI needs to be serializable to support marshaling and un-marshaling of objects)

There are three primary reasons why objects are not serializable by default and must implement the Serializable interface to access Java's serialization mechanism.
  1. Not all objects capture useful semantics in a serialized state. For example, a thread object is tied to the state of the current JVM. There is no context in which a de-serialized Thread object would maintain useful semantics.
  2. The serialized state of an object forms part of its class's compatibility contract. Maintaining compatibility between versions of serializable classes requires additional effort and consideration. Therefore, making a class serializable needs to be a deliberate design decision and not a default condition.
  3. Serialization allows access to non-transient private members of a class that are not otherwise accessible. Classes containing sensitive information (for example, a password) should not be serializable nor externalizable. (If there is a need for more control over the process of reading /writing the object to a stream then the class can implement Externalizable)
Some of the classes which are Serializable in java are the Wrapper classes,String class, Date,DateTime, File etc.

How can we achieve/JVM helps in serialization?
An object can be marked as serializable by making the object implement the Serializable interface. It is just a marker interface and has no method which the class needs to implement. (markers help the JVM to understand that the object is of a specific type and needs to be treated accordingly i.e it helps to identify the semantics of being serializable). All subtypes of the Serializable class are themselves Serializable.

To allow subtypes of non-serializable classes to be serialized, the subtype may assume responsibility for saving and restoring the state of the supertype's public, protected, and (if accessible) package fields provided  the class it extends has an accessible no-arg constructor to initialize the class's state. It is an error to declare a class Serializable if this is not the case and will be detected at runtime.

During deserialization, the fields of non-serializable classes will be initialized using the public or protected no-arg constructor of the class. A no-arg constructor must be accessible to the subclass that is serializable. The fields of serializable subclasses will be restored from the stream.

When traversing a graph, an object may be encountered that does not support the Serializable interface. In this case the NotSerializableException will be thrown and will identify the class of the non-serializable object.

What do you understand by serial Version ID? 
The serialVersionUID is a universal version identifier for a Serializable class. Deserialization uses this number to ensure that a loaded class corresponds exactly to a serialized object. 

During the process of serialization all classes are given an unique serial Version ID if it is not explicitly provided in the class. [You can explicitly add a unique ID yourself pro-grammatically or by the use of SerialVer tool.].
If the identifier of the class and that of the flattened object is not the same then the de-serialization process will throw an InvalidClassException. This can happen if a new attribute is added to the modified class or the class no longer extends the same hierarchy tree , that is, the structure of the class undergoes modification. This  exception could also be thrown when the serialVersionID calculated by different JVM's vary.


No comments:

Post a Comment