Showing posts with label Java. Show all posts
Showing posts with label Java. Show all posts

Sunday, May 31, 2009

The Decorator Pattern

I volunteered to do a presentation to the Java User's Group at my work and adapted an old lecture to discuss the Decorator design pattern. I thought this would also be a good post too. Source code is available from a link at the bottom of the post.

Patterns are techniques used to approach solving a common programming problem. They are object-oriented, but language independent. The Decorator pattern is a very useful pattern, but not one that gets a lot of attention. The "Gang of Four" book (by Gamma, Helm, et.al. see below) defines the purpose of this pattern as:
Attach additional responsibilities to an object dynamically. Decorators provide a flexible alternative to subclassing for extending functionality.
A decorator class adds a new capability to an existing object by impersonating the original object and intercepting methods calls to it. By intercepting it can modify the normal result that the original method call would have. To impersonate the original, both the original class and the decorator class must implement the same interface.


Typically, the decorator object's methods will perform some additional operation and also call the original object's method. This enhances the result of the original operation in some way.



The advantages of the decorator pattern are:
  • Decorators can be chained to add multiple behaviors
  • Very flexible reuse of code
  • Can be runtime decision (unlike inheritance)
  • Avoids huge inheritance trees
As an example of using the decorator pattern consider Java's InputStream/OutputStream classes. Java defines subclasses of these (FilterInputStream/FilterOutputStream) specifically designed to create decorators.

Let's look at how to create a decorator for any OutputStream using FilterOutputStream.


The out attribute of the FilterOutputStream is a reference to the OutputStream that the class is decoratoring. This attribute is initialized by the constructor of FilterOutputStream. The default implementations of the FilterOutputStream methods just forward their call exactly to its out attribute.

We'll create a decorator named DebugOutputStream that traces the method calls on an OutputStream. First lets look at the beginning of its definition:

public class DebugOutputStream extends FilterOutputStream {

  private String name;

  public DebugOutputStream(String name, OutputStream out) {
    super(out);
    this.name = name;
}

The name parameter is used to label the output of this particular DebugOutputStream (useful if there is more than one). The out parameter is passed up to the FilterOutputStream and is used to initialize it's protected out attribute.

Let's look at what the close method of DebugOutputStream:

public void close() throws IOException {

  StringBuilder msg = new StringBuilder(
     "DebugOutputStream(" + name + "): called void close()");
  try {

    out.close();   // Call to decorated object

  } catch( IOException e) {

    msg.append( " Exception: " + e + " thrown");
    throw e;

  } catch( RuntimeException e) {

     msg.append( " Exception: " + e + " thrown");
    throw e;

  } finally {

     System.out.println( msg);

  }
}
Most of the code is composing a message to print out information about the call to close on the decorated object (i.e., referenced in the out attribute).

Here's an example of using the DebugOutputStream to spy on how GZIPOutputStream works.


InputStream input = new BufferedInputStream( 
  new FileInputStream( args[0]));
OutputStream output = new DebugOutputStream( "GZIPOutputStream",
  new GZIPOutputStream(
  new DebugOutputStream( "FileOutputStream",
  new FileOutputStream( args[1]))));

byte [] buffer = new byte[4096];
int numRead;
while( (numRead = input.read(buffer)) > 0) {

  output.write( buffer, 0, numRead);

}
input.close();
output.close();

Note that the BufferedInputStream is acting as a decorator to the FileInputStream. Two DebugOutputStreams are used to trace themethod calls at two places. The first one (given the name "GZIPOutputStream") will trace the calls made to the GZIPOutputStream. The second one will trace the calls made to the FileOutputStream.

The output of this program on a sample input file is:
DebugOutputStream(FileOutputStream): called void write( byte []) (10 bytes)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 4096)
DebugOutputStream(GZIPOutputStream): called void write(byte [], 0, 3424)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 512)
DebugOutputStream(FileOutputStream): called void write(byte [], 0, 253)
DebugOutputStream(FileOutputStream): called void close()
DebugOutputStream(GZIPOutputStream): called void close()

From this we can see that first a 10 byte header is written out to the file, then we see the data is being given to the GZIPOutputStream in 4096 byte chunks. But the GZIPOutputStream is not writing the data out immediately. Only after all the data from the input file has been exhausted is the compressed data written out to the file and in 512 byte chunks. For a bigger input file, compressed data would have been written out before the end. GZIPOutputStream has some internal cache that is written out when it fills up or the stream is closed. The latter is happening here.

I also used this same technique to help with layout problems with Swing panels. One can create a JComponent or JPanel decorator that prints out how the methods used by LayoutManagers (like getSize(), getPreferredSize(), etc.) to see how the LayoutManager is interrogating your class to lay it out.
Next, let's look at another use of a decorator to add a more practical enhancement to an input/output stream, encryption. I am not an expert on cryptology at all, so please don't use this code for any information you really need to secure! To encrypt data, I exclusive OR the data with a sequence of pseudo-random bytes as a mask. I then use the fact that:
(x ^ mask) ^ mask == x
to decrypt the data (^ is the exclusive OR operator in Java, C and C++). By exclusive OR'ing the encrypted data with the same sequence of pseudo-random bytes I will get the original data back.
The abstract EncryptingOutputStream class extends FilterOutputStream and overrides the write(int) and write( byte [], int, int) methods. The write( byte []) method calls write( byte[], int, int) so I don't need to override it. An abstract method void encrypt( byte []) is defined that must be overridden to do the actual encryption of the data. (This is an example of the Template Method pattern.)
Similarly, there is an abstract DecryptingInputStream class that extends FilterInputStream and overrides the int read() and int read( byte [], int, int) methods. It defines an abstract method void decrypt( byte [], int, int) that must decrypt the data in the specified section of the array passed to it.
Concrete classes named XOREncryptingOutputStream and XORDecryptingInputStream are defined that define the appropriate abstract methods. Both classes are passed a key to their constructors. This key is used as a seed to a random number generator. If the key is the same, the same pseudo-random numbers will be generated to encrypt/decrypt the data.
As an example, putting all this together I compressed then encrypted (with the key 1234567890) a web page and put it on my web site at http://www.drpaulcarter.com/mystery. Let's look at how to use the classes discussed here to read the web page. We need to create a chain of InputStreams that read the web page, decrypt it and then uncompress it. We can do that with the following code:

new GZIPInputStream(
        new XORDecryptingInputStream(
        new URL(url).openStream(), key)));
The url variable is string containing "http://www.drpaulcarter.com/mystery" Then we only need to read from the GZIPInputStream as we would any InputStream. Here's a sequence diagram showing what happens as data is read:

To download and display the web page type:
java test.DecryptingBrowser http://www.drpaulcarter.com/mystery 1234567890
An important point to realize is the InputStream reading the data from the web server knows nothing about compression or encryption. However, we have dynamically added these capabilities by chaining it to together with decorators that add these capabilities. This is the power of the decorator pattern.
I originally developed this example back in 1998 when I was teaching Java. This was in the Java 1.2 days. In creating this presentation, I discovered that Java 1.4 added encryption streams, javax.crypto.CipherInputStream and javax.crypto.CipherOutputStream. They both use a javax.crypto.Cipher object to do the encryption and decryption. This is yet another pattern, the Strategy pattern.
The full example code of the classes described here can be found here.

Recommended books:


Monday, December 22, 2008

Jar files and indexing

Recently at work, I ran into an interesting "feature" of how jar files and indexing work with Java and ant.

For background on indexing see:
http://closingbraces.net/2007/05/13/jarclasspathandindex/

On some other developer's machines, some java applications were failing with NoClassDefFoundError exceptions. However, I personally never saw this error. The jar's with the classes that were not being found were listed in the Class-Path entry of the manifest of the jar that used them. I was extremely puzzled by this. I could fix the problem by explicitly adding the jar's to the classpath, but I didn't understand why I needed to do this when everything worked fine on my machine.

From the title of this post you can probably guess that the problem involved indexing. Our build process (using ant), creates a jar (let's call in myjar.jar) with the <jar> task, then it runs "jar -i" on it to index it (using an ). If all this happens everything works fine. However, there seems to be a race condition between when ant closes the jar file that it created with the <jar> task and when the "jar -i" command is run. When this happens, the "jar -i" command fails because the file is still locked and the index is not created.

I will go into exactly why this caused the error in a minute, but first I want to look at how I first tried to fix this because it also brings up an important point on the problem. The <jar> task has an optional attribute named "index" which can be used to have ant index the file. I tried using this instead of calling "jar -i" on it, but it did not fix the problem. Why didn't it? Because setting index="true" in the <jar> task does not do the same thing as running "jar -i" on the jar! This is very uninituitive! So what is the difference? Using the <jar> task, on the classes in the jar itself are added to the index; however, using "jar -i" the classes in the jar and all the classes used by the jar are added to the index.

Why does this make a difference? The key is in the second link above. Quoting:
The executable jar must either not have a META-INF/INDEX.LST file, or if it does have such an INDEX.LST file this needs to list the contents of both the executable jar and all of the other jars as well. Anything not in this list will not be found on the class path, regardless of the “Class-Path” entry in the manifest...

Two things to note, the filename I see is META-INF/INDEX.LIST (not LST) and the link goes on to say:
and regardless of any command-line “-classpath” (which is ignored anyway).
This did not apply in my case. We are using java -cp file.jar file.MainClass to run our apps, not java -jar file.jar. In a nutshell, if the jar has an index, it's Class-Path entry is ignored.

In my case, our application uses jacorb.jar (from JacORB). It uses two other jars: logkit-1.2.jar and avalon-framework-4.1.5.jar. These are the two jars I had to add to the classpath to fix the problem. My jar has jacorb.jar listed in its Class-Path entry.

If everything builds properly and myjar.jar is indexed using "jar -i", everything works because entries for all the classes used by myjar.jar are included in the index (including the ones in logkit-1.2.jar and avalon-framework-4.1.5.jar. The Class-Path entry is ignored since the index is present.

However, if the "jar -i" command fails, then the Class-Path entry is in effect and classes in jacorb.jar can be resolved, but not the classes that jacorb.jar uses. Why not? Because jacorb.jar was indexed using ant. So, it has an index, but it only contains its own classes, not the ones in the other two jars.

If you want to see this problem yourself. You can download the code I used to verify this from here. It has 3 jars: myjar.jar, dependjar.jar and depend2jar.jar. myjar uses dependjar which uses depend2jar. This is equivalent to myjar.jar which uses jacorb.jar which uses logkit-1.2.jar.

The ant script by default will build the jars without indexing and use jar -i to create indexed versions of the jars named imyjar.jar, idependjar.jar and idepend2jar.jar.
If you run ant with -Dindex="yes" it will index the plain jars (myjar.jar, dependjar.jar and depend2jar.jar) using ant's index method.
To recreate the problem I saw, type:
ant
ant build.dependjar -Dindex="yes"
java -cp myjar.jar myjar.MyClass
The first line builds all the plain jars with no indexing. The second line replaces dependjar.jar with one using ant's index funciton (just like jacorb.jar). The last line demonstrates the error that occurs. You can also verify that adding dependjar.jar to the classpath doesn't fix the error either.

In general I see several ways to fix this problem.
  1. Avoid using indexing
  2. Put all the jars on the classpath
  3. Fully index all jars with "jar -i"
Hopefully this post will help anyone else running into this problem.