java – unitstep.nethttp://unitstep.net
the home of peter chngSun, 11 Mar 2018 01:34:40 +0000en-UShourly1https://wordpress.org/?v=4.9.4Java PhantomReferences: A better choice than finalize()http://unitstep.net/blog/2018/03/10/java-phantomreferences-a-better-choice-than-finalize/
http://unitstep.net/blog/2018/03/10/java-phantomreferences-a-better-choice-than-finalize/#respondSun, 11 Mar 2018 01:34:40 +0000http://unitstep.net/?p=1743We’ve talked about soft and weak references, and how they differ from strong references. But what about phantom references? What are they useful for?

Starting with Java 9, the intended usage of PhantomReference objects is to replace any usage of Object.finalize() (which was deprecated in Java 9), in order to allow for this sort of object clean-up code to be run in a more predicable manner (as designated by the programmer), rather than subject to the constraints/implementation details of the garbage collector.

How to use them

The basic usage is to create a PhantomReference that wraps some object reference. However, the get() method will always return null for a PhantomReference, so what can this object even be used for?

Firstly, creating a phantom reference does not make the object phantom-reachable. (The same applies for weak references, and soft references) The object must first lose all (strong) references to it, and then sometime after that, the JVM will determine it’s phantom-reachable.

This is why you must register a phantom reference with a ReferenceQueue in order for it to be useful. This is indicated by the constructor signature, and the accompanying Javadoc:

“It is possible to create a phantom reference with a null queue, but such a reference is completely useless: Its get method will always return null and, since it does not have a queue, it will never be enqueued.”

The phantom reference will then be enqueued in the reference queue at the moment the JVM determines the reference object is only phantom-reachable, (that is, it has no strong, soft, nor weak references), and this serves as a notification of the reachability change.

Alternative to finalize

Using a PhantomReference along with a ReferenceQueue can allow you to be notified when an object has been finalized by the GC, and thus allow you to perform any necessary clean-up action.

In fact, starting with Java 9, Object.finalize() has been deprecated, recognizing what many Java developers have know for some time: That implementing finalize() can lead to error-prone code: The thread which calls finalize() is the garbage-collector thread, which introduces concurrency concerns, and an improper finalize() method could leak a reference to the object itself, preventing it from being GC’d.

Before Java 9, PhantomReference didn’t fully address all of these concerns, but there were changes made to bridge this gap. Check out the notable differences in the Java 8 vs Java 9 docs:

Java 8 PhantomReference:
– Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism.
– The Javadoc states: “Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.”

Java 9 PhantomReference:
– Phantom references are most often used to schedule post-mortem cleanup actions.
– There’s no mention of them not being automatically cleared by the garbage collector.
– The Javadoc states: “Suppose the garbage collector determines at a certain point in time that an object is phantom reachable. At that time it will atomically clear all phantom references to that object and all phantom references to any other phantom-reachable objects from which that object is reachable. At the same time or at some later time it will enqueue those newly-cleared phantom references that are registered with reference queues.”

This means that in Java 9, PhantomReference objects are dequeued at a later change (from pre-mortem to post-mortem) and the PhantomReference itself should not prevent garbage collection of the object, and thus create a resource leak. Previously, this was not the case.

A simple example

Let’s take a look at a simple and contrived example, my favourite kind of example. (An example is also at: https://ideone.com/I8f4U3)

package net.unitstep.examples.references;
import java.lang.ref.PhantomReference;
import java.lang.ref.ReferenceQueue;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* @author Peter Chng
*/
public class PhantomReferenceExample {
private static final Logger LOGGER = LoggerFactory.getLogger(PhantomReferenceExample.class);
// Just so we have a non-primitive, non-interned object that will be GC'd.
public static class SampleObject<T> {
private final T value;
public SampleObject(final T value) {
this.value = value;
}
@Override
public String toString() {
return String.valueOf(this.value);
}
}
public static class PhantomReferenceMetadata<T, M> extends PhantomReference<T> {
// Some metadata stored about the object that will be used during some cleanup actions.
private final M metadata;
public PhantomReferenceMetadata(final T referent, final ReferenceQueue<? super T> q,
final M metadata) {
super(referent, q);
this.metadata = metadata;
}
public M getMetadata() {
return this.metadata;
}
}
public static void main(final String[] args) {
// The object whose GC lifecycle we want to track.
SampleObject<String> helloObject = new SampleObject<>("Hello");
// Reference queue that the phantom references will be registered to.
// They will be enqueued here when the appropriate reachability changes are detected by the JVM.
final ReferenceQueue<SampleObject<String>> refQueue = new ReferenceQueue<>();
// In this case, the metadata we associate with the object is some name.
final PhantomReferenceMetadata<SampleObject<String>, String> helloPhantomReference = new PhantomReferenceMetadata<>(
helloObject, refQueue, "helloObject");
new Thread(() -> {
LOGGER.info("Starting ReferenceQueue consumer thread.");
final int numToDequeue = 1;
int numDequed = 0;
while (numDequed < numToDequeue) {
// Unfortunately, need to downcast to the appropriate type.
try {
@SuppressWarnings("unchecked")
final PhantomReferenceMetadata<SampleObject<String>, String> reference = (PhantomReferenceMetadata<SampleObject<String>, String>) refQueue
.remove();
// At this point, we know the object referred to by the PhantomReference has been finalized.
// So, we can do any other clean-up that might be allowed, such as cleaning up some temporary files
// associated with the object.
// The metadata stored in PhantomReferenceMetadata could be used to determine which temporary files
// should be cleaned up.
// You probably shouldn't rely on this as the ONLY method to clean up those temporary files, however.
LOGGER.info("{} has been finalized.", reference.getMetadata());
} catch (final InterruptedException e) {
// Just for the purpose of this example.
break;
}
++numDequed;
}
LOGGER.info("Finished ReferenceQueue consumer thread.");
}).start();
// Lose the strong reference to the object.
helloObject = null;
// Attempt to trigger a GC.
System.gc();
}
}

In this straightforward example, we:

1. Create an object.
2. Create a ReferenceQueue for the JVM to use.
3. Wrap that object in a PhantomReference-derived class, attach some metadata to it, and register it with the ReferenceQueue
4. When the JVM detects the appropriate reachability changes (i.e, there’s no longer a strong reference to helloObject, and it’s only phantom-reachable), it will enqueue the phantom reference object into the reference queue.
5. We create a separate thread to monitor the reference queue, and could do some clean-up associated with the object here.

1. If you get a reference to the object via Reference.get(), you could leak it outside of the thread, causing a resource leak.
2. You might also be invoking a method on the object while it is being finalized, (since the GC runs in a separate thread), introducing concurrency issues.

Both of these issues do not matter when using a PhantomReference, since its get() method always returns null (though you could use reflection to get access to the object reference, a horrible idea), and PhantomReferences are enqueued after the object has been finalized.

However, with soft and weak references, get() could still return a reference to the object. During clean-up, when dequeuing the reference from the ReferenceQueue, I would not use this approach for the reasons above.

Instead, extend the appropriate reference class and attach some additional metadata (in the form of an object that does not have a reference to the original object) that can be used to perform the clean up, as in the example above.

For example, you could extend PhantomReference to store the name of some temporary file that could be removed when the object is no longer reachable. Then, you could retrieve this file name from the PhantomReference child class and use that, rather than getting it directly from the original object.

Avoid complexity

In general, having your clean-up code rely on some finalization mechanism (whether it be the finalize() method, or using PhantomReferences) should not be your first option. This is because you’d be relying on certain JVM GC behaviour in order to enforce your application logic, which isn’t a good idea in my mind. Relying on clean-up to take place when some object goes out-of-scope is a fragile linkage.

Prefer using an explicit (i.e. within your own application’s logic) clean-up mechanism, if possible. This is likely to be simpler. This is reinforced by the deprecation note in Java 9 for Object.finalize(): “Classes whose instances hold non-heap resources should provide a method to enable explicit release of those resources, and they should also implement AutoCloseable if appropriate”.

I’d only use the above approach (PhantomReference/ReferenceQueue) as a fallback approach to compliment an explicit clean-up mechanism, and would not rely on it unless absolutely necessary – as you can tell, it’s a little complicated. Prefer simplicity.

By contrast, a weak reference doesn’t prevent the garbage collector from clearing the referred object. That is, if the only references that remain to an object are weak references, that object will be treated as if there are no strong references to it, and thus it will be cleared by the GC on its next run. Weak references are implemented mainly by the WeakReference “wrapper” and WeakHashMap, the latter of which only maintains weak references to the keys, so that once the keys are inaccessible (i.e. no strong references exist), the WeakHashMap will automatically drop/remove the corresponding entries, which can also make the value objects eligible for GC so as long as there are no other references to them.

But how do these references contrast with a soft reference?

Soft References vs. Weak References

The basic difference between a soft reference and a weak reference is how aggressively the garbage collector will attempt to clear them. An object that has only weak references is treated by the GC no differently than an object with no references at all; that is, the GC would clear these objects and reclaim the memory (through the object life cycle) as soon as it sees fit.

A soft reference is treated slightly differently. According to the Javadoc, a soft reference is only cleared by the garbage collector in response to memory demand or need. This would seem to imply some sort of algorithm is used to determine when soft references should be cleared. The Javadoc is a little vague here, with the only requirement that all soft references be cleared before an OutOfMemoryError is thrown, with only a suggestion that VMs first clear out older soft references before clearing out newer ones:

Caches

Both weak references and soft references can be used to build caches, although the utility of such a cache (as opposed to a more traditional size-based cache) is debatable.

A weak-reference based cache is seen in WeakHashMap. A typical use case would be:

1. You have a lot of large objects that you want to associate with a certain key, and hence store in a map.
2. You’d like these large objects to be automatically removed/GC’d when the key is no longer reachable, rather than having to make an explicit remove() call.

A WeakHashMap can accomplish this; when the keys are no longer reachable, the associated entry will be removed, and the value object will be GC’d, provided there are no other references to it.

Soft References, on the other hand, won’t be garbage collected until there is memory demand. This can be useful to build a cache of objects that gets automatically expired in response to memory pressure. Guava’s CacheBuilder offers an option with this strategy.

However, both the weak and soft reference-based approach to caches are less predictable than a traditional size-based cache, as the expiration policy is now governed by GC behaviour. You will also have to make sure that the objects you wish to cache are being compared via object identity, and not object equality, since the garbage collector depends on object identity.

Conclusion

Soft and Weak references are similar in that if an object only has a soft/weak reference pointing at it, this alone won’t prevent it from being garbage collected. However, objects with weak references will be GC’d similar to an object which has no references, while objects with soft references will be GC’d at the JVM’s discretion, usually in response to memory demand.

Because of the additional factor of GC behaviour, I would not recommend using a cache based on soft references unless you had a very specific use case for it. Instead, a typical maximum-size cache would probably be a better sensible default if one needed a cache. (See Guava’s CacheBuilder for examples of this)

In a following article, we’ll look at PhantomReference and how it might be a better choice than overriding finalize() to schedule cleanup actions.

I didn’t like that and decided to code up a Java class generator (it’s written in Python) based on the Data Type Reference. You can use it to generate a Java class based on the SoftLayer Data Type you pass in. Enjoy!

Here’s how I achieved a similar result in Spring Web MVC. (Note: the following examples were done with Spring 3.2.1)

Built-in?

Spring does provide some build-in support for conversion to specific types. For example, you can convert to Date and various numeric types. But, what if you want to convert a request parameter to some other custom type or a type from a third party? (Such as the date-time classes from Joda Time)

Minimize configuration

I searched for a bit and foundvarioussolutions, but they all seemed to require too much configuration: In addition to writing a converter class, you’d have to manually the converter with a ConversionService (either in code or in XML configuration). I didn’t like the idea of having to register the converter class; instead, I wanted it to be registered automatically based on some annotations.

Modify controller method to use the proper type

Note that if conversion fails (via an uncaught exception from the convert() method) then the client will see a 400 Bad Request response.

@RequestMapping(value = "widget/{date}", method = RequestMethod.GET)
@ResponseBody
public String getWidgetDate(@PathVariable("date") final LocalDate date) {
// We get auto-conversion to a LocalDate type...
// Just spits back the date to the client.
return date.toString();
}

Summary

I hope you found this useful. With just a little bit of work, we have a bean that will auto-register and make available any new converters you define, so as long as you annotate the converter properly.

The final keyword – classes and methods

When used on a class, final prevents the class from being extended. Similarly, when final is used on a method, that method cannot be overridden in any subclass.

This is mainly done for reasons of security and consistency: If you have a class whose methods you count on to provide predictable results, you may want to prevent that class from being extended so that someone cannot substitute a subclass that behaves differently to a consumer that expects your original class. The same may go for certain methods on a class.

But generally, you should not be marking classes/methods as final unless you have a compelling argument for it. The reason is that using final may unnecessarily constrain the design and hamper future developers who may have a legitimate reason for wanting to extend your class. Though I prefer composition over inheritance, it’s not a choice I would want to force on everyone else, nor do I believe it’s always the right choice.

In particular, you should not be marking classes/methods as finaljust for some perceived performance benefit. In many cases you won’t gain a thing performance wise, but will be unnecessarily be constraining your design.

One reason you shouldn’t refrain from marking a method as final is for mockability; mocking frameworks like JMockit allow you to easily mock out final methods, so don’t let that affect your design decisions.

The final keyword – variables

When used on variables, final essentially means that the variable can only be assigned once. For class/static members, this is when the class is loaded; for instance members, this is when an instance of the class is created; for local variables, it is usually when the variable is declared. Method parameters can also be declared as final.

As opposed to the use of final on classes/methods, I usually like using final for most variables, as it can decrease the complexity of code by decreasing the chances for state changes. Once you know that a local variable is final, if it’s primitive, you can be sure its value won’t change; if it’s an object, you can be sure you’ll always be dealing with the same object.

Furthermore, the use of final on member variables ensures that you know they’ll be assigned once and only once: At object creation.

Some people may think that the use of final variables is superfluous, but I think it’s a good defensive programming technique. This article sums up my viewpoints nicely.

finally

As opposed to the final keyword, which is completely optional and not strictly necessary to use, you should almost always be using the finally keyword in Java when dealing with closeable resources. (This is true before Java 7, as Java 7 introduces some nice syntactic sugar that can remove the need to use finally)

In this example, we initialize the inputStream within a try block. If anything goes wrong with reading from the stream, an exception will be thrown (which we don’t handle here, for brevity) but before it is allowed to propagate, the code within the finally block will be executed, ensuring that if the stream has been opened, it will be closed. This will prevent a resource leak from happening.

As you can see, the amount of “boilerplate” is reduced as we don’t need to have an explicit finally block anymore. This is known as a “try-with-resources” statement, where an object that implements Closeable/AutoCloseable is declared and initialized with the try statement and automatically closed no matter what the outcome of the statements within the try block. I would suggest using this unless your code needs to be JDK 6 compatible.

finalize()

As opposed to final, which you may use and finally, which you will certainly have to use in JDK 6 and below, you will almost never need to implement or override Object.finalize().

The finalize() method is called by the garbage collector (GC) determines that there is no longer anyway for the object to accessed; this means the object is eligible for finalization, which means the GC can begin the process of making the object finalizable, finalized and then reclaimed. Generally any Object that does not have a Strong reference is eligible.

There is generally no need to override the default finalize() method; if your class opens resources then it should be responsible for properly closing/releasing them by using try-catch-finally as outlined above, or you should make your class implement Closeable so that clients/callers of your class can ensure its proper closure.

You could, in theory, use finalize() as a “safety net” of sorts to mitigate the effects of bad clients/callers not calling close() on your class by ensuring that internal resources are closed/released in finalize() but you should never rely on this.

Additionally, if you are overriding finalize() you must take care to ensure that you don’t “leak” a reference to the object to the “outside world”, or else this may prevent the object from being garbage collected, resulting in a potential memory leak! (i.e. don’t pass a reference to this to any external objects during finalize())

In general, if you are going to implement/override finalize(), you should have a very specific reason and be very careful in how you do so.

The final word

I hope you enjoyed this summary of final, finally and finalize(). Learning these concepts is pretty fundamental to having a good understanding of Java as a whole.

This isn’t so much of an issue for @PathParam parameters, (since you won’t even get to the proper resource method without a matching URI) but it does affect @HeaderParam and @QueryParam (among others) since they aren’t needed for Jersey to determine which resource method to invoke. By that definition, they are implicitly optional. There should be a way to make them required.

The behaviour of such a required annotation might be as follows:

If the request does not have the parameter, then by default a Response with Status.BAD_REQUEST (HTTP 400) would be returned to the client.

Some way of customizing the HTTP response code and message should also be provided.

Right now, there’s not really an elegant way to make something like a @HeaderParam required. Here are some solutions I’ve tried.

Attempt #1: Parameter classes

Parameter classes can be useful for transforming the single input of a parameter into a single output, and also for verifying that the input parameter value is valid. This can be useful for ensuring that an input parameter can be converted into a specific object, or that it matches a specific format.

As an example, consider the CsvListParam class. This class takes a comma-separated list as a parameter, and returns a List<String> comprising each entry in the list. It does an additional check to ensure that the input is not blank, according to StringUtils.isBlank(). If it is blank, a 400 Bad Request response is returned to the client via the WebApplicationException thrown.

However, if the parameter is not present at all – for example, if this was obtained via the @HeaderParam annotation and the header was not present – then the code in the parameter class will not even be invoked by Jersey. Instead, the value will simply be null. If we require this parameter, it results in nasty if-else null-checking code in each of our resource methods that requires the parameter.

We need something that gets rid of the necessary null-checking in each resource method.

Attempt #2: Injection Providers

Coda Hale provides another brilliant example how to use Injection Providers in Jersey. Basically, with Injection Providers, you can do everything that you could do with Parameter classes, and more.

While Parameter classes are useful only for single-input to single-output mapping, Injection Providers can map multiple inputs to single or multiple outputs. This could allow you to take multiple values in the HTTP request and use them populate a single Bean, or use them to form some more complex single value. It also allows you to do validation, as throwing a WebApplicationException from an Injection Provider will cause the contained Response or status to be properly returned to the client.

So, we can achieve a similar effect using an Injection Provider. A first attempt at resolving the issue yields the CsvListProvider class:

However, while this works as expected, it has the unfortunate side effect that the source of the parameter (an HTTP header of “X-TEST”) has to be specified in the Provider class rather than on the annotation. This isn’t ideal since we have to create a new Injection Provider class for each HTTP header we want to make required.

Further attempts

I have been trying to figure out a solution to this. One possible way might be to change the AbstractInjectableProvider to the following declaration:

We could then define a custom annotation type to take the place of A instead of always using @Context. However, this doesn’t work, as we have no way of then obtaining any of the annotation’s values in the concrete Provider class. A solution like this would require changes in the core of Jersey to make it work, thus reducing the solution essentially the same as having an @Required annotation as proposed above.

Conclusion

It seems like we need a proper solution to this via a change in Jersey. Evidently, others have come to the same conclusion, as there are at least twoissues open for Jersey related to this.

Double trouble

The key point to understanding the tricky syntax is to realize that when you’re creating a String literal in Java, backslashes are used to form escape sequences as well. Most people are familiar with this concept, when, for example, constructing a String that spans multiple lines:

final String multiline = "A String...\nOn two lines";

When calling Pattern.compile, you pass in a String literal that is the regular expression. However, regular expressions also use the backslash character to begin escape sequences. So, to ensure that the regular expression engine in Pattern gets the correct syntax, you must replace every backslash in your regular expression with two backslashes. This is to prevent Java from interpreting the single backslash as just a String escape sequence.

Or, put another way, if you wanted a String with the contents "\n", that is a String with a backslash followed by the letter ‘n’, you’d have to define it as:

final String newLineEscapeSequence = "\\n";

This is the gist of it; we need to pass in the preserved backslashes into the Pattern regular expression engine, so you have to create a literal backslash by using a double-backslash in your String literal. This information is in the Pattern Javadoc, but it’s sort of buried beneath loads of regular expression syntax.

Keep this in mind when constructing your regular expressions outside of Java in a tool like RegExr. These principles also apply when using other classes/methods that use Pattern, such as String.split() or Scanner.useDelimiter()

An example

Here’s a simple example where we try to find the word “The” at the beginning of a String, delimited by a word boundary matcher.

The key point here is that the word boundary matcher (\b) must be passed in as a String literal of "\\b" so that the backslash is properly interpreted. In the incorrect Pattern, "\b" maps to a backspace character literal.

I think the reason this concept is somewhat tricky is that you have to deal with two levels of escaping – the Java String literal syntax and the Regular Expression syntax.

This isn’t going to be an indictment of bad programming; in fact, I think it’s good if you can look back at your old code and see where it could be improved. Such a process suggests that you are continually self-improving, a skill crucial in software development. Besides, all of us have made a mistake or two at times when we were stressed, tired or just plain not thinking straight.

However, there’s one mistake that I’ve seen that I think warrants bringing to light, and that is the misuse of the Flyweight pattern.

Who wants to be a Flyweight?

Flyweight is typically used to describe one of the smaller weight classes in boxing or other fighting sports. This “minimal” aspect is what is shared with the design pattern of the same name. Simply put, a Flyweight object is one that reduces memory use by sharing common data with other objects. Despite this plain definition, implementing the Flyweight pattern can be tricky.

Perhaps this is why I have seen examples like this: (Java pseudo-code below; may not compile, but you shouldn’t use it anyways)

Now, obviously the memory footprint of WidgetWithManyFields can be quite large, and since not all aspects of an application will need access to all data fields, it was decided that a “Flyweight” was needed:

This isn’t really the Flyweight pattern at all. In fact, I don’t even know if it is a pattern at all. It might be considered something like the Proxy pattern, if the “Flyweight” class contained an instance of the regular class. But I don’t really know.

So what is a Flyweight?

Consider the example of a document that can have images embedded in it. There might be multiple copies of the same image present in the document, but each copy would be sized and positioned differently within the document.

In this case, you wouldn’t want to load and store the data in memory for multiple copies of the same image as that would be wasteful. However, each instance of the image displayed in the document might be formatted or positioned differently. How might this be done?

Firstly, some assumptions:

An image is uniquely identified by some resource path.

The underlying image data does not change during the lifetime of the application.

With these assumptions, we can define three classes that allow us to implement the Flyweight pattern.

Firstly, an ImageData class that encapsulates the actual image data. There should be only one canonical instance of this class for each unique resource path. Because of this, we can pool these objects for reuse.

However, the ImageData objects won’t be directly used by other parts of the application. Instead, we create an ImageFlyweight class that is manipulated. Each instance contains a reference to a canonical ImageData object and also stores information about how to format and position the image.

In this way, there can be multiple ImageFlyweight instances that reference the same image and hence the same ImageData instance, but each instance would define separate formatting and positioning details.

Tying everything together is a factory (ImageFlyweightFactory) that maintains the pool and is the access point for getting instances of ImageFlyweight.

Below is the code: (Sorry, it’s a lot of code to throw at you at once, but I didn’t feel like breaking it down into separate chunks, and you can just copy & paste it into your favourite IDE for inspection/compilation)

/**
* Copyright (c) 2012 Peter Chng, http://unitstep.net/
*/
package net.unitstep.examples.flyweight;
import java.util.HashMap;
import java.util.Map;
/**
* In order for the Flyweight Pattern to be effective, ImageFlyweight instances
* should only be obtained via ImageFlyweightFactory.getImageFlyweight().
*
* This ensures that for each unique resource path, there is only one instance
* of the backing ImageData existing in the application.
*
* @author Peter Chng
*/
public class ImageFlyweightFactory {
private Map<String, ImageData> imageDataPool =
new HashMap<String, ImageData>();
public ImageFlyweight getImageFlyweight(final String resourcePath) {
// This will return a new ImageFlyweight object each time; however, the
// backing ImageData might be shared across multiple ImageFlyweight
// instances.
return new ImageFlyweight(this.getImageData(resourcePath));
}
private ImageData getImageData(final String resourcePath) {
ImageData imageData = this.imageDataPool.get(resourcePath);
if (null == imageData) {
imageData = new ImageData(resourcePath);
this.imageDataPool.put(resourcePath, imageData);
}
return imageData;
}
/**
* @return the current count of ImageData instances in the pool; only for
* testing purposes.
*/
public int getImageDataPoolCount() {
return this.imageDataPool.size();
}
/**
* Will contain the data representing an image loaded from some resource, i.e.
* the file system.
*
* This is a private inner class because it should never need to be used
* externally by callers. It is considered an implementation detail.
*
* We assume that the resource path is the uniquely-identifying aspect of an
* image and that the underlying image resource/data will not change over the
* lifetime of the application.
*
* Thus, only one instance of the ImageData class is needed for each image
* uniquely identified by its resource path.
*
* @author Peter Chng
*/
private class ImageData {
private final byte[] data;
private final String resourcePath;
public ImageData(final String resourcePath) {
this.resourcePath = resourcePath;
// Image data would be loaded here based on the resource path supplied.
// For brevity, it's not really done.
this.data = new byte[] {};
}
public byte[] getData() {
// Note: If we really intend to make this class immutable, we should
// return a defensive copy instead so that callers cannot modify the
// data stored in this instance.
return this.data;
}
public String getResourcePath() {
return resourcePath;
}
// Note: Not strictly necessary to override equals() and hashCode() for this
// example, but it's done to indicate we only consider the resource path
// in determining equality.
@Override
public boolean equals(final Object object) {
if (null == object) {
return false;
}
if (object == this) {
return true;
}
if (object.getClass() != this.getClass()) {
return false;
}
return this.resourcePath.equals(((ImageData) object).getResourcePath());
}
@Override
public int hashCode() {
return this.resourcePath.hashCode();
}
}
/**
* The ImageFlyweight object contains a reference to a canonical ImageData
* object containing the actual image data we wish to render.
*
* By making this a static inner class of {@link ImageFlyweightFactory} and
* the constructor private, instantiation of this class can be controlled and
* limited to only the {@link ImageFlyweightFactory}. Callers MUST obtain an
* instance of the ImageFlyweight through the factory and not by direct
* instantiation.
*
* It also contains other properties that will affect the rendering of the
* image in the application, such as height, width and position.
*
* Reusing the same ImageData object across different ImageFlyweight instances
* allows us to display the same image in different ways within the
* application, without having to load (or store in memory) the image data
* multiple times.
*
* @author Peter Chng
*/
public static class ImageFlyweight {
private final ImageData imageData;
private int height;
private int width;
private int positionX;
private int positionY;
private ImageFlyweight(final ImageData imageData) {
this.imageData = imageData;
}
public byte[] getData() {
return this.imageData.getData();
}
// Getters/setters for height, width, positionX, positionY...
public int getHeight() {
return height;
}
public void setHeight(int height) {
this.height = height;
}
public int getWidth() {
return width;
}
public void setWidth(int width) {
this.width = width;
}
public int getPositionX() {
return positionX;
}
public void setPositionX(int positionX) {
this.positionX = positionX;
}
public int getPositionY() {
return positionY;
}
public void setPositionY(int positionY) {
this.positionY = positionY;
}
}
}

Everything is contained within the ImageFlyweightFactory class, because the ImageData class does not need to be visible to outsiders and callers should not be able to instantiate ImageFlyweight instances on their own.

With this code, we have a simple test harness to verify whether it’s working:

DEBUG ImageFlyweightTest - Current number of ImageData instances: 1
DEBUG ImageFlyweightTest - Current number of ImageData instances: 2
DEBUG ImageFlyweightTest - Current number of ImageData instances: 2

The key point is that after the third ImageFlyweight object is created, the count in the ImageData pool does not increase since the same image has already been “loaded”.

Other examples

Note that Java itself implements something similar to the Flyweight pattern for Strings; this is known as string interning and many other languages support this feature as well.

Basically, because Strings are immutable, Java can store each distinct value in a pool and then reuse these instances when appropriate. As an example, the following code displays “EQUAL”:

Note that this doesn’t work if you directly create a String using the new keyword.

Conclusion

I know that this was a fairly contrived example (aren’t they all?), but I hope it provided the basics of the Flyweight pattern to readers. There are a lot of holes and I don’t suggest you directly copy this example for production code, but instead learn the skills to effectively develop the pattern on your own.

As always, I welcome questions or comments and especially corrections if I’ve made a mistake! Thanks for reading!

When calling methods, primitive data types are passed by value, while objects and arrays are passed by reference. This means when you call a method with an object as a parameter, you are merely providing that method a way to access/manipulate the same object via a reference; no copy is made. Contrast that with primitives: When calling a method that requires them, a copy of that value is put on the call stack before invoking the method.

In that way, references are somewhat like pointers, though they obviously cannot be manipulated by pointer arithmetic. But what about weak references? What are they, and how do they contrast with strong references?

Weakly understood

Based on my experience, the concept of weak references, or more generally reachability, is not one that is well-understood in the Java world. At least I did not have a good grasp of them until stumbling upon some sample code one day. It may be that the need to utilize them is outside the confines of most day-to-day programming tasks, as the concept is fairly low-level. Nonetheless, it’s an important concept to understand.

Basically, Java specifies five levels of reachability for objects that reflect which state the object is in, in relation to being marked as finalizable, being finalized and being reclaimed. They are, in order of strongest-to-weakest:

Strongly Reachable

Softly Reachable

Weakly Reachable

Phantom Reachable

Unreachable

An object’s normal state, as soon as it has been instantiated and assigned to a variable/field is strongly reachable. Chances are, these are the only types of objects you’ve worked with. We’ll first cover the concept of weakly reachable objects, as I believe it provides a good base for understanding the remainder.

Cleaning out the trash

Going by the API reference, a weakly reachable object is one that can be reached by traversing (i.e. going through) a weak reference. That’s a succinct definition to be sure, but it just raises the next question: What is a weak reference?

Simply put, if an object can only be reached by traversing a weak reference, the garbage collector will not attempt to keep the object in memory any more than it would an object with no references to it, i.e. an object that cannot be accessed. Thus, from the garbage collector’s point-of-view, a weakly-referenced object will eventually be cleaned from memory the same as an object no references to it.

So, if weakly-referenced objects are treated the same as completely non-referenced ones, what is the purpose of the weak reference? A good example is the WeakHashMap, a class provided by Java.

WeakHashMap

The best way to describe a WeakHashMap is one where the entries (key-to-value mappings) will be removed when it is no longer possible to retrieve them from the map. For example, say you’ve added an object to the WeakHashMap using a key k1. If you now set k1 to null, there should be no way to retrieve the object from the map, since you don’t have the key object around any more to call get() with. This behaviour is possible because WeakHashMap only has weak references to the keys, not strong references like the other Map classes.

This makes it ideal for use as a cache of sorts. A typical use case is to associate keys with some large objects that take up a lot of memory; with only a weak reference to the keys, when there is no longer any external reference to the keys, the entry for it will be removed, which will also remove the WeakHashMap’s reference to the value objects. This can also make the value objects then eligible for garbage collection, *provided there are no other strong references* to the value objects outside of the map.

Note that for the WeakHashMap to work this way, as it was intended, the key objects must only be considered equal if they are actually the same object – i.e. object identity instead of mere equality. This is the default behaviour for Object.equals() and Object.hashCode(), so if these methods have not been overridden, the object is OK to be used as a key in WeakHashMap. Objects like Integer are not suitable for use in WeakHashMap, because it is possible to create two separate (non-identical) objects that are both equal:

Another point of importance is that String is not a suitable key for a WeakHashMap as well. In addition to its overriding of equals() and hashCode(), String objects in Java are also interned (i.e. stored) in a pool by the JVM when created. This means that they may remain strongly referenced even after you have apparently gotten rid of your reference to them. Because of this, entries that you add to a WeakHashMap using String keys may never get dropped, even after you have apparently lost reference to the keys, since the Strings may remain strongly referenced in the string intern pool.

An example of String interning:

final String s1 = "The only thing we have to fear is fear itself.";
final String s2 = "The only thing we have to fear is fear itself.";
LOGGER.debug("s1.equals(s2): " + s1.equals(s2)); // True.
LOGGER.debug("s1 == s2: " + (s1 == s2)); // May also return true!

String objects are interned for performance reasons, so when you are going to create a new String, Java first checks if there is a String in the pool that is “equal” to the one you are creating. If such a String exists, the existing object is just returned instead of having to instantiate a new object. This is possible because Strings in Java are immutable, i.e. operations that appear to modify a String (such as concatenation, toUpperCase(), etc.) really return a new String object while preserving the original.

The last usage note is that even though the keys are weakly-referenced by WeakHashMap, the values remain strongly-referenced. Thus, you must take care to not use value objects that strongly reference the keys themselves, as if this happens, the keys/entries will no longer be automatically dropped because a strong reference may always exist to the keys. (This can be avoided by wrapping the value object in a WeakReference, so that both keys and values are weakly-referenced when in the WeakHashMap)

Example use of WeakHashMap

Here is a brief, albeit contrived example of WeakHashMap at work:

// SampleKey is just an object that holds a single int. (Use instead of
// Integer, since Integer overrides equals() and hashcode())
SampleKey key = new SampleKey(42);
SampleObject value = new SampleObject("Sample Value");
final WeakHashMap<SampleKey, SampleObject> weakHashMap = new WeakHashMap<SampleKey, SampleObject>();
weakHashMap.put(key, value);
// At this point, we still have a strong reference to the key. Thus, even
// though the key is weakly-referenced by the WeakHashMap, nothing will
// be automatically removed even if we give a hint to the GC.
System.gc();
LOGGER.debug(weakHashMap.size()); // Will still be '1'.
LOGGER.debug(weakHashMap.get(key)); // Will still be 'Sample Value'.
// Now, we if set the key to null, the entry in weakHashMap will eventually
// disappear. Note that the number of times we have to 'kick' the GC
// before the entry disappears may be different on each run depending
// on the JVM load, memory usage, etc.
// This could also allow the SampleObject value to be GC'd, provided there
// were no other references to it.
key = null;
value = null;
int count = 0;
while(0 != weakHashMap.size())
{
++count;
System.gc();
}
LOGGER.debug("Took " + count + " calls to System.gc() to result in weakHashMap size of : " + weakHashMap.size());

Finishing up

In an upcoming article, I plan on covering the other types of references (soft and phantom) as well as the associated Reference classes in Java. I wanted to keep this post brief so that it provided a basic understanding of the situation.

Changes/Fixes

2011-04-10: Fixed numerous incorrect usages of the term “dereference”. Thanks to Ranjit for the explanation.

Mobile vs. Desktop

Following in the steps of RIM, Google’s Android, Palm and others, Sun hopes to follow the same pattern of success that Apple has enjoyed with their App Store. However, things are a bit different here. All the current App Store competitors have a separate mobile platform with which to compete against Apple. In general, this business model makes sense because there are few other easy ways to get software onto the devices, so a centralized app store of sorts makes sense.

In Apple’s case, they intended from the beginning to have the App Store be the only way to get software onto their devices. This closed-model and high level of control, which Apple is known for, is what helped make the App Store so popular. It was also very easy to use, and provided functionality not available elsewhere. Other mobile app stores aim to emulate this “app store tie-in”, hoping to make their respective app stores the primary place to get new software for your device, thus providing the companies with a steady source of revenue.

On the desktop, thing’s aren’t so clear. For the most part, people are already able to freely and easily download/purchase and install software, either through their web browser or through content delivery systems like Steam. Sun will have some real competition on their hands because of this, and unless they can create the ecosystem to spawn “killer apps”, people won’t be flocking to it in droves. Currently, there are just too many options for getting new software onto your desktop, thanks to the openness of the system, and this will be a real problem for Sun when it comes to gaining any significant market share in this area.

Furthermore, as noted in the article, there haven’t been very many compelling Java apps, save for Eclipse and Azureus. (I only use the former) Java on the desktop just hasn’t been as much of a success as Sun would’ve hoped for, mainly because Java desktop applications haven’t had the same consistent look & feel that native OS applications have offered, with some notable exceptions like Eclipse. While Java has gained much acceptance on the server side, it may have to settle for this before looking to gain significant acceptance on the desktop anytime soon.

App Store Hype?

It should be also noted that while the Apple’s App Store has been a roaring success for the company itself, it seems that it’s not as much of a success for the vast majority of developers out there. Like most markets with low-barriers to entry, (blogging for profit, startups, etc.) the distribution of revenue seems to follow a long-tail model, with very few developers making a lot of money, with the rest only making a fraction of that.

In order to place #34 on the social networking charts, you need 30-35 downloads a day. At the standard app store pricing of .99, and after Apple takes its cut, that means your app needs to bring in a little over $20 a day to chart at that position. And social networking is a popular category.

Thus, it would appear that App Store not as profitable for developers as the hype or large success stories would suggest. You may have a compelling app that is nicely done, but it may only make a marginal amount instead of the six-figure amounts being seen by some of the most successful apps. If this is the case with even a successful implementation like Apple’s App Store, how does this bode for Sun, which hasn’t even proven that their Java app store can enjoy similar success? Indeed, it appears that they will have a hard time attracting developers to their app store platform.