Weak reference와 Weak hashmap에 관하여

Posted by epicdev Archive : 2011. 9. 8. 00:42

Most any Java Developer will be familiar with the concepts of references, as in pass-by-reference vs. pass-by-value. (Pointers, now that’s another thing…)

When calling methods, primitive data types are passed by value, while objects and arrays are passed by reference. This means when you call a method with an object as a parameter, you are merely providing that method a way to access/manipulate the same object via a reference; no copy is made. Contrast that with primitives: When calling a method that requires them, a copy of that value is put on the call stack before invoking the method.

In that way, references are somewhat like pointers, though they obviously cannot be manipulated by pointer arithmetic. But what about weak references? What are they, and how do they contrast with strong references?

Weakly understood

Based on my experience, the concept of weak references, or more generally reachability, is not one that is well-understood in the Java world. At least I did not have a good grasp of them until stumbling upon some sample code one day. It may be that the need to utilize them is outside the confines of most day-to-day programming tasks, as the concept is fairly low-level. Nonetheless, it’s an important concept to understand.

Basically, Java specifies five levels of reachability for objects that reflect which state the object is in, in relation to being marked as finalizable, being finalized and being reclaimed. They are, in order of strongest-to-weakest:

  1. Strongly Reachable
  2. Softly Reachable
  3. Weakly Reachable
  4. Phantom Reachable
  5. Unreachable

An object’s normal state, as soon as it has been instantiated and assigned to a variable/field is strongly reachable. Chances are, these are the only types of objects you’ve worked with. We’ll first cover the concept of weakly reachableobjects, as I believe it provides a good base for understanding the remainder.

Cleaning out the trash

Going by the API reference, a weakly reachable object is one that can be reached by traversing (i.e. going through) a weak reference. That’s a succinct definition to be sure, but it just raises the next question: What is a weak reference?

Simply put, if an object can only be reached by traversing a weak reference, the garbage collector will not attempt to keep the object in memory any more than it would an object with no references to it, i.e. an object that cannot be accessed. Thus, from the garbage collector’s point-of-view, a weakly-referenced object will eventually be cleaned from memory the same as an object no references to it.

So, if weakly-referenced objects are treated the same as completely non-referenced ones, what is the purpose of the weak reference? A good example is the WeakHashMap, a class provided by Java.

WeakHashMap

Unfortunately, WeakHashMap may also be poorly understood, probably as a result of weak references not being well known. WeakHashMap may at times be described as a “cache” of sorts, where objects/entries that are not used will be removed to decrease memory usage. This is not how WeakHashMap works at all.

The best way to describe a WeakHashMap is one where the entries (key-to-value mappings) will be removed when it is no longer possible to retrieve them from the map. For example, say you’ve added an object to the WeakHashMap using a key k1. If you now set k1 to null, there should be no way to retrieve the object from the map, since you don’t have the key object around any more to call get() with. This behaviour is possible because WeakHashMap only has weak references to the keys, not strong references like the other Map classes.

Note that for the WeakHashMap to work this way, as it was intended, the key objects must only be considered equal if they are actually the same object – i.e. object identity instead of mere equality. This is the default behaviour forObject.equals() and Object.hashCode(), so if these methods have not been overridden, the object is OK to be used as a key in WeakHashMap. Objects like Integer are not suitable for use in WeakHashMap, because it is possible to create two separate (non-identical) objects that are both equal:

final Integer i1 = new Integer(4);
final Integer i2 = new Integer(4);
LOGGER.debug("i1.equals(i2): " + i1.equals(i2)); // True.
LOGGER.debug("i1 == i2: " + (i1 == i2)); // False.

Another point of importance is that String is not a suitable key for a WeakHashMap as well. In addition to its overriding of equals() andhashCode(), String objects in Java are also interned (i.e. stored) in a pool by the JVM when created. This means that they may remain strongly referenced even after you have apparently gotten rid of your reference to them. Because of this, entries that you add to a WeakHashMap using String keys may never get dropped, even after you have apparently lost reference to the keys, since the Strings may remain strongly referenced in the string intern pool.

An example of String interning:

final String s1 = "The only thing we have to fear is fear itself.";
final String s2 = "The only thing we have to fear is fear itself.";
LOGGER.debug("s1.equals(s2): " + s1.equals(s2)); // True.
LOGGER.debug("s1 == s2: " + (s1 == s2)); // May also return true!

String objects are interned for performance reasons, so when you are going to create a new String, Java first checks if there is a String in the pool that is “equal” to the one you are creating. If such a String exists, the existing object is just returned instead of having to instantiate a new object. This is possible because Strings in Java are immutable, i.e. operations that appear to modify a String (such as concatenation, toUpperCase(), etc.) really return a new String object while preserving the original.

The last usage note is that even though the keys are weakly-referenced by WeakHashMap, the values remain strongly-referenced. Thus, you must take care to not use value objects that strongly reference the keys themselves, as if this happens, the keys/entries will no longer be automatically dropped because a strong reference may always exist to the keys. (This can be avoided by wrapping the value object in a WeakReference, so that both keys and values are weakly-referenced when in the WeakHashMap)

Example use of WeakHashMap

Here is a brief, albeit contrived example of WeakHashMap at work:

// SampleKey is just an object that holds a single int. (Use instead of
// Integer, since Integer overrides equals() and hashcode())
SampleKey key = new SampleKey(42);
SampleObject value = new SampleObject("Sample Value");

final WeakHashMap<SampleKey, SampleObject> weakHashMap = new WeakHashMap<SampleKey, SampleObject>();
weakHashMap.put(key, value);

// At this point, we still have a strong reference to the key. Thus, even
// though the key is weakly-referenced by the WeakHashMap, nothing will
// be automatically removed even if we give a hint to the GC.
System.gc();

LOGGER.debug(weakHashMap.size()); // Will still be '1'.
LOGGER.debug(weakHashMap.get(key)); // Will still be 'Sample Value'.

// Now, we if set the key to null, the entry in weakHashMap will eventually
// disappear. Note that the number of times we have to 'kick' the GC
// before the entry disappears may be different on each run depending
// on the JVM load, memory usage, etc.
key = null;
int count = 0;
while(0 != weakHashMap.size())
{
  ++count;
  System.gc();
}
LOGGER.debug("Took " + count + " calls to System.gc() to result in weakHashMap size of : " + weakHashMap.size());

Finishing up

In an upcoming article, I plan on covering the other types of references (soft and phantom) as well as the associated Reference classes in Java. I wanted to keep this post brief so that it provided a basic understanding of the situation.

출처: http://unitstep.net/blog/2011/03/26/java-weak-references-and-weakhashmap/