I'm involved in an application development; where we are storing numerous string objects into multiple HashMaps. Since the string objects created/placed in HashMap may have duplicate values; we're calling String::intern() method to make use of String Constant Pool feature in Java. But, my concern is - When the Constant String instance will be removed from Pool? Is it follow the same behaviour that if no references are presentm it is eligible for GC; i.e., removal of an instance from all HashMaps?

kaja.mohideen wrote:
... we are storing numerous string objects into multiple HashMaps. Since the string objects created/placed in HashMap may have duplicate values;
we're calling String::intern() method to make use of String Constant Pool feature

Why?
What do you think that will accomplish?

When the Constant String instance will be removed from Pool?

If you don't invoke intern() you don't have to worry about that.

Is it follow the same behaviour that if no references are present it is eligible for GC;

That is back-to-front. That is not how GC works.
GC traces references from root objects, all objects that are reached survive the GC cycle.

i.e., removal of an instance from all HashMaps?

You would have to use something from [url http://docs.oracle.com/javase/6/docs/api/java/lang/ref/package-summary.html]java.lang.ref or your HashMap will have a reference that prevents your String from being eligible for GC.

kaja.mohideen wrote:
... we are storing numerous string objects into multiple HashMaps. Since the string objects created/placed in HashMap may have duplicate values;
we're calling String::intern() method to make use of String Constant Pool feature

Why?
What do you think that will accomplish?

As I mentioned, we are storing huge number of String objects into HashMaps. The String objects may have duplicate values. Calling String::intern before inserting into Map will make the Map refer to the same String instance if it is already available in Constant Pool. Correct me, If I'm wrong.

When the Constant String instance will be removed from Pool?

If you don't invoke intern() you don't have to worry about that.

For the above reason, we have to use intern method.

Is it follow the same behaviour that if no references are present it is eligible for GC;

That is back-to-front. That is not how GC works.
GC traces references from root objects, all objects that are reached survive the GC cycle.

Ok. That's my understanding, but didn't expressed it correctly. Now re-phrasing my question - Will the String instance in Constant Pool be eligible for GC if all the Hashmaps refering to an instance in constant pool remove the corresponding entry from Map?

An interned String can be garbage collected. It seems to be that you should be using a WeakHashMap. That doesn't hold strong references to the key Strings at all, regardless of whether they have been interned or not.

kaja.mohideen wrote:
... we are storing numerous string objects into multiple HashMaps. Since the string objects created/placed in HashMap may have duplicate values;
we're calling String::intern() method to make use of String Constant Pool feature

Why?
What do you think that will accomplish?

As I mentioned, we are storing huge number of String objects into HashMaps. The String objects may have duplicate values. Calling String::intern before inserting into Map will make the Map refer to the same String instance if it is already available in Constant Pool. Correct me, If I'm wrong.

I take it you are talking about HashMap<String,String> and you are not talking about keys
as there can be only one HashMap entry for a given key (in one HashMap).
When storing a large number of HashMap entries with necessarily different keys, but storing many duplicate values
(logically there must be significantly less different values than keys - or you would not have many duplicates)
you may be able to trade processor time for some memory saving
but unless these duplicate HashMap values are particularly long
you might want to make the invocation of intern() configurable and profile your applications performance (processor & memory)
with and without the intern() call.

Multiple instance of Interned Sting objects are being saved in Hashmaps ...that means there can be same objects at different keys in the Hashmap.
you are concerned about the memory usage , weakhashmap can be used and (As mentaioned by EJP) and at the time of clearing map ..System.gc() can be
invoked to make a request to GC.

EJP wrote:
2. The correct answer to the title of this thread is 'never'.

Actually, according to the JVM spec, the constant pool is part of the method area, and, "Although the method area is logically part of the heap, simple implementations may choose not to either garbage collect or compact it."

So it can be GCed, thought I doubt it ever is, except possibly in niche JVM implemenetations.

You're taking "A pool of strings, initially empty, is maintained privately by the class String." as meaning it's not "the constant pool"? I wouldn't interpret it that way. And based on the JVM spec, I don't see anything saying the intern()ed Strings go to a different pool than "the constant pool". And if it is a separate pool, then what does go into "the constant pool"?

http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html#22972
"A runtime constant pool is a per-class or per-interface runtime representation of the constant_pool table in a class file (§4.4). It contains several kinds of constants, ranging from numeric literals known at compile time to method and field references that must be resolved at run time. "
[...]
"Each runtime constant pool is allocated from the Java virtual machine's method area (§3.5.4)."

http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html#6656
"The Java virtual machine has a method area that is shared among all Java virtual machine threads. [...] It stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors [...]"

http://java.sun.com/docs/books/jvms/second_edition/html/ConstantPool.doc.html#73272
"The Java virtual machine maintains a per-type constant pool (§3.5.5),"
[...]
"A string literal (§2.3) is derived from a CONSTANT_String_info structure (§4.4.3) in the binary representation of a class or interface. The CONSTANT_String_info structure gives the sequence of Unicode characters constituting the string literal.

The Java programming language requires that identical string literals (that is, literals that contain the same sequence of characters) must refer to the same instance of class String. In addition, if the method String.intern is called on any string, the result is a reference to the same class instance that would be returned if that string appeared as a literal. Thus,

("a" + "b" + "c").intern() == "abc"

must have the value true.

To derive a string literal, the Java virtual machine examines the sequence of characters given by the CONSTANT_String_info structure.

If the method String.intern has previously been called on an instance of class String containing a sequence of Unicode characters identical to that given by the CONSTANT_String_info structure, then the result of string literal derivation is a reference to that same instance of class String.

Otherwise, a new instance of class String is created containing the sequence of Unicode characters given by the CONSTANT_String_info structure; that class instance is the result of string literal derivation. Finally, the intern method of the new String instance is invoked."

You cannot possibly equate a pool specifically described as 'maintained privately by the String class' with the Constant Pool which is (a) created by the compiler per class (b) merged by the class loader and (a) accessible to the entire JVM via any class whatsoever.

I also suggest you check the source code of String. There is a pool data structure there corresponding precisely to the description 'privately maintained by the String class'.

The process you quote above is what happens when interning strings that equat to literals already in the constant pool. Nowhere there does it describe adding to the constant pool at runtime, which would be required if both pools were the same.

For literals that are already in it. That doesn't prove that the private String pool and the constant pool are the same, i.e. that String maintains the constant pool. It doesn't. It also doesn't prove that the constant pool is subject to GC, which is the topic of this thread. It isn't.

EJP wrote:
You cannot possibly equate a pool specifically described as 'maintained privately by the String class' with the Constant Pool which is (a) created by the compiler per class (b) merged by the class loader and (a) accessible to the entire JVM via any class whatsoever.

My understanding from reading the JLS was that the per-class pool references the JVM-wide pool.

I'm not sure I understand the details of how all the references into that pool are managed, and if you're talking about classes having their own references to it, then fine, but there can be no question that intern() puts String objects into the same pool of objects as where String literals live.