My startup means the computation needed before executing user's code
(in main method) (see [1][2], while Aleksey's opinion is the startup
benchmark in SPECJVM2008.
[1] http://www.oracle.com/technology/pub/articles/dev2arch/2004/01/jrockit.html
[2] http://www.ibm.com/developerworks/java/library/os-ecspy1/
On Mon, Dec 22, 2008 at 8:38 AM, Nathan Beyer <ndbeyer@apache.org> wrote:
> Can someone give a quick summary of the two different definitions of
> "startup" being discussed?
>
> -Nathan
>
> On Sun, Dec 21, 2008 at 6:22 PM, Wenlong Li <wenlong@gmail.com> wrote:
>> Aleksey,
>>
>> Thx for testing this patch, and sharing your experimental result.
>> Yes, I think your result would be reasonable. The performance gain of
>> this patch varies with different systems.
>>
>> Again, I would like to say we have different definitions for "startup".
>> Maybe I should move the change in classlib module to vm module, so
>> that the dependency can be minimized.
>>
>> thx again for discussion. :)
>> wenlong
>>
>> On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev
>> <aleksey.shipilev@gmail.com> wrote:
>>> Hi Wenlong,
>>>
>>> I had some performance experiments with your patch. The test system is:
>>> - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667
>>> - WD 3200KS, 320 Gb, 16 Mb cache
>>> - Gentoo Linux x86, 2.6.23
>>> - Harmony r728459
>>> - SPECjvm2008
>>>
>>> To recreate the stressful conditions over and over the simple script
>>> was written [1]. The script invalidates the caches before actually
>>> starting the workload: re-reads the same 64 Mb file a couple of times
>>> to fill out on-HDD cache, invalidating VFS block caches first to make
>>> sure the data is really requested from the disk.
>>>
>>> On HWA [2] these performance results were produced:
>>>
>>> "cold-start" (invalidate caches):
>>> clean: (5.24 +- 0.28) secs
>>> ondemand: (4.49 +- 0.17) secs
>>>
>>> "warm-start" (don't invalidate caches);
>>> clean: (2.82 +- 0.01) secs
>>> ondemand: (2.80 +- 0.02) secs
>>>
>>> That is, on-demand patch does bring +17% (-+9%) improvement on HWA
>>> when running with flushed caches, and does not bring any performance
>>> improvement in warm mode.
>>>
>>> As I mentioned several times, this test does not reflect the real
>>> performance end user would perceive, so I took two SPECjvm2008:startup
>>> benchmarks and run each of them 10x10 times.
>>>
>>> SPECjvm2008:startup.helloworld, "cold start":
>>> clean: (8.93 +- 0.21) ops/min
>>> ondemand: (9.04 +- 0.03) ops/min
>>>
>>> SPECjvm2008:startup.compiler.compiler, "cold start":
>>> clean: (1.44 +- 0.05) ops/min
>>> ondemand: (1.42 +- 0.04) ops/min
>>>
>>> As you can see even in very stressful situation there's no boost. I
>>> would find these performance results unconvincing to change the
>>> infrastructure of boolclasspath resolution. Am I missing something
>>> important?
>>>
>>> Thanks,
>>> Aleksey.
>>>
>>> [1] run.sh
>>> #!/bin/bash
>>>
>>> R=`pwd`
>>>
>>> JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java
>>> #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java
>>> JAVA_OPTS="-Xmx1024M -Xms1024M"
>>>
>>> for T in `seq 1 10`; do
>>>
>>> echo "*************** EXECUTING ITERATION $T ****************"
>>>
>>> # invalidate HDD caches
>>> # - need to replace all entries in LRU HDD cache
>>> # - flush the kernel VFS cache first to ensure the data
>>> would be read from disk
>>>
>>> echo "Flushing caches"
>>> for I in `seq 1 5`; do
>>> sync
>>> echo 3 > /proc/sys/vm/drop_caches
>>>
>>> dd if=cachekiller.file of=/dev/null > /dev/null 2>&1
>>> done
>>>
>>> echo "Executing."
>>>
>>> # HelloWorld
>>> /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1
>>>
>>> # SPECjvm2008
>>> #cd $R/benchmarks/storage/SPECjvm2008
>>> #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar
>>> SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1
>>>
>>> echo ""
>>> done
>>>
>>> [2] HelloWorld.java
>>> public class HelloWorld {
>>> public static void main(String[] args) {
>>> System.out.println("Hello, world!");
>>> }
>>> }
>>>
>>>
>>> On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <wenlong@gmail.com> wrote:
>>>> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov
>>>> <alexei.fedotov@gmail.com> wrote:
>>>>> Wenlong,
>>>>> Thanks for removing the commented code.
>>>>>
>>>>> There are several VMs which make use of the Harmony class library,
>>>>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony VM
>>>>> specific, isn't it? If it is, then it's better to keep related changes
>>>>> in the VM module. If it is not, then it might be a good idea to keep
>>>>> the changes in the class library module unless other VMs already has
>>>>> such optimization in their code.
>>>> [Wenlong] Though at this moment, you can think on-demand class parsing
>>>> is a specif optimization from your point of view. I believe it could
>>>> be a general technique, e.g., it can be easily deployed in other
>>>> runtime systems. Current VM also depends on the luniglobal.c in
>>>> working_classlib to get all class libraries/modules. e.g., there is a
>>>> cross-module dependence between classlib and VM. When user wants to
>>>> add new module, they should manually change the
>>>> bootclasspath.properties, while if applying this patch, user should
>>>> revise my added property file instead of the bootclasspath.properties.
>>>> I understand modifying bootclasspath file may be a specification.
>>>>>
>>>>> In any case crossing module boundary would make class library users
>>>>> think more than once or even write some code. Is it technically
>>>>> possible to prepare a patch which does not change module boundaries?
>>>>> What do you think?
>>>> [Wenlong] Yes, it is possible from technical perspective, but a little
>>>> complicated. I can think about it. :)
>>>>
>>>>>
>>>>> As for your performance experiments, which particular test are your
>>>>> measuring? It is bootclasspath-unpretentious "Hello, world", isn't it?
>>>> [Wenlong] My startup means the work executed before running user's
>>>> computation. That is, the vm creation time. I manually add
>>>> instrumentation code for execution time in JNI_CreateJavaVM of
>>>> JNI.cpp. This startup work is common for any benchmarks. My experiment
>>>> was conducted on both Windows and Linux system. Please see my previous
>>>> message about performance gain from this optimization.
>>>>
>>>> Thx,
>>>> Wenlong
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <wenlong@gmail.com>
wrote:
>>>>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov
>>>>>> <alexei.fedotov@gmail.com> wrote:
>>>>>>> Wenlong,
>>>>>>> Have I missed a discussion of the proposed design? I see that
you
>>>>>>> expose a new public interface:
>>>>>>> /**
>>>>>>> * @map the jar with exported package in the pending jar list
for
>>>>>>> on-demand jar parsing
>>>>>>> * Key is the jar, and value is the package exported by this
jar
>>>>>>> */
>>>>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const char*
key,
>>>>>>> const char* value));
>>>>>>>
>>>>>>> Did you mean "Maps" instead of "@map"? Strangely the word "pending"
>>>>>>> disappeared from the name of the wrapping VMI interface
>>>>>>> SetJarPackageMapping . Why should we extend both OPEN and VMI
>>>>>>> interfaces with the same function? Why did you put your code
into
>>>>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c,
>>>>>>> thus introducing another dependency between VM and class library?
>>>>>> [Wenlong] The boot class path is defined in luniglobal.c in Harmony,
>>>>>> and it also has dependence with VM. In my understanding, my patch
is
>>>>>> related to boot class path determination, so I also put my code in
>>>>>> luniglobal.c, and use VMI interface to communicate with VM.
>>>>>>
>>>>>>>
>>>>>>> + //rcSetProperty = (*vmInterface)->SetJarPackageMapping
>>>>>>> (vmInterface, jarName, jarValue);
>>>>>>> + /*
>>>>>>> + hymem_free_memory(jarName);
>>>>>>> + hymem_free_memory(jarValue);
>>>>>>> + */
>>>>>>> Should we really commit the commented code?
>>>>>>> Thanks.
>>>>>>
>>>>>> [Wenlong] Please see my latest version of patch in the list. Such
>>>>>> commented code has been removed.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <t.p.ellison@gmail.com>
wrote:
>>>>>>>> I was hoping that somebody else would comment first, so I
don't have to
>>>>>>>> be the grumpy one all the time :-)
>>>>>>>>
>>>>>>>> As I said before, this is good prototyping work...
>>>>>>>>
>>>>>>>> Wenlong Li wrote:
>>>>>>>>> I did the pre-commit test on the patch of on-demand class
library
>>>>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039),
and it
>>>>>>>>> works well now.
>>>>>>>>> Can Harmony incorporate this feature?
>>>>>>>>
>>>>>>>> I'm not sure it is ready for committing to the head stream
yet.
>>>>>>>>
>>>>>>>>> Via on-demand class parsing, we can reduce startup time
from 20+
>>>>>>>>> seconds to 3 seconds for cold runing, and 170 ms to 140
ms for warm-up
>>>>>>>>> running on Core 2 Duo with Windows.
>>>>>>>>
>>>>>>>> Can you tell me how to reproduce 20+sec cold start-up? I
haven't seen
>>>>>>>> anything like that in my simple tests.
>>>>>>>>
>>>>>>>>> After applying the patch, please note there is some change
to add new modules.
>>>>>>>>> (1) If you want to add new modules/libraries, please
don't put them in
>>>>>>>>> the bootclasspath.properties file. This file now only
saves modules
>>>>>>>>> needed during startup (the VM startup only accesses class
libraries in
>>>>>>>>> eight modules)
>>>>>>>>
>>>>>>>> That would break too much. How about creating a new file
rather than
>>>>>>>> re-purposing an existing file with different semantics?
This file is
>>>>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least.
>>>>>>>>
>>>>>>>>> (2) For new modules/libraries, please put them in the
>>>>>>>>> modulelibrarymapping.properties file. You should specify
the module
>>>>>>>>> name and its exported class library. Here is one example:
>>>>>>>>> math.jar=java.math, where "math.jar" means the module
name, and
>>>>>>>>> "java.math" means the class libraries this module exports.
>>>>>>>>
>>>>>>>> As we discussed on another thread, its unclear if the time
is spent in
>>>>>>>> following the slow indexing through the classpath/JAR directories,
or
>>>>>>>> whether it is speed of loading bytes once we know what we
need. I think
>>>>>>>> that it is premature to abandon the JAR manifest data as
the principal
>>>>>>>> source of metadata until we understand the problem this solves.
>>>>>>>>
>>>>>>>> Can we measure where the time is spent in the current implementation?
>>>>>>>> I think it will help guide this approach to a better solution.
>>>>>>>> What tools do you recommend for profiling start-up?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Tim
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> С уважением,
>>>>>>> Алексей Федотов,
>>>>>>> ЗАО «Телеком Экспресс»
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> С уважением,
>>>>> Алексей Федотов,
>>>>> ЗАО «Телеком Экспресс»
>>>>>
>>>>
>>>
>>
>