Monday, May 19, 2008

GC: JAVA 1.4

There are 3-4 algos specified by jvm 1.4 specifications. namely
reference counting
tracing
stop n copy
generational.
mark n sweep
Does JVM spec say that any of the above algo has the advantage over the other and the situations when one should be used and how do we use any one of them. Do we specify them somewhere as JAVA_OPTIONS?

--This is the first post. I'll detail about this thing later on.This post would serve as a startup for our blogs.

5 comments:

Tapan said...

There is no perfect algorithm that will perform in all real time situations. In fact, generational collectors make use of tracing(which is 'mark and sweep'), copying, compacting algorithms. I think, all JVMs use generational collectors by default.

For generational collectors, the heap is broadly divided in young and tenured generation. For young generation, parallel garbage collector(mark-n-sweep and then copy) is used and for tenured generation, concurrent low-pause collector(which is mark-n-sweep and then compact) is used.

There are othere options available too and you can always tune JVM parameters for your collectors.

I think that through JAVA_OPTIONS you can specify your choice of garbage collector and its properties like..
SET JAVA_OPTIONS=%JAVA_OPTIONS% -XX:+UseParallelGC

and then..

java %JAVA_OPTIONS% ClassName

Saurabh said...

We can specify the Garbage collection algorithm to be used in Java Options.for example
-XX:+UseParallelGC
-XX:+UseConcMarkSweepGC

The performance of various garbage collector algorithms depend upon the number of processors being used.
for example the Throghput or parallel garbage collector uses multiple threads for minor collection.If there are 4 cpu it would use 4 threads for the same.So it would not perform as well as the serial collecor on 1 CPU machine.

This was one of the factor.There can be other factor like if we need short pauses in the application during GC we use Concurrent collector.

I am not sure about one thing that when the GC thread is executing does the thread running gets stopped for that time period ??

Tapan said...

yes Saurabh. The concurrent collectors does most of the task while application is running (the other application threads continue to execute). But it is limited to marking phase. But during sweep and copy/compact phase, the application comes to a halt and that is called 'pause'. If you want to minimize 'pause', the collections should happen frequently but it comes at the cost of 'throughput'. So, there is a trade-off. But you can always choose between pause and throughput. If you know that your objects are short-lived, frequent collection will be useful (gives you low pause), otherwise, you wil go for throughput.

Can anyone explain what 'footprint' is? Or,what do you mean by 'memory pages'? And what are the downsides for giving a high value for heap size (-Xmx or -Xms)?

Saurabh said...

Footprint is the total space required by the jvm.Its working space(heap,stack,method area ,its own cache)

We can specify the Heap size depending upon the req.
like in production we can specify the max and min to be same so that the memory resizing does not happens..as we know that no other process is running on that machine and all our memory can be used for this appln.
But if there are many processes running than probably we should limit the max size of the heap ..as other process would require memory and as a result "paging of virtual memory to disk" (what i understood is the copying of data from RAM to hard disk ) would cause performance lapse.

I read somewhere that the working size is measured in pages.
I am not sure about this..We can discuss this further.

PLease comment :)

RamrajChauhan said...

I would say the heap size,throughput or pause values would be configured after running the application server via several load tests and several trail values.
Some values which might be valid today for the current load that the server handles may not be practical/valid a year down the line when the traffic volumne has increased. There again monitoring job would have to be done.This can very easily be done via the profilers.

--i'm ignorant about the footprint concept but i would beleive what Saurabh says..