SUMMARY: efficient use of physical memory

We have a 4/280 with 128 Meg of physical memory and about 700 Meg
of swap. We often run large processes of around 512 Meg virtual
size. We are running SunOS 4.1.1 and DBE 1.1. The problem I have
noted recently is that while running the process and ps -avx, the
maximium reported RSS is only about 55%. Running vmstat shows
considerable paging in, no paging out, and about the same number
of pages being freed. There is only one user on the system and
only a few processes. The large processes essentially processes
a query then returns the result to a relatively small process for
display. The display process is about 40 Meg virtual size.

My question or concern is that the big process does not seem to
be using all available physical memory; nor is any other process.
Also, when the query results are returned to the display process,
the large process quickly goes to reporting zero Resident Set Size.
Is there some hard limit that is restricting one process to about
half of the physical memory? Is the clock algorithm aging these
pages too quickly so that I am getting a large number of "soft"
page faults; that is, pages are marked for delete but are actually
still in memory, so they're getting picked up cheaper than going
to disk but still more expensive than if they were part of my
process.

Let's see, other information: some disk activity, but not too
heavy. No swap outs reported by vmstat, but moderately heavy
swap ins. pmegstat shows no pmegs stolen but lots are allocated
an inmmu. Lots of stolen contexts reported by pmegstat.

The most helpful reply is once again due to Hal Stern who
suggested turning off the swapper. This may be done with
an adb patch:
adb -k -w /vmunix /dev/mem
nosched?W 1
$q

Don't forget to save a copy of the kernel first. Also, the system
must be rebooted after the patch. If you did it right,
vmstat -S 5 will show zeros for both the si and so columns.

Other helpful utilities available from your Sun engineer are
pmegstat which shows the pmeg allocations and vmpage (evidently
written by Hal Stern?) which shows the allocation of virtual memory pages.
After turning off swapping, I experienced an immediate improvement
in performance. I also tried changing the handspread but was
unable to tell much difference. I think I need to experiment
further, however.

Here is a summary from Hal on where memory goes and comments
on handspread:

i've enclosed a copy of "vmpage", which tells you where
the various pages are going -- kernel, file cache, process
address space, etc.

1/4 of 128M is 32M unaccounted for. figure with maxusers=64
and nbuf=x70, you lose about 4M in the kernel. various
processes (sendmail, shells, etc) probably use up another 4-8M,
although a lot of them may be swapped out. if you're doing
a lot of file i/o, the rest could be file cache and the
minimum amount of free space left by the VM system.

handspread = 0x7ffffff is probably too big; you should choose
a value between 32M and 64M for it. if it's too big, you'll
spend a lot of time finding a page to kick out; if it's too
small you'll always toss out the first-scanned rather than
least-referenced pages.

you are getting stung by the swapper.
what is probably happening is that when the large process
is running, the small process is (possibly) swapped out, or
at least marked for swap out. then it becomes runnable,
and it creates a huge demand for memory -- so the swapper
swaps out your big process.

when the RSS value goes to zero, you've been marked for
swapping. doesn't mean all of your pages are on disk,
but you've been put in the queue. that's probably where
the high number of pageins come from -- you're just
reclaming pages that were put on the free list when
the huge process got swapped out. this is confirmed by
the large number of soft faults (reclaims/reattaches).

something to try: disable the swapper using
# echo "nosched?W 1" | adb -k -w /vmunix /dev/mem
# reboot
(you have to reboot for this to work)
as for the RSS only getting to 55% of the total process size,
that may be a function of it getting swapped in/out; only
the part you're using is paged back into the resident set. so
you may only touch part of the data/stack segments to handle
a query before you get swapped out again.