Missed that, sorry.
[color=blue]
> Ignoring the specifics of the OP running AIX vs Linux for a moment,
> would the congestion control selection matter much if there were no
> packet losses? We are still awaiting the OP's "netstat report" :)[/color]

LOL - I'll suspend the pseudo-reality in my brain for a minute and
pretend the universe consists of zero data loss and no impedance
mismatches between any piece of copper anywhere as well as zero fiber
impurities nor defects and 100% TIR. =) OHMMM..

so latest news.. seems there are still some bottlenecks,
but so far it is more and more interesting..

- have changed tunables as described in previous post, and
- currently I have:
- currently I have:- scp - up to 5 megabytes/s
- currently I have:- scp - up to 5 megabytes/s- nfs - up to 11 megabytes/s
- currently I have:- scp - up to 5 megabytes/s- nfs - up to 11 megabytes/s- ftp - up to 90 megabytes/s (!)
- question is - what are the reasons for these differencies?
and obviously, do we have any way to line up to ftp performance?
I am aware of scp limitations, will install and patch version
with HPN, but what about nfs, any ideas how to improve?

No worries.
[color=blue][color=green]
> > Ignoring the specifics of the OP running AIX vs Linux for a
> > moment, would the congestion control selection matter much if
> > there were no packet losses? We are still awaiting the OP's
> > "netstat report" :)[/color][/color]
[color=blue]
> LOL - I'll suspend the pseudo-reality in my brain for a minute and
> pretend the universe consists of zero data loss and no impedance
> mismatches between any piece of copper anywhere as well as zero
> fiber impurities nor defects and 100% TIR. =) OHMMM..[/color]

Yep :) It is an easy morph for me to go from Rick Jones to Don Jones
to Don Quixote. :)

Still want to see the netstat statistics though...

rick jones

--
oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

09-18-2008, 08:07 PM

unix

Re: 10GB link vs 3Mb/s transfer

[email]serwan@gdziestamna.il.pw.edu.pl[/email] wrote:[color=blue]
> so latest news.. seems there are still some bottlenecks,
> but so far it is more and more interesting..[/color]

BTW, the WAN link is 10Gbit - I have been ass-u-me-ing that the
end-hosts also have 10Gbit interfaces. Can you confirm that?
[color=blue]
> - have changed tunables as described in previous post, and
> - currently I have:
> - currently I have:
> - scp - up to 5 megabytes/s
> - nfs - up to 11 megabytes/s
> - ftp - up to 90 megabytes/s (!)
> - question is - what are the reasons for these differencies?[/color]

So, what is your "networking person" saying now?-)

FTP - open the file, open the socket, dump contents of first into
second, perform the inverse at the other side. Pure unidirectional
transfer, no waiting for application level replies from the receiver.
Flow control is up to TCP. Apart from the file system interaction
(which at 90 MB/s you may be hitting against) it looks just like a
netperf TCP_STREAM test.

scp - has its own "window" as it were and as such ends-up waiting for
application layer (relative to TCP) replies from the receiver. So, it
will be constrained by the slower of the two flow-control mechanisms -
that of SSL/TLS and that of TCP. If scp cannot have enough
outstanding on the connection at one time it won't matter that TCP
would allow more.

The encryption/decryption has significantly higher CPU overheads and
as you ramp-up the throughput you have to start looking for one or
more of the CPUs on either end saturating. Do look at _individual_
CPU utilization not overall - in broad terms a single TCP connection
will not make use of the cycles of more than one or two
cores/threads/whatnot.

NFS - it is a request/reply application. There can be only so many
NFS requests outstanding at one time, which again is a form of flow
control sitting above that of TCP. If NFS cannot have enough requests
outstanding at one time, it won't matter that TCP would allow more.
[color=blue]
> and obviously, do we have any way to line up to ftp performance?
> I am aware of scp limitations, will install and patch version
> with HPN, but what about nfs, any ideas how to improve?[/color]

You need to find-out how to get AIX to allow either greater
read-ahead, or greater write-behind depending on whether this is
reading a file over NFS or writing it.

You can "simulate" the NFS over TCP behaviour if you:

../configure --enable-burst

for netperf and then to simulate reading a file from an NFS server you
would say something like:

Where <val> needs to be at least as large as <mountsize>*<maxrequests>
because the netperf TCP_RR test is being (ab)used and doesn't use
select/poll, so we want to make sure that it can _always_ do that many
writes to the socket without blocking or it may deadlock. The -v 2
option is to get it to report a bit more about the throughput in each
direction.

And as before, you need to be checking netstat statistics for TCP for
before and after each of these transfers to be certain there are no
packet losses...

rick jones
--
portable adj, code that compiles under more than one compiler
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

09-22-2008, 03:49 PM

unix

Re: 10GB link vs 3Mb/s transfer

Enable TCP window scaling/large windows (RFC 1323) and SACKS.

These options need to be enabled on both TCP stacks.

It would help a lot if you used something MRTG or PRTG to monitor the
10gig port in your environment.... You may indeed be getting the
throughput of the link, but there are other factors involved like
protocol overhead and so on.

Rick Jones wrote:[color=blue]
> [email]serwan@gdziestamna.il.pw.edu.pl[/email] wrote:[color=green][color=darkred]
>>> 3 MB/s using _what_ *exactly* for the transfer?[/color][/color]
>[color=green]
>> not sure if understood your question - just to try to answer:
>> - ftp - up to 3MB/s
>> - scp - up to 3MB/s
>> - nfs - once achieved up to 7 MB/s, usually ~4,5MB/s[/color]
>
> That is what I was looking for.
>[color=green][color=darkred]
>>> _Which_ tunable parameters. You really need to be much more specific.
>>> Names and values.[/color][/color]
>[color=green]
>> And this is interesting; in IBM's manuals there are several parameters
>> mentioned; I have tested, as described here:
>> [url]http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=[/url]
>> /com.ibm.aix.prftungd/doc/prftungd/tcp_udp_perf_tuning.htm
>> - tcp_sendspace - values from 16384 up to 655360 (no results)
>> - tcp_recvspace - values from 16384 up to 655360 (no results)
>> - sb_max - values from 1048576 up to 1310720 (no results)[/color]
>
> For a 10Gbit link with 25ms RTT you will need to be going much larger
> than those values I suspect.
>[color=green]
>> and, finally, yesterday evening I have found also:
>> - tcp_nodelayack - have changed it from 0 to 1, and some imporvement
>> was observer - average transfer was:
>> -- ftp - up to 5 MB/s
>> -- scp - up to 5 MB/s
>> -- nfs - up to 10 MB/s
>> little bit better, but still - can I have more?[/color]
>
> That tcp_nodelayack affected ftp or scp is surprising to me. A bulk
> transfer such as that performed by FTP should not need immediate ACKs.
> As for the rest, those settings you changed are likely defaults. If
> FTP or scp are making their own setsockopt() calls to set socket
> buffer sizes, it stands to reason that a change in the defaults would
> not result in a change in performance. Ostensibly, if there is a way
> to get FTP to use a different socket buffer size (on _both_ ends) it
> would appear in the manpages for ftp and ftpd. Taking a system call
> trace of the ftp client (or ftpd) would show if it is making
> setsockopt() calls.
>
> NFS may be using the defaults.
>
> rick jones[/color]

09-24-2008, 05:05 PM

unix

Re: 10GB link vs 3Mb/s transfer

So, any word on where things stand now?

rick jones

Rick Jones <rick.jones2@hp.com> wrote:[color=blue]
> [email]serwan@gdziestamna.il.pw.edu.pl[/email] wrote:[color=green]
> > so latest news.. seems there are still some bottlenecks,
> > but so far it is more and more interesting..[/color][/color]
[color=blue]
> BTW, the WAN link is 10Gbit - I have been ass-u-me-ing that the
> end-hosts also have 10Gbit interfaces. Can you confirm that?[/color]
[color=blue][color=green]
> > - have changed tunables as described in previous post, and
> > - currently I have:
> > - currently I have:
> > - scp - up to 5 megabytes/s
> > - nfs - up to 11 megabytes/s
> > - ftp - up to 90 megabytes/s (!)
> > - question is - what are the reasons for these differencies?[/color][/color]
[color=blue]
> So, what is your "networking person" saying now?-)[/color]
[color=blue]
> FTP - open the file, open the socket, dump contents of first into
> second, perform the inverse at the other side. Pure unidirectional
> transfer, no waiting for application level replies from the receiver.
> Flow control is up to TCP. Apart from the file system interaction
> (which at 90 MB/s you may be hitting against) it looks just like a
> netperf TCP_STREAM test.[/color]
[color=blue]
> scp - has its own "window" as it were and as such ends-up waiting for
> application layer (relative to TCP) replies from the receiver. So, it
> will be constrained by the slower of the two flow-control mechanisms -
> that of SSL/TLS and that of TCP. If scp cannot have enough
> outstanding on the connection at one time it won't matter that TCP
> would allow more.[/color]
[color=blue]
> The encryption/decryption has significantly higher CPU overheads and
> as you ramp-up the throughput you have to start looking for one or
> more of the CPUs on either end saturating. Do look at _individual_
> CPU utilization not overall - in broad terms a single TCP connection
> will not make use of the cycles of more than one or two
> cores/threads/whatnot.[/color]
[color=blue]
> NFS - it is a request/reply application. There can be only so many
> NFS requests outstanding at one time, which again is a form of flow
> control sitting above that of TCP. If NFS cannot have enough requests
> outstanding at one time, it won't matter that TCP would allow more.[/color]
[color=blue][color=green]
> > and obviously, do we have any way to line up to ftp performance?
> > I am aware of scp limitations, will install and patch version
> > with HPN, but what about nfs, any ideas how to improve?[/color][/color]
[color=blue]
> You need to find-out how to get AIX to allow either greater
> read-ahead, or greater write-behind depending on whether this is
> reading a file over NFS or writing it.[/color]
[color=blue]
> You can "simulate" the NFS over TCP behaviour if you:[/color]
[color=blue]
> ./configure --enable-burst[/color]
[color=blue]
> for netperf and then to simulate reading a file from an NFS server you
> would say something like:[/color]
[color=blue]
> netperf -H <server> -t TCP_RR -f M -v 2 -- -s <val> -S <val> -r 256,<mountsize> -b <max outstanding NFS requests>[/color]
[color=blue]
> Where <val> needs to be at least as large as <mountsize>*<maxrequests>
> because the netperf TCP_RR test is being (ab)used and doesn't use
> select/poll, so we want to make sure that it can _always_ do that many
> writes to the socket without blocking or it may deadlock. The -v 2
> option is to get it to report a bit more about the throughput in each
> direction.[/color]
[color=blue]
> And as before, you need to be checking netstat statistics for TCP for
> before and after each of these transfers to be certain there are no
> packet losses...[/color]
[color=blue]
> rick jones
> --
> portable adj, code that compiles under more than one compiler
> these opinions are mine, all mine; HP might not want them anyway... :)
> feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...[/color]

--
Process shall set you free from the need for rational thought.
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...