How to know how many sockets need to load balance the high network traffic - TCP-IP

This is a discussion on How to know how many sockets need to load balance the high network traffic - TCP-IP ; Hi, I am designing a network that has the traffic about 300MB per
second. I am worried that the amount of the traffic will overwhelm the
socket if I only create one socket to handle the data.
Does anyone know ...

How to know how many sockets need to load balance the high network traffic

Hi, I am designing a network that has the traffic about 300MB per
second. I am worried that the amount of the traffic will overwhelm the
socket if I only create one socket to handle the data.
Does anyone know how to estimate how many sockets that I need to
create to load balance
the traffic? Are there some tools to for the estimation?

thanks very much,

Re: How to know how many sockets need to load balance the high network traffic

In article <1186264971.672383.157830@w3g2000hsg.googlegroups.c om>,zho_zhang11790@yahoo.com wrote:
> Hi, I am designing a network that has the traffic about 300MB per
> second. I am worried that the amount of the traffic will overwhelm the
> socket if I only create one socket to handle the data.
> Does anyone know how to estimate how many sockets that I need to
> create to load balance
> the traffic? Are there some tools to for the estimation?
>
> thanks very much,

Your question doesn't make much sense. Sockets are abstractions in
networking APIs, they aren't part of the network itself.

What exactly is it you're trying to optimize, the network
infrastructure, the design of a server, etc.?

Re: How to know how many sockets need to load balance the high network traffic

On Aug 4, 10:04 pm, Barry Margolin wrote:
> In article <1186264971.672383.157...@w3g2000hsg.googlegroups.c om>,
>
> zho_zhang11...@yahoo.com wrote:
> > Hi, I am designing a network that has the traffic about 300MB per
> > second. I am worried that the amount of the traffic will overwhelm the
> > socket if I only create one socket to handle the data.
> > Does anyone know how to estimate how many sockets that I need to
> > create to load balance
> > the traffic? Are there some tools to for the estimation?
>
> > thanks very much,
>
> Your question doesn't make much sense. Sockets are abstractions in
> networking APIs, they aren't part of the network itself.
>
> What exactly is it you're trying to optimize, the network
> infrastructure, the design of a server, etc.?
>
> --
> Barry Margolin, bar...@alum.mit.edu
> Arlington, MA
> *** PLEASE post questions in newsgroups, not directly to me ***
> *** PLEASE don't copy me on replies, I'll read them in the group ***

Many thanks to your response. In fact this is my first independent
project trying to implement an optimized client server
computing system for our experiment using the posix socket API. The
client keeps sending huge amount of data through the high speed
network to the server. The server needs to load the data into the
database in time as well as preprocess and forward the data to other
applications for further processing. The data could be sent in
parallel.
I have two ways to implement. One way is to create one connection
between the client and server in one thread. The other way is to
create multiple connections in multiple threads. I am not sure which
way is better in this case?

I am not a cs major. Just studied the Richard Stevens' book and have a
little simple client server programming experience. Therefore it is a
real challenge for me to make this decision in the beginning. Any
suggestions will be
extremely helpful for me.

thanks

Re: How to know how many sockets need to load balance the high network traffic

zho_zhang11790@yahoo.com wrote:
> The
> client keeps sending huge amount of data through the high speed
> network to the server. The server needs to load the data into the
> database in time as well as preprocess and forward the data to other
> applications for further processing. The data could be sent in
> parallel.

First, I would start by figuring out where the bottlenecks are and attack
them first. You've got several things going on here:

Until you have some clue which of those are the rate limiting step(s),
there's no point in trying to optimize any of them.

Re: How to know how many sockets need to load balance the high networktraffic

zho_zhang11790 wrote:
> Hi, I am designing a network that has the traffic about 300MB per
> second. I am worried that the amount of the traffic will overwhelm the
> socket if I only create one socket to handle the data.

IIUC, you need to transfer 2.4 Gbit/s between two computers?

How are these two computers connected to each other?

Re: How to know how many sockets need to load balance the high network traffic

On Aug 6, 6:27 am, Spoon wrote:
> zho_zhang11790 wrote:
> > Hi, I am designing a network that has the traffic about 300MB per
> > second. I am worried that the amount of the traffic will overwhelm the
> > socket if I only create one socket to handle the data.
>
> IIUC, you need to transfer 2.4 Gbit/s between two computers?
>
> How are these two computers connected to each other?

Sorry. 300MB means 300Mega Bits.

Re: How to know how many sockets need to load balance the high network traffic

In article <1186448021.353006.42290@d55g2000hsg.googlegroups.c om>, wrote:
>> > Hi, I am designing a network that has the traffic about 300MB per
>> > second. I am worried that the amount of the traffic will overwhelm the
>> IIUC, you need to transfer 2.4 Gbit/s between two computers?
>>
>> How are these two computers connected to each other?
>
>Sorry. 300MB means 300Mega Bits.

Sorry, but the convention of more than 10 years and probably much longer
standing among knowledgable network professionals differs. 'B' is
bigger than 'b', so 'B' stands for Bytes while 'b' stands for bits.
"300 MB/s" should mean "300 megaBytes per second." I long ago give
arguing 'b' versus 'B' and always spell out MBytes and Mbits. The few
extra keystrokes save a lot of confusion.

Some say that in telecommunications circles, K, M, and G stand for
10^3, 10^6, and 10^9 but mean 2^10, 2^20, and 2^30 bits or bytes o
RAM and disk space. I figure the less than 10% (for T=10^12 versus
2^40) difference is usually too small to worry about but otherwise
requires explicit words.

As for the original question of whether to use one or more than one
socket, remember that any sort of mulit-threading or multi-processing
costs additional CPU cycles and memory bandwidth. If the processing
of the data by the sending or receiving computer cannot be spread among
multiple CPUs (and often even if it can be spread), you will probably
achieve the greatest throughput with a single network connecting one
process with single socket on both computers.

Multiple sockets, multiple networks, and so forth are powerful
tools, but only for applications that can be divided into largely
independent, parallel tasks.

As Roy Smith wrote nearby, before optimizing pushing bits through
sockets, it would be wise to locate and understand the existing
bottlenecks. Doing otherwise is likely to cause costly and wrong
premature optimizations. There are zillions of painful reasons why the
following Google search hs so many hits:http://www.google.com/search?q=%22pr...ptimization%22

Re: How to know how many sockets need to load balance the high network traffic

[I'm just assuming that TCP is to be used because the experiment
doesn't like to miss data. The original requestor didn't make that
clear though.]

Vernon Schryver writes:
> As for the original question of whether to use one or more than one
> socket, remember that any sort of mulit-threading or
> multi-processing costs additional CPU cycles and memory bandwidth.
> If the processing of the data by the sending or receiving computer
> cannot be spread among multiple CPUs (and often even if it can be
> spread), you will probably achieve the greatest throughput with a
> single network connecting one process with single socket on both
> computers.

On the other hand, multiple sockets can help with multicore CPUs and
network adapters that are able to demultiplex packets from different
TCP sockets to different cores, e.g.

Even when the application itself cannot use multiple threads, this
could be used to distribute (TCP) packet processing overhead over
multiple cores.

There is another impact of using multiple parallel TCP connections,
and that is the impact of loss, both congestion-induced and other. An
aggregate of multiple connections will be more "robust" to loss, and
compete more aggressively with other users of the network (if any).
> Multiple sockets, multiple networks, and so forth are powerful
> tools, but only for applications that can be divided into largely
> independent, parallel tasks.

If the main purpose of the application (or its bottleneck) is the
transfer of large amounts of data, then that seems like a naturally
parallelizable task - just split/stripe the data.
> As Roy Smith wrote nearby, before optimizing pushing bits through
> sockets, it would be wise to locate and understand the existing
> bottlenecks. Doing otherwise is likely to cause costly and wrong
> premature optimizations. There are zillions of painful reasons why the
> following Google search hs so many hits:
> http://www.google.com/search?q=%22pr...ptimization%22

Very true!
--
Simon.

Re: How to know how many sockets need to load balance the high network traffic

In article ,
Simon Leinen wrote:
>Even when the application itself cannot use multiple threads, this
>could be used to distribute (TCP) packet processing overhead over
>multiple cores.

I know that sentiment is popular. However, based on extended experience
with multi-processors on the vendor side, I think it is more often false
than true. The trouble is that TCP segement and IP packet processing
requires a trivial number of CPU cycles that is generally less than the
cost of the locking required to protect common data structures. IP is
practically free in CPU cycles (assuming no fragment reassembly), except
for that multi-CPU locking. Almost all of the cost of TCP is in waiting
from main memory while moving the data among buffers or to compute the
checksum. Recall Van Jacobson's number of something like 120 CPU cycles
per TCP segment, exclusive of DMA copyin/out, and checksumming. It's
not the checksum itself, which can be done with fewer than 1 CPU
cycle/byte, but waiting from the data to get through in or out of slow
RAM and through caches.

If your CPU runs at 3 GHz and so can checksum at more than 3
GByte/sec, what will it be doing most of the time if your main
memory runs at 1 GByte/sec? There are multiprocessors with main
memories that run at more than 1 GByte/sec, but somehow I don't
think they' are the topic of this thread at this time.

It's handy to just assume that no waiting on inter-CPU locks is needed
to send or receive a TCP segment, but most people who do that have never
looked at the kernel source from start to end of the system call looking
for locks. I don't recall ever seeing Mentat STREAMS source, but I
have seen other multi-CPU TCP impliementations. They all have at least
some locking. One memorable example that was ostensibly designed for
"light weight threads" had lock calls every few dozen lines. The locks
are not just in the TCP protocol code but in buffer management and other
stuff such as STREAM head lists. Merely checking a lock and winning
without any contention (e.g. thanks to CPU affinity) burns a lot CPU
cycles compared to total TCP protocol processing.

I'm sure that's a vey nice card, but except for being 10 GE, I don't
see anything revolutionary. For example, CPU affinity to minimize
cache thrashing and over heated kernel locks has been important for
good TCP benchmark numbers for more than 10 years.

>There is another impact of using multiple parallel TCP connections,
>and that is the impact of loss, both congestion-induced and other. An
>aggregate of multiple connections will be more "robust" to loss, and
>compete more aggressively with other users of the network (if any).

Except for competing more aggressively with other users of the network
I think that is also more wrong than right. If your have enough
buffering, (i.e. large enough windows), fast retransmit deals with low
rates of loss invisibly. If you don't have large enough windows or
if you have high loss rates, you'll merely have more TCP state machines
stuck waiting for retransmissions.

For example, consider the extreme case of quite low error rates losing
at most single segments per serveral RTTs. In this case spreading the
data among several TCP connections will delay the duplicate acks which
are needed to trigger fast retransmission.

>> Multiple sockets, multiple networks, and so forth are powerful
>> tools, but only for applications that can be divided into largely
>> independent, parallel tasks.
>
>If the main purpose of the application (or its bottleneck) is the
>transfer of large amounts of data, then that seems like a naturally
>parallelizable task - just split/stripe the data.

I think that if you do much benchmarking with raw throughput tests such
as ttcp, I think you'll find that parallel TCP connections are no
faster and often slower.

Re: How to know how many sockets need to load balance the high network traffic

Vernon Schryver wrote:
> In article ,
> Simon Leinen wrote:
> > http://www.sun.com/products/networki...rnet/index.xml
> I'm sure that's a vey nice card, but except for being 10 GE, I don't
> see anything revolutionary. For example, CPU affinity to minimize
> cache thrashing and over heated kernel locks has been important for
> good TCP benchmark numbers for more than 10 years.

It isn't. There were at least one or two very similar cards from
other source shipping well before that one.

rick jones
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

Re: How to know how many sockets need to load balance the high network traffic

In article ,
Rick Jones wrote:
>> > http://www.sun.com/products/networki...rnet/index.xml
>
>> I'm sure that's a vey nice card, but except for being 10 GE, I don't
>> see anything revolutionary. For example, CPU affinity to minimize
>> cache thrashing and over heated kernel locks has been important for
>> good TCP benchmark numbers for more than 10 years.
>
>It isn't. There were at least one or two very similar cards from
>other source shipping well before that one.

Re: How to know how many sockets need to load balance the high network traffic

Vernon Schryver wrote:
> In article ,
> Rick Jones wrote:
> >> > http://www.sun.com/products/networki...rnet/index.xml
> >
> >> I'm sure that's a vey nice card, but except for being 10 GE, I don't
> >> see anything revolutionary. For example, CPU affinity to minimize
> >> cache thrashing and over heated kernel locks has been important for
> >> good TCP benchmark numbers for more than 10 years.
> >
> >It isn't. There were at least one or two very similar cards from
> >other source shipping well before that one.
> It isn't which, "very nice," "revolutionary," or something else?
> I suspect revolutionary.

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...