If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Fastest way for multithreaded for each loop?

Already tried parallel.foreach but wasnt fast enough. maybe something with background threads? couldnt get it to work because when i try it it does a for each on every background thread i start.
for each line in list is what i try to do.

Re: Fastest way for multithreaded for each loop?

How fast do you need it and why? How fast was Parallel.ForEach and why do you think that that isn't already something with background threads? Maybe you should actually show us the single-threaded code and your attempt at using Parallel.ForEach because there's every chance that you just did it wrong. In fact, if you are correct that each thread does a For Each loop then you definitely did it wrong.

Re: Fastest way for multithreaded for each loop?

I am only guessing here so I could be wrong but the limiting factor is going to be how long each attempt takes, if you are taking 8 seconds to timeout and you have 15,000 things to check you could be suffering a lot of time penalties there.

Anything you do that uses the threadpool is going to hit the limits of the threadpool (IIRC it is about 25 threads per core as a maximum, often less than that, as a default), if you managed the threads yourself you could create more but even then there will be limits to just how much you can speed this up by using multiple threads.

Roughly how long is it taking to process all of the entries? What is the ratio of working to non-working proxies? How long does it take to process a failing / working proxy?

Re: Fastest way for multithreaded for each loop?

Also understand a computer has limits.

A single-core CPU can only really execute one thing at a time. Threads are sort of simulated in that environment by using "time slicing" and noting that if it can execute billions of instructions per second, then it can divide each second into several "pieces" and devote millions of instructions to different things. To humans, that looks enough like "doing multiple things at once" we're happy. Multiple-core CPUs, same thing, just more cores.

There are costs to all of this. The act of switching between tasks is called a "context switch" and involves a little bit of work. If multi-core CPUs don't share cache memory, time can be spent waiting for caches to update. All of this limits exactly how many things a CPU can do in some unit of time.

Now, you're also trying to make network requests. Your hardware has buffers and can only maintain so many open connections. It's also being shared with everything else on your machine that wants to make connections. When receiving data, that data has to go into buffers, then your program has to read it. This can take some time. So, conceptually, if you try to open thousands of connections at once, you'll go far slower than if you opened fewer connections at a time.

So for both threads and connections, if we were to graph performance vs. number of threads/connections, we'd see that we get faster up to some point, then suddenly much, much slower as we overwhelm the system.

So up-front, if you're expecting to be able to do 15,000 tests simultaneously, this is a bad expectation and won't work.

It's far more likely the best scheduling algorithm will plan at most 2 threads per CPU core you have, which likely means 8-16 total threads. If we assume the maximum 16, then you're going to have to make roughly 1,000 8-second cycles to finish, so 2 hours is a reasonable expectation. PlausiblyDamp mentioned maybe 25 threads per core. This could or could not work, it all depends on how much I/O contention that introduces. I think he's right that lowering the timeout is sensible: if a proxy has more than 500ms latency I'm not very interested in it. That's almost an 8x reduction in your total runtime for doing practically nothing.

But all said and done, what you're writing is more or less a war dialer/port scanner, and these are slow tools.

This answer is wrong. You should be using TableAdapter and Dictionaries instead.

Re: Fastest way for multithreaded for each loop?

Originally Posted by SoldierCrimes

I have no idea how to keep track of that, but i will lower the timeouts and see how it goes!

I suppose you could use something like https://msdn.microsoft.com/en-us/lib...v=vs.110).aspx to record the times for each request, it would be accurate enough to get a good idea where the time is being spent. Ultimately though if you are looking at performance you will need some way of timing things, either through code of by using a profiler. If you don't know where the delays are then it is very hard to remove them.