Answers

Invoking asynchronous functionality does not necessarily create a new thread, nor does it necessarily grab a thread from a thread pool to do the work. The documentation used to say that it creates a new thread. Fortunately that has now been corrected. Unfortunately, it still strongly suggests that it'll use extra threads to do the work. This is unhelpful.

It's easy to demonstrate that asynchronous operations don't necessarily consume threads. I showed an example of this here:

The code shown there performs 100 asynchronous threading operations simultaneously. The number of threads in the process is around 10. (The exact number will probably vary a little from one run to another, depending on various things like OS version, system load, number of CPUs on your system and so on.)

So it's clear that an async operation does not necessarily use a thread - if each of those 100 operations required a thread, you'd see the process using at least 100 threads. In fact it's possible to have hundreds or even thousands of asynchronous operations all running with just a handful of threads.

This is one of the biggest differences between creating your own threads to manage multiple operations, and using async operations. You can support a much greater number of concurrent requests using async programming because an async operation is much lighter weight than a thread.

The only point at which asynchronous operations will go creating threads for you is if you provide a completion handler - an AsyncCallback to be called when the work is complete. If you do this, a thread pool thread will be used to call your handler. But this only happens once the operation has completed. (So it's not the async operation that creates the thread. It's that threads may be created to notify your program that the operation is complete.) If you use this technique, then you absolutely must care about all the usual synchronization headaches of a multi-threaded program, despite what ArtySaravana appears to be suggesting.

If you don't supply a completion handler, then multiple threads need not be involved. So just pass null as the AsyncCallback parameter, and make sure you call EndXxx (e.g. EndSend, EndReceive, EndRead, or whatever) at some point in the future. The question is: how do you know when to call it? There isn't a completely straightforward answer to that - indeed, the main reason for supplying a completion callback is so that you know when the operation is complete.

It is possible to work out when the operation is done. The IAsyncResult returned by the BeginXxx method has a flag you can read to see if it is completed. And it's also able to return you a WaitHandle, that you can then block on to wait until the method is done. Of course at that point you've just turned your async operation back into a synchronous one... However, you might be able to use the 'wait for multiple' WaitHandle technique so that you can still reap the benefits of async operation in a single-threaded world. And in that case you would be able to do as ArtySaravan says, and ignore the usual multithreading headaches.

But be aware that the overwhelming majority of examples out there use completion callbacks. (In fact I don't think I've ever seen an example of how to write a single-threaded async application. It's perfectly possible, it's just not what any of the examples show.) So all the examples you'll find will therefore end up being multi-threaded. (You'll typically see a lot fewer threads than async operations, but it's still somewhat multithreaded.) This is presumably why the myth that async operations cause threads to be created persists.

Asynchronous delegate invocation always runs the work on a thread pool thread. However, it's a bit of a special case, because that's exactly what the purpose of async delegate invocation is: to run work on a thread pool thread.

So the behaviour exhibited by async delegate invocation isn't a reliable indicator of how asynchronous work procedes in general.

In fact there are many different ways in which asynchronous operations are implemented under the covers.

Networking is the one I know best, mainly because I spent 3 years writing network device drivers for a living. Network send and receive operations quite simply don't need threads in order to procede most of the time. You need a thread at the point at which you start an operation, but starting an operation takes next to no time. You also need to have a thread in order to process the results at some point after the operation completes. (Assuming you actually care about the results...) But for all the time in between, the networking stack has no use for a thread - the operations are all just represented by data structures in memory. On the occasions where the driver system needs to use the CPU, it'll run in kernel mode, stealing CPU time from whatever user mode thread happened to be executing at the time the relevant event occurs - often this thread will be one entirely unrelated to the thread involved in launching the operation - it could even be in a completely different process. It usually only borrows the CPU for a very short duration before allowing the computer to continue with whatever it was doing at the time.

(I've left out a few details here, but those are the basics.)

File system access goes through a different path in the OS but the same style of intrinsically threadless operation applies under the covers.

Not everything that implements the async pattern will be layered on top of this kind of mechanism. As you've illustrated, asynchronous delegate invocation doesn't, for example. But async networking and async file access (amongst other things) are intrinsically threadless down in the OS level, and you're able to exploit that in .NET if you use the async pattern to use them.

The example program I linked to really does have 100 asynchronous operations on the go simultaneously: all 100 of those receive operations is live and ready to accept data instantly. The reason so many threads are needed is twofold: (1) because the operations complete fairly infrequently and (2) the completions are handled pretty quickly. If you tried to use asynchronous delegate invocation to do this you'd see something completely different happen. If you tried to fire up 100 asynchronous delegate invocations, then unless you're running on a machine with 4 or more processors, you won't actually manage to get all 100 running by default - the thread pool won't spin up more than 25 ordinary worker threads per processor. Even on quad CPU box you'll be waiting a while - it ramps the number of threads up pretty slowly.

So I think that illustrates fairly clearly that the mechanisms involved in async networking operations are quite different from those of the CLR thread pool. (Although completion handlers are run on the CLR thread pool. So in practice, thread pool behaviour tends to be comingled with the async networking behaviour.)

So to return to the original poster's question, we were asked about: "the differences between Asynchronous invocation and threading". So if you wanted to listen for incoming data on 1000 different sockets simultaneously, the difference is that if you used threading, that would require 1000 threads, but using asynchronous APIs, that can be managed with far fewer threads.

From the program flow point of view, the asyncrhonous invocation does not use the same flow as your program, this means that the operation needs to be triggered in an independent flow. If in some occasions the async calls does not use a new thread (or a thread from the pool) it needs to use a fiber, maybe that's why you see 10 to 1 rate on your example. (Whatr is more, sometimes the CLR runs your "new" thread as a fiber in order to save OS resources).

For sockets or any other I/O bound operation the CLR (not even Windows) does not use a thread, uses a Microsoft Windows device driver to handle it. (To discuss in another thread)

Fibers as you may know, are independent flows with its own stack that runs on your same thread. This means that you can execute several tasks sharing the process cycles of your threads (of course in this case the parallelism does not applies). .NET does not provides an interface for fibers but you can easily invoke them using PInvoke. The only challenge of fibers is that you have to schedule the task by yourself (switch between fibers).

Important note: The CLR team is working hard on the idea of sharing threads on the near future, so it is recommended to stick to the CLR Thread namespace instead of using PInvoke. They are playing with implementations running your new thread in another thread that is on a "wait" state, something very cool that will enhance the performance.

The answer depends on what you want to do. Generally, you will create and manage your own threads when you do not want to use the ones from the thread pool, you are not doing a lot of "waits" etc. However, remember each thread takes up resources.

You still need to worry about synchronization and all other head-aches of multi-threaded programming. The beauty of asynchronous calls is evident from your question. Unless you are using Windows Forms (and the thread owning the control issue), you can program quite unaware that the about the threads involved in async programming.

Here is a great article about Async programming in .NET that will get you going.

There are many differences, actually delegate async invoke (use a thread from the pool (async invocations may use threads or fibers) to execute (like queuing a task on the Threadpool object), so they can not be compared, one uses the other one.

Threads are independent execution flow with its own stack. Imagine them as they are a different program running "at the same time" as your main program. When you want a specific task to be exceuted but as soon as you call it you want your program to continue the asynchronic invocation is doing that, executing your function at the same time.

As you may deduce, you need to keep an eye on shared resources (heap objects) as many of them can be triggered at the same time. If they need to access to any shared resource use proper lockings to control the access. If the function does not use any shared resource you will not need to lock it as it has its own stack.

Invoking asynchronous functionality does not necessarily create a new thread, nor does it necessarily grab a thread from a thread pool to do the work. The documentation used to say that it creates a new thread. Fortunately that has now been corrected. Unfortunately, it still strongly suggests that it'll use extra threads to do the work. This is unhelpful.

It's easy to demonstrate that asynchronous operations don't necessarily consume threads. I showed an example of this here:

The code shown there performs 100 asynchronous threading operations simultaneously. The number of threads in the process is around 10. (The exact number will probably vary a little from one run to another, depending on various things like OS version, system load, number of CPUs on your system and so on.)

So it's clear that an async operation does not necessarily use a thread - if each of those 100 operations required a thread, you'd see the process using at least 100 threads. In fact it's possible to have hundreds or even thousands of asynchronous operations all running with just a handful of threads.

This is one of the biggest differences between creating your own threads to manage multiple operations, and using async operations. You can support a much greater number of concurrent requests using async programming because an async operation is much lighter weight than a thread.

The only point at which asynchronous operations will go creating threads for you is if you provide a completion handler - an AsyncCallback to be called when the work is complete. If you do this, a thread pool thread will be used to call your handler. But this only happens once the operation has completed. (So it's not the async operation that creates the thread. It's that threads may be created to notify your program that the operation is complete.) If you use this technique, then you absolutely must care about all the usual synchronization headaches of a multi-threaded program, despite what ArtySaravana appears to be suggesting.

If you don't supply a completion handler, then multiple threads need not be involved. So just pass null as the AsyncCallback parameter, and make sure you call EndXxx (e.g. EndSend, EndReceive, EndRead, or whatever) at some point in the future. The question is: how do you know when to call it? There isn't a completely straightforward answer to that - indeed, the main reason for supplying a completion callback is so that you know when the operation is complete.

It is possible to work out when the operation is done. The IAsyncResult returned by the BeginXxx method has a flag you can read to see if it is completed. And it's also able to return you a WaitHandle, that you can then block on to wait until the method is done. Of course at that point you've just turned your async operation back into a synchronous one... However, you might be able to use the 'wait for multiple' WaitHandle technique so that you can still reap the benefits of async operation in a single-threaded world. And in that case you would be able to do as ArtySaravan says, and ignore the usual multithreading headaches.

But be aware that the overwhelming majority of examples out there use completion callbacks. (In fact I don't think I've ever seen an example of how to write a single-threaded async application. It's perfectly possible, it's just not what any of the examples show.) So all the examples you'll find will therefore end up being multi-threaded. (You'll typically see a lot fewer threads than async operations, but it's still somewhat multithreaded.) This is presumably why the myth that async operations cause threads to be created persists.

Both the Console.WriteLines always say that the thread being used to execute the DoWork as well as the DoWorkCallback is from the Thread Pool.

Since the async code needs to be executed using a Thread distinct from the user-created threads, I would assume that such a thread needs to come from the Thread pool. Isn't this the reason why your program does not consume 1000 threads for 1000 connections? As .NET re-uses the Thread pool threads?

Also, I did mean to suggest that you will still need to care about synchronization issues but I understand how that line could be misleading. Have since changed that.

Asynchronous delegate invocation always runs the work on a thread pool thread. However, it's a bit of a special case, because that's exactly what the purpose of async delegate invocation is: to run work on a thread pool thread.

So the behaviour exhibited by async delegate invocation isn't a reliable indicator of how asynchronous work procedes in general.

In fact there are many different ways in which asynchronous operations are implemented under the covers.

Networking is the one I know best, mainly because I spent 3 years writing network device drivers for a living. Network send and receive operations quite simply don't need threads in order to procede most of the time. You need a thread at the point at which you start an operation, but starting an operation takes next to no time. You also need to have a thread in order to process the results at some point after the operation completes. (Assuming you actually care about the results...) But for all the time in between, the networking stack has no use for a thread - the operations are all just represented by data structures in memory. On the occasions where the driver system needs to use the CPU, it'll run in kernel mode, stealing CPU time from whatever user mode thread happened to be executing at the time the relevant event occurs - often this thread will be one entirely unrelated to the thread involved in launching the operation - it could even be in a completely different process. It usually only borrows the CPU for a very short duration before allowing the computer to continue with whatever it was doing at the time.

(I've left out a few details here, but those are the basics.)

File system access goes through a different path in the OS but the same style of intrinsically threadless operation applies under the covers.

Not everything that implements the async pattern will be layered on top of this kind of mechanism. As you've illustrated, asynchronous delegate invocation doesn't, for example. But async networking and async file access (amongst other things) are intrinsically threadless down in the OS level, and you're able to exploit that in .NET if you use the async pattern to use them.

The example program I linked to really does have 100 asynchronous operations on the go simultaneously: all 100 of those receive operations is live and ready to accept data instantly. The reason so many threads are needed is twofold: (1) because the operations complete fairly infrequently and (2) the completions are handled pretty quickly. If you tried to use asynchronous delegate invocation to do this you'd see something completely different happen. If you tried to fire up 100 asynchronous delegate invocations, then unless you're running on a machine with 4 or more processors, you won't actually manage to get all 100 running by default - the thread pool won't spin up more than 25 ordinary worker threads per processor. Even on quad CPU box you'll be waiting a while - it ramps the number of threads up pretty slowly.

So I think that illustrates fairly clearly that the mechanisms involved in async networking operations are quite different from those of the CLR thread pool. (Although completion handlers are run on the CLR thread pool. So in practice, thread pool behaviour tends to be comingled with the async networking behaviour.)

So to return to the original poster's question, we were asked about: "the differences between Asynchronous invocation and threading". So if you wanted to listen for incoming data on 1000 different sockets simultaneously, the difference is that if you used threading, that would require 1000 threads, but using asynchronous APIs, that can be managed with far fewer threads.

From the program flow point of view, the asyncrhonous invocation does not use the same flow as your program, this means that the operation needs to be triggered in an independent flow. If in some occasions the async calls does not use a new thread (or a thread from the pool) it needs to use a fiber, maybe that's why you see 10 to 1 rate on your example. (Whatr is more, sometimes the CLR runs your "new" thread as a fiber in order to save OS resources).

For sockets or any other I/O bound operation the CLR (not even Windows) does not use a thread, uses a Microsoft Windows device driver to handle it. (To discuss in another thread)

Fibers as you may know, are independent flows with its own stack that runs on your same thread. This means that you can execute several tasks sharing the process cycles of your threads (of course in this case the parallelism does not applies). .NET does not provides an interface for fibers but you can easily invoke them using PInvoke. The only challenge of fibers is that you have to schedule the task by yourself (switch between fibers).

Important note: The CLR team is working hard on the idea of sharing threads on the near future, so it is recommended to stick to the CLR Thread namespace instead of using PInvoke. They are playing with implementations running your new thread in another thread that is on a "wait" state, something very cool that will enhance the performance.

I'm familiar with fibers. But I'm not aware of any asynchronous mechanisms in .NET that use them. (Although you could write a custom host using the host APIs that would make use of them.)

Which mechanisms use fibers? Async delegate invocation certainly doesn't - it just uses threads. And as you and I have both pointed out, sockets and other I/O uses devices drivers in a way that does not require either threads or fibers. Given that my example used sockets, that's why it used so few threads - fibers don't enter into the picture.

I'm a little confused. You are suggesting that my example might be using fibers. But then you go on to say that to use fibers in .NET you need to use P/Invoke...

Is there any example of code that does not use P/Invoke but which does use fibers in .NET? (And did look at my example? It's just using sockets, so surely that won't use fibers?)

The idea of borrowing a waiting thread to run other work is cool but also scary - it opens up opportunities for re-entrancy. Under certain circumstances COM would enable this to happen - an STA that is blocked on an outbound COM call can handle incoming calls on that blocked thread. I've seen processes get into a real mess when this happened: if the secondary call does something that blocks, the operation whose thread it 'borrowed' cannot proceed even once its blocking operation is complete until the secondary call also unblocks. Are the CLR team thinking of exposing this as a user-level feature, or just using it for internal work?

The CLR uses fibers internally in order to consume less kernel threads, for example, if you create a new thread on your application the CLR may decide to use a fiber instead, and will do the schedulling for you (as the OS does know nothing about fibers). What I was suggesting is that when you run your async invocations (not using delegates), the internal APM (Async Program Manager) of the CLR uses a combination of threads and fibers in order to execute your code without using too many threads. The Fibers functions are available for native code (CLR is native) and it use them (as explained on CLI via C#, where I am double checking now).

You can still use fibers for your own application using PInvoke (as is not implemented on System.Threading). What you actually do is switching your thread into a fiber to allow you to run several of them.

Going back to your example, you are using the async function from the CLR than may use a fiber to assign to your socket connection (then your socket will use the windows drivers to perform async operations). You can not control how the CLR decides to schedulle your task. For example, you use a timer for sending, all the CLR timers are managed by a single thread, that schedules the execution queuing the request on the ThreadPool.

I think the confussion (including myself) is that the CLR changes the execution model (as we were use to on native code), and that's why you find statements on the help saying "may use a thread".

I agree with you about the scary idea of sharing threads, it will need much more testing and that's why I couldn't been released on .NET 2 (not even .NET 3 - as they call the WinFx release). The original idea is to not expose this functionality to the user, it will be a CLR performance enhancement.

I know it's possible to use the CLR hosting APIs to provide a host that uses fibers instead of threads. And that's why the documentation hedges its bets.

But I was under the impression that the default host never used fibers.

So are you basing your statements on the fact that it's technically possible that fibers could be used in a non-standard host? Or are you actually saying that the default host will sometimes use fibers?

Or are you referring to the SQLCLR host? (I'm not too familiar with that, but I know it definitely does break the mapping between CLR threads and Win32 thread. I don't know if it uses fibers, but I know that it very definitely uses a different implementation for CLR threads than the standard host.)

If there's a situation where the standard host for the CLR behaves as you've described, could you give a specific example? (More specific than "if you create a new thread on your application the CLR may decide to use a fiber instead" - in my experience the standard host has never done this, so I'd like to know how I can get it to do that.) Since this runs counter to my current understanding of how the default CLR host behaves, I'd really like to observe this in action to improve my understanding. So I'd like to know how you managed to get this behaviour to occur.

AS i have always understood the workings exactly as IanG has, i am also interested in how (when?) the CLR may choose to use fibers, not so much to allow me to write code to take advantage (ooo scary), but more to better my understanding of the CLR.

I know it has been a long time since there was activity in this thread but something you said hit directly on a problem I am having. I am using the asyncronous socket examples from MSDN as my base code. In the Recieve_Callback function I am raising an event that passes the recieved data up to another object to be parsed. Once parsed it uses a TableAdapter to insert the data into a SQL database. Once this is done it cycles back, calls the BeginRecieve routine again and waits for further data.

The problem I am having is that once I Insert using the TableAdapter object the socket crashes when I call the BeginRecieve method. The error I get is the following:

{"The Undo operation encountered a context that is different from what was applied in the corresponding Set operation. The possible cause is that a context was Set on the thread and not reverted(undone)."}

This obviously a threading problem. I have done alot of work to isolate the issue and by removing the TableAdapter.Insert command everything works just fine. When you mentioned that the SQLCLR threads break the mapping between the CLR threads and the Win32 threads I saw this as a possible explanation for my problem.

You have a very clear understanding of the threading process and I was hoping you might be able to give me some advice or direct me someplace that would shed some light on my problem.

The problem could be that you don't call EndInvoke before the next call of BeginInvoke. Most documents say that you always should call EndInvoke bevore you start another operation (sync or async) on the object.

It is absolutely a requirement that you call the corresponding End method. (That would be EndReceive in this case, rather than EndInvoke.)

But I'm guessing that is exactly what the previous poster is doing. The socket doesn't tell you how much data it has received until you call EndReceive, so it's difficult to see how you'd be able to get anywhere without calling it. Unless your code presumes that the receive operation will always receive a particular number of bytes. But if you've assumed that, then there's a bug in your code - socket receive operations make no guarantees about how much data they return in one lump. They can and sometimes do arbitrarily split the data into seemingly random-sized chunks.

That's a surprising error message - I've never seen any mention of context or undo from a socket error before. Are you using any kind of interception-based framework? Could you show us the full stack trace for the exception, because the message on its own doesn't make much sense?

The error has originated from inside the .NET Framework class library code, which is why it doesn't make a lot of sense by the time it has bubbled back up the stack to you. The code in question is the related to the ExecutionContext, which is the mechanism .NET uses to ensure that whatever security context was in place when you initiated an asynchronous operation continues to be in place when callbacks relating to that operation occur. (Without this, you might be able to elevate your permissions by performing asynchronous work.)

From the looks of things, I'd say that an exception got thrown by the asynchronous callback. (Could the event handler you're calling in your Catch block be throwing an exception? You really should be wrapping a second Try ... Catch block around that RaiseEvent to guard against this sort of thing.) As a result, the ExecutionContext attempted to perform its cleanup, but something went wrong during cleanup, causing a second exception to be thrown. This unfortunately means that you've lost the original exception. This makes it hard to diagnose. You really want to know what that original exception was. The only way you can do this is to run under the debugger, configuring it to stop on all exceptions the moment they are thrown. (Unfortunately, that can sometimes drown you in exceptions - some apps happen to throw a lot of recoverable exceptions... So this can be a tedious process.)

Even taking into account the fact that you're probably seeing a secondary exception rather than the one you want to see, it looks like you've only shown part of the exception information. The stack trace you have shown there appears not to have anything to do with the code snippet you have shown. The stack trace looks to be that for the IO thread that handled the completion. But you said in an earlier message that this was thrown when you call BeginRecieve. Normally when an exception occurs on a worker thread, and is reported later on some other thread, you get two stack traces: the original worker thread one nested inside the trace for the call that ultimately receives the error.

So it looks like I'm only seeing half of the error here - one stack trace instead of two.

Three other things strike me about this:

You said you got the error when you call BeginReceive, but I don't actually see any calls to that method. You're calling BeginReceiveFrom.

Normally .NET reports this sort of asynchronous error on the 'End' call, and not the 'Begin' call. This suggests that something strange is probably happening, and the error you're seeing might well be a side-effect of something else. (Indeed, clearly something strange is going on, because the ExecutionContext's cleanup shouldn't have fallen over and thrown a new exception obscuring the original one.) But can we confirm: exactly where is that exception being thrown from? The stacktrace you've shown doesn't tell us this, and the information you've provided before doesn't seem to be consistent with this code sample (see comment 1)

You're calling BeginReceiveFrom, but EndReceive. That's a mismatch, and I would expect it to fail or at least cause strange behaviour. Calls to BeginReceiveFrom should be matched with calls to EndReceiveFrom, rather than EndReceive. If you fix this error, does the behaviour change?

Can you shed any light on those points? I think we need to address all of these in order to make progress.