Created attachment 8467969[details]
worker_spawn_lock.tar.gz
This bug is related to the w.i.p. experimental SharedArrayBuffer+Atomics+Futex work with Lars Hansen, Sean Stangl and myself.
When JS code calls 'new Worker(url);', the worker does not actually start up, until the calling JS code yields back to the browser and sleeps for an unknown indeterminate time period (~100 msecs), which is probably dependent on the complexity and size of the JS file url that is being loaded. This is problematic since it prevents Emscripten-based code from being able to micro-parallelize computation. E.g. a game might spawn up a new thread to run skinning, particle system, or physics, etc. while doing other per-frame computation, and then join the thread, like this:
void requestAnimationFrameHandler()
{
pthread_t thr;
pthread_create(&thr, NULL, thread_entrypoint, 0);
// Perform some other work..
// Need results, join.
pthread_join(thr, 0);
}
Even worse, it looks like spawning a web worker happens asynchronously, and takes up some amount of time, in the order of hundreds of msecs, so an application is not able to spawn a thread in one requestAnimationFrame() update, and expect to be able to pthread_join() on it on the next frame, or any subsequent frame, since there is no guarantee that the thread has been able to start up yet.
Attached a test case. Running it requires building trunk with Lars's patch from bug #979594.
As a workaround, the application can precreate a pool of workers at startup to make sure that when the application needs a thread, it has a worker to host that thread immediately available. However this has several disadvantages:
- The application needs to know in advance how many threads it needs at startup, which might not always be feasible.
- Workers require a lot of memory, since they need to duplicate the whole JS page that is being executed. In Emscripten applications this can amount on the order of ~50MB of JS code, or more in development builds. If the application e.g. maintains a worker pool of 8 workers to be able to host a peak of 8 threads, this would mean preallocating 400MB of memory.
- Creating the worker pool slows down page startup times.
- In some applications threads spawn other threads. The current experimented implementation in Emscripten has the worker pool specific to each thread. This means that each spawned thread needs its own worker pool for threads it might spawn in turn. This amounts to a tree of workers, which of course can't be infinite, so the application would need to know in advance which of its threads might spawn new threads, and somehow message that to the Emscripten compiler. This amounts to more porting complexity. It is uncertain at this point if it would be possible to share a global per-page worker pool between all created workers that would support synchronous execution without having to yield to the browser, like in the repro.
Due to this issue, a current limitation of the Emscripten pthreads implementation is that only the main thread is able to spawn new threads.

Thanks Luke for moving to the proper spot.
Does the Web Worker specification acknowledge that this kind of asynchronous startup requirement is allowed? I guess pooling up workers for "warm-up" might be enough, if we can make the workers not require redundant duplicate copies of the same JS code over and over again. I'll experiment with the single global worker pool approach, and use some kind of postMessage-based way of launching new threads and see how that works out.

Yeah Alon, that's how I implemented it currently, a preRunDependency in Emscripten that warms up a pool. However, it can require guidance from the user and gives the limitations I mention in the first comment. :/ I hope that it's possible to write the pool to be global and shared by all threads and not need to have it per-thread like it currently is, but that's still yet a bit uncertain if that might run into other issues.

> We need the main thread to continue running to receive network data.
The main thread also needs to asynchronously allow extensions to prevent the load and whatnot.
> Does the Web Worker specification acknowledge that this kind of asynchronous startup
> requirement is allowed?
You can't actually observably tell it apart from the worker thread not getting a timeslice for a while, as far as I can tell.

Created attachment 8651737[details]
worker_spawn_worker_lock.zip
Testing the case that main thread spawns worker 1, which then spawns worker 2. Main thread yields back to browser, but worker 1 immediately synchronously blocks until worker 2 is run and flips a bit to hear back. This scenario works without hanging.

Created attachment 8819972[details]
worker_spawn_lock_via_blob.zip
Exploratory test: given that in the naive way of "new Worker(url)" has the fundamental issue that one needs to wait for the XHR to finish, here's another test case which attempts to avoid that by manually XHRing the url at first into a Blob, and then creating the worker via new Worker(URL.createObjectURL(xhr.response));
Unfortunately that does not work either, so there is more asynchronicity involved besides just the XHR.