I'm looking for some insight into how other people would handle this or any pointers about Java's threading that I probably missed.

I have entities in the scene who update part of their state using the results of an external script. There's no upper limit (yet) on how many of these entities there can be, so in theory a scene could have five or five hundred. I say external because there is no interface for Java (or C or anything else) to the script environment. It's another process that takes in a script file and an inputs file from the terminal and creates an outputs file with the results of its processing. I can handle running the scripts and translating data back and forth just fine. However, the time it takes for each execution instance is significant enough (around 10 ms on my dev machine) that I can't iterate through every entity in the update loop without causing significant delays for a scene with more than a couple of them running. I'm working on optimizing everything I'm doing to communicate between the game and the script process, but (for this machine) there is a hard minimum of around 3 ms per execution. Hence I'm looking into how best to divvy the work.

My current ideas are something like this:

Handle as many objects in the update loop as is possible within X ms.

Use a cached thread pool ExecutorService.

Use a fixed thread pool ExecutorService.

I'm not nuts about the first one because the scripts should run as close as possible to their given update rate (each script has a configurable clock rate) or things could, in theory, go awry. To explain a little, the script process is actually a Verilog synthesizer and simulator. Each "script" is a Verilog file describing some hardware component that is simulated by the process. Each component might have a clock and it helps if they run as close as possible to their expected clock rates or timing can get fussy. Obviously the scheduler doesn't allow things to happen in real time nor anywhere near the level of accuracy one would hope for from real hardware, but it's a plus if they're delayed as little as possible. Pushing script updates to the next game update is something I'm hoping to avoid. Though I will point out that I haven't tested this yet to see how much of a problem it is. Thorough testing of this will be a bit difficult, though, since this game is meant to be used by others and I have no way of knowing what sort of hardware they might be trying to simulate.

I'm going to try implementing the third idea first because it feels the most "right" to me, though I have no real justification for it at the moment. My only concern with the second one is the possibility of the game having been idle for a bit, the threads being released, and suddenly a wave of 150 update requests pops up. The cached pool says it will create threads as needed. Does this mean it would try to create 150 threads at once?

Does anyone have some experience with a situation like this, and if so, what worked for you?

The process whose responsibility is execute those verilog scripts stay resident or must be executed upon each iteration ?

You have access to the Verilog Simulator's source code ? It's possible to change the simulator to take the input from a socket, a shared memory or any other IPC mecanism ? or is the simulator capable of taking the input via stdin and output the results via stdout ?

If the simulator could stay resident, accepting the input (and the output) from an IPC mecanism (or the stdin/stdout), you could create a pool of those simulators and use it from within your application. If the simulator is CPU bound, you should resist the urge to create more threads when they're being used. The optimal of threads may vary but, in theory, with a single threaded, primary CPU bounded process, should be the number of CPU cores (not sure about this last one, people with more experience in multicore programming could provide you a better insight)

Unfortunately, the process wasn't at all built with this sort of use in mind, so it isn't resident and it only takes in file paths for inputs and creates a new file for each output. I have access to the source; good point that I can just rework it, though I think the way it's setup may take a little doing to make it more... general use. Not sure why I didn't just consider that first in retrospect.

I had been working with the idea of #cores (or #cores - 1) workers, hence my uncertainty about using Java's cached thread pool.

Anyway, thanks for the idea. I'll flip through their code and see if it's not too much of a headache.

The update loop must finish its processing on a fixed amount of time ? If it doesn't, perhaps you could go without the thread pool.

You could create a façade, which take a batch of data that must be processed by he simulator. This façade divides the original batch into batches, each one with the size of the avaliable simulators (which can be pooled). After each one of these batches are dispatched, the façade waits for the execution of each simulation. After all batches are processed, the façade returns the results to the update loop.