[en] Lattice Boltzmann-based (LB) simulations are well suited to the simulation of fluid flows in complex structures encountered in chemical engineering like porous media or structured packing used in distillation and reactive distillation columns. These simulations require large amounts of memory (around 10 gigabytes) and
would require very long execution times (around 2 years) if executed on a single powerful desktop computer.
The execution of LB simulations in a distributed way (for example, using cluster computing) can decrease the execution time and reduces the memory requirements for each computer. Dynamic Heterogeneous Clusters (DHC) is a class of clusters involving computers inter-connected by a local area network; these computers are potentially unreliable and do not share the same architecture, operating system, computational power, etc. However, DHCs are easy to setup and extend, and are made of affordable computers.
The design and development of a software system which organizes large scale DHCs in an efficient, scalable and robust way for implementing very large scale LB simulations is challenging. In order to avoid that some computers are overloaded and slow down the overall execution, the heterogeneity of computational power should be taken into account. In addition, the failure of one or several computers during the execution of a simulation should not prevent its completion.
In the context of this thesis, a simulation tool called LaBoGrid was designed. It uses existing static load balancing tools and implements an original dynamic load balancing method in order to distribute the simulation in a way that minimizes its execution time. In addition, a distributed and scalable fault-tolerance mechanism
based on the regular saving of simulation’s state is proposed. Finally, LaBoGrid is based on a distributed master-slave model that is robust and potentially scalable.