One of the most striking features of quantum mechanics is the exponential growth of resources, required to find the states of a composite system, with the size of the system. This also is the origin of the two main bottlenecks in numerical studies of complex quantum systems, that are (i) diagonalizations of big matrices and (ii) propagations of large systems of linear differential equations with global symplectic structure. Operations of the first type are purely scalable, while most of the propagation algorithms allow for the high degree of parallelism. Here we show how the workload of finding Floquet eigenstates of an ac-driven nonintegrable quantum system can be shared between a general-purpose central processing unit (CPU) and a graphic processing unit (GPU), when both are working within one computing platform. Namely, diagonalization steps are delegated to the CPU, while the time propagation is performed on the GPU. This strategy led to a computational time speed-up of several order of magnitude as compared to the performance of the CPU alone.