Parallel Programming in Cyber-Physical Systems

Abstract

The growing diffusion of heterogeneous Cyber-Physical Systems (CPSs) poses a problem of security. The employment of cryptographic strategies and techniques is a fundamental part in the attempt of finding a solution to it. Cryptographic algorithms, however, need to increase their security level due to the growing computational power in the hands of potential attackers. To avoid a consequent performance worsening and keep CPSs functioning and secure, these cryptographic techniques must be implemented so to exploit the aggregate computational power that modern parallel architectures provide. In this chapter we investigate the possibility to parallelize two very common basic operations in cryptography: modular exponentiation and Karatsuba multiplication. For the former, we propose two different techniques (m-ary and exponent slicing) that reduce calculation time of 30/40%. For the latter, we show various implementations of a three-thread parallelization scheme that provides up to 60% better performance with respect to a sequential implementation.

Appendix: Sequential and Basic Parallel Code for Karatsuba

In the following we show the base sequential Karatsuba code (Kara_seq) and the parallel version based on C++11 std::async asynchronous thread invocation that we used in our experiments. The purpose is to highlight the changes that need to be done for enabling a multi-threaded computation.

In Fig. 11 we show the base code that implements a Karatsuba step on the multi-precision numbers and then delegates the multiplication of the k∕2 split numbers to the native GMP multiplication algorithm, here accessed through overloaded * operator. The two operands are split, and the three multiplications are performed in sequence by the same main thread, which eventually calculates the final result from the three partial ones. splitBigNum_limb procedure, not shown, splits a multi-precision number into its most and least significant parts using the limb, i.e., processor word, granularity.

Basic sequential code implementing the Karatsuba step used as a reference for the discussed parallelized versions. We use multi-precision numbers from the GMPXX (GMP for C++) library (mpz_class), and the three multiplications on k∕2-bit numbers are performed by the main thread using GMP standard multiplication. The code for splitting the numbers into most and least significant part is omitted

Then, Fig. 12 shows the version (Kara_thrAs) in which two additional threads are spawned through the std::async C++ construct to calculate the first two multiplication concurrently and, in parallel with the third one, computed in the main thread. Then the main thread retrieves the partial results calculated by the helper ones or blocks until they are ready (e.g., retLL.get( ) call).

Parallel code implementing the Karatsuba step through C++ std::async concurrency construct. Each of the three multiplications on k∕2-bit numbers is performed by a different thread: the first two by additional threads, each operating inside the std::async, and the third directly performed in the main thread. The main thread then waits the completion of the other ones (e.g., retLL.get( )) before calculating the final result

Threads are spawned in the points indicated by the green arrows and are joined to the main execution in the points indicated by the red arrows.

Kara_thr version is very similar to the Kara_thrAs with only two differences. Firstly, std::thread is used in place of std::async, and the retrieval of partial results requires to preliminarily join the threads via thread::join( ) method, i.e., possibly waiting for their completion.

The other more advanced versions, kara_infThr and kara_infThrLF, are not shown as the detailed analysis of their code would require too many advanced concepts of parallel programming for the scope of this chapter. However, their main features and operational principles are summarized without approximations in the overall discussion of Sect. 2.3.