Most computer nowadays support 32 bits or 64 bits of data type on various type of programming languages and they are sufficient for most use cases. However, in cryptography, the required range and precision are more than 64 bits which are computationally expensive on CPUs. In this report, we present our design and implementation of a multiple-precision integer library including basic arithmetic, Montgomery multiplication and exponentiation with parallel techniques for GPUs which is implemented using CUDA, a parallel computing platform and application programming interface model created by NVIDIA. Experimental results will be shown that a significant speedup can be achieved comparing the performance of N. Emmart and C. Weems, "Pushing the Performance Envelope of Modular Exponentiation Across Multiple Generations of GPUs.