Due to the increasing availability of so called "Non-Mydriatic" cameras, digital imaging has become a very important part of the ophthalmologist’s work. This has created large databases of retinal images. It would be desirable to have a fast image processing tool that allows to analyse such databases in a short time, and to process the image in situ while the patient is examined. In this paper we contribute such a system for fast retinal image analysis. While it achieves a comparable quality as state-of-the-art methods, it differs from most of them by the fact that it is extremely fast. Retinal blood vessels are assumed to be line-like structures and can therefore be enhanced via convolution with suitable, elongated kernels. For this task we use the second derivative of the local Radon kernel. It is rotated at different angles and adapts via a maximisation procedure to the directions of the vessels. We combine smoothing along vessel directions with contrast enhancement across them. Afterwards our algorithm detects vessels as connected structures with very few interruptions. A subsequent skeletonisation allows a higher-level description of the vessel tree. In order to end up with a very fast system, we combine efficient algorithms for numerical integration, numerical differentiation and interpolation, and we propose an automatic parameter selection strategy. Our convolution kernels are precomputed and stored into cached constant memory. All essential components in our algorithms are intrinsically parallel, and the resulting system is implemented on GPUs using the CUDA programming language. Our qualitative evaluations with the publicly available DRIVE database and our own clinical database shows that the system achieves competitive performance. We demonstrate that it is possible to process images of size 4288 x 2848 pixels in 1.2 seconds on a NVIDIA Geforce GTX680, including the time for reading from and writing to disk.