Typical VLSI implementations of discrete-time cellular neural networks (DTCNN) incorporate costly hardware to implement the basic DTCNN cell, resulting in a small grid size that needs to be cascaded with many other chips for processing images of any practical size. In the paper, a low-cost DTCNN cell that can be incorporated into a single chip in large numbers has been proposed. Memory bandwidth considerations show that 256 DTCNN cells can be incorporated into a single chip DTCNN processor to compute a 256×256 image at 30 frames per second. Techniques based on rectangular-shaped cell grids for use with video memory have been proposed to satisfy the memory bandwidth requirements. The architecture of the proposed DTCNN processor is also capable of supporting the flexible grouping of basic cells. In addition, the processor, which is capable of supporting the flexible grouping of cells, can be cascaded in a highly scalable manner to facilitate the processing of larger images at high speed.