Traditional geometric scene analysis cannot attempt to address the understanding of human vision. Instead it adopts an algorithmic approach, concentrating on geometric model fitting. Human vision, however, is both quick and accurate but very little is known about how the recognition of objects is performed with such speed and efficiency. It is thought that there must be some process both for coding and storage which can account for these characteristics. In this thesis a more strict emulation of human vision, based on work derived from medical psychology and other fields, is proposed. Human beings must store perceptual information from which to make comparisons derive structures and classify objects. It is widely thought by cognitive psychologists that some form of symbolic representation is inherent in this storage. Here a mathematical syntax is defined to perform this kind of symbolic description. The symbolic structures must be capable of manipulation and a set of operators is defined for this purpose. The early visual cortex and geniculate body are both inherently parallel in operation and simple in structure. A broadly connectionist emulation of this kind of structure is described, using independent computing elements, which can perform segmentation, re-colouring and generation of the base elements of the description syntax. Primal colour information is then collected by a second network which forms the visual topology, colouring and position information of areas in the image as well as full description of the scene in terms of a more complex symbolic set. The idea of different visual contexts is introduced and a model is proposed for the accumulation of context rules. This model is then applied to a database of natural images.