A system and a method are disclosed for generating video. Object information is received. A path of motion of the object relative to a reference point is generated. A series of images and ground for a reference frame are generated from the ground truth and the generated path. A system and a method are disclosed for generating an image. Object information is received. Image data and ground truth may be generated using position, the image description, the camera characteristics, and image distortion parameters. A positional relationship between the document and a reference point is determined. An image of the document and ground truth are generated from the object information and the positional relationship and in response to user specified environment of the document.