Off-topic question: If the video generation gets its information from graphic images (photos, video, etc.). Where do they get the context for those images? Where do they get the backround to the image that “completes” the picture?