The copy gives it a meaning it would have lacked had the written line not been under each image. It's not see and say or redundancy either. It's a common-sense finishing the joke. How would I have known the guy in the boat was a voyeur and not a coward, or that the guy in the whale's mouth was seen from the POV of a widow and not a narcissist or paranoid schizophrenic? Leaving interpretation to each reader could have worked too, but personally, I prefer this. Kudos to team.