Greetings.
It depends on the camera model. My fisrt point-and-shoot digital camera had that option, it was called "audio tag" and could record 30 seconds of audio after each shot. However, voice records were stored separately on the memory chip.

There's another way to do the things. For example, Pentax K-x has photo tagging system where you can add a text file to the photo. But typing a text isn't easy and obviously will take longer than recording an audio file.

When I shoot and need to record something, I will write it down but then take a photo of it right after the photo. That way I won't loose it. Also take photo of my hand to indicate a point on a shoot, for example, thumb up is beginning of a pano and a thumb down is the end of a sequence.