Is there a rule of thumb for when it is best to contact N workers with
MPI_Bcast vs. when it is best to use a loop which cycles N times and
moves the same information with MPI_Send to one worker at a time?

The rule of thumb is to use a collective whenever you can. The rationale is that the programming should be easier/cleaner and the underlying MPI implementation has the opportunity to do something clever.

For that matter, other than the coding semantics, is there any real
difference between the two approaches? That is, does MPI_Bcast really
broadcast, daisy chain, or use other similar methods to reduce bandwidth
use when distributing its message, or does it just go ahead and run
MPI_Send in a loop anyway, but hide the details from the programmer?

I believe most MPI implementations, including OMPI, make an attempt to "do the right thing". Multiple algorithms are available and the best one is chosen based on run-time conditions.

With any luck, you're better off with collective calls. Of course, there are no guarantees.