yuri, the backtrace you posted is another issue. i am building your branch of wip-yuri-testing_2017_3_4 to see if "ceph_test_msgr" still fails. and i am able to reproduce this. https://github.com/ceph/ceph/pull/13700 is the offending PR. just updated it.

osd.4 sends the empty transaction because backfill has not yet started (the pg has only reached backfill_wait), the pg has not split, and thus hoid > last_backfill_started and hoid > peer_info[peer].last_backfill, so shoud_send_op() returns false.

The question is why is osd.1's missing set not reset - and the answer lies in the Stray state. When receiving an MOSDPGLog, if its this message's info.last_backfill is not MIN, we merge the log (and with it the missing set). This is to start backfill from the same place we left off from.

Do we actually want to clear the missing set here, or just filter it for the correct child PG?...I presume killing it works fine since otherwise presumably the child PG would be buggy, and it's surely better to not have them behave divergently. Never mind that question.