Rapid ata hotplug on a libsas controller results in cases where libsasis waiting indefinitely on eh to perform an ata probe.

A race exists between scsi_schedule_eh() and scsi_restart_operations()in the case when scsi_restart_operations() issues i/o to other devicesin the sas domain. When this happens the host state transitions fromSHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and->host_busy is non-zero so we put the eh thread to sleep even though->host_eh_scheduled is active.

Before putting the error handler to sleep we need to check if thehost_state needs to return to SHOST_RECOVERY for another trip througheh. Since i/o that is released by scsi_restart_operations has beenblocked for at least one eh cycle, this implementation allows thosei/o's to run before another eh cycle starts to discourage hung tasktimeouts.

--- a/drivers/scsi/scsi_error.c+++ b/drivers/scsi/scsi_error.c@@ -1687,6 +1687,20 @@ static void scsi_restart_operations(stru * requests are started. */ scsi_run_host_queues(shost);++ /*+ * if eh is active and host_eh_scheduled is pending we need to re-run+ * recovery. we do this check after scsi_run_host_queues() to allow+ * everything pent up since the last eh run a chance to make forward+ * progress before we sync again. Either we'll immediately re-run+ * recovery or scsi_device_unbusy() will wake us again when these+ * pending commands complete.+ */+ spin_lock_irqsave(shost->host_lock, flags);+ if (shost->host_eh_scheduled)+ if (scsi_host_set_state(shost, SHOST_RECOVERY))+ WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));+ spin_unlock_irqrestore(shost->host_lock, flags); }