On friday we had a SAN failure. Logged a job with EMC and they had no clue as nothing was obvious.

They did notice that both storage processors had a panic at the same time. The EMC development engineers are looking into it.Needless to say that pretty much everything turned to custard. A lot of VMs are unhappy and pretty much needed a cold reboot.

On one of our hosts I have lost one of the LUNs. On the SAN there are two LUNs available. ESX seems to detect the same LUN twice and obviously I have issues with my paths.

If your VMs are safely on other hosts you may want to evict this host and reinstall ESX and add host back to cluster. Kind of of a cop out but may be the best use of your time. Of course these suggestions are predicated on your VMs running on another host that is not having an issue.

Note: On the Clariion check to see if the Failure happened at the same time as the weekly battery test, this is a scheduled activity that could effect both SPs

any chance of some more detail relating to the fault and environment? i work with EMC clariion constantly and have not seen a bug yet to cause this so i would be interested to here, what is the Flare code that you have currently and what is the Bug detail that EMC highlighted? did they reference any primus articles and do you have the Bug check or Panic ID?

lots of questions i know but it may help more people avoid your pain in the future

There are a couple of Primus articles that reference the ASC/ASCQ combination with a Sense Key of 6, they are said to be due to data changes within the LUN and do not indicate data corruption, i would expect to see these during an internal LUN migration on the Clariion, did the last migration complete?