Alternatively, using management utility "gpstate -e", you can identify the segments which are marked down.

Note: If all the down segments are from a specific hosts then there could be a filesystem OR server issue. In these scenarios you can verify the server in question using : (Here we assume all the down segments are from sdw2)
- ping sdw2
- ssh sdw2
If ssh went fine then check filesystem using below commands. Here we assume the data partitions are data1 and data2:
- du -sh /data1 /data2
If any issue noticed then please engage EMC DCA team.

Identify the event and the time of the failure.

Use table "gp_configuration_history" and search via "dbid", search for entries for both primary and mirror for the respective content.

In case the roles have switched, i.e preferred_role of a segment was primary 'p' but if its now acting as a mirror 'm', in order to switch to preferred roles, rebalance operation using gprecoverseg is required or you can restart the database.

Note: Rebalance operation causes segments to restart, any running queries will be canceled, so please plan it during a maintenance window.

If all the down segments are from a specific host then there could be a filesystem/switch/server problem. Engage EMC DCA team to fix any hardware issue if found.

In certain cases (missing files on the mirror, etc.) the incremental recovery will fail. The reason will be found in the primary/mirror log files. If the problem is in the mirror, then full recovery (gprecoverseg -F) can be used. Full recovery will transfer all of the contents of the primary segment data directory to the mirror.

Usual reasons for the segment to fail:

mirror cannot keep up with workload - log entry will be recorded before the transition warning for timeout expiration (WARNING: "threshold '75' percent of 'gp_segment_connect_timeout=' is reached mirror may not be able to keep up with primary, primary may transition to change tracking", "increase guc 'gp_segment_connect_timeout' by 'gpconfig' and 'gpstop -u' ").

segment crash - log entry will be written with the type of the crash and usually the stack of the crashed process.

segment Out of Memory / Postmaster Reset - if segment runs out of memory in an important location in the code, Postmaster Reset is done to protect the instance and the data -

filesystem problems on primary or mirror segment - the kernel returned error message will be recorded in the log file