Actually, the trouble didn't start immediately: both arrays were stable for a few weeks, lulling us into a false sense of security. Filesystem usage was initially light, since as I mentioned before, this was the early days of the organization and there were relatively few users of the cluster and storage. Further, the early users tended to be more sophisticated about their resource usage (I'll talk more about this in the future) and were therefore easier on the array. This light early usage masked some fundamental design flaws, but you can always count on Lustre to reveal any shortcomi