People seem to compute the ensemble statistics of objects and use this information to support the recall of individual objects in visual short-term memory. However, the appropriate grouping of objects into ensembles is not always obvious, and people may need to infer different hierarchical organizations of the objects. These different organizations should determine how ensemble information influences object recall. We tested whether objects' hierarchical structure influences visual short-term memory recall and assessed the encoding scheme people use to represent objects in a hierarchical structure. To address these questions, we asked subjects to recall the locations of objects arranged in different spatial clustering structures. Objects in the same cluster were recalled with similar displacement errors, suggesting that the hierarchical structure induced correlated errors. Furthermore, objects arranged into fewer clusters containing more objects were recalled more accurately. We considered three accounts of this improvement: (a) fewer misassociations of objects to locations, (b) more effective random guessing around cluster centers, and (c) more accurate encoding of object locations. Our analyses suggest that performance improved as objects were more densely clustered because guessing around the cluster centers decreased and object locations were recalled more accurately. One explanation for this pattern is that subjects represented the relative positions of objects using a log encoding. Such a scheme would allow more densely clustered objects to be recalled with greater fidelity. Consequently, we designed a model that represents the locations of objects relative to their clusters and recalls the relative positions with Weber noise on distance. We fit the model to subjects' responses and found for each clustering structure the model was able to accurately predict objects' bias towards clusters and the noise of object locations. Together, these results suggest that denser clustering allows more parsimonious encoding of the object hierarchy, preserving resources for encoding individual objects.