This study tackles fundamental archaeological questions using large, complex digital datasets, building on recent discussions about how to deal with archaeology’s emerging ‘data deluge’ (Bevan 2015). At a broad level, it draws on the unprecedented volume of legacy data gathered from many different sources – almost one million records in total – for the English Landscape and Identities project (Oxford, UK). More specifically, the paper focuses in detail on artefact evidence – material derived primarily from surface surveys, stray finds and metal detecting. Novel computational models are developed that extend and connect ideas from usually distinct research realms (different arenas of artefact research, digital archaeology, etc.). Major interpretative issues are addressed including how to approach background factors that shape the archaeological record, and how to understand spatial and temporal patterning at various scales. Overall, we suggest, interpreting large complex datasets sparks different ways of working, and raises new theoretical concerns.