The badger (Meles meles) is an important wildlife host for bovine tuberculosis (bTB), and is a reservoir of infection to cattle. Reliable indicators of badger abundance at large spatial scales are important for informing epidemiological investigation. Thus, we aimed to estimate badger social group abundance from a large-scale dataset to provide useful information for the management of bTB in the Republic of Ireland (ROI). Robust estimates of species abundance require planned systematic surveying. This is often unfeasible at large spatial scales, resulting in inadequate (biased) data collection. We employed species distributional modelling (SDM) using 7724 badger main-sett (burrow) locations across the ROI at a 1 ha scale. This dataset was potentially biased as surveying was directed towards areas with cattle bTB-breakdowns. In order to manage sampling bias, we developed a model where the environment was sampled using pseudoabsences geographically constrained to the potential survey area only (constrained model), in addition to a model where all of the ROI was sampled (non-constrained model). Models predic- tive performance was assessed using internal (splitting the national-scale dataset) and external validation on independent datasets; the latter included 278 main setts from a local-scale unbiased intensive survey (755 km2). Finally, the relationship between predicted probability and observed abundance at local-scale was used to infer number of social-groups at the national level. The geographically constrained model showed moderate discriminatory power, but good calibration in both the internal and external valida- tions. The non-constrained model resulted in higher discrimination but poorer calibration in the internal validation, indicating a limitation for national-scale predictions. Interestingly, there was a strong cubic relationship between predicted probability-classes and observed sett density in the local-area (R2 = 0.85 and 0.96; for the non-constrained and the constrained models, respectively). At the national-scale, the preferred model predicted a total of 19,200 (95% Confidence Interval: 12,200–27,900) social groups. Our analyses demonstrated that under a critical perspective large-scale potentially biased datasets can be used to estimate variations in species abundance. The abundance predictions are in keeping with recent independent estimations of the badger population, and will be a valuable index of species abun- dance for epidemiology (e.g. risk mapping), species management (e.g. informing vaccine strategies) and conservation planning (e.g. assessing population viability).