Abstract

Here we present a portable X-ray fluorescence (pXRF) dataset collected in situ (n = 1591) and a laboratory dataset (n= 226) from a soil sampling campaign in Marirongoe, Mozambique, to document the strength of rapid geochemical data collection in the field during mineral exploration. Real-time mapping of the geochemistry of underlying granite by utilising pXRF analysis of soil samples identified variation in granitic composition, thus allowing exploration to rapidly focus on the most prospective areas for Ta-Nb-U-REE mineralization. Principal components analysis and clustering protocols are applied to the centred log-ratio transform of a selection of eight elements (Ca, Fe, K, Rb, Sr, Ti, Mn, and Zr) to identify rocks of the same geochemical affinity a posteriori. Maps of these clusters reveal a map pattern that provides an interpretation of the underlying geology. One of the limitations of pXRF is false elemental concentrations being detected due to spectral overlaps between elements. We provide a possible solution to this problem through statistical data analysis using a probabilistic modelling approach. We propose a binary approach whereby the pXRF data for these elements, such as Sn, can be considered in the context of presence (detected; >150 ppm) or absence (not detected; <150 ppm) as comparison to laboratory data shows that the concentration of Sn is reliably detected at concentrations >150 ppm. A kernel density estimator and Bayes conditional probability can provide an effective method for calculating the probability of a sample having elevated content of elements, such as Sn, which may be variably detected by pXRF (depending on matrix and concentration). Utilising statistical approaches to treat large geochemical datasets, such as those that can be generated by pXRF, as they are collected, can provide timely and significant insights that might otherwise not have been apparent in elemental concentration maps alone.

Portable X-ray fluorescence (pXRF) technology has become widely accepted as a routine tool for data collection in mine and exploration settings. The importance of applying routine quality assurance and quality control (QA/QC) protocols to pXRF workflows to produce reliable and robust datasets has been rigorously documented (e.g. Durance et al. 2014; Fisher et al. 2014; Gazley & Fisher 2014; Simandl et al. 2014). With the use of time-efficient data collection procedures and simple QA/QC protocols, coupled with robust workflows, dynamic exploration strategies can be developed rapidly and at low cost, whereby data available while teams are still in the field can inform the focus of exploration activities. Early-stage exploration projects are often very budget-sensitive and being able to efficiently collect data and draw geologically significant conclusions from them is important. In this paper, we present a case study from Marirongoe, Mozambique, that documents the value of rapidly collected field data to an exploration campaign, allowing real-time mapping of the underlying granite chemistry using soil samples. There has been significant historic artisanal work in the Marirongoe area, and previous exploration work in the area has identified Ta-Nb-U-REE xenotimite pebbles in alluvial sand deposits. Thus, the aim of this field campaign was to try to identify the source of the pebbles. This was aided by an airborne radiometric survey that identified areas of significant radioactivity, which were to be followed up by the soil sampling campaign we document here. An initial grid of soil samples analysed in situ by pXRF (n = 1591) was followed up by a second, broader-spaced, soil sample campaign in which samples were collected for laboratory analysis (n = 226).

Utilising the in situ pXRF dataset, we map variations in granitic compositions, allowing identification of the most fractionated areas, and thus those most prospective for Ta-Nb-U-REE mineralization whilst exploration teams are still out in the field and without having to wait for laboratory results to be returned. The in situ pXRF data also highlighted anomalous concentrations in Sn and Hf, and it is the former that we examine in detail here. We also highlight the limitations of pXRF with respect to the appearance of false anomalies due to peak overlap. We provide a possible solution to this problem through statistical data analysis and the probabilistic modelling approach of Hill et al. (2014). For rare earth elements (REE) and elements such as W, Sn and Ag, pXRF is a suboptimal technique due to elemental interferences. Many REE have interferences with Ca and Fe, Ag with K c. 3.3 keV, Sn with Ca at c. 3.7 keV, and Ag and Sn interfere on each other at c. 25 keV. Furthermore, the use of W and Ag inside pXRF instruments can result in poor detection limits for these elements, and in some cases, removes the ability to detect that element entirely. Thus, pXRF may only be reliable at determining that these elements are present above a given threshold, but not their absolute abundance. We propose a binary approach whereby the pXRF data for these elements, in this case Sn, can be considered in the context of presence (detected; >150 ppm) or absence (not detected; <150 ppm) and the actual concentration reported is not considered. By utilising other elements that are robust, reliable and detected in every sample of a dataset (e.g. Ti, Zr, Fe, and Mn), it is possible to predict the probability that any given sample also contains Sn at concentrations >150 ppm.

Geological setting

Regional

The crystalline basement in the northern part of Tete Province, NW Mozambique, is largely composed of various Mesoproterozoic granitoids, with Neoproterozoic to Ordovician granitoids occurring to a smaller areal extent. These granitic rocks are divided into several types on the basis of petrography, texture, geophysical signature, chemical composition, and age. Most granitoids belong to the 1200–1000 Ma age group and are related to the Grenvillian Orogeny. The largest massives belong to the Furancungo Suite, Cassacatiza Suite, Serra Danvura granitoids, Castanho Granite and Rio Capoche Granite, which are all composed of different magmatic phases. 500–470 Ma intrusions belong to post-Pan-African magmatism. Some intrusions have undergone Pan-African (c. 530 Ma) metamorphism (Mäkitie et al. 2008).

The Marirongoe study area is located within the c. 1300 Ma Fíngoè Terrane. This Terrane is interpreted as a 1330 Ma active margin or island arc volcano-sedimentary belt overlain by a 1200 Ma magmatic arc (Westerhof et al. 2008). It consists of SW- to NE-trending volcano-sedimentary Fíngoè Supergroup with vast granitoid domains to the NW and SE. The southwestern granitoid is assigned to the 1077 ± 2 Ma Cassacatiza Suite, and contains Fíngoè windows and intrusions of younger granitoids including the 1050 Ma Monte Sanja Suite and undated Marirongoe Granitoids (Westerhof et al. 2008).

Local

The Marirongoe granite pluton is a roughly circular intrusion with a diameter of c. 9 km, which has been emplaced into Cassacatiza Suite rocks (Fig. 1). The Marirongoe granite is a weakly foliated, predominantly equigranular quartz-feldspar-biotite granite with minor magnetite in places (Siegfried 2008). Texturally, it is mostly even-grained, but some porphyritic varieties also exist, occasionally with rapakivi textures. Some of the even-grained types resemble granitised quartz-feldspar gneisses (Mäkitie et al. 2008).

A significant portion of the eastern pluton is characterized by a more mafic intrusive phase (Siegfried 2008). This pluton contains numerous pegmatite dykes up to 8 m thick. There is a particular concentration of dykes in a pegmatite swarm on the eastern and southeastern margins where historical gemstone mining has been focussed. High quality gemstones, including topaz and aquamarine beryl have been mined here since the early twentieth century. Artisanal workings have exploited in situ pegmatites as well as the regolith overlying pegmatites. The mined pegmatites typically comprise a white quartz nucleus with surrounding white/pink K-feldspar and quartz, biotite, black tourmaline (schorl) and sometimes muscovite, and minor gemstones. Pegmatites in the eastern pegmatite field are highly evolved, structurally complex and intrude into the genetically related alkaline Marirongoe granite (Siegfried 2008). Mafic dolerite dykes are found both around and within the pluton and are locally the youngest igneous phase (Ualadze Suite) (Siegfried 2008). The eastern margin of the study area is dominated by an extensive mantle of Quaternary colluvium. A relatively shallow regolith is characteristic of the study area. In the north of the study area lie meta-arkose quartz-feldspar gneiss of the <1300 Ma Fingoé Supergroup. To the SW are megacrystic deformed granites and granodiorites with small blocks of older orthoquartzite (Sale-Sale Formation), and coarse-grained mesocratic deformed granites of the Cassacatiza Suite (Siegfried 2008).

Portable XRF acquisition, correction and processing

All of the pXRF elemental concentrations presented here were obtained using a Niton XL3t GOLDD pXRF unit with an 8–50 kV Ag X-ray tube using methods that are consistent with the approaches outlined in Gazley & Fisher (2014) with some exceptions as outlined below. The analytical uncertainty on each analysis is a function of run-time; a total analysis time was typically 60 seconds (20 seconds on each filter). This analysis time was selected to optimise the number of samples that could be analysed in a reasonable time while maintaining analytical precision. Analytical errors are on the order of ≤3% as reported by the pXRF unit based on counting statistics. To check for contamination, SiO2 blanks were analysed periodically (at least once daily) and showed no addition of elements throughout the study period. The pXRF unit was used in the field by digging a c. 20 cm pit and using the ‘NITON soil extension pole’ to stabilise the unit in the pit on a nominal 40 × 80 m grid covering an area of c. 4 × 2 km for 1591 samples. This method is considered appropriate for this sample medium as the soil in the study area is typically fine-grained and well-homogenised; indeed, some workers have argued in situ analysis of samples is more representative than sub-sampling for subsequent analysis (e.g. Ramsey & Boon 2012). Samples need to be dry prior to pXRF analysis, otherwise concentrations will be underreported (e.g. Parsons et al. 2012). The soils analysed in this study were readily sieved in the field, confirming moisture contents were low.

Four reference materials with bulk compositions that are reasonably consistent with the weathered granite analysed here were routinely included at a frequency of c. 1 reference material in 20 samples; NIST2709a (soil), NIST2780 (mine waste), RCRA (spiked soil sample), and TILL-4 (till); details of these reference materials can be found in Table 1. Note the poor performance for As, Hf and W in NIST2780 compared to their expected values. Examination of the spectra for analyses of NIST2780 does not provide any insights into this, and accordingly the determinations of As, Hf and W are excluded from the correction factor calculations outlined below. An examination of the performance of all of these reference materials shows that the pXRF unit was very stable throughout (Fig. 2). Accordingly, an approach has been adopted whereby the median value for these reference materials has been used to calculate a correction factor. For most elements, this took the form of y=mx+c, however, for some trace elements that have values that converge on 0, a y=mx correction may be more appropriate (Fisher et al. 2014). These reference materials were also used to calculate a lower limit of quantification (LOQ) following the approach of MacDougall & Crummett (1980) where repeated analyses are utilised to calculate an LOQ. These values are presented in Table 2 along with summary statistics for each element in the dataset. Based on the QA/QC protocols outlined above, we consider that the data are fit-for-purpose (e.g. Bédard & Barnes 2010): namely, to characterise variation in granite geochemistry within the Marirongoe area to aid exploration for Li, Ta and Sn.

Performance of standards NIST2709a, NIST2780, RCRA, and CCRMP over the period for which the data used in this study were collected. Error bars are presented for analytical error as reported by the pXRF unit, but are insignificant.

Lithium, Ta and Sn were identified as the elements of economic interest in the Marirongoe area, accordingly the reliability of these elements by pXRF must be verified if they are to be utilised. The atomic number for Li is too light to be detected by pXRF as it does not produce X-ray peaks at energies that are detectable. Accordingly it is not discussed any further in this contribution. The performance of many elements by pXRF has been well documented in the literature (e.g. Durance et al. 2014; Fisher et al. 2014; Hall et al. 2014; Simandl et al. 2014). However, there is only limited peer-reviewed literature to verify the determination of Ta and Sn by pXRF (e.g. Knésl et al. 2015). Unfortunately, in this study, the original dataset was collected in situ on the soil (on a nominal 40 × 80 m grid) and at that time there were no samples collected for subsequent laboratory analysis. However, following anomalies in Ta and Sn being detected in the pXRF dataset a second dataset was collected over the same area for laboratory analysis. This second sample set (n = 226), was collected across a larger grid spacing of 100 × 300 m, and the sample was taken from the bottom of a c. 20 cm deep pit and sieved in the field with the <1 mm fraction retained. These samples were sent for laboratory analysis at Intertek Genalysis Laboratory, Johannesburg, South Africa. Three assay techniques were utilised following two sample digestion methods: aqua-regia digestion followed by flame atomic absorption spectrometry (Cu, Ni and Zn) and inductively coupled plasma mass spectrometry (Au, Bi, Co, Mo, Pb, Pd, Pr and Pt); and sodium peroxide fusion (Ni crucibles) and hydrochloric acid to dissolve the melt followed by ICP-MS (Ce, Dy, Er, Eu, Gd, Ho, La, Li, Lu, Nb, Nd, Sm, Sn, Ta, Tb, Th, Tm, U, W, Y and Yb). Since the original in situ pXRF dataset (40 × 80 m grid) and subsequent laboratory dataset (100 × 300 m) were not collected on the same samples, and did not have a co-incident grid, both datasets were estimated to 100 × 100 m cells. This was done using an inverse distance squared weighted interpolation in 3DS Surpac Software, requiring a minimum of four points to estimate each block and a maximum of 20 points, and using a 250 m search radius. The interpolated values allows for a direct comparison between the two datasets.

Results

pXRF geochemistry

Calcium, Fe, K, Rb, Sr, Ti, and Zr were detected in all samples analysed, Mn was detected in 89.0% of the samples and As in 94.5% of the samples analysed (Table 1). All of these elements are known to perform reliably and robustly by pXRF (e.g. Durance et al. 2014; Fisher et al. 2014; Hall et al. 2014; Simandl et al. 2014), and these results were corrected using the set of reference materials that were analysed at the same time as the samples (e.g. Fisher et al. 2014; Gazley & Fisher 2014). Aluminium, Si and Mg were detected in <6.2% of samples due to poor sensitivity of the instrument to these atomically-light elements and the data are accordingly not presented here. However, Ta and Sn are elements of interest and were not well constrained in the available reference materials. Accordingly, a comparison to laboratory data is required to assess the quality of these elements. A scatter plot of Ta and Sn data based on the cell values of laboratory and pXRF data estimated by inverse distance squared weighted interpolation is presented in Figure 3. Since these elements were not sufficiently present in the reference materials to be corrected they are left raw (i.e. as reported by the pXRF unit). Thus, one would not expect to have a slope that was close to 1; however, if the correlation was good, the R2 would tend towards 1. Tantalum does not correlate well with the laboratory data; a linear fit results in an R2 = 0.36; Sn performs slightly better with an R2 = 0.44. To demonstrate that the interpolation method that we have utilised here to generate a common 100 × 100 m grid to allow comparison between the in situ soil dataset (40 × 80 m grid) and subsequent dataset sent for laboratory analysis (100 × 300 m grid), we have also included raw Th data on Figure 3c. The laboratory v. raw pXRF plot for Th has an R2 = 0.89. The high R2 value for Th suggests that the interpolation method did not contribute to the inferior R2 values for Sn and Ta, and this poor correlation is an analytical artefact, not a sampling/interpolation one.

Comparison of laboratory data against pXRF data from cell values of data estimated by inverse distance squared weighted interpolation. A 1:1 line is the dark dashed line, while the solid light line is the line of best fit (a) Ta; (b) Sn; and (c) Th. Note that Th has an R2 value of 0.89 which shows that the interpolation method was not responsible for the poor correlation in Ta and Sn data.

Initial examination of the pXRF data for the soil samples was conducted by plotting elemental concentrations in map view for all elements. Corrected pXRF data is presented for Ca, Fe, K, Rb, Sr, Ti, Mn, and Zr in Figure 4; while Figure 5 presents uncorrected pXRF data for Ta and Sn, and corrected pXRF data for As. These two figures reveal a cohesive high in Rb and As through the middle of the study area, consistent with a region of elevated K. The northern-half of this geochemical high represents enriched Sn, and increased Ta to the south. Elevated As, Sn, and Ta are surrounded by lower Rb granites that are variably enriched and depleted in other elements: Ca, Fe, Zr, Ti, Mn and Sr form cohesive highs and lows, while K exhibits an opposite map pattern to these elements.

Uncorrected pXRF data for Ta and Sn and corrected pXRF data for As based on gridding function in ioGAS™; search radius of ten cells and smoothing radius of five cells. Outline of the As (and Rb) anomaly is plotted on the Ta and Sn maps as a dashed line.

Multivariate ordinations

To better differentiate lithological variations in the geochemical data, an eight-element subset (Ca, Fe, K, Rb, Sr, Ti, Mn, and Zr) was selected. These elements are likely to represent lithological variation in the underlying granites and, significantly, were detected in all samples analysed (except for Mn, detected in only 89.0% of samples). Missing Mn values were imputed using the impKNNa function from the robCompositions package (Hron et al. 2010; Templ et al. 2011).

The data output by pXRF are compositional (i.e. each sample sums to 100%), this is known as the ‘closure issue’ as the data are not independent. Compositional data, because of this ‘closure issue’, do not inhabit a Euclidean sample space, and thus cannot be subjected to standard statistical methods (e.g. parametric techniques) or even plotted on an element-element plot. When data are closed, the value of each component (i.e. element) depends on the value of every other component (i.e. increase the amount of one element and one, or more, must decrease to accommodate it). A centred log-ratio transform (CLR) was applied to this dataset as it ‘opens’ the dataset by transforming it such that the components are no longer dependent – in a geometrical sense it transforms the dataset from a simplex space to a Euclidean space (e.g. Aitchison 1982, 1986; Aitchison et al. 2000). Aitchison's CLR was applied using the clr function in the R package Hotelling (Curran 2013; R Development Core Team 2016). Sparse robust principal components analysis (PCA) was performed on the log-ratio transformed data using the PCAgrid function in the R package pcaPP (Croux et al. 2007; Filzmoser et al. 2014; R Development Core Team 2016). We used a broken stick distribution (Jackson 1993) to screen for statistically-significant principal components (PCs). In a broken stick test, a randomized dataset of the same structure as the real data but with uncorrelated variables is generated and eigenvalues are calculated from it. Both the eigenvalues of the real data and of the randomized simulated data are plotted as a scree plot in rank order. Any components from the real dataset whose eigenvalues are higher than those predicted by the randomized dataset are considered statistically significant (Fig. 6). Principal component 1 and PC2 were found to be significant, and together account for 75% of the variance in the dataset. The point-scatter of samples along PC1 and PC2, and associated elemental loadings (i.e. the input variables driving each PC axis) are shown in Figure 7. Principal component 1 is dominated by variation in Zr and Ti v. Mn, whereas PC2 is dominated by K and Rb v. Ca and Fe. Thus, PC1 and PC2 most likely track lithogeochemical variation in the granitic system.

Principal component scores for each sample were plotted in map pattern as interpolated using the gridding function in ioGAS™ using a search radius of ten cells and smoothing radius of five cells (Fig. 8). Since Sn was not included in the PCA, it is apparent from a comparison of Figures 5 and 8 that Sn (Ta and As) are associated with high PC2 values, which as discussed above is dominated by high K and Rb concentrations and depletions in Ca and Fe. These areas can now be identified using other elements that can be robustly and routinely detected by pXRF. To classify suites of chemically-similar samples, a Bayesian mixture-modelling cluster analysis was run using the package Mclust (Fraley et al. 2012) in R. Mixture modelling is a probabilistic modelling approach to identifying subpopulations (subgroups) within a larger population (dataset), and uses the Bayesian Information Criterion to decide how many subpopulations best describes the dataset. Results are shown by colours on Figure 7 and as a colour map in Figure 9. Groupings make geological sense as they are coherent groups on the PCA biplot (Fig. 7) and correspond to eigenvectors for elements that are known to covary or have antithetic relationships in granitic systems.

The difficulties associated with compositional data analysis in geochemistry have recently been summarized in Buccianti & Grunsky (2014) and references therein, and the approaches that we have adopted here are consistent with their recommendations to ensure that the variations in our geochemical dataset are appropriately documented. Furthermore, as discussed in Grunsky et al. (2014) an advantage of using PCs over groups of elements is that they represent linear combinations of elements that are likely controlled by mineral stoichiometry.

A probabilistic approach

The probabilistic approach of Hill et al. (2014) was utilised to identify Sn anomalies. Tin is not reliably detected by pXRF using its low energy peaks because of peak overlaps between Sn Lα1 = 3.444 keV and Lβ1 = 3.663 keV peaks with the Kα1 = 3.314 keV, Kβ1 = 3.590 keV for K and the Kα1 = 3.692 keV for Ca. Since both K and Ca are typically major rock-forming elements it is not possible to utilise these peaks. However, Sn also has some higher energy peaks (Kα1 = 25.044 keV, Kβ1 = 25.271 keV, Kβ1 = 28.486 keV, Kβ2 = 29.109 keV, and Kβ3 = 28.444 keV; Fig. 10); but, unfortunately the pXRF unit that was deployed to the field had an Ag tube resulting in interference between Ag and Sn not recognized at the outset of the field season. Silver has some peaks that coincide with the Sn Kα peaks (namely Kβ1 = 24.942 keV, Kβ2 = 25.456 keV, and Kβ3 = 24.911 keV) which means that an absolute determination of Sn concentration is problematic by pXRF at low-concentrations (i.e. the difference in peak height between the samples presented in Figure 10b, c is negligible); but once higher concentrations are reached the Sn Kα peaks become sufficiently well formed to be reliably detected over the Ag Kβ peaks (Fig. 10d). On the basis of inspecting the shape and height of the peaks in the spectra (Fig. 10) and the plot of pXRF v. laboratory Sn concentrations (Fig. 3b), we consider that at higher concentrations >150 ppm can be both reliable and robust.

(a) Energy spectra from the pXRF unit from 0–50 keV for a sample reported by the unit to have 39.8 ppm Sn; (b) as for (a) but 18–30 keV; (c) as for (b) but for a sample with <34.9 ppm Sn; and (d) as for (b) but for a sample with 551.6 ppm Sn. Relevant energy peaks for Sn and Ag are labelled.

To utilise the large pXRF soil dataset we have adopted the approach of Hill et al. (2014) who utilised Rb, Sb, and Cr concentrations to determine the probability that a sample contained Au above a given threshold in an Au deposit with high short-range grade variance. Hill et al. (2014) showed that a kernel density estimator and Bayes conditional probability can provide an effective method for calculating the probability of a sample having elevated Au content and that this measure will be more spatially continuous than Au assay values if the appropriate geochemical proxies are selected. We have applied the same code utilised by Hill et al. (2014) to predict the probability of Sn concentration in all samples in the pXRF dataset. We utilised Fe, Ti, Zr and Mn concentrations as these elements are reliably detected by pXRF and should be reasonably immobile during weathering. Tin concentrations that were >150 ppm (8% of the samples, n = 128) were used as the training dataset and the results are presented in Figure 11.

Probability map for Sn based on Fe, Ti, Zr, and Mn concentrations interpolated using gridding function in ioGAS™; search radius of ten cells and smoothing radius of five cells.

The two key outcomes from applying a conditional probability approach are: (1) that two areas in the NW area of the soil grid appear as having a high probability for having anomalous Sn concentrations (Fig. 11), despite no Sn detected by pXRF (Fig. 5); and (2) the area of high Sn to the southeast of the main Sn anomaly (8 339 400 mN, 345 600 mE, Fig. 5) is not associated with a high probability for Sn but rather associated with a watercourse, and most probably represents alluvial transport of Sn-bearing minerals. The first outcome provides rapid targets for follow-up exploration, while the latter can be excluded from follow-up exploration programmes.

Adopting the approach of Hill et al. (2014), using conditional probability to estimate the likelihood that a given sample contained >150 ppm Sn, provides an approach to reliably determine the potential distribution of an element that is not well-determined at low concentrations. This technique provides an opportunity for dynamic exploration campaigns, where the results of a small number of samples with concentrations considered to be reliable and robust can be used to inform the distribution of a given element. A similar approach could be adopted whereby laboratory analyses are used to determine the relationship between elements that are reliably detected by pXRF (e.g. Fe, Zr, Ti and Mn) with elements that are unreliably detected, in this case Sn, or alternatively for elements that there is no way of determining by pXRF, e.g. Li.

Acknowledgements

Permission to publish these data comes from Great Western Exploration Ltd. We are grateful for the comments of Angus McFarlane, June Hill and Shawn Hood on an early draft of this paper, and the journal reviews of Shaun Barker and Gwendy Hall.