This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Air pollution exposure has been shown to be associated with an increased risk of specific cancers. This study investigated whether the number and incidence of the most common cancers in Saudi Arabia were associated with urban air pollution exposure, specifically NO2. Overall, high model goodness of fit (GOF) was observed in the Eastern, Riyadh and Makkah regions. The significant coefficients of determination (r2) were higher at the regional level (r2 = 0.32–0.71), weaker at the governorate level (r2 = 0.03–0.43), and declined slightly at the city level (r2 = 0.17–0.33), suggesting that an increased aggregated spatial level increased the explained variability and the model GOF. However, the low GOF at the lowest spatial level suggests that additional variation remains unexplained. At different spatial levels, associations between NO2 concentration and the most common cancers were marginally improved in geographically weighted regression (GWR) analysis, which explained both global and local heterogeneity and variations in cancer incidence. High coefficients of determination were observed between NO2 concentration and lung and breast cancer incidences, followed by prostate, bladder, cervical and ovarian cancers, confirming results from other studies. These results could be improved using individual explanatory variables such as environmental, demographic, behavioral, socio-economic, and genetic risk factors.

A thorough understanding of the consequences of air pollutants on public health is essential for the progress of functioning policies to decrease the negative impact of ambient air pollution [1]. Mounting evidence indicates that exposure to air pollution might be associated with an increased risk of adverse health effects. An association has been reported between exposure to pollutants, such as particulate matter (PM), nitrogen dioxide (NO2) and ozone (O3), and increases in hospital admissions for cardiovascular and respiratory disease and mortality in Europe and the United Sates [2].

Several studies have found a relationship between the risk of developing cancer and exposure to air pollution [3,4,5], and many have concluded that long-term exposure to PM air pollution is positively associated with increased lung cancer mortality [6,7,8]. Nyberg et al. [4] used nitrogen oxide (NOx)/NO2 and SO2 as air pollution indicators from road traffic and heating and found that urban air pollution increased lung cancer risk. Based on the well-documented urban/rural difference in lung cancer incidence in Oslo, Nafstad et al. [5] found that the adjusted risk ratio for developing lung cancer was associated with NOx exposure between 1974 and 1978. Vineis et al. [9] assessed the relationship between air pollution (NO2, PM10, and SO2) and lung cancer in Europe. They found an association between lung cancer and NO2, while no obvious association was observed for other pollutants. In another recent study, an estimated 5%–7% of lung cancers in European non-smokers and ex-smokers could be attributed to exposure to high levels of air pollution, including NO2, or vicinity to heavy-traffic roads [10]. Evidence for an association between long-term exposure to air pollution and lung cancer is not limited to populations in Western countries. A study conducted by Katanoda et al. [11] demonstrated that long-term exposure to air pollution (PM2.5, SO2 and NO2) was related to the development of lung cancer and respiratory diseases in Japan. Raaschou-Nielsen et al. [12] found a relationship between NOx concentration and lung cancer risk and living within 50 m of a major road.

Although most studies have focused on the association between air pollution and lung cancer, there is evidence that air pollution is associated with an increased risk for other cancers. Castano-Vinyals et al. [13] reported small-to-moderate positive relationships between bladder cancer and a number of air pollution indicators. A trend analysis conducted in Taiwan demonstrated a significant relationship between increases in air pollution and risk of death from bladder cancer [14]. Crouse et al. [15] examined whether postmenopausal breast cancer was related to urban air pollution using NO2 as an indicator of air pollution. They found an approximately 25% increased risk of postmenopausal breast cancer for every 5 ppb increase in exposure to the ambient NO2 concentration. Raaschou-Nielsen et al. [16] investigated the association between traffic-related air pollution and risk for cancers other than lung cancer; they modeled the NOx concentration and traffic at the residence level as air pollution indicators from traffic. NOx at the residence level was considerably related to brain and cervical cancer risk. Rosenlund et al. [17] temporally analyzed all of the cancer cases that occurred in Stockholm County between 1985 and 1996 and suggested that long-term exposure to traffic-generated air pollutants such as NO2 increases the risk of cancer. Based on a follow-up evaluation that was conducted in 1999 and 2000 using annual average air pollution exposure data from 1991 to 2000, Kan and Gu [18] found significant associations between air pollutants (TSP, SO2 and NOx) and mortality from lung cancer in China. Using time-varying Cox proportional hazards models, Yorifuji et al. [19] provided support for the prevailing evidence that long-term exposure to traffic-related NO2 air pollution increases the risk of cardiopulmonary mortality as well as lung cancer mortality. In a Canadian study, Hystad et al. [20] developed spatiotemporal models to investigate lung cancer incidence in relation to long-term exposure to ambient air pollutants and found that lung cancer incidence increased most with NO2 and PM2.5 exposure.

However, other studies have reported a moderate, low or no evidence of association between the risk of adverse health effects and air pollution. Beelen et al. [21], for example, investigated the association between lung cancer incidence and air pollution using exposure to black smoke, NO2, SO2 and PM as well as traffic intensity variables as air pollution indicators. The relative risks were slightly below unity for the overall air pollution concentrations, while they were slightly elevated for the traffic variables.

Exposure to air pollution such as NO2 might be considered to be one environmental risk factor for cancer; however, cancer incidence rates are influenced by a combination of genetic, demographic, socio-economic and environmental risk factors [22,23,24,25,26,27,28,29,30,31,32]. Regrettably, there seems to be a lack of data on these covariates in Saudi Arabia, and thus these covariates could not be analyzed in the present study.

Most of the abovementioned studies used either logistic regression or classical global regression techniques, such as ordinary least square (OLS) regression, which presuppose that the relationship between cancer and air pollution is spatially invariant, homogeneous and stationary, i.e., there are no local variations in the associations between the dependent and explanatory variables. The concept of stationarity is central in the analysis of spatial and temporal variations. A stationary process is a process that has similar properties at all locations in the area of interest. A stationary model has the same parameters at all locations, whereas a non-stationary model allows the parameters to vary locally [33]. Geographically weighted regression (GWR) is a local spatial statistical method used to examine spatial non-stationarity by allowing the associations between variables to vary from location to location [34]. GWR is a simple but powerful method for exploring non-stationary spatial relationships. It is a useful exploratory analytical tool that generates a set of location-specific parameter estimates that can be mapped and analyzed to provide information about spatial non-stationarity in the relationships between the predictors and the outcome variable [35]. The core principle underlying several local methods is the notion of spatial dependency: features close together in space tend to be more similar than features that are farther apart. This principle was termed the “First Law of Geography” by Tobler [36]. GWR is capable of extending the same principle to regression analysis [33].

The applications of GWR have grown rapidly in various fields, including sociology, health and demography [35]. GWR studies in health fields include the analysis of health and disease [37,38,39,40], health care delivery [41], the spatially varying relationships between immature mosquitoes and human population density [42] and gastric cancer in Taiwanese ethnic communities [43]. Mandal et al. [44] used OLS and GWR to examine whether breast cancer in females and prostate cancer in males were correlated at the county level in the United States using age-adjusted county-level average annual incidence rates for Caucasians. GWR revealed a more pronounced association than did OLS, and the parameter estimates computed for each county in the GWR model helped to determine that over 76% of the counties had a significant positive association between breast and prostate cancer. A more relevant study to the present research was conducted by Gilbert and Chakraborty [45], who stated that the spatial association between the cumulative cancer risk from exposure to hazardous air pollutants and explanatory variables such as race, ethnicity and socioeconomic status is not stationary throughout Florida’s census tracts. They found that conventional multivariate regression techniques such as OLS cannot reveal the local variations in these associations, whereas GWR allowed them to examine the spatial variation within the study area for each individual model coefficient.

Cancer incidence and mortality demonstrate non-stationary processes with regional variation and spatial drift, as they occur at different rates in different places. However, few studies in the literature have reported the use of GWR to assess the relationship between cancer incidence and tropospheric NO2. The present study aimed to investigate whether the number and incidence of the most common cancers in Saudi Arabia were significantly associated with exposure to urban air pollution (using NO2 as an indicator) using OLS and GWR in a Geographical Information System (GIS).

2. Materials and Methods2.1. Cancer Data

Incidences of cancer were obtained from the Saudi Arabian Cancer Registry (SCR) [46]. The cancer dataset included data on diagnosed incidences of cancer in Saudi nationals from January 1998 to December 2004. A total of 45,532 cancer patients were diagnosed during this period. Many cancer indices have been devised to express the occurrence of cancer and other diseases in each zone. Three of the most common indices for cancer research are the crude incidence rate (CIR), the age-specific incidence rate (AIR) and the age-standardized incidence rate (ASR) [47,48]. In this study, the CIR for a particular cancer site in the human body is the total number of cases registered as a proportion of the total population. All rates were expressed as per 100,000 population. The CIR per 100,000 can readily be calculated by dividing the total number of cases of a particular cancer type by the population and multiplying the result by 100,000. Although age is a well-known covariate for cancer incidence, in the present study, the CIR was considered as opposed to an ASR. This decision was made because population data by age group are only available at the regional level, not at the governorate and city levels, and thus, ASRs could not be computed for governorates and cities. Furthermore, to ensure the consistency of the cancer rates for comparison across the three spatial levels, the CIR was most appropriate measure.

The spatial cancer incidence database in Saudi Arabia was designed and developed in the form of an ESRI File Geodatabase on three spatial levels: regional, governorates and cities. Saudi Arabia is divided into thirteen regions; each region is divided into governorates, and each governorate includes a number of cities. The cancer database we obtained had records for individual cancer cases. However, the location of each cancer case was not included. To develop a spatial cancer database, the individual cancer cases were aggregated into city, governorate, regional and national levels. Starting from the city level, all of the cancer cases located in the same city were grouped and aggregated to be represented by that city. Next, all of the cancer cases located in certain cities belonging to a specific governorate were grouped together and represented by that governorate. At the regional level, all of the cancer cases in certain governorates belonging to a specific region were grouped together and represented by that region.

2.2. NO2 Data

NO2 is an omnipresent atmospheric pollutant due to the extensive prevalence of both natural and anthropogenic sources, and is a primarily man-made gas. NO2 is produced in the environment as the main emission nitrogen oxides (NOx). The NOx that yield NO2 are emitted naturally by biomass burning (e.g., forest fires), lightning, and microbial activity in the soil, while they are emitted due to anthropogenic activities by fossil fuel and biofuel combustion, power plants, heavy industry and vehicular traffic, making it a strong indicator of vehicle emissions. NO2 (and other NOx) is a forerunner of a number of harmful secondary air pollutants, including nitric acid and photo oxidants (including ozone) [1,49,50].

Figure 1

Distribution of mean tropospheric NO2 column density, 2003–2010.

The mean tropospheric NO2 column density data for cities in Saudi Arabia (Figure 1) were extracted from a global NO2 pollution map produced by the Satellite Group in the Max-Planck-Institute for Chemistry in Mainz, Germany [51]. The image shows the global mean tropospheric NO2 column density between 2003 and 2010 using Envisat observations as measured by the SCIAMACHY instrument on ESA’s Envisat, the world’s largest satellite for environmental monitoring. “SCIAMACHY is an imaging spectrometer whose primary mission objective is to perform global measurements of trace gases in the troposphere and in the stratosphere. The solar radiation transmitted, backscattered and reflected from the atmosphere is recorded at relatively high resolution (0.2 µm to 0.5 µm) over the range 240 nm to 1700 nm, and in selected regions between 2.0 µm and 2.4 µm. SCIAMACHY has three different viewing geometries: nadir, limb, and sun/moon occultations which yield total column values as well as distribution profiles in the stratosphere and (in some cases) the troposphere for trace gases and aerosols. The nadir and limb viewing strategy of SCIAMACHY yields total column values as well as profiles for trace gases and aerosols in the stratosphere. Additionally, this enables estimates of global trace gas and aerosol content and distribution in the lower stratosphere and troposphere. The measurements obtained from SCIAMACHY enable the investigation of a wide range of phenomena which influence atmospheric chemistry such as measurement in the troposphere: biomass burning, pollution, arctic haze, forest fires, dust storms, industrial plumes; and measurement in the stratosphere: ozone chemistry, volcanic events and solar proton events. The spatial resolution of SCIAMACHY depends on the wavelength region and also on the solar zenith angle. For most NO2 measurements, the area is 60 × 30 km2. Currently, the analysis is based on a rather limited set of both uncalibrated and calibrated data that have been released by ESA, and therefore has to be considered as preliminary” [50]. A description of the retrieval algorithm used and an application to long-term changes of tropospheric NO2 can be found in Richter et al. [52].

Using the global mean tropospheric NO2 column density map, we first isolated the area of Saudi Arabia from the global map and then georeferenced the clipped map (Figure 1). The NO2 values were first extracted for Saudi cities using the Sample function with the nearest resampling algorithm, and then we aggregated the NO2 values at the governorate and regional levels using the Zonal Statistics function in ESRI ArcGIS. An issue associated with the aggregated NO2 is the method by which the geographic boundaries of regions and governorates are defined; this difficulty is known as the modifiable areal unit problem (MAUP) [53].

2.3. Spatial Statistical Analysis

GWR is a reasonably recent contribution to modeling spatially heterogeneous processes. Using GWR, parameters can be estimated anywhere in the study area given a dependent variable and a set of one or more independent variables measured at areas whose location is known [34,54,55,56,57]. In contrast to the global regression model OLS, GWR can estimate discrete coefficients for each observation, i.e., geographic features. GWR extends the conventional OLS linear regression models that mask significant local variation. The key difference between global and local analyses is that global estimation uses one model for all observations, while GWR estimates a particular local model for each location in space. GWR is capable of generating parameter estimates for every regression point using observations in a given neighborhood. The parameter estimates are characteristically mapped to highlight spatial variation [58]. GWR is an extension from global regression to local regression, with the critical idea that for each regression point i, there is a bump of influence around i described by the weight function such that sampled observations near i have more influence in the estimation of the parameters than observations sampled further away [34]. The GWR model can be expressed as follows:
(1)
where the dependent variable y is regressed on a set of independent variables, each denoted by xk, and the parameters are allowed to vary over space. Here, (ui, vi) denotes the coordinates of the i-th point in space, and βk(ui, vi) is a realization of the continuous function βk(ui, vi) at point i; xi1, xi2, . . ., and xip are the explanatory variables at point i; and εi are error terms [34,56]. For a given data set, the local parameters βk(ui, vi) are estimated using the weighted least square procedure. The weights wij for j = 1,.., n at each location (ui, vi) are obtained as a continuous function of the distance between point i and the other data points.

Let:
(2)
be the matrix of the local parameters. Each row is estimated by
(3)
where i = 1, . . ., p represents the row of the matrix, X is the matrix of explanatory variables, y is the dependent variable, and W(i) is an n by the n spatial weighting matrix of the form:
(4)

In global regression models such as OLS, every point has the same weight, whereas in local regression models such as the GWR model, the spatial weight of these points decreases with the distance from the regression point. The weights are computed using a weighting scheme that is known as a kernel. Following the suggestions of Fotheringham et al. [34], in this study, the spatial adaptive kernel was applied rather than the fixed kernel because cities are not positioned regularly in the study area, i.e., they are heterogeneous and clustered in some areas. The spatial context is a function of a specified number of neighbors. Where the distribution of cities (in this study) is dense, the spatial context is smaller; where the distribution of cities is sparse, the spatial context is larger. A spatially adaptive kernel is usually formed by sorting the distances of the sample points from the desired regression point i and setting the bandwidth so that it includes only the first N observations, where the optimal value of N is determined by the data. The weight can be computed by using the specified kernel, setting the value of any observation whose distance is greater than the bandwidth to zero and excluding them from the local calibration [57]. Although a number of kernels are possible, the bi-square weighting function is usually used to create adaptive kernels [34] and can be implemented in ESRI ArcGIS [38]. Gilbert and Chakraborty [45] used the bi-square weighting function to produce adaptive kernels for the GWR model that examined the spatial association between cumulative cancer risk from exposure to hazardous air pollutants and explanatory variables such as race, ethnicity and socioeconomic status. Charlton and Fotheringham [57] stated that the bi-square weighting function is a near-Gaussian function with the useful property that the weight is zero at a finite distance and can be expressed as wij = [1 − (dij/b)2]2, where dij is the distance between a calibration point i and a sample data point j and b is the distance to the Nth nearest neighbor, also known as the bandwidth.

In GWR, the regression model is adjusted based on the data that are geographically close to a specific location. In other words, GWR measures parameters within specified distances (named bandwidths) of each other and weights these parameters from an identified regression reference point using a spatial weight function. The optimal bandwidth distance or the optimal number of neighboring units in the GWR can be specified using either cross-validation or Akaike information criterion (AIC) tests. The AIC is considered the most fitting method for applying the adaptive kernel technique because it considers both goodness-of-fit and degrees of freedom [34,58]. In the present study, the optimal bandwidth size was found by minimizing the AIC value, following previous examples of GWR application [58,59,60,61,62]. The bandwidth was found by minimizing the AIC value. The AIC criterion in GWR is computed as in Hurvich et al. [63]:
(5)
where n is the number of observations in the dataset, is the estimate of the standard deviation of the residuals, and tr(S) is the trace of the hat matrix. The AIC can be used to compare models of the same independent variable and compare the global OLS model with a local GWR model [57]. The OLS and GWR models were fitted and mapped using ESRI ArcGIS 10.1.

3. Results

Our analysis of the mean tropospheric NO2 data for cities in Saudi Arabia (Figure 1) indicates that the high vertical column distributions of NO2 were associated with major cities across Saudi Arabia, including Riyadh (central) and Jeddah (western coast), and cities in the Eastern Province, including Dammam, Khobar, Jubail, and Ras Tanura.

A total of 45,532 cancer cases (22,930 males and 22,602 females) were diagnosed among Saudi Nationals between January 1998 and December 2004. In Saudi Arabia, the overall CIR between 1998 and 2004 was 42.41 per 100,000 people in the population (42.61 among males and 42.22 among females) (Figure 2), which indicates that cancer incidence is low among Saudi nationals. In a comparison of the CIRs of overall cancers in the Gulf Cooperation Council (GCC) countries [25], the rate observed among Saudis was lower than that observed in Bahrain, Qatar, Kuwait and Oman (51–93 among males and 47–98 among females) between 1998 and 2001 and lower than the worldwide rate (188 per 100,000) in 2008 [23]. The overall ASR of cancer at all sites in Saudi Arabia during the period between 1998 and 2004 ranged between 70 and 80 per 100,000 people (74–80 among males and 68–80 among females). Therefore, Saudi Arabia exhibited a lower ASR than did other GCC countries, such as Qatar (male: 165.5; female: 172.4) and Bahrain (male: 157.7; female: 144.6) between 1998 and 2001 [25]; the ASR of Saudi Arabia was also lower than the worldwide ASRs of 204 and 165 per 100,000 for males and females, respectively, in 2008 [23]. Liver cancer was the most common, accounting for 8.84% of all cancers in males, followed closely by non-Hodgkin’s lymphoma (NHL) with 8.80% and leukemia with 8.19%; colorectal cancer ranked 4th, followed by lung and prostate cancers. In females, breast cancer was the most common, accounting for 20.2% of all cancers in females, followed by thyroid cancer with 9.3%. Colorectal cancer ranked 3rd and was closely followed by NHL and leukemia. Riyadh region reported 13,063 cancer cases, accounting for 28.69% of all cancer diagnoses between 1998 and 2004, followed by Makkah region, which reported 10,479 cases, accounting for 23.01%, and Eastern province, which reported 7,698 cases, accounting for 16.91%. These three regions showed a significantly increasing trend in the overall number of cancer cases diagnosed between 1998 and 2004. Alahsa governorate (located in the Eastern region) reported the highest CIR, with 284.71 cases per 100,000 population. Ras Tanura governorate (Eastern region) was second, with a CIR of 113.82 cases per 100,000 population, and Shagra governorate ranked third, with a CIR of 110.96 cases per 100,000 population. Baha, Jeddah, Riyadh, Jazan, Dammam and Al-Khobar were among the governorates with the highest rates of all cancers, with CIRs ranging from 53.98 to 69.15 per 100,000 population. Samtah city (Jazan region) reported the highest CIR, with 177.13 per 100,000 population, and Al Qatif city (Eastern region) was second, with a CIR of 173.1 per 100,000 population. Al-Khobar, Shagra, Jazan, Alqunfidhah and Sarat Abidah were among the cities with the highest CIRs of all cancers, which ranged between 135.55 and 171.1 per 100,000 population (Table 1, Figure 2 and Figure 3).

The association between the mean tropospheric NO2 and the number and incidence rates of the most common cancers in Saudi Arabia at the region, governorate and city levels were examined using OLS and GWR. A significant association was found, but substantially smaller and less robust associations were also observed. It was found that the number of cancer cases has strong associations with CIR (r2 = 0.80, 0.73 and 0.84 for all cases, males and females respectively) and ASR (r2 = 0.76, 0.64 and 0.80 for all cases, males and females respectively). This justifies the use of the number of cancer cases in the analysis to detect association between cancer and NO2. Table 2, Table 3 and Table 4 show the associations between NO2 and the most common cancers at the region, governorate and city levels.

At the regional level (Table 2), the OLS method indicated that the numbers of lung, prostate, Hodgkin’s disease, bladder and breast cancers (r2 = 0.62, 0.56, 0.55, 0.55 and 0.50, respectively, p < 0.05) were significantly positively associated with NO2. While using the CIR, the main significant associations were positive associations between NO2 and breast, prostate and lung cancers (r2 = 0.71, 0.61 and 0.59, respectively, p < 0.05). It was found that ASR at the regional level has a stronger association with NO2 (r2 = 0.51, 0.49 and 0.52 for all cases, males and females respectively) than does CIR (r2 = 0.43, 0.37 and 0.47 for all cases, males and females respectively). This implies that if we have data for ASR at finer geographic levels, they might have stronger associations with NO2 as well.

At the governorate level (Table 3), the overall values of the coefficient of determination r2 were generally less than those found at the regional level. At the spatial governorate level, the OLS method indicated that the numbers of diagnosed breast, lung, bladder, cervical and ovarian cancers (r2 = 0.33, 0.32, 0.31, 0.33 and 0.33, p < 0.05) were the highest in terms of a significant association with NO2. However, the spatial autocorrelation of breast cancer violated the assumption of independence based on the clustered standardized residual error.

The significant associations between NO2 and the CIRs of the most common cancers were low among the most common cancers, and the highest was found for colorectal and all cancers (r2 = 0.06, p < 0.05). Using the GWR method, the highest significant correlation was found between NO2 and the number of diagnosed lung cancers (r2 = 0.43). The CIR of lung cancer showed the highest correlation (r2 = 0.44), but the spatial autocorrelation violated the assumption of independence based on the clustered standardized residual error. A significant correlation between NO2 and the CIR of the most common cancers was also found for breast and prostate cancers (r2 = 0.39 and 0.31, respectively).

At the cities level (Table 4), the overall values of the coefficient of determination r2 were generally lower than those found at the regional and governorate levels. At this spatial level, OLS and GWR were applied for the number of diagnosed cancers and the CIR. Using OLS, the highest significant association with NO2 was found for the number of lung cancer diagnoses (r2 = 0.23, p < 0.05), while for the CIR, there was no significant association (r2 ≤ 0.0003, p > 0.05). Using GWR, the highest significant correlation was found between NO2 and the number of lung cancer diagnoses (r2 = 0.33) followed by cervical, ovarian and breast cancers (r2 = 0.30, 0.29 and 0.29, respectively). Regarding the CIR, the highest significant correlation was found for Hodgkin’s disease (r2 = 0.22), whereas the other most common cancers were violated by the spatial autocorrelation of clustered standardized residual error. Overall, high coefficients of determination (r2) were observed in the Eastern, Riyadh and Makkah regions and in their governorates and cities.

4. Discussion

This study aimed to investigate whether the number of cases and incidence of the most common cancers in Saudi Arabia between 1998 and 2004 were significantly associated with exposure to NO2 urban air pollution using the OLS and GWR models in GIS. This study is the first in Saudi Arabia and the region to use spatial and non-spatial cancer data, the spatial cofounding factor (i.e., distribution surface of NO2), and the methods applied.

The high NO2 concentrations in the major cities across Saudi Arabia could be attributed to vehicle emissions and the chemical industries. Additionally, the Eastern region contains Saudi Arabia’s massive petroleum resources, as it is home to most of Saudi Arabia’s oil production. The province is also home of the City of Jubail, which hosts the Jubail Industrial City, a global hub for chemical industries and the largest industrial city in the Middle East. It also holds the Middle East’s largest and the world’s fourth largest petrochemical company. The Eastern region also encompasses Ras Tanura city, which is a major oil port and oil operations center for Saudi Aramco, the largest oil company in the world. The NO2 concentrations in Riyadh and Jeddah, the two largest cities in Saudi Arabia, could be attributed to the large number of cars and urban activities.

There were statistically significant associations between the concentration of NO2 air pollution and the most common cancers diagnosed between 1998 and 2004 in Saudi Arabia. This result can be explained by the fact that NO2 is much more concentrated in urban areas, where more cancer cases occur because of the size of the population. However, the coefficient of determination of these associations varied between the spatial levels of analysis (regions, governorates and cities), the methods used (OLS and GWR), the measurement of cancer data employed (diagnosed number or CIR) and the diagnosed cancer sites. Notably, the only results considered in this study were those significant at p < 0.05 and the standardized residual errors that were not spatially autocorrelated.

Regarding the spatial level of analysis, the significant coefficients of determination (r2) were higher at the regional level (r2 =0.32–0.71), weaker at the governorate level (r2 =0.03–0.43) and declined slightly at the city level (r2 = 0.17–0.33). The finding that the association was higher at the regional level may be attributable to the rural/urban variability in NO2, which is fairly visible in Figure 1. However, the low values of the coefficients of determination at the lowest spatial level (i.e., cities) suggest that additional variation remains unexplained. Thus, factors other than NO2 may be associated with the risk of cancer.

Robinson [64] coined the terms “ecological fallacy” and “ecological correlation”, which refer to the inappropriate use of an aggregated statistic to make inferences about an individual. This study is considered an ecological correlation because the units of analysis in this study were people within cities, governorates and regions but not individual people: i.e., ecological inferences about the individual were drawn from aggregate data. This is a common concern in ecological studies in which exposure and response are quantified only for aggregates and not individuals [65].

Regarding the methods used, only the OLS method was applied at the regional level because there are thirteen administrative regions in Saudi Arabia, and the minimum recommended number of features to apply GWR is 100. Using OLS, the significant coefficients of determination at the regional level were high (r2 = 0.32–0.71). At the governorate and city levels, GWR indicated that the associations between the concentration of NO2 air pollution and the most common cancers were marginally improved (r2 = 0.03–0.33 using OLS and r2 = 0.03–0.43 using GWR for governorates; r2 = 0.17–0.23 using OLS and r2 = 0.17–0.33 using GWR for cities). Therefore, a non-stationary local model (i.e., GWR) gave a much better account than a global model (i.e., OLS) for spatial estimation and prediction. Although global models mask widespread local variation, local models increase prediction accuracy by offering the opportunity to explore and understand local variations and allowing the spatial drift of regression parameters to be identified, estimated and mapped.

Regarding the employed measurement of cancer data, regardless of the spatial level of analysis or the method used, the significant coefficients of determination were r2 = 0.17–0.62 using the number of diagnosed cancer cases, whereas they were r2 = 0.05–0.71 using the CIRs. This finding suggests that a correlation exists between NO2 and cancer development. A high association between cancers and NO2 exposure for both the number and incidence rate might imply that such a relationship is highly focused on urban areas with a high population and high NO2 concentration due to urban and industrial activities. This result is largely factual, particularly when one examines the areas with a high association between the two variables. Areas with high associations were clustered in the Eastern and Riyadh regions. The industrial and petrochemical activities in Saudi Arabia are largely located in the Eastern Province, which is the largest producer of oil and related petrochemical activities worldwide as well as a high densely populated area. By contrast, the Riyadh region includes the capital city and is the most populated area in the country.

In terms of tumor location, a high association was observed between the concentration of NO2 air pollution and the risk of developing lung and breast cancers, followed by prostate, bladder, cervical and ovarian cancers. This finding corroborates results from other studies. For example, associations have been reported between NO2 and lung cancer [4,6,7,8,11], breast cancer [14], bladder cancer [13] and cervical and brain cancers [16].

However, this study is limited because the study cohorts were cancer incidence rates between 1998 and 2004 versus the NO2 concentration between 2003 and 2010. Exposure must precede the outcome, and a decade may be required for cancer to develop. It would have been preferable to use NO2 data for previous decades; unfortunately, such data were not available. One could argue that the overall pattern and trend in the NO2 concentration may not have changed substantially. Outdoor NO2 air pollution can mainly be attributed to power plants, heavy industrial activities and vehicular traffic. Al-Jeelani [66] stated that there is a lack of data about air pollution generated by power plants in Saudi Arabia and that the most significant source of air pollutants such as NO2 is automobiles. The number of automobiles in most Saudi cities increases in tandem with population growth. Heavy industrial activities in Saudi Arabia were established a few decades ago and are concentrated in certain major regions: Eastern, Riyadh and Makkah. Therefore, it can be claimed that the overall pattern and trends related to NO2 concentration may not have changed significantly between 1998 and 2004 compared with the period between 2003 and 2010. Moreover, exposure to air pollutants such as NO2 is one environmental risk factor for cancer. However, cancer incidence is explained by a combination of genetic, demographic, socio-economic, environmental, behavioral and cultural risk factors. In particular, the variations in cancer incidence are probably associated with many variables, including population aging and growth, tobacco smoking status (intensity and duration), occupational exposures, environmental exposures and factors, dietary habits (including unhealthy dietary habits), physical inactivity, the prevalence of obesity, genetic factors, the lack of screening programs and the accessibility of specialized cancer centers [22,23,24,25,26,27,28,29,30,31,32]. Regrettably, there appears to be a lack of data on these covariates in Saudi Arabia, and thus, they could not be analyzed in the present study.

5. Conclusions

This study is the first of its kind in Saudi Arabia because it relied on reliable cancer data acquired from the Saudi Cancer Registry, the spatial database of cancer incidence rates developed by the authors and the global NO2 map created using the Envisat observations, as measured by the SCIAMACHY instrument on ESA’s Envisat. Additionally, the statistical methodology employed in this study was a combination of global models, such as OLS, and local spatial statistical models, such as GWR, which captured and explained both the global and local heterogeneity and variations in the number of cancer cases and incidence rates. However, there is a lack of information on other contributing (cofounding) factors. Although an association was found between exposure to NO2 air pollution and the development of some cancers, these inferences may be inaccurate to a certain extent because they are uncertainly supported by the aggregate data. If exposure to NO2 was found in individual-level data, the inferences would be more reliable and could be used strategically to create health policies, health planning services and preventive policies and to control emissions. Environmental, demographic, behavioral, socio-economic, genetic and other risk factors are of great importance in spatial epidemiological studies of cancer. Countries with noticeable industrial expansions and increased burden of cancer such as Saudi Arabia should establish a nationwide spatial database of risk factors at the individual level. Such data will be vital for spatial epidemiological studies and for studies related to more general health concerns.

Acknowledgments

The authors acknowledge with gratitude King Abdulaziz City for Science and Technology, the Saudi Cancer Registry and its board members and King Faisal Specialist Hospital & Research Centre.

Conflicts of Interest

The authors declare no conflict of interest.

References1.WHO (World Health Organization), Regional Office for Europe HealthHealth Aspects of Air Pollution with Particulate Matter, Ozone and Nitrogen DioxideWHO Regional Office for EuropeBonn, Germany20032.BarillaroG.DiligentiA.StrambiniL.M.CominiE.FagliaG.NO2 adsorption effects on p+-n silicon junctions surrounded by a porous layerSensors Actuat. B Chem.200813492292710.1016/j.snb.2008.06.0483.BeesonW.L.AbbeyD.E.KnutsenS.F.Long-term concentrations of ambient air pollutants and incident lung cancer in California adults: Results from the AHSMOG study. Adventist Health Study on SmogEnviron. Health Perspect.199810681382310.2307/34341254.NybergF.GustavssonP.JärupL.BellanderT.BerglindN.JakobssonR.PershagenG.Urban air pollution and lung cancer in StockholmEpidemiology20001148749510.1097/00001648-200009000-000025.NafstadP.HaheimL.L.OftedalB.GramF.HolmeI.HjermannI.LerenP.Lung cancer and air pollution: A 27 year follow up of 16 209 Norwegian menThorax2003581071107610.1136/thorax.58.12.10716.DockeryD.W.PopeC.A.XuX.SpenglerJ.D.WareJ.H.FayM.E.FerrisB.G.Jr.SpeizerF.E.An association between air pollution and mortality in six US citiesN. Engl. J. Med.19933291753175910.1056/NEJM1993120932924017.PopeC.A.ThunM.J.NamboodiriM.M.DockeryD.W.EvansJ.S.SpeizerF.E.HeathC.W.Particulate air pollution as a predictor of mortality in a prospective study of US adultsAm. J. Respir. Crit. Care Med.199515166967410.1164/ajrccm/151.3_Pt_1.6698.PopeC.A.IIIBurnettR.T.ThunM.J.CalleE.E.KrewskiD.ItoK.ThurstonG.D.Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollutionJAMA20022871132114110.1001/jama.287.9.11329.VineisP.HoekG.KrzyzanowskiM.Vigna-TagliantiF.VegliaF.AiroldiL.AutrupH.DunningA.GarteS.HainautP.Air pollution and risk of lung cancer in a prospective study in EuropeInt. J. Cancer200611916917410.1002/ijc.2180110.VineisP.HoekG.KrzyzanowskiM.Vigna-TagliantiF.VegliaF.AiroldiL.OvervadK.Raaschoi-NielsenO.Clavel-ChapelonF.LinseisenJ.Lung cancers attributable to environmental tobacco smoke and air pollution in non-smokers in different European countries: A prospective studyEnviron. Health2007610.1186/1476-069X-6-711.KatanodaK.SobueT.SatohH.TajimaK.SuzukiTNakatsukaH.TakezakiT.NakayamaT.NittaH.TanabeK.An association between long-term exposure to ambient air pollution and mortality from lung cancer and respiratory diseases in JapanJ. Epidemiol.20112113214310.2188/jea.JE2010009812.Raaschou-NielsenO.AndersenZ.J.HvidbergM.JensenS.S.KetzelM.SørensenM.LoftS.OvervadK.TjønnelandA.Lung cancer incidence and long-term exposure to air pollution from trafficEnviron. Health Perspect.201111986086510.1289/ehp.100235313.Castaño-VinyalsG.CantorK.P.MalatsN.TardonA.Garcia-ClosasR.SerraC.CarratoA.RothmanN.VermeulenR.SilvermanD.Air pollution and risk of urinary bladder cancer in a case-control study in SpainOccup. Environ. Med.200865566010.1136/oem.2007.03434814.LiuC.C.TsaiS.S.ChiuH.F.WuT.N.ChenC.C.YangC.Y.Ambient exposure to criteria air pollutants and risk of death from bladder cancer in TaiwanInhal. Toxicol.201021485415.CrouseD.L.GoldbergM.S.RossN.A.ChenH.LabrecheF.Postmenopausal breast cancer is associated with exposure to traffic-related air pollution in Montreal, Canada: A case-control studyEnviron. Health Perspect.20101181578158310.1289/ehp.100222116.Raaschou-NielsenO.AndersenZ.J.HvidbergM.JensenS.S.KetzelM.SørensenM.HansenJ.LoftS.OvervadK.TjønnelandA.Air pollution from traffic and cancer incidence: A Danish cohort studyEnviron. Health2011106710.1186/1476-069X-10-6717.RosenlundM.BellanderT.NordquistT.AlfredssonL.Long-term exposure to air pollution and cancerEpidemiology200718S6618.KanH.GuD.Association between Long-Term Exposure to Outdoor Air Pollution and Mortality in China: A Cohort StudyProceedings of the ISEE 22nd Annual ConferenceSeoul, Korea28 August–1 September 201019.YorifujiT.KashimaS.TsudaT.Ishikawa-TakataK.OhtaT.TsurutaK.DoiH.Long-term exposure to traffic-related air pollution and the risk of death from hemorrhagic stroke and lung cancer in ShizuokaSci. Total Environ.201344339740210.1016/j.scitotenv.2012.10.08820.HystadP.DemersP.A.JohnsonK.C.CarpianoR.M.BrauerM.Long-term residential exposure to air pollution and lung cancer riskEpidemiology20132476277210.1097/EDE.0b013e3182949ae721.BeelenR.HoekG.van den BrandtP.A.GoldbohmR.A.FischerP.SchoutenL.J.ArmstrongB.BrunekreefB.Long-term exposure to traffic-related air pollution and lung cancer riskEpidemiology20081970271010.1097/EDE.0b013e318181b3ca22.American Cancer SocietyGlobal Cancer Facts & Figures2nd ed.American Cancer SocietyAtlanta, GA, USA201123.FerlayJ.ShinH.R.BrayF.FormanD.MathersC.ParkinD.M.Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008Int. J. Cancer20101272893291710.1002/ijc.2551624.ClappR.W.JacobsM.M.LoechlerE.L.Environmental and occupational causes of cancer: New evidence 2005–2007Rev. Environ. Health20082313710.1515/REVEH.2008.23.1.125.Al-HamdanN.RavichandranK.Al-SayyadJ.Al-LawatiJ.KhazalZ.Al-KhateebF.AbdulwahabA.Al-AsfourA.Incidence of cancer in Gulf Cooperation Council countries, 1998–2001East. Mediterr. Health J.20091560061126.KirkeleitJ.RiiseT.BratveitM.MoenB.E.Increased risk of acute myelogenous leukemia and multiple myeloma in a historical cohort of upstream petroleum workers exposed to crude oilCancer Causes Control200819132310.1007/s10552-007-9065-x27.NybergF.GustavssonP.JärupL.BellanderT.BerglindN.JakobssonR.PershagenG.Urban air pollution and lung cancer in StockholmEpidemiology20001148749510.1097/00001648-200009000-0000228.MemonA.DarifM.Al-SalehK.SureshA.Epidemiology of reproductive and hormonal factors in thyroid cancer: Evidence from a case-control study in the Middle EastInt. J. Cancer200297828910.1002/ijc.157329.SakodaL.C.Horn-RossP.L.Reproductive and menstrual history and papillary thyroid cancer risk: The San Francisco Bay Area thyroid cancer studyCancer Epidemiol. Biomark.200211515730.Santamaría-UlloaC.The impact of pesticide exposure on breast cancer incidence. Evidence from Costa RicaPoblación y Salud en Mesoamerica.20097Available online:http://ccp.ucr.ac.cr/revista/volumenes/7/7-1/7-1-1/7-1-1-ing.pdf(accessed on 13 December 2012)31.BlotW.J.FraumeniJ.F.Jr.Cancers of the Lung and PleuraCancer Epidemiology and Prevention2nd ed.SchottenfeldD.FraumeniJ.F.Jr.Oxford University PressNew York, NY, USA199663766532.TwiggL.MoonG.WalkerS.The Smoking Epidemic in EnglandHealth Development AgencyLondon, UK200433.LloydC.D.Local Models for Spatial AnalysisCRC PressBoca Raton, FL, USA201134.FotheringhamA.S.BrunsdonC.CharltonM.Geographically Weighted Regression: The Analysis of Spatially Varying RelationshipsWileyChichester, UK200235.MatthewsS.A.YangT.C.Mapping the results of local statistics: Using geographically weighted regressionDemogr. Res.20122615116610.4054/DemRes.2012.26.636.ToblerW.R.A computer movie simulating urban growth in the Detroit regionEcon. Geogr.19704623424010.2307/14314137.GoovaertsP.Analysis and Detection of Health Disparities Using Geostatistics and a Space-Time Information System: The Case of Prostate Cancer Mortality in the United States, 1970–1994Proceedings of GIS Planet 2005Estoril, Portugal30 May–2 June 200538.NakayaT.FotheringhamA.S.BrunsdonC.CharltonM.Geographically weighted Poisson regression for disease association mappingStat. Med.2005242695271710.1002/sim.212939.YangT.C.TengH.W.HaranM.The impacts of social capital on infant mortality in the U.S.: A spatial investigationAppl. Spat. Anal.2009221122710.1007/s12061-009-9025-940.ChenV.Y.J.WuP.C.YangT.C.SuH.J.Examining non-stationary effects of social determinants on cardiovascular mortality after cold surges in TaiwanSci. Total Environ.20104082042204910.1016/j.scitotenv.2009.11.04441.ShoffC.YangT.C.MatthewsS.A.What has geography got to do with it? Using GWR to explore place-specific associations with prenatal care utilizationGeoJournal20127733134110.1007/s10708-010-9405-342.LinC.H.WenT.H.Using geographically weighted regression (GWR) to explore spatial varying relationships of immature mosquito and human densities with the incidence of dengueInt. J. Environ. Res. Public Health201182798281510.3390/ijerph807279843.TsaiP.J.The analysis of geographically weighted regression pertaining to gastric cancer and Taiwanese ethnic communitiesIPCBEE.201116Available online:http://www.ipcbee.com/vol16/1-E011.pdf(accessed on 6 September 2012)44.MandalR.St-HilaireS.KieJ.G.DerryberryD.Spatial trends of breast and prostate cancers in the United States between 2000 and 2005Int. J. Health Geogr.2009810.1186/1476-072X-8-5345.GilbertA.ChakrabortyJ.Using geographically weighted regression for environmental justice analysis: Cumulative cancer risks from air toxics in FloridaSoc. Sci. Res.20114027328610.1016/j.ssresearch.2010.08.00646.SCR (Saudi Cancer Registry)Available online:http://www.oncology.org.sa/portal/index.php?option=com_content&view=article&id=145&Itemid=130&lang=en(accessed on 25 October 2013)47.BoyleP.ParkinD.M.Statistical Methods for RegistriesCancer Registration: Principles and MethodsJensenO.M.ParkinD.M.MacLennanR.MuirC.S.SkeetR.G.IARCLyon, France1991IARC Scientific Publication No. 9512615848.NandakumarA.GuptaP.C.GangadharanP.VisweswaraR.N.ParkinD.M.Geographic pathology revisited: Development of an atlas of cancer in IndiaInt. J. Cancer200511674075410.1002/ijc.2110949.ESA (European Space Agency)Global Air Pollution Map Produced by Envisat’s SCIAMACHYAvailable online:http://www.esa.int/Our_Activities/Observing_the_Earth/Global_air_pollution_map_produced_by_Envisat_s_SCIAMACHY(accessed on 27 December 2012)50.Tropospheric NO2 from SCIAMACHY MeasurementsAvailable online:http://www.iup.uni-bremen.de/doas/no2_tropos_from_scia.htm#Introduction(accessed on 29 December 2012)51.BeirleS.WagnerT.TRACE GASES (NO2), Satellite Group, Max-Planck-Institute for Chemistry in Mainz, GermanyAvailable online:http://joseba.mpch-mainz.mpg.de/no2_nad.htm(accessed on 27 December 2012)52.RichterA.BurrowsJ.P.NüssH.GranierC.NiemeierU.Increase in tropospheric nitrogen dioxide over China observed from spaceNature200543712913210.1038/nature0409253.OpenshawS.The Modifiable Areal Unit ProblemGeo BooksNorwich, UK198454.BrunsdonC.FotheringhamA.S.CharltonM.E.Geographically weighted regression: A method for exploring spatial nonstationarityGeogr. Anal.19962828129855.FotheringhamA.S.CharltonM.BrundsonC.The geography of parameter space: An investigation into spatial non-stationarityInt. J. Geogr. Inf. Syst.19961060562756.FotheringhamA.S.BrunsdonC.CharltonM.E.Two techniques for exploring non-stationarity in geographical dataGeogr. Syst.19974598257.CharltonM.FotheringhamA.S.Geographically Weighted RegressionNational Centre for Geocomputation, National University of Ireland MaynoothMaynooth, Ireland200958.MennisJ.Mapping the results of geographically weighted regressionCartogr. J.20064317117910.1179/000870406X11465859.LongleyP.A.TobonC.Spatial dependence and heterogeneity in patterns of hardship: An intra-urban analysisAnn. Assoc. Am. Geogr.20049450351910.1111/j.1467-8306.2004.00411.x60.AliK.PartridgeM.D.OlfertM.R.Can geographically weighted regressions improve regional analysis and policy making?Int. Reg. Sci. Rev.20073030032910.1177/016001760730160961.CahillM.MulliganG.Using geographically weighted regression to explore local crime patternsSoc. Sci. Comput. Rev.20072517419310.1177/089443930729892562.GraifC.SampsonR.J.Spatial heterogeneity in the effects of immigration and diversity on neighborhood homicide ratesHomicide Stud.20091324226010.1177/108876790933672863.HurvichC.M.SimonoffJ.S.TsaiC.L.Smoothing parameter selection in nonparametric regression using an improved Akaike information criterionJ. R. Stat. Soc. B19986027129364.RobinsonW.S.Ecological correlations and the behavior of individualsAm. Sociol. Rev.19501535135710.2307/208717665.FreedmanD.A.Ecological inference and the ecological fallacyInt. Encycl. Soc. Behav. Sci.199964027403066.Al-JeelaniH.A.Air quality assessment at Al-Taneem area in the Holy Makkah City, Saudi ArabiaEnviron. Monit. Assess.200915621122210.1007/s10661-008-0475-3