Malaria is endemic in most parts of Tanzania and remains a major cause of morbidity and mortality both in rural and urban areas. Ecological niche modelling (ENM) has been considered a useful tool to assess the potential geographical distribution of various species. The application of such tool is very limited in predicting the potential distribution of diseases, especially when using occurrence (presence). In this study an ensemble model approach was employed to predict the current and future (2050) potential distribution of malaria in Tanzania. The ensemble approach demonstrated an enhanced prediction model compared to the individual model outputs.

Background

Malaria is a leading cause of morbidity and mortality accounting for over 30% of the disease burden in Tanzania. Over 95% of the 37.4 million people in the country are at risk of malaria infection. Various factors account for malaria in Tanzania, which include demographic factors, socioeconomic factors, weak health systems, a limited budget, poor governance and accountability, antimalarial drug and insecticide resistance, environmental and climate change, vector migration, and land use patterns. Efforts have been employed to reduce malaria in Tanzania, which include insecticide treated mosquito nets, indoor residual spraying, improved diagnosis by microscopy and rapid diagnostic tests, effective treatment of cases, and implementation of intermittent presumptive treatment of pregnant women. In spite of the many efforts to combat malaria, the disease remains a leading public health problem in most parts of the country. Climate conditions such as precipitation, temperature, and relative humidity have a substantial impact on malaria. Despite the importance of these factors to the distribution of malaria, limited studies have been undertaken to address the association between climatic conditions and malaria epidemics.

Objectives

Previous attempts to map the geographical distribution of malaria have focused on a theoretical model that is based on available long-term climate data, as well as empirical models that fit malaria data to environmental factors to predict the number of months during which transmission is possible. These studies have not demonstrated the predictive ability beyond the input data area. Ecological niche modelling (ENM) has been considered a useful tool to assess the potential geographical distribution of species. It has been applied to diseases to assess the potential distribution of vectors. Applications of ENM to study the distribution of malaria using occurrence cases are limited in Tanzania. Here, we adapt modelling techniques, to predict the current and future potential distribution of malaria. The goals of the study were to (i) identify possible distribution areas of malaria using an ensemble approach that integrate multiple individual models to generate a better and more conservative overall solution, (ii) identify the environmental and climate conditions correlated with malaria occurrences, estimate the population at risk, and (iii) determine how future climate change may affect the distribution of malaria in Tanzania.

Methodology

Data: Malaria occurrence point data were obtained from the Ministry of Health and Social Welfare. These are reported cases from various health facilities in the country. The Current and future (2050) environmental data used in our study were obtained from CliMond gridded climate data, which represents an improvement on the existing global climate data available for bioclimatic modelling. Thirteen environmental variables were used from CliMond; this included eight bioclimatic variables, monthly minimum and maximum temperatures, monthly precipitation, monthly altitude and relative humidity. The 8-bioclimatic variables were mean temperature of wettest quarter, mean temperature of driest quarter, mean temperature of warmest quarter, mean temperature of coldest quarter, precipitation of wettest quarter, precipitation of driest quarter, precipitation of warmest quarter, and precipitation of coldest quarter. The study also included other variables such as human population density and normalised difference vegetation index (NDVI). To avoid fitting the model into too many environmental variables, we extracted the environmental information from each presence data and performed a Pearson correlation tests to see if any of the layers were too similar to include in a model together.
Data Processing: The environmental data used for model development were imported into ArcGIS 10.1 software in which they were re-projected to the same coordinate system, clipped to an area encompassing the administrative boundaries of Tanzania, resampled to obtain the same pixel resolution of 5km, extracted to obtain same dimensions, and converted to ASCII format.
Models development: We considered eight modeling algorithms for the ENM development, GAM, GLM, GBM, MAXENT, MARS and RF were implemented in biomod2 package in Revolution R software, SVM using dismo package and GARP using a Desktop GARP.
Ensemble Model Prediction: An ensemble approach was adapted in our study by combining the eight model output through a weighted average using two thresholds (i) the 5th percentile of the training presence (5% TP) and (ii) the least training presence (LTP).
To estimate the populations at risk of malaria, we reclassified the ensemble model outputs to binary maps (which have pixel values of 0 - no malaria and 1 - malaria present) using the two thresholds - 5% TP and LPT. ArcGIS tools were used to compute the population and districts predicted at risk to malaria

Results

The overall contribution of each environmental variable to all the models ranged from 2% to 62%. Population density was the main variables influencing the potential distribution of malaria in all the models. Relative humidity contributed 10.5% to the model followed by altitude (10%) and precipitation of driest quarter (5.4%). The other variables had less influence. The prediction maps revealed that almost the whole country is endemic for malaria. However, the probability of malaria presence varies spatially. All the models depicted high probability (0.5 or greater) of occurrence of malaria in the east and south coast of Indian Ocean, north regions and along Lake Victoria. The models depicted a medium probability of malaria occurrence along the central and west regions. The ensemble model at 5% TP threshold demonstrated high occurrence of malaria in the east, coast of Indian Ocean, north regions and along Lake Victoria, a pattern from east to central, then low occurrence from central to west and also south parts of the country
The ensemble model future (2050) prediction at 5% TP threshold showed an increase/shift of malaria occurrence in the northern part and towards the central part of the country is expected. High percentage of malaria occurrence is predicted in the southern highlands and southern regions of the country. Some areas are predicted with low percentage occurrence in the central regions and areas in the west of the country. Areas in the north, around Lake Victoria and along the coast of Indian Ocean are predicted to maintain the highest percentage of malaria occurrence.
The current population at risk of malaria is estimated to be 29 and 34 million, and this could rise in the future to 81.58 and 93.7 million. About 79% of the districts are at high risk for malaria, which is predicted to increase to 84% in future

Conclusion

A link between climate change and malaria has been described previously; particularly temperature and rainfall are mentioned as the major variables contributing to malaria distribution. The present study, however, shows a lesser contribution of temperature and rainfall in the development of the models, as compared to population density, which depicted the highest contribution. This suggest that (i) population density is the key variable in malaria and (ii) malaria cannot necessary be caused by climate variables, as they may exhibit a smaller role in determining the ecological niche and hence the potential distribution of malaria. However, despite the potential influence of the population variable shown in our model outputs, it is then clear that population density, environmental variables and other factors (than those we used) will need to be included in studies attempting to model malaria endemicity.
Our findings showed high percentage areas predicted by the ensemble for both current and future - 2050, whereas individual models resulted into low predicted areas. The results suggest that ensemble model predictions are more robust than the predictions from individual models.
An important implication of our model is that the predicted distribution of malaria in the various districts in Tanzania can inform the selection of locally appropriate control interventions. The malaria control program can plan better for the distribution of resources by specifically focusing on the areas predicted to be at high risk.