Following NASA's prediction of sunspot numbers for the current sunspot cycle, Cycle 24, we now include sunspot numbers as an explanatory variable in a statistical model. This model is based on fitting monthly rainfall values with factors and covariates obtained from solar-lunar geometry values and sunspot numbers. The model demonstrates high predictive skill in estimating monthly values by achieving a correlation coefficient of 0.9 between model estimates and the measurements. Estimates for monthly total rainfall for the period from 1901 to 2020 for Kenya indicate that the model can be used not only to estimate historical values of rainfall, but also to predict monthly total rainfall. We have found that the 11-year solar sunspot cycle has an influence on the frequency and timing of extreme hydrology events in Kenya, with these events occurring every 5±2 years after the turning points of sunspot cycles. While solar declination is the major driver of monthly variability, sunspots and the lunar declinations play a role in the annual variability and may have influenced the occurrence of the Sahelian drought of the mid-1980s that affected the Sahel region including the Greater Horn of Africa. Judging from the reflection symmetry, the trend of the current maximum and the turning point of the sunspot minimum at the end of the Modern Maximum, with a 95% level of confidence, drought conditions similar to those of the early 1920s may reoccur in the year 2020±2.

We begin by describing briefly the three predictors we have used in this study. These are monthly values of the sunspot numbers, maximum lunar declination and solar declination.

Sunspot numbers

The number of sunspots appearing on the solar surface has been recorded each month through observations and calculation for a long time. Currently, sunspot numbers are clearly headed towards a minimum given the trends and the near symmetry of the current maximum, typically referred to as Modern Maximum, which comprises Cycles 17 to 23 (Figure 1). The current cycle, Cycle 24, will probably mark the end of the Modern Maximum, with the sun switching to a state of less strong activity. While there are three main groups of prediction methods according to Kristof1 - precursor methods, extrapolation methods and model-based predictions - the National Aeronautics and Space Administration (NASA) and the Solar Influences Data Analysis Center (SIDC)2 have finally used the precursor method and made their predictions for Cycle 24. The smoothed sunspot numbers and predictions can be seen in Figure 1.

Sunspot numbers have been associated with a change in climate, including severe climatic conditions during the Maunder Minimum - the period 1640-1705 which was characterised by a conspicuous lack of sunspots.3 Total solar irradiance increases when the number of sunspots increases. Total solar irradiance is higher at solar maximum, even though sunspots are darker (cooler) than the average photosphere. Meehl and Arblaster4 analysed sea surface temperatures from 1890 to 2006. They then used two computer models from the US National Center for Atmospheric Research to simulate the response of the oceans to changes in solar output. They found that as the sun's output reaches a peak, the small amount of extra sunshine over several years causes a slight increase in local atmospheric heating, especially across parts of the tropical and subtropical Pacific where sun-blocking clouds are normally scarce. The small amount of extra heat leads to more evaporation, producing extra water vapour. In turn, moisture is carried by trade winds to the normally rainy areas of the western tropical Pacific, fuelling heavier rains.

In 2008, White and Liu5 provided evidence that the 11-year solar cycle may be the trigger for El Nino and La Nina events by using harmonic analysis on observed and model data. A model such as the one developed in this study captures interannual rainfall variability by involving sunspot numbers as predictors. The dotted line in Cycle 24 represents NASA's predicted sunspot numbers for 2012-2020.

Sunspot Cycle 24 is the last cycle of the current maximum while the dotted line shows the sunspot numbers that NASA have predicted for 2013-2020. The current prediction for Sunspot Cycle 24 gives a smoothed sunspot number maximum of about 69 in 2013. Hathaway et al.'s6 method of predicting the behaviour of a sunspot cycle is fairly reliable once a cycle has reached about 3 years after the minimum sunspot number occurs.

Maximum lunar declination

Varying angular lunar velocity caused by the lunar node cycle is considered likely to influence natural forcing of the El Nino Southern Oscillation (ENSO) by lunar tidal forces. Cerveny and Shaffer7 examined a possibility that lunar tidal forces act as an external forcing mechanism in regulating sea surface temperatures tied to ENSO events. They obtained a statistically significant correlation between maximum lunar declination (MLD) and both equatorial Pacific sea surface temperatures and South Pacific atmospheric pressure (the Southern Oscillation Index) for the period 1854-1999. High MLDs were associated with La Nina conditions, while low MLDs were associated with El Nino conditions. Under high MLD, circulation of the Pacific gyre is enhanced by tidal forces, inducing cold-water advection into the equatorial region that is characteristic of La Nina conditions. Under low MLD, tidal forcing is weakened, cold-water advection is reduced, and warmer sea surface conditions characteristic of El Nino prevail. Together with the solar cycle, MLDs are used to capture interannual rainfall variability.

Solar declination

Earth's axis of rotation is tilted 23.5° away from the plane perpendicular to Earth's orbit while its axis points in the same direction as Earth orbits the sun. Therefore, the solar declination angle determines the seasons, which are characterised by varying solar irradiance, varying length of day and an annual rainfall pattern. Solar declination was used to generate seasonal variation of rainfall.

Materials and methods

Data

The Kenya Meteorological Service8 (KenMet) supplied the rainfall data (rain gauge measurements) taken at Dagoretti and Jomo Kenyatta Airport from 1959 to 2005. The Climate Research Unit of the University of East Anglia (UK) provided research data sets for the Kenyan region.9 We extracted monthly and annual rainfall totals from 1901 to 2000. The country aggregation is based on the TYN_CY_1_1 data set. This data set is referred to here as CRU.K.

NASA provided solar and lunar declination values obtained from their ephemeris software.10 The National Oceanic and Atmospheric Administration's National Geophysical Data Center provided sunspot numbers, including NASA's prediction. The international sunspot number is produced by SIDc2 at the World Data Center for the Sunspot Index at the Royal Observatory of Belgium.

The model

Model SMS12.12 is of the form:

Response variable ~ predictor(s) Equation 1 in which the response variable is the monthly total rainfall and the three predictors are solar declination, maximum lunar declination and sunspot numbers. This model design is based on fitting a generalised linear model (GLM) of the Tweedie11 family to Kenya's monthly total rainfall distribution values from 1951 to 1980. A fitting procedure involved obtaining beta values which satisfy the linear equation:- where yi is the estimated monthly total rainfall values for each of the three predicting factors, x, and ej is the fit error in the estimate. In this study, we fitted first-order factors and therefore T=1. Statistical software was used to fit a GLM and obtain the initial estimate of beta values for the fit. The fitting procedure comprised two main steps:

Step 1: Fitting a GLM

We computed an initial estimate set of beta values using the GLM of the Tweedie family. This family of exponential dispersion models is characterised by the power mean-variance relationship:

Thus variance V is a function of the mean (µ) and p is the power variance of the Tweedie distribution calculated by means of a routine in an R-program called tweedie.profile. To specify the Tweedie, the mean (µ), the dispersion parameter (Φ) and the variance power (p) are required. Standard algorithms in R-software calculate µ and the maximum likelihood estimate is used to work out Φ and p. A GLM fit on the rainfall distribution obtains initial estimates for a fit parameter, a, and a dispersion parameter, Φ. At this point it is possible to use these beta values to calculate rainfall estimates for the GLM fit. However, rainfall data is correlated and therefore it is necessary to fit a generalised estimating equation to account for the correlation within the variable being fitted.

Step 2: Fitting a generalised estimating equation

To fit a generalised estimating equation, it is necessary to use a correlation matrix which best describes the manner of correlation to calculate new beta values. In this case we used estimates obtained in Step 1 for the fit parameter a, and the correlation matrix AR(1) which is defined as α|u-v|

In matrix notation this becomes

New beta estimates are thus obtained which are then used to estimate monthly totals by use of Equation 1.

Results and discussion

Model results

Model SMS12.12 was trained on a 30-year CRU.K data set (1951-1980) and tested on two segments of data: 1901-1950 and 1981-2000. Predictors, solar declination, maximum lunar declinations and sunspot numbers used were mean values for each month. Figure 2 shows how SMS12.12 demonstrates prediction stability with time.

Methods used for avoiding artificial prediction skill included using independent training and test data sets, cross-validation and hindcasting. Forecast skill depends on the amount of lead time, the number of forecast months and the strength of the relationships between estimates and rainfall records. Each value plotted in Figure 2 represents a Pearson product-moment correlation coefficient between estimate and CRU.K value for corresponding months in the year. Correlation values remained above 0.5 throughout the 100-year period, except for 1925 (Figure 2). The probability of obtaining a correlation value above 0.5 when the model is used in estimating projected values is therefore 99 out of 100 years (0.99). An adjusted R2-value of 0.62 was obtained between CRU.K and model estimates during the training period and reduced values of 0.52 and 0.56 were obtained from the test data sets. SMS12.12 shows stability in estimating monthly total rainfall when model monthly estimates are compared with corresponding CRU.K values. The average correlation for the period 1901-2000 is 0.8. The contributions of the individual predictors to the variability of the predictand were 59.7% for solar declination, 9.4% for sunspot numbers and 8.9% for maximum lunar declination. Thus solar declination played the dominant role in monthly rainfall variability. The remaining 22% of the variability remains unexplained.

Monthly rainfall projection

SMS12.12 was then used to estimate monthly rainfall totals for the period 1901-2020. Figure 3 shows monthly estimates so obtained.

Monthly totals were then aggregated into annual values and results standardised by mean and standard deviation. The results are shown in Figure 4, in which model results have been plotted together with CRU.K values for comparison. Model SMS12.12 estimates indicate elevated monthly totals (>+1 standard deviation) in the periods 1912-1913, 1931-1932, 1951-1952, 1987-1988, 1993-1994, 1997-1998 and 2005-2006, and depressed monthly rainfall (<-1 standard deviation) in 1917, 1937-1938, 1947-1948, 1982-1985,1992-1993, 2002-2004, 2010-2011 and 2019-2020.

Model results were then validated with records from the United Nations Development Programme.12 Below average annual total rainfall was reported to have occurred in 1928, 1933-1934, 1937, 1939, 1942-1944, 1947, 1952-1953, 1955-1957, 1975-1977, 1980-1985, 1991-1992, 1999-2000 and 2004. Other below average values were recorded by KenMet in 1965, 1973-1974, 1976 and 1992-1993. Floods recorded by KenMet occurred in 1961, 1963, 1977-1978 and 1997-1998. Projected model estimates indicate below normal rainfall in 2009-2011, 2015 and 2019-2020, with values within one standard deviation. Above normal rainfall may be expected in 2012-2014, 2016 and 2018, with values within two standard deviations. Estimates were calculated at a 95% confidence level.

Model diagnostics

Probabilities of rainfall volumes were calculated in order to judge the accuracy of the model estimates. The results are shown in Figure 5. Estimates are comparable at all stages of model development as shown by hindcasting, training (fitting) and forecast stages as well as with the 1901-2000 climatology. Correlation values between the model and CRU.K data are shown in Table 1 for the hindcast, training and forecast stages. Model estimates are therefore reliable.

Sunspot numbers and annual rainfall

Rainfall for Kenya in the Modern Maximum indicates a peak trend that corresponds to that of sunspots, as shown in Figure 6. Figure 6 shows standardised values of annual totals of rainfall and smoothed sunspot numbers. A best fit trend line of peak annual rainfall is also shown. The trend has a peak in the 19th sunspot cycle centred around 1961. Variability in annual rainfall shows reflection symmetry in the year 1961, such that Cycles 18 and 20 are object and image, respectively, in Figure 6.

We refer to events occurring prior to 1961 as objects of corresponding events after 1961, which are images. Object and image pairs labelled c, d, e and f are cycle pairs: 17 and 21, 16 and 22, 15 and 23, and 14 and 24. Object and image cycle pairs have similar rainfall peak amplitudes. Reflection symmetry demands that if sunspot turning points lead rainfall events in the objects side, the reverse will happen in the image side. While the cause of the distribution symmetry is still under investigation, it is what is observed from Figure 6. At least three sunspot turning points are outstanding. The first one is the heavy rainfall of the early 1960s corresponding to the maximum in Cycle 19, the second is the drought of the mid-1970s and the minimum between Cycles 20 and 21, and the third corresponds to the great Sahelian drought after the passing of the minimum between Cycles 21 and 22. From Figure 6 one can identify a turning point for each event of severe hydrology in Kenya, suggesting that sunspot numbers had an influence on rainfall as was found by Meehl and Arblaster4. Now that sunspots are headed for a minimum at the end of the Modern Maximum, one may expect fewer events of high rainfall- and perhaps a prolonged drought of the Sahelian type. Judging from the symmetry of the Modern Maximum, a drought of the type experienced in the early 1930s will most likely occur in 2020±2 after the passage of the current Cycle 24. This observation is also consistent with model SMS12.12 results as shown in Figure 4. Because Kenya's rainfall is influenced by the Sahel climate, it is likely that the decline in rainfall volumes may be experienced in the Eastern Africa region and perhaps the Sahel region, including the Greater Horn of Africa.

Summary and conclusions

This study was motivated by the desire to find out the physical causes of the Kenyan droughts of the early 1980s and at the turn of this century. The temporal distribution of sunspot numbers indicates that each turning point corresponds to events of severe hydrology in Kenya with a time lag of 5±2 years. Therefore, such events are predictable so long as sunspots can be predicted in advance. However, the prediction of sunspots has not been easy and the current prediction of Cycle 24 appears to be at the end of the Modern Maximum, therefore breaking the continuity. The current maximum is fairly symmetrical, increasing the confidence that sunspot activity is headed for an all time low, perhaps similar to the one at the beginning of the last century with a corresponding reduction in annual rainfall volumes. It is therefore likely that Kenya will experience reduced rainfall at the turn of Cycle 24 and around the year 2020±2.

Model estimates indicate that before 2020, above average rainfall may be expected in the period 2013-2018 and below normal rainfall in 2019-2020. No sunspot numbers are available to enable estimation beyond 2020 using the model; in addition, the behaviour of sunspots is uncertain beyond Cycle 24. However, as we head towards 2020, it is likely that the evolution of sunspots will occur in a predictable pattern so that sunspot prediction will be possible. However, this observation cannot yet be assumed for global data sets. Furthermore, we recommend that future studies be done on rainfall residuals so that the seasonality factor is eliminated and a better indication of the influence of sunspot numbers can be obtained. A comparison of the results with those obtained through statistical downscaling methods is also recommended.

Acknowledgements

We thank the following for providing data: the Kenya Meteorological Department (Nairobi, Kenya), the National Oceanic and Atmospheric Administration and the National Aeronautics and Space Administration (USA), the Climate Research Unit of the University of East Anglia (UK), and the Solar Influences Data Analysis Center of the Royal Observatory of Belgium. We also thank the following for providing software: R Foundation for Statistical Computing at the National Centre for Atmospheric Research and Unidata for the Integrated Data Viewer at the Program Center, University Corporation for Atmospheric

Research (Boulder, CO, USA), ConvexDNA for the Excel Mixer (Geneva, Switzerland), and Fourmilab for HomePlanet (Switzerland).