Figures

Abstract

Timings of human activities are marked by circadian clocks which in turn are entrained to different environmental signals. In an urban environment the presence of artificial lighting and various social cues tend to disrupt the natural entrainment with the sunlight. However, it is not completely understood to what extent this is the case. Here we exploit the large-scale data analysis techniques to study the mobile phone calling activity of people in large cities to infer the dynamics of urban daily rhythms. From the calling patterns of about 1,000,000 users spread over different cities but lying inside the same time-zone, we show that the onset and termination of the calling activity synchronizes with the east-west progression of the sun. We also find that the onset and termination of the calling activity of users follows a yearly dynamics, varying across seasons, and that its timings are entrained to solar midnight. Furthermore, we show that the average mid-sleep time of people living in urban areas depends on the age and gender of each cohort as a result of biological and social factors.

Author summary

For humans living in urban areas, the modern daily life is very different from that of people who lived in ancient times, from which todays’ societies evolved. Mainly due to the availability of artificial lighting, modern humans have been able to modify their natural daily cycles. In addition, social rules, like those related to work and schooling, tend to require specific schedules for the daily activities. However, it is not fully understood to what extent the seasonal changes in sunrise and sunset times and the length of daylight could influence the timings of these activities. In this study, we use a new approach to describe the dynamics of human resting periods in terms of mobile phone calling activity, showing that the onset and termination of the resting pattern of urban humans follow the east-west sun progression inside the same timezone. Also we find that the onset of the low calling activity period as well as its mid-time, are subjected to seasonal changes, following the same dynamics as solar midnight. Moreover, with resting time measured as the low activity periods of people in cities, we discover significant behavioural differences between different age and gender cohorts. These findings suggest that the length and timings of the human daily rhythms, still have a sensitive dependence on the seasonal changes of the sunlight.

Data Availability: The original dataset comprises call detail records of individuals whose actual identities are not known but addressed by uniquely hashed identifiers by the service provider thus the dataset is fully anonymized prior been given to us. Each record in this dataset pertains to a call in which two individuals participated and the record contains the precise time and duration of the call. The metadata also includes age, gender and postal code information of each anonymized subscriber. In addition, we had to sign a Non-Disclosure Agreement for not to share the original data. A processed aggregated data, comprising the probability distributions of finding a call at each time snap, for every day and every city included in this study, needed to fully reproduce the results reported here, has been deposited in a public data depository: https://doi.org/10.5281/zenodo.1020613.

Funding: This work was supported by Project COSDYN, Academy of Finland (Project No. 276439); EU HORIZON 2020 FET Open RIA project (IBSEN) No. 662725; CONACYT, Mexico, Grant No. 383907; and European Research Council for the Advanced Investigator Grant No. 295663. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The daily activity of people varies across space and time from place to place, date to date, and hour to hour as a result of biological, societal, economic, and environmental factors, shaping the society where they live. Roughly speaking, each day humans do certain activities at specific times. There are many environmental factors (cues or ‘zeitgebers’) involved in the entrainment of this clock, but as pointed out by Roenneberg et al. [1], the most dominant is light and is associated with the light-darkness cycle determined by the daily rhythm of daylight. However, mainly in places not close to the equator, the timing and duration of daylight is subject to noticeable seasonal variation due to the yearly movement of the Earth around the Sun, and these changes have a direct influence on the kind and timing of different human activities. On the other hand, humans living in urban areas are also immersed in an environment full of cues that could influence the entrainment of the circadian clock. Artificial lighting, social practices and schedules (work and school hours, workdays vs weekends), particularly for those living in big urban areas, could have a noticeable influence on the entrainment process. Social conventions impose characteristic schedules on individuals, and, at the population level we can expect people in urban areas to have periods of high activity between morning and evening, and periods of low activity (resting) during the night. The length and timings of human activity periods, specifically in urban areas, has important consequences for human health [2–5], economy and power consumption [6], and public transportation efficiency [7].

The human sleep wake cycle (SWC), and its dynamics in particular, has been studied in recent years to understand the processes and cues that govern it [8]. Generally speaking, most research on human SWC has focused on experiments with small groups under controlled conditions [9, 10], or questionnaire studies [1, 11–14] (mainly using the Morning-Eveningness Questionnaire (MEQ) [15] and the Munich Chronotype Questionnaire (MCTQ) [16]). The use of these tools for studying SWC has proved to be very fruitful and effective, though having some limits on the domain of applicability [14]. In contrast, the ever-increasing availability of information communication technologies (ICT) combined with researchers’ ability to access large-scale ICT-generated datasets (‘Big data’) has made possible the study of human behaviour using a variety of reality (data) mining techniques. In particular, there are a number of examples where mobile phone datasets have been analyzed to study social networks [17–20], sociobiology [21, 22], mental health [23], mobility [24–27], as well as social behaviour of cities [28, 29]. Over the past decade or so, the existence and accessibility of these large population-level datasets, has allowed scientists to study intrinsic human behavioural and socio-evolutionary patterns in unprecedented and complementary ways, compared to other research approaches.

Recently, datasets of mobile phone usage have also been used to study circadian rhythms, by analyzing individual’s mobile phone usage from the data captured by sensors [26, 30–35], or people’s communication patterns from their call detail records (CDRs) [31–33]. For example, one study used the mobile phone screen on-off sensor data to examine the sleep wake cycle of nine individuals, finding that most of the individuals varied their sleep time patterns between weekdays and weekends, as well as showing seasonal changes in their mid-sleep time [30]. In another study using mobile phones calls and text messages of a small number of individuals, it was shown that individuals can be classified as having morning type or evening type activity levels [31]. In our previous related work [33], we quantified the resting periods of people from their mobile phone calling activity, showing that there is a counterbalancing effect between the afternoon and night time resting periods, due to an interplay between ambient temperature and sunlight. The use of CDRs as a tool for investigating the sleep/wake circadian rhythm, is in our view a promising new line of research as of that complements the other research approaches especially the large scale survey-based studies, pioneered by Roenneberg et al [11–13].

In this study, we apply reality mining techniques to users’ call records in a mobile phone communication network to study the dynamics of the users’ calling patterns by focusing on the periods of low activity, i.e. when almost no calls are made. Users of the mobile phone network typically have specific time periods during which their calling activity ceases, and we may assume that the SWC is bounded inside this period of inactivity. We observe that the daily calling activity time displays an interesting dynamics across the year through seasons and along different geographical zones. By studying these patterns we can gain insights into human activity patterns, and the SWC, in particular. Interestingly, the calling activity pattern changes with the day of the year and it is found to depend also on the geographical location (latitude and longitude of the mobile phone user). From the circadian clocks involved in the daily rhythms of human societies, only those entrained to solar-based events depend also on the geographical location and on the day of the year.

In this work, we use mobile phone calling activity at the population level to study how the onset and termination of the urban human activity in different cities is synchronized with the East-West progression of the Sun. Also, we analyzed the annual progression of the onset and termination of the calling activity, finding that they show a strong seasonal variation. We note that this behavior is similar to the annual dynamics of solar midnight, inferring that solar midnight is an important cue entraining the human circadian clock. Finally, we determine the mid-time of the period of low calling activity, which is bounded between the termination of calling activity each day and its onset on the next day. We interpret this mid-time to correspond to the mid-sleep time, and show that it is strongly dependent on the age and gender of the individuals in the population.

Results

Using an anonymized dataset containing details of mobile phone communication of subscribers of a particular operator in a European country described in detail in the Methods section, we investigate the calling activity of the urban population living in cities as a function of time of the day for all the dates during the year. This we do by calculating for each city the probability distribution Pall(t, d) for finding an outgoing call at time t of a day d = (1,…,365) of the year. For all the studied cities, a region of almost null activity can be found around 4:00 am. Using this natural bound to split the calling activity from one day to another, we define a ‘day’ starting from 4:00am of a calendar day and running to 3:59am of the next calendar day.

In Fig 1 we show Pall(t, d) (green line) during days d = 214−215 (marking early August) for a city with over a 500,000 inhabitants. The distribution Pall(t, d) has two high calling activity periods with the first one corresponding to the morning calls, peaking around noon, and the second related to the evening calls, peaking around 8:00 pm. This bimodal pattern is present every day across the year and all the cities included in this study. The high calling activity periods are delimited by two periods of low activity, one centered around 4:00 pm related to the time after lunch, and the second one in the middle of the night, around 4:00 am within the sleeping period. The pattern present in Pall(t, d) is similar to that reported in other studies using different CDRs [29, 36], mainly at the times when the calling activity starts and ceases. In ref. [29], where the calling activity of some Spanish cities was studied, the histogram of the number of active users at each time has a similar bimodal shape, with similar times for their onset and termination, as well as the depth in the middle located around the same time period (i.e. between 3:00 pm and 4:00 pm).

Fig 1. Probability distribution for finding a call at time t, for a particular in 2007.

(green) Distribution when all the calls are included. (red) Distribution when only the last call at night is included (between 5:00 pm and 4:00 am next day). (blue) Distribution when only the first call of the day is included (between 5:00 am and 4:00 pm). The distribution of the last and first calls are sharper and have well-defined maxima.

To study the specific times when the calling activity rises and falls, we analyze the ‘morning’ and ‘night’ periods separately, defining the former between 5:00 am and 3:59 pm, and latter between 5:00 pm and 3:59 am on the following calendar day, in such a way that each period is 11 hours long. During each ‘morning’, we select only the first call made by each user inside that period and construct the associated probability distribution for the time of the first call PF(t, d), directly related to the rise of calling activity. Similarly, during the ‘night’ we define the corresponding probability distribution for the time of the last call PL(t, d) by taking into account only the last call made by each user within that period. In Fig 1, it can be seen that the three defined probability distributions Pall(t, d) (green), PL(t, d) (red), and PF(t, d) (blue) for consecutive days during winter, for a particular city with a population over a 500,000. The shape of the distributions Pall(t, d), PL(t, d), and PF(t, d) depicted in Fig 1 for a specific day appear to be preserved for all the days and cities we have studied.

Urban activity synchronization with East-West sun progression

The mean time of the first call tF and of the last call tL of people in a city can be influenced by environmental, social, and economic factors, and their possible daily value could be distributed completely at random. However, we find that during the year and at different latitudes, despite the different factors influencing the shape of the distribution Pall, the onset and termination of calling activity follows a consistent pattern, and this characteristic behaviour allows us to compare the calling activity pattern of cities lying at different latitudes. If the onset or termination of the urban calling activity is socially driven, with fixed times for specific activities (like office working hours from 9:00am to 6:00pm), one could expect that cities lying in the same time zone and at the same latitudes have similar calling activity timings (onset and termination). However, we find that the onset and termination of calling activity synchronizes with the East-West sun progression, in such a way that cities lying in western locations start (and terminate) their calling activity after cities at eastern locations, with a delay difference corresponding to the time difference between their local meridians. In Fig 2A and 2B we show tL and tF for 5 different cities lying inside a latitudinal band centered at 42°N±40′. The region including the 5 cities spans a longitudinal angle of 10.8°, and by taking one of the cities as a reference, other cities are located at −7.8°, −4.7°, −3.7°, and +3.0° from the reference city marked here with 0.0°. Then we compare the actual distributions PL and PF of the time of the last call and of the first call, respectively, for the 5 cities in the same latitudinal band, and find that PL and PF for western cities seem shifted to later times. However, when the distributions are shifted by an amount of time corresponding exactly with the time difference between the local meridian of the corresponding city and the reference city, the distributions visibly collapse onto each other, as can be seen in Fig 2C and 2D. In this case, the time shifts are +31.2, +18.8, +14.8, and -12 minutes for the cities located at -7.7°, -4.7°, -3.7°, and +3° from the reference city at 0°, respectively.

Fig 2. Temporal shift of the onset and termination times of the calling activity along geographical longitude.

Probability distributions of the time of the last call PL(t, d) and that of the first call PF(t, d) for 5 different cities lying at the same latitude but at different relative longitudes from a reference point located at the second city from east to west within the band for two consecutive days during the year. The relative longitudes of the cities are -7.8°, -4.7°, -3.7°, 0°, and +3°. (Upper panel) Probability distributions for (A) the time of the last call, and (B) the time of first call. (Lower panel) Probability distributions for (C) the time of the last call, and (D) the time of first call, shifted by a time corresponding to the difference between their local sun transit times (31.2, 18.8, 14.8, and -12 minutes for the cities located at -7.8°, -4.7°, -3.7°, and +3° from the reference city, respectively). The collapse of the distributions onto the reference city’s distribution is evident when the longitudinal time shift is added. This collapse implies that the 5 cities begin (or cease) their calling activity in a way that is synchronized with a temporal phase corresponding to the difference between their sun transit times.

The distribution collapse shown in Fig 2 is obtained by introducing a time shift corresponding to the sun transit differences between cities. In order to quantify the exact delay between the distributions, we calculate the required time shift that should be introduced between the calling distributions to minimize the Kullback-Leibler divergence DKL between them (see the Methods section). This measure is indicative of the similarity between the distributions, and is minimized when they are identical. We extend this analysis to include data from 30 cities, each one lying in one of the four latitudinal bands centered at 37°N (10 cities), 39.5°N (5 cities), 41.5°N (7 cities), and 42.5°N (8 cities). For each band, we choose one city lying near the mid point of the band as the reference, and calculate for all the cities in the band the average time shift between them and the reference city. This is done for each day of the week, averaging over 52 weeks of the year 2007. The results are shown in Fig 3, and it can be seen that the time shift that minimizes the divergence between the distributions corresponds to the delay between their local sun transit times. This synchronization appears stronger for the termination of the calling activity (represented by the distributions PL). As this pattern is consistently present in all of the four analyzed latitudinal bands, we conclude that it is a general behaviour of the population living in the cities. This result is consistent with those reported by Roenneberg et al. [12], obtained from MCTQ studies of people in Germany, distributed over a region that is 9° wide longitudinally. In their work, they take into account the population of the city by defining three population size categories, i.e. less than 300,000 inhabitants, between 300,000 and 500,000 inhabitants, and more than 500,000 inhabitants, while we classify each city of more than 100,000 inhabitants according to its latitudinal coordinate. Grouping the cities into latitudinal bands, we found a consistent entrainment to the East-West progression of the Sun, regardless of the population size of each city.

Fig 3. Time progression of the onset and termination of the calling activity along the geographical longitude.

Temporal progression of the onset and termination of the calling activity for cities lying at different geographical longitude. The time shift n*Δ that minimizes the divergence between the probability distribution of the first call PF in a reference city and the corresponding distributions of the other different cities lying at the same latitude. 4 different bands are analyzed, centred at 37.5°N, 39.5°N, 41.5°N, and +43.1°N. For each city inside each band, the time shifts n*Δ for the 7 days of the week are shown, as the set of 7 points with the same color located at the corresponding time difference between the local meridians of each city and that of the reference. The dashed line represents the time shift between the sun transit time at the reference city and a hypothetical point located at each corresponding longitude. The error bars represent the standard deviation from the average value for each day of the week. From the plot it can be seen that, for cities lying further away from the reference city, a bigger time shift is required to collapse the distributions.

This result implies that the termination (last call of the day) and onset (first call of the next day) of calling activities in cities at similar latitudes follow an external cue driven by solar events, and the time difference in these solar events between two different cities is reflected in the timings of their calling activity.

Entrainment of urban calling activity with sun-based cues

We have shown that the cities located at the same latitude but at different longitudes have periods of low calling activity with different onset and termination times (Figs 2 and 3). This shift coincides with the difference between their local sun transit times, i.e. when the sun crosses the meridian of the city. This observation raises the question as to what external daily event induces such synchronization. As the delays correspond to the time period between the local sun transit times of the cities, it seems plausible to think that the sun functions as a cue for this entrainment.

At the latitudes where the studied cities are located, the time difference between the sunset in the summer and in the winter is around 3 hours, if daylight saving is not taken into account, and the same holds for the time difference between sunrises. In contrast, the time difference between the mean time of the last calls between summer and winter is at most one hour [33]. However, there is a clear synchronization between the sun transit time and the timings of calling activity. This means that there should be an external clock functioning as a cue. On the other hand, from a biological perspective, the time when the secretion of melatonin reaches its maximum [37] lies close to midpoint between sunset and sunrise (i.e. solar midnight), once the night is as dark as possible. It has been proposed that the mid-sleep time coincides with the time corresponding to maximum melatonin secretion [38, 39], and if the solar midnight shifts through the year, the time for the maximum melatonin secretion should follow a similar pattern, as well as the entrained mid-sleep time.

In their study, Allenbradt et al. [40], using the MCTQ approach, have reported that mid-sleep time (on free-days) changes from one season to another. In some of the studied populations, they found that there is a small but significant difference in the average mid-sleep time between the days when Daylight Saving Time is applied and other days. This lends support to our assumption that if the mid-sleep time shifts in response to seasons, the timings of the calling activity should be influenced by its variation. In such a case, when the human mid-sleep time occurs at later hours, the timings of the calling activity for the following days should also occur at later hours. In other seasons, when the mid-sleep time occurs earlier, the activity timings should also be shifted towards earlier hours. If this is the case, then solar midnight should be functioning as the cue to which the calling activity timings are entrained. The activity pattern is a consequence of the interplay between seasonal and geographical factors, as well as social and societal activities like work and/or school, transportation, eating and leisure activities. However, the latter require specific timings during the day, not necessarily controlled by the sleep/wake cycle. We have shown elsewhere [33] that the total period of low calling activity (that is, the period between the termination and the onset of the calling activity) is strongly correlated with the duration of daylight, showing seasonal changes similar to the mid-sleep time.

In order to find any possible synchronization between the onset (and termination) of calling activity and solar midnight, we calculate the average of the mean times of the last call and that of the first call , for three sets of cities located at the latitudinal bands ϕ = 37°30′N (seven cities), 40°20′N (six cities), and 43°0′N (eight cities). We compare , and with the yearly evolution of the solar midnight in a reference city within a given latitudinal band (see Fig 4). A detailed description of how and are calculated can be found in the Methods section. It can be seen that only resembles to some extent the dynamics of the solar midnight, with their two minima and at least one of their maxima occurring around the same days of those of solar midnight, although the relative amplitudes are not in correspondence. In addition, the discontinuities introduced by the daylight saving is visible in all the graphs, suggesting that the timings of the calling activity are not solely influenced by the socially-driven time, but instead are synchronized with an external (astronomical) clock.

Fig 4. The yearly evolution of the time of the first call and that of the last call compared against the yearly shift of the solar midnight.

(Top sets) —average of the mean time of the first call of 3 sets of cities located at latitudinal bands centred at ϕ = 37°30′N (blue), 40°20′N (green), and 43°0′N (red). (bottom sets) —average of the mean time of the last call for the same sets of cities. In the middle of the panels, the solar midnight time in one of the cities within the band. The shape of resembles to some extent the graph of the solar midnight, coinciding with the two minima (for days 130 and 302) and one of the maxima (for day 210). For the case of , the graph shows some correspondence with the sunrise although to a lesser extent. The discontinuities introduced by the daylight saving shows in the graphs, suggesting that the period of low calling activity is not solely influenced by the socially-driven time, but is synchronized with an external (astronomical) event. The number of cities inside the bands ϕ = 37°30′N (blue), 40°20′N (green), and 43°0′N (red), are 7, 6, and 8, respectively.

Age and gender dependence of the mid-sleep times

The period of low calling activity is bounded by the mean times of the last call during the night and of the first call in the morning. The duration of this period changes across seasons [33] and is strongly influenced by the length of the day (or conversely by the length of the night). The mid-time of this low calling activity period should correspond to the average time of human low activity, i.e. when the majority of the urban population is sleeping. In chronobiology studies, the mid-sleep time, corresponding to the time when human sleep is in the middle of its cycle, has been found to vary with the age and gender of the individuals [11, 41]. Despite the fact that each individual has a distinctive sleep-wake cycle, with a chronotype ranging from advanced sleep period (morningness) to delayed sleep period (eveningness) [42], at the population level a characteristic mid-sleep time can be consistently calculated, taken simply as the average of individual mid-sleep times.

From the mean times of the last call of the day, tL and of the first call tF of the next day, we define the period of low calling activity TLCA as the elapsed time between tL and tF, as a measure of the time when cities cease their activity. In Fig 5a, the width of the low activity period TLCA of the most populated city in the dataset is shown, for 4 different days of the week (Tuesdays, Fridays, Saturdays and Sundays), as a function of the subscribers’ age and gender. There is a noticeable change of about 3 hours, moving from the age cohort of 20 to that of 40 year olds. After that rather abrupt increase, especially for Fridays and Saturdays, TLCA slightly decreases, reaching a local minimum value for the age cohort of 50 year olds, and then it increases again to reach the highest value at the age of 78 years. For the analyzed weekday (Tuesday) as well as for Sunday, TLCA increases almost monotonically with the cohort age, showing a small plateau for age cohorts between 45 and 58.

Fig 5. Period of low calling activity and mid-sleep times for different age and gender cohorts.

(a) Period of low calling activity TLCA. The TLCA is calculated as the elapsed time between the mean time of the last call and that of the first call, as a function of the age and gender of different cohorts, for the most populated city in the dataset in 2007. (b) mid-sleep time tmid, calculated as the time in the middle of the interval between the mean time of the last call and that of the first call, as a function of the age and gender of different cohorts of the same city. For each age cohort, TLCA and tmid are calculated for females (circles) and males (triangles) separately. Both quantities are different for different days of the week, and the corresponding plots are shown for (green) Tuesdays, (red) Fridays, (blue) Saturdays, and (violet) Sundays. As Mondays to Thursdays have similar values, therefore only the data for Tuesdays is shown.

We have also tracked the midpoint of the inactivity period, defined as the mid-time between tL and tF. Due to its similarity with the average time in the middle of the sleeping period [41], we interpret this minimum calling activity time as the mid-sleep time tmid, calculated simply as tmid = (tL + tF − 24)/2. Both quantities are found to depend on the age and gender of each cohort, as can be seen in Fig 5b. We find that, for certain age groups (from 18 to 32 years old, and from 43 to 80 years old) tmid occurs at a later time for women as compared to men, while in the age group of 33 to 42 years old, tmid for the men occur later. This finding differs somewhat from the reported mid-sleep times (on free days) in the chronotype questionnaire study based on the MCTQ [11, 13], where males show a later mid-sleep time for age cohorts younger than 38 years old. Also, there is a strong dependence on age, with younger age cohorts (20–30 year old) having later tmid, i.e. around 30 minutes after that of the oldest age cohort (70-80 years old). This observation is in accordance with the observed chronotypes [41], which are attributed to biological factors or internal clock being regulated by neuronal and hormonal mechanisms. We also found an unexpected rise of tmid for the age cohort of 45–65 year old individuals, which we suspect is entirely of social origin. Hence it seems that both biological and social factors play a role in changing tmid, i.e. shifting the period of low activity to later hours.

In addition, we find that tmid varies across days of the week. On Fridays and Saturdays tmid occurs at a later hours compared with the other days. Similarly, the age cohort with the latest mid-sleep time tmid is different for different days of the week. On Saturdays, individuals in the age group 30 to 45 years old have the latest tmid, while for the other days of the week it is the 20–25 years old cohort which shows the latest mid-sleep time. The results of TLCA and tmid for the most populated city are also and consistently found in the next 5 most populated cities, as shown in the Supplementary Material (S1 and S2 Figs, respectively).

Discusion

In this study, we have found that the onset and termination of the period of low calling activity for people in cities at about the same latitude but at different longitudes are shifted according to their relative longitudinal separation. Cities westward from the easternmost analyzed city stop their activity later in line with the time delay of the sun transit time. This result suggests that a solar event acts as a cue for the circadian rhythm of the period of low calling activity with the SWC bounded inside. This result is consistent with those reported by Roenneberg et al. [12], although strictly speaking the two studies cannot be compared directly as the focus of our study is on variation by latitude and theirs was on variation by population size of cities.

In addition, we found that the seasonal variation of the termination of calling activity resembles the annual variation in solar midnight (or solar noon). However, when the annual behaviour of activity termination is compared with other characteristic solar events like the sunrise and sunset, it appears to have a different functional form with different number of maxima and minima with different dates. Although, it seems likely that solar midnight (or solar noon) acts as a cue in the synchronization of the termination of the calling activity, further research is needed to confirm this. At the individual level, knowledge of the mid-sleep time and sleep duration allows the determination an individual’s chronotype [16]. However, at the population level, we could determine from the calling distributions the characteristic variation in the sleep duration and mid-sleep time as a function of the group age. The observed overall trends are in line with the earlier findings [41] and reveal an increase in the sleep duration and decline in the mid-sleep time with age. Several other intricacies are also evidenced at closer inspection. Firstly, the aspect of ‘social jetlag’ [43], defined as the difference between the mid-sleep times on free days and that of work days, becomes apparent across all age groups. Interestingly, although social jet-lag is expected to give rise to extended sleep duration on free days as a compensatory effect, for young adults (20–25) we find that the sleeping periods are comparatively less on free days (Friday and Saturday nights and the following mornings). Therefore, sleep deprivation is likely to be at a maximum for this age range. Second, previous observations suggest a monotonic decrease in the mid-sleep time from around 20 years of age, which can be attributed to endocrine factors [41]. In contrast, we observe a reversal in trend of the mid-sleep time such that at the age of 45 years it starts rising till 55 years of age, after which it decrease again.

Materials and methods

In this study, we have analyzed a very large dataset of anonymized call detail records (CDRs) from a mobile phone service provider offering services in a a country located in the Southern Europe subregion of the United Nations geoscheme [44]. Due to a Non Disclosure Agreement associated to the dataset we are bound to keep the identity of the country unknown, and thus we have partially masked the latitude and Longitude coordinates of the cities to screen their actual location, such that each city is associated with a latitudinal band, and the latitude at the center of the band is assigned as the latitude of the city. In the analyses, depending on the measure we were focusing on, we chose the width and center of the latitudinal bands and in all cases specifying the corresponding values. The latitude coordinate associated with each band is described by ϕ ± dϕ, with ϕ the latitude in degrees at the center of the band, and dϕ the half-width of the band in degrees. On the other hand, as the latitudinal region is given, the Longitude coordinate is also screened by providing instead its angular separation from, an arbitrary point located in the same latitudinal band. Thus, for a given city, its longitude coordinate θ denotes the number of degrees from a reference point located in the same latitudinal band. The anonymization of the subscribers’ identities was performed by the service provider prior the data been given to us. The dataset contains CDRs of around 10,000,000 subscribers during 2007, with more than 3 billion calls between 50,000,000 unique identifiers. Each record contains the date, time, duration, and anonymized caller and callee identifiers. The dataset also includes demographic information of the majority of the subscribers, and, for those cases, the age, gender, postal code, and location of the most accessed cell tower (MAC-tower) are known. Thus, there are three possible locations associated to each user, namely the associated city center, the location of the MAC-tower and the center of the postal code region, and we use them to determine whether the subscriber “lives in a city”—defined by cases where their three associated locations are sufficiently close to each others. Taking as a reference point the geographical location of the associated city center, a subscriber lives there if the following three conditions are satisfied:

the distance between the location of the MAC-tower and the associated city’s center is less than 15 km

the distance between the center of the associated postal code region and the associated city’s center is less than 15 km

the distance between the location of MAC-tower and the center of associated postal code region is less than 30 km.

In this study, we chose 36 of the cities with more than 100,000 inhabitants in 2007 (see S3 Fig in the Supplementary Material), in such a way that our final analysis takes into account the calling patterns of around 1,000,000 subscribers in total. Locations of the subscribers are associated with the locations of the cities they reside. Each city is associated with the following two geographical coordinates: the latitudinal coordinate is fixed as the midpoint of a latitudinal band including the city, and the longitudinal coordinate, defined as the angular distance between the city and a reference point located in the studied region.

Quantifying delays between calling activity timings

Calling behavior varies seasonally, particularly the mean value and the width of the distributions of the first and last call vary across the year, being pushed towards the afternoon during winter and towards midnight during the summer. In spite of this seasonal variation, for a given day the calling distributions of different cities have similar shapes, and we exploit this similarity to calculate the delays between them to identify the temporal shifts of the distributions. The Kullback-Leibler divergence [45] is a measure of similarity between two distributions, commonly used in statistical analysis, for example when comparing one distribution obtained from data and another generated by a model. It reaches zero, its minimum possible value, when the distributions are identical, and it increases in value as the distributions become more and more dissimilar. In the case of the calling activity of different cities, the distributions are not identical but have a very similar shape. Applying Kullback-Leibler divergence to a pair of these distributions, it would reach a minimum value when these distributions overlap most, falling on top of each other and collapse to one. Thus, if we measure the amount of time one distribution should be shifted in order to minimize its divergence from the second distribution. The time shift would correspond the actual time delay between them.

In order to quantify the actual time shift between the distributions PL of last calls for cities lying along different Longitudes, we proceed as follows. First, for all the cities within the band, we calculate all the distributions PL(t, d) between January 2nd and December 31st. For each day d, we fix PL(t, d)0° of the city labeled ‘0°’ as the reference distribution, and for every other city c in the band, we compared the reference PL(t, d)0° with time-shifted versions PL(t + nΔ, d)c of the distribution PL(t, d)c, with −5 ≤ n ≤ 8 and Δ = 5 min, to find the time shift n*Δ that minimizes the divergence DKL between them. Here, DKL is the Kullback-Leibler divergence measure, defined as DKL(P, Q) = ∑iPi log(Pi/Qi), with P, Q being the two discrete distributions. Once we find for each city the set {n*Δ} with all the time-shifts across the year, we calculate its average time-shift 〈n*Δ〉, and plot it for all the cities in the band in the right column of Fig 3. As the time for the mean time of the last call is different for different days of the week [33], the average is calculated separately for each day of the week. We apply the same procedure for the time of the first call distributions PF, and the results are shown in the left column of Fig 3.

Averaging the mean times of the calling activity inside a latitudinal band

In order to find if there is any relation between tL and tF and the solar midnight, we have chosen 7, 6 and 8 cities, lying in the latitudinal bands centered at ϕ = 37°30′ N, 40°20′ N, and 43°0′ N, respectively. For each city, we shift its corresponding distributions in accordance with its longitudinal difference to collapse all into one. Then we calculate the average mean time of the last call, , where, denotes the mean time of the last call for the shifted distribution for a city c belonging to the analyzed band during the day d, and 〈⋅〉 denotes the average over all cities lying within the band. Similarly, we calculate the average mean time of the first call for the given latitudinal band. The quantities and are compared with the time at which the solar midnight occurs in the reference city of each band. It should be noted that in the original graphs there are days of national holidays and local festivities that introduce drastic pattern changes, which we filter out to construct the final graphs.

Supporting information

S1 Fig. Period of low calling activity TLCA for different age and gender cohorts.

The TLCA is calculated as the elapsed time between the mean time of the last call and of the first call, as a function of the age and gender of different cohorts, for the six most populated city in the dataset in 2007. For each age cohort, TLCA is calculated for females (circles) and males (triangles) separately. TLCA is different for different days of the week, and the corresponding plots are shown for (green) Tuesdays, (red) Fridays, (blue) Saturdays, and (violet) Sundays. Mondays to Thursdays have similar values, therefore only the data for Tuesdays is shown.

tmid is calculated as the time at middle of the interval between the mean time of the last call and of the first call, as a function of the age and gender of different cohorts, for six of the seven most populated cities in the dataset in 2007. For each age cohort, tmid is calculated for females (circles) and males (triangles) separately. tmid is different for different days of the week, and the corresponding plots are shown for (green) Tuesdays, (red) Fridays, (blue) Saturdays, and (violet) Sundays. Mondays to Thursdays have similar values, therefore only the data for Tuesdays is shown. During some Fridays nights, the calling activity extended until very late in the night, and the distribution of the morning calling activity on the next day presents a small peak around 4:00 a.m. If present, we include this peak in the analysis when calculating the time of the first call, due to its small amplitude and width compared with the main part of the distribution for the time of the first call. This is also true for the results shown in Fig 5 in the main text.

14.
Levandovski R., Sasso E., and Hidalgo M. P., “Chronotype: a review of the advances, limits and applicability of the main instruments used in the literature to assess human phenotype,” Trends in psychiatry and psychotherapy, vol. 35, no. 1, pp. 3–11, 2013.

30.
Abdullah S., Matthews M., Murnane E. L., Gay G., and Choudhury T., “Towards circadian computing: early to bed and early to rise makes some of us unhealthy and sleep deprived,” in Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing, pp. 673–684, ACM, 2014.