Abstract

From economic inequality and species diversity to power laws and the analysis of multiple trends and trajectories, diversity within systems is a major issue for science. Part of the challenge is measuring it. Shannon entropy has been used to rethink diversity within probability distributions, based on the notion of information. However, there are two major limitations to Shannon’s approach. First, it cannot be used to compare diversity distributions that have different levels of scale. Second, it cannot be used to compare parts of diversity distributions to the whole. To address these limitations, we introduce a renormalization of probability distributions based on the notion of case-based entropy as a function of the cumulative probability . Given a probability density , measures the diversity of the distribution up to a cumulative probability of , by computing the length or support of an equivalent uniform distribution that has the same Shannon information as the conditional distribution of up to cumulative probability . We illustrate the utility of our approach by renormalizing and comparing three well-known energy distributions in physics, namely, the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions for energy of subatomic particles. The comparison shows that is a vast improvement over as it provides a scale-free comparison of these diversity distributions and also allows for a comparison between parts of these diversity distributions.

1. Diversity in Systems

Statistical distributions play an important role in any branch of science that studies systems comprised of many similar or identical particles, objects, or actors, whether material or immaterial, human or nonhuman. One of the key features that determines the characteristics and range of potential behaviors of such systems is the degree and distribution of diversity, that is, the extent to which the components of the system occupy states with similar or different features.

As Page outlined in a series of inquiries [1, 2], including The Difference and Diversity and Complexity, diversity within systems is an important concern for science, be it making sense of economic inequality, expanding the trade portfolio of countries, measuring the collapse of species diversity in various ecosystems, or determining the optimal utility/robustness of a network. However, an important major challenge in the literature on diversity and complexity, which Page also points out [1, 2], remains: the issue of measurement. Although statistical distributions that directly reflect the spread of key parameters (such as mass, age, wealth, or energy) provide descriptions of this diversity, it can be difficult to compare the diversity of different distributions or even the same distribution under different conditions, mostly because of differences in scales and parameters. Also, many of the measures currently available compress diversity into a single score or are not intuitive [1–4].

At the outset, motivated by examples of measuring diversity in ecology and evolutionary biology from [3, 4], we sought to address these challenges. We begin with some definitions and a review of our previous research.

First, in terms of definitions, we follow the ecological literature, defining diversity as the interplay of “richness” and “evenness” in a probability distribution. Richness refers to the number of different diversity types in a system. Examples include (a) the different levels of household income in a city, (b) the number of different species in an ecosystem, (c) the diversity of a country’s exports, (d) the distribution of different nodes in a complex network, (e) the various health trends for a particular disease across time/space, or (f) the cultural or ethnic diversity of an organization or company. In all such instances, the greater the number of diversity types (be these types discrete or continuous), the greater the degree of richness in a system. In the case of the current study, for example, richness was defined as the number of different energy states.

In turn, evenness refers to the uniformity or “equiprobability” of occurrence of such states. In terms of the above examples, evenness would be defined as (a) a city where household income was evenly distributed, (b) an ecosystem where the diversity of its species was equal in number, (c) a country with an even distribution of exports, (d) a complex network where all nodes had the same probability of occurrence, (e) a disease where all possible health trends were equiprobable, or (f) a company or organization where people of different cultural or ethnic backgrounds were evenly distributed. In the case of the current study, for example, evenness was defined as the uniformity or “equiprobability” of the occurrence of all possible energy states.

More specifically, as we will see later in the paper, we define the diversity of a probability distribution as the number of equivalent equiprobable types required to maintain the same amount of Shannon entropy (i.e., the number of Shannon-equivalent equiprobable states). Given such a definition, a system with a high degree of richness and evenness would have a higher degree of , whereas a system with a low degree of richness and evenness would have a low degree of . In turn, a system with high richness but low evenness (as in the case of a skewed-right system with long tail) would have a lower degree of than a system with high richness and high evenness.

1.1. Purpose of the Current Study

Recently, we have introduced a novel approach to representing diversity within statistical distributions [5, 6], which overcomes such difficulties and allows the distribution of diversity in any given system (or cumulative portions thereof) to be directly compared to the distribution of diversity within any other system. In effect, it is a renormalization that can be applied to any probability distribution to produce a direct representation of the distribution of diversity within that distribution. Arising from our work in the area of complex systems, the approach is based on the notion of case-based entropy, [5]. This approach has two major advantages over the Shannon Entropy , which, as we alluded to above, is one of the most commonly used measures of diversity within probability distributions and which calculates the average amount of uncertainty (or information, depending on one’s perspective) present in a given probability distribution. First, can be used to compare distributions that have different levels of scale; and, second, can be used to compare parts of distributions to their whole.

After developing the concept and formalism for case-based entropy for discrete distributions [5], we first applied it to compare complexity across a range of complex systems [6]. In that work, we investigated a series of systems described by a variety of skewed-right probability distributions, choosing examples that are often suggested to exhibit behaviors indicative of complexity such as emergent collectivity, phase changes, or tipping points. What we found was that such systems obeyed an apparent “limiting law of restricted diversity” [6], which constrains the majority of cases in these complex systems to simpler types. In fact, for these types of distribution, the distributions of diversity were found to follow a scale-free rule, with or more of cases belonging to the simplest or less of equiprobable diversity types. This was found to be the case regardless of whether the original distribution fit a power law or was long-tailed, making it fundamentally distinct from the well-known (but often misunderstood) Pareto Principle [7].

In the following, we continue to explore the use of case-based entropy in comparing systems described by statistical distributions. However, we now go beyond our prior work in the following ways. First, we extend the formalism in order to compute case-based entropy for continuous as well as discrete distributions. Second, we broaden our focus from complexity/complex systems to diversity in any type of statistically distributed system. That is, we start to explore distributions of diversity for systems where richness is not a function of the degree of complexity types.

Third, the discrete indices we used had a degree of subjectivity to them, for example, how should household income be binned and what influence does that have on the distribution of diversity? As such, we wanted to see how well worked for distributions where the unit of measurement was universally agreed upon.

Fourth, we had not emphasized how was a major advance on Shannon entropy . As known, while has proven useful, it compresses its measurement of diversity into a single number; it is also nonintuitive; and, as we stated above, it is not scale-free and therefore cannot be used to compare the diversity of different systems; neither can it be used to compare parts of the diversity within a system to the entire system.

Hence, the purpose of the current study, as a demonstration of the utility of , is to renormalize and compare three physically significant energy distributions in statistical physics: the energy probability density functions for systems governed by Boltzmann, Bose-Einstein, and Fermi-Dirac statistics.

2. Renormalizing Probability: Case-Based Entropy and the Distribution of Diversity

The quantity case-based entropy [5], , renormalizes the diversity contribution of any probability distribution , by computing the true diversity of an equiprobable distribution (called the Shannon-equivalent uniform distribution) that has the same Shannon entropy as . is precisely the number of equiprobable types in the case of a discrete distribution, or the length, support, or extent of the variable in the case of continuous distributions, which is required to keep the value of the Shannon entropy the same across the whole or any part of the distribution up to a cumulative probability . We choose the Shannon-equivalent uniform distribution for two reasons:(i)First, it is well known that, on a finite measure space, the uniform distribution maximizes entropy: that is, the uniform distribution has the maximal entropy among all probability distributions on a set of finite Lebesgue measures [8].(ii)Second, a Shannon-equivalent uniform distribution will, by definition, count the number of values (or range of values) of that are required to give the same information as the original distribution if we assume that all the values (or range of values) are equally probable.

Hence, the uniform distribution renormalizes the effect of varying relative frequencies (or probabilities) of occurrence of the values of without losing information (or entropy). In other words, if all choices of the random variable are equally likely, the number of values (or the length, if it is a continuous random variable) needed for the random variable to keep the same amount of information as the given distribution is a measure of diversity. In a sense, each new value (or type) is counted as adding to the diversity, only if the new value has the same probability of occurrence as the existing values. Diversity necessarily requires the values of the random variable to be equiprobable since lower probability, for example, means that such values occur rarely in the random variable and hence cannot be treated as equally diverse as other values with higher probabilities. Hence, by choosing an equiprobable (or uniform) distribution for normalization, we are counting the true diversity, that is, the number of equiprobable types that are required to match the same amount of Shannon information as the given distribution.

This calculation (as we have shown elsewhere [5]) can be done for parts of the distribution up to a cumulative probability of . This means that a comparison of for a variety of distributions is actually a comparison of the variation of the fraction of diversity contributed by values of the random variable up to .

Since, regardless of the scale and units of the original distribution, and both vary from to , one can plot a curve for versus for multiple distributions on the same axes. thus provides us with a scale-free measure to compare distributions without omitting any of the entropy information, but by renormalizing the variable to one that has equiprobable values. What is more, it also allows us to compare different parts of the same distribution, or parts to wholes. That is, we can generate a versus curve for any part of a distribution (normalizing the probabilities to add up to in that part) and compare the curve of the part to the curve of the whole or another part to see if the functional dependence of on is the same or different. In essence, has the ability to compare distributions in a “fractal” or self-similar way.

In [5], we showed how to carry out the renormalization for discrete probability distributions, both mathematical and empirical. In this paper, as we stated in the Introduction, we make the case for how constitutes an advance over , in terms of providing a scale-free comparison of probability distributions and also comparisons between parts of distributions. More importantly, we demonstrate how works for continuous distributions, by examining the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions for energy of subatomic particles. We begin with a more detailed review of .

3. Case-Based Entropy of a Continuous Random Variable

Our impetus for making an advance over the Shannon entropy comes from the study of diversity in evolutionary biology and ecology, where it is employed to measure the true diversity of species (types) in a given ecological system of study [3, 4, 9, 10]. As we show here, it can also be used to measure the diversity of an arbitrary probability distribution of a continuous random variable.

Given the probability density function of a random variable in a measure space , the Shannon-Weiner entropy index is given by

The problem, however, with the Shannon entropy index , as we identified in our abstract and Introduction, is that while being useful for studying the diversity of a single system, it cannot be used to compare the diversity across probability distributions. In other words, is not multiplicative: a doubling of value for does not mean that the actual diversity has doubled. To address this problem, we turned to the true diversity measure [3, 11, 12], which gives the range of equiprobable values of that gives the same value of :

The utility of for comparing the diversity across probability distributions is that, in , a doubling of the value means that the number of equiprobable ranges of values of has doubled as well. calculates the range of such equiprobable values of that will give the same value of Shannon entropy as observed in the distribution of . We say that two probability densities and are Shannon-equivalent if they have the same value of Shannon entropy. Case-based entropy is then the range of values of for the Shannon-equivalent uniform distribution for . We also note that Shannon entropy can be recomputed from by using .

In order to measure the distribution of diversity, we next need to determine the fractional contribution to overall diversity up to a cumulative probability . In other words, we need to be able to compute the diversity contribution up to a certain cumulative probability . To do so, we replace with , the conditional entropy, given that only the portion of the distribution up to a cumulative probability (denoted by ) is observed with conditional probability of occurrence with density up to a given cumulative probability . That is,

The value of for a given value of cumulative probability is the number of Shannon-equivalent equiprobable energy states (or of values of the variable in the -axis in general) that are required to explain the information up to a cumulative probability of within the distribution. If , then is the number of such Shannon-equivalent equiprobable energy states for the entire distribution itself.

We can then simply calculate the fractional diversity contribution or case-based entropy as

It is at this point that the renormalization ( as a function of ) becomes scale independent as both axes range between values of and with the graph of versus passing through and . Hence, irrespective of the range and scale of the original distributions, all distributions can be plotted on the same graph and their diversity contributions can be compared in a scale-free manner.

To check the validity of our formalism, we calculate for the simple case of a uniform distribution given by on the interval . Intuitively, if we choose , then, owing to the uniformity of the distribution, we expect itself. In other words, the diversity of the part is simply equal to , that is, the length of the interval , and hence the versus curve will simply be the straight line with slope equal to . This can be shown as follows:

With our formulation of complete, we turn to the energy distributions for particles governed by Boltzmann, Bose-Einstein, and Fermi-Dirac statistics.

4. Results

4.1. for the Boltzmann Distribution in One Dimension

We first illustrate our renormalization by applying it to a relatively simple case: that of an ideal gas at temperature . The kinetic energies of particles in such a gas are described by the Boltzmann distribution [8]. In one dimension, this iswhere is the Boltzmann constant and .

The entropy of can be shown to be , and hence the true diversity of energy in the range is given by

The cumulative probability from to is then given by

Hence, can be computed in terms of as

Equation (9) is useful for the one-dimensional Boltzmann case to eliminate the parameter altogether in (11) to obtain an explicit relationship between and . It is to be noted that, in most cases, both and can only be parametrically related through . The other quantities introduced in Section 3 can then be calculated as follows:

We note that, in (13), the temperature factor cancels out, indicating that the distribution of diversity for an ideal gas in one dimension is independent of temperature. The resulting graph of as a function of is shown in Figure 1. It is worth noting in passing that reaches when , indicating that approximately of the molecules in the gas are contained within the lower of diversity of energy probability states at all temperatures (here, diversity is defined as the number of equivalent equiprobable energy states required to maintain the same amount of Shannon entropy ). Thus, the one-dimensional Boltzmann distribution obeys an interesting phenomenon that we have identified in a wide range of skewed-right complex systems, which (as we briefly discussed in the Introduction) we call restricted diversity and, more technically, the 60/40 rule [6]. The independence of temperature in the versus curve, for the Boltzmann distribution, shows that the effect of increasing is to shift the mean of the distribution to higher energies and to increase its standard deviation, but not to change its characteristic shape. Still, what is key to our results is that the temperature independence of the curve for the Boltzmann distribution in one dimension validates that our renormalization preserves the fundamental features of the original distribution.

Figure 1: as a function of for the Boltzmann distribution in one dimension.

4.2. for the Boltzmann Distribution in Three Dimensions

We now turn to the calculation of for the physically more important case of the Boltzmann distribution in three dimensions [8]:where the additional factor of accounts for the density of states.

The cumulative probability from to can be computed as follows:

As we would hope, (15) has the property that as , the cumulative probability .

However, it is difficult to solve (15) for directly in terms of . We therefore compute in parametric form with being the parameter. Also, analytical forms are not possible, so Matlab was used to compute , , and , respectively:

Thus, can also only be computed in parametric form with parameter that varies from to . Figure 2 shows the curve thus calculated for the Boltzmann distribution in three dimensions.

Figure 2: versus for Boltzmann 3D superimposed at three different temperatures: K, K, and 5000 K.

Although the temperature independence of this distribution is not immediately evident from Figure 2, one would, following the same logic as for the one-dimensional case, expect the distribution of diversity to be the same for all . That is, as in the one-dimensional case, because changes in do not affect the original distributions characteristic shape, we expect the renormalized distribution to be independent of temperature. This does, indeed, turn out to be the case. This is illustrated in Figure 2, which overlays the results of the calculations for K, K, and 5000 K. It is also worth noting that, just like our one-dimensional case, the curve obeys the rule of restricted diversity [6]: regardless of temperature, over 60 percent of molecules are in the lower 40 percent of diversity of energy probability states (here again, diversity is defined as the number of equivalent equiprobable energy states required to maintain the same amount of Shannon entropy ).

In addition, it is worth noting that as we might expect, adding more degrees of freedom increases the average energy by a factor of per degree while maintaining the same shape for the distribution of energy. Hence, the current result will still hold true for gas molecules with higher degrees of freedom; that is, the distribution of diversity is always exactly the same for an ideal gas, whether monoatomic or polyatomic.

4.3. The Bose-Einstein Distributions for Massive and Massless Bosons

We now move on to consider the second of our example distributions. The Bose-Einstein distribution gives the energy probability density function for massive bosons above the Bose temperature aswhere is a normalization constant and where is the Riemann zeta function. In the following calculations, we use the Bose temperature for helium, K.

For massless bosons such as photons, the energy probability density function is [13]It is important to note that the “density of states” factors shown in (17) and (19) result in different energy distributions, despite the two types of boson obeying the same statistics.

The conditional probabilities, conditional entropies, true diversities, and case-based entropies for these distributions cannot be calculated analytically but can be calculated numerically. The results of such calculations, using the software Matlab, are shown in Figure 3.

Figure 3: versus for Helium-4 and for photons. Note: the results of calculations carried out at K, K, and K are overlaid.

As with the Boltzmann distributions, we find that the distributions of diversity for the two boson systems are independent of temperature. Although the curves for the two types of boson are very similar, it is evident that the distributions of diversity do differ to some extent. For helium-4 bosons, a slightly larger fraction of particles are contained in lower diversity energy states than is the case for photons, with of atoms contained in the approximately of the lowest diversity states, as compared to approximately for photons. In other words, using , we are able to identify, even in such instances where intuition might suggest it to be true, common patterns within and across these different energy systems, as well as their variations. With this point made, we move to our final energy distribution.

4.4. The Fermi-Dirac Distribution

The final distribution we use to illustrate our approach is the Fermi-Dirac distribution:where is again a normalization constant and is the Fermi energy [13]. In the following, we calculate distributions for sodium electrons, for which eV. Once again, , and cannot be calculated analytically and so we rely on numerical calculations using Matlab.

The Fermi-Dirac distribution differs from the previous examples in that it is not simply scaled by changes in energy. Instead, its shape changes, transforming from a skewed-left distribution, with a sharp cut-off at the Fermi energy at low temperatures, to a smooth, skewed-right distribution at high temperatures. Thus, unlike the situation for Boltzmann and Bose-Einstein distributions, one would expect the distributions of diversity for fermions such as electrons to be dependent on temperature. Figure 4 compares the results of calculating as a function of for electrons in sodium at temperatures of K (the temperature of space), K (representing temperatures on earth), K (the temperature of the surface of the sun), and K (the temperature of the core of the sun).

Figure 4: Diversity curves for sodium electrons at a range of temperatures with on the -axis and on the -axis.

This figure shows that the degree of diversity is the highest for fermions at low temperatures; for example, at K, fully of the lowest equiprobable diversity states are need to contain of the particles, compared with only approximately at K. It also shows that, for sodium electrons, the diversity curve at normal temperatures on earth ( K) is almost identical to that at very low temperatures. That is, a room temperature Fermi gas of sodium electrons has a distribution of diversity very similar to that of a “Fermi condensate.”

5. Using to Compare and Contrast Systems

With our renormalization complete for all three distributions, we sought next to demonstrate, albeit somewhat superficially, the utility of for comparing and contrasting systems, given how widely known the results are for these three classic energy distributions. To begin with, it is usual to assume that, in the limit of high , both Bose-Einstein and Fermi-Dirac distributions reduce to Boltzmann distributions, and so the physical properties of both bosons and fermions in this limit should be those of an ideal gas.

In Figures 5 and 6, we show a comparison of all three energy distributions for temperatures of 6000 K and K (the Bose-Einstein distribution for massless bosons is included for comparison). In these figures, it appears that, by 6000 K, the Bose-Einstein distribution for helium-4 is indistinguishable from the 3D Boltzmann distribution. Also, while the Fermi-Dirac distribution has clearly not reduced to the Boltzmann distribution even at K, it appears to be trending towards it.

Figure 5: Energy density curves.

Figure 6: versus curves.

However, comparison of the diversity distributions suggests that even when the energy probability density functions appear to coincide, significant physical differences remain between the systems. Figure 7 compares all the diversity curves calculated in the present work.

It is clear from Figure 7 that the distributions of diversity for a classical ideal gas and for both Bose-Einstein and Fermi-Dirac distributions are significantly different. Because these renormalized distributions are independent of temperature, this suggests that there is no limit in which the Bose-Einstein distribution for the photon becomes completely indistinguishable from the Boltzmann distribution. Even more strikingly, the distribution of diversity in a system obeying Fermi-Dirac statistics only approaches that of bosonic systems at extremely high temperatures, similar to those at the core of the sun. At lower temperatures, the Fermi gas has substantially higher degrees of diversity than all the other systems. This is because, at lower temperatures, most of the fermions are yet to surpass the barrier created by the Fermi energy and hence are all restricted to the lower end of the energy.

Thus, the transformation from the usual probability distribution to a distribution of case-based entropy ( versus ) has allowed us to make direct scale-free comparisons, of the ways in which the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac energy distributions are similar or differ both internally (as a function of temperature ) and across distributions. It appears that, except for extremely high temperatures, the Fermi-Dirac distribution has a larger value of than the others. This means that there are a larger number of Shannon-equivalent equiprobable states of energy for the Fermi-Dirac distribution as compared to the others. A speculative explanation could be that Pauli’s exclusion principle does not allow for more than one fermion to occupy the same quantum state, thereby restricting the accumulation of fermions in the same state (i.e., more diversity).

6. Conclusion

As we have hopefully shown in this paper, while Shannon entropy has been used to rethink probability distributions in terms of diversity, it suffers from two major limitations. First, it cannot be used to compare distributions that have different levels of scale. Second, it cannot be used to compare parts of distributions to the whole.

To address these limitations, we introduced a renormalization of probability distributions based on the notion of case-based entropy (as a function of the cumulative probability ). We began with an explanation of why we rethink probability distributions in terms of diversity, based on a Shannon-equivalent uniform distribution, which comes from the work of Jost and others on the notion of true diversity in ecology and evolutionary biology [4, 9, 10]. With this approach established, we then reviewed our construction of case-based entropy . Given a probability density , measures the diversity of the distribution up to a cumulative probability of , by computing the length or support of an equivalent uniform distribution that has the same Shannon information as the conditional distribution of up to a cumulative probability .

With our conceptualization of complete, we used it to renormalize and compare three physically significant energy distributions in physics, namely, the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions for energy of subatomic particles. We chose these three distributions for three key reasons: () we wanted to see if works for continuous distribution; () where the focus was on diversity of types and not on their rank order in terms of complexity; and () where the unit order of measure was both objective and widely accepted. Based on our results, we concluded that is a vast improvement over as it provides an intuitively useful, scale-free comparison of probability distributions and also allows for a comparison between parts of distributions as well.

The renormalization obtained will have a different shape for different distributions. In fact, a bimodal, right skewed, or other kinds of distributions will lead to a different versus curve. There are two interesting points of inquiry in future papers, namely, (a) how the shape of the original distribution influences the versus curve and (b) whether we can reconstruct the original shape of the distribution given the versus curve. Because of the scale-free nature of , all distributions can be compared in the same plot without reference to their original scales. In our future work, we will endeavor to connect the shape of the versus curve to the shape of the original distribution. This will allow us to locate portions of the original distribution (irrespective of their scale), where diversity is concentrated, and portions where it is sparse, even though the original distributions cannot be plotted on the same graph due to huge variation in their scales.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank the following colleagues at Kent State University: () Dean Susan Stocker, () Kevin Acierno and Michael Ball (Computer Services), and () the Complexity in Health and Infrastructure Group for their support. They also wish to thank Emma Uprichard and David Byrne and the ESRC Seminar Series on Complexity and Method in the Social Sciences (Centre for Interdisciplinary Methodologies, University of Warwick, UK) for the chance to work through the initial framing of these ideas.