riskyr User Guide

Hansjörg Neth, SPDS, uni.kn

2018 02 01

“Solving a problem simply means representing it so as to make the solution transparent.” (H.A. Simon)1

What is the probability of a disease or clinical condition given a positive test result? This seems a simple and fairly common question, yet doctors, patients and medical students find it surprisingly difficult to answer.

Decades of research on probabilistic reasoning and risk literacy have shown that people are perplexed and struggle when information is expressed in terms of probabilities (e.g., see Mandel & Navarrete, 2015, and Trevethan, 2017, for overviews), but find it easier to understand and process the same information when it is expressed in terms of natural frequencies (see Gigerenzer and Hoffrage, 1995; Gigerenzer et al., 2007; Hoffrage et al., 2015).

riskyr is a toolbox for rendering risk literacy more transparent by facilitating such changes in representation and offering multiple perspectives on the dynamic interplay between probabilities and frequencies. The main goal of riskyr is to provide a long-term boost in risk literacy by fostering competence in understanding statistical information in domains such as health, weather, and finances (Hertwig & Grüne-Yanoff, 2017).

This guide first illustrates a typical problem and then helps you solving it by viewing risk-related information in a variety of ways. It proceeds in three steps:

We first present a typical problem in the probabilistic format that is commonly used in textbooks. This allows introducing some key probabilities, but also explains why both this problem and its traditional solution (via Bayes’ formula) remains opaque and is rightfully perceived as difficult.

We then translate the problem into natural frequencies and show how this facilitates its comprehension and solution.

Finally, we show how riskyr renders the problem more transparent by providing three sets of tools:

A. A fancy calculator that allows the computation of probabilities and frequencies;

B. A set of functions that translate between different representational formats;

C. A variety of visualizations that illustrate relationships between frequencies and probabilities.

Motivation: A problem of probabilities

A basic motivation for developing riskyr was to facilitate our understanding of problems like the following:

Mammography screening

The probability of breast cancer is 1% for a woman at age 40 who participates in routine screening. If a woman has breast cancer, the probability is 80% that she will get a positive mammography. If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography.

A woman in this age group had a positive mammography in a routine screening.
What is the probability that she actually has breast cancer?

(Hoffrage et al., 2015, p. 3)

Information provided and asked

Problems like this tend to appear in texts and tutorials on risk literacy and are ubiquitous in medical diagnostics. They typically provide some risk-related information (i.e., specific probabilities of some clinical condition and likelihoods of some decision or test of detecting its presence or absence) and ask for some other risk-related quantity. In the most basic type of scenario, we are given 3 essential probabilities:

The prevalence of some target population (here: women at age 40) for some condition (breast cancer):

prev = \(p(\mathrm{cancer}) = 1\%\)

The sensitivity of some decision or diagnostic procedure (here: a mammography screening test), which is the conditional probability:

sens = \(p(\mathrm{positive\ test}\ |\ \mathrm{cancer}) = 80\%\)

The false alarm rate of this decision, diagnostic procedure or test, which is the conditional probability:

The first challenge in solving this problem is to realize that the probability asked for is not the sensitivity sens (i.e., the probability of a positive test given cancer), but the reversed conditional probability (i.e., the probability of having cancer given a positive test). The clinical term for this quantity is the positive predictive value (PPV) or the test’s precision:

PPV = \(p( \mathrm{cancer}\ |\ \mathrm{positive\ test} )\) = ?

How can we compute the positive predictive value (PPV) from the information provided by the problem? In the following, we sketch three different paths to the solution.

Using Bayes’ formula

One way to solve problems concerning conditional probabilities is to remember and apply Bayes’ formula (which is why such problems are often called problems of “Bayesian reasoning”):

By inserting the probabilities identified above and knowing that the probability for the absence of breast cancer in our target population is the complementary probability of its presence (i.e., $p() = 1 - \(`prev` = 99\%\)) we obtain:

Thus, the information above and a few basic mathematical calculations tell us that the likelihood of a woman in our target population with a positive mammography screening test actually having breast cancer (i.e., the PPV of this mammography screening test) is slightly below 8%.

Using natural frequencies

If you fail to find the Bayesian solution easy and straightforward, you are in good company: Even people who have studied and taught statistics find it difficult to think in these terms. Fortunately, researchers have found that a simple change in representation renders the same information much more transparent.

Consider the following problem description:

Mammography screening (freq)

10 out of every 1000 women at age 40 who participate in routine screening have breast cancer.
8 out of every 10 women with breast cancer will get a positive mammography.
95 out of every 990 women without breast cancer will also get a positive mammography.

Here is a new representative sample of women at age 40 who got a positive mammography in a routine screening.
How many of these women do you expect to actually have breast cancer?

(Hoffrage et al., 2015, p. 4)

Importantly, this version (freq) of the problem refers to a frequency of \(1000\) individuals of our original target population. It still provides the same probabilities as above, but specifies them in terms of natural frequencies (see Gigerenzer & Hoffrage, 1999, and Hoffrage et al., 2002, for clarifications of this concept):

The prevalence of breast cancer in the target population:

prev = \(p(\mathrm{cancer}) = \frac{10}{1000} (= 1\%)\)

The sensitivity of the mammography screening test, which is the conditional probability:

Rather than asking us to compute a conditional probability (i.e., the PPV), the task now prompts us to imagine a new representative sample of women from our target population and focuses on the women with a positive test result. It then asks for a frequency: “How many of these women” do we expect to have cancer?

To provide any answer in terms of frequencies, we need to imagine a specific sample size \(N\). As the problem referred to a population of \(1000\) women, we conveniently pick a sample size of \(N = 1000\) women with identical characteristics (which is suggested by mentioning a “representative” sample) and ask: How many women with a positive test result actually have cancer?2

In this new sample, the frequency of women with cancer and with a positive test result should match the numbers of the original sample. Hence, we can assume that \(10\) out of \(1000\) women have cancer (prev) and \(8\) of the \(10\) women with cancer receive a positive test resul (sens). Importantly, \(95\) out of the \(990\) women without cancer also receive a positive test result (fart). Thus, the number of women with a positive test result is \(8 + 95 = 103\), but only \(8\) of them actually have cancer. Of course the ratio \(\frac{8}{103}\) is identical to our previous probability (of slightly below 7.8%). Incidentally, the reformulation in terms of frequencies protected us from erroneously taking the sensitivity (of sens = \(\frac{8}{10} = 80\%\)) as an estimate of the desired frequency. Whereas it is easy to confuse the term \(p( \mathrm{positive\ test}\ |\ \mathrm{cancer} )\) with \(p( \mathrm{cancer}\ |\ \mathrm{positive\ test} )\) when the task is expressed in terms of probabilities, it is clearly unreasonable to assume that about 800 of 1000 women (i.e., 80%) actually have cancer (since the prevalence in the population was specified to be 10 in 1000, i.e., 10%). Thus, reframing the problem in terms of frequencies made us immune against a typical mistake.

Using riskyr

Reframing the probabilistic problem in terms of frequencies made its solution easier. This is neat and probably one of the best tricks in risk literacy education (as advocated by Gigerenzer & Hoffrage, 1995; Gigerenzer 2002; 2014). While it is good to have a way to cope with tricky problems, it would be even more desirable to actually understand the interplay between probabilities and frequencies in risk-related tasks and domains. This is where riskyr comes into play.3

riskyr provides a set of basic risk literacy tools in R. As we have seen, the problems humans face when dealing with risk-related information are less of a computational, and more of a representational nature. As a statistical programming language, R is a pretty powerful computational tool, but for our present purposes it is more important that R is also great for designing and displaying aesthetic and informative visualizations. By applying these qualities to the task of training and instruction in risk literacy, riskyr is a toolbox that renders risk literacy education more transparent.

Risk vs. uncertainty

To clarify the concept of “risk” used in this context: In both basic research on the psychology of judgment and decision making and more applied research on risk perception and risk communication, the term risk refers to information or decisions for which all options and their consequences are known and probabilities for the different outcomes can be provided. This notion of risk is typically contrasted with the wider notion of uncertainty in which options or probabilities are unknown or cannot be quantified.4

For our present purposes, the notion of risk-related information refers to any scenario in which some events are determined by probabilities. A benign example of a risk-related situation is the riskyr start-up message: Every time you load the package, the dice are cast and determine which particular message (out of a range of possible messages) is shown. Even if you notice this, determining the exact probability of a message would require extensive experience, explicit information (e.g., us telling you that we use R’s sample function to randomly select 1 of 5 possible messages), or cheating (by peeking at the source code). In real life, the events of interest are vastly more complex and numerous and both our experience and options for cheating are subject to hard constraints. Thus, provided we do not want to regress into superstition, we need science to figure out probabilities and transparent risk communication for understanding them.

Promoting risk perception and communication

riskyr facilitates risk perception and promotes a deeper understanding of risk-related information in three ways:5

by organizing data structures and computational functions in useful ways;

by providing translations between probabilities and frequencies;

by providing transparent visualizations that illustrate relationships between variables and representations.

If others find the ways in which riskyr computes, transforms, and represents risks helpful or useful, riskyr may facilitate teaching and training efforts in risk literacy and generally hopes to promote a more transparent communication of risk-related information. In the following, we show how we can address the above problem by using three types of tools provided by riskyr.

A. A fancy calculator

riskyr provides a set of functions that allows us to calculate various desired outputs (probabilities and frequencies) from given inputs (probabilities and frequencies). For instance, the following function computes the positive predictive value PPV from the 3 essential probabilities prev, sens, and spec (with spec = 1 – fart) that were provided in the original problem:

It’s good to know that riskyr can apply Bayes’ formula, but so can any other decent calculator — including my brain on a good day and some environmental support in the form of paper and pencil. The R in riskyr only begins to make sense when considering functions like the following: comp_prob_prob computes probabilities from probabilities (hence its name). More precisely, comp_prob_prob takes 3 essential probabilities as inputs and returns a list of 13 probabilities as its output:

The probabilities provided need to include a prevalence prev, a sensitivity sens, and either the specificity spec or the false alarm rate fart (with spec = 1 – fart). The code above illustrates 3 different ways in which 3 of these “essential” probabilities can be provided. Thus, the assigned objects p1, p2, and p3 are all equal to each other.

The probabilities computed by these “essential” probabilities include the PPV, which can be obtained by asking for p1$PPV = 0.0776398. But the object computed by comp_prob_prob is actually a list of 10 probabilities and can be inspected by printing p1:

The list of probabilities computed includes the 3 essential probabilities (prev, sens, and spec or fart) and the desired probability (p1$PPV = 0.0776398), but also many other probabilities that may have been asked instead. (See the vignette on data formats for details on these probabilities.)

Incidentally, as R does not care whether probabilities are entered as decimal numbers or fractions, we can check whether the 2nd version of our problem — the version reframed in terms of frequencies — yields the same solution:

By providing our original probabilities to the function comp_freq_prob we can compute a list of frequencies from probabilities (hence the name). To compute frequencies for the specific sample size of 1000 individuals, we need to provide N = 1000 as an additional argument. As before, it does not matter whether the probabilities are supplied as decimal numbers or as ratios (as long as they actually are probabilities, i.e., numbers from 0 to 1).

As the ratio fart = 95/990 is not exactly equal to fart = .096 (but rather 95/100 = 0.95) the two versions of our problem actually vary by a bit. Here, the results f1 and f2 are only identical because the function comp_freq_prob rounds to nearest integers by default. To compute more precise frequencies (that no longer round to integers), use the round = FALSE argument:

In this list, the sample of N = \(1000\) women is split into 3 different subgroups. For instance, the \(10\) women with cancer appear as cond_true cases, whereas the 990 without cancer are listed as cond_false cases. The \(8\) women with cancer and a positive test result appear as hitshi and the 95 women who receive a positive test result without having cancer are listed as false alarmsfa. (See the vignette on data formats for details on all frequencies.)

Computing probabilities from frequencies

A translator between 2 representational formats should work in both directions. Consequently, riskyr also allows to compute probabilities by providing frequencies:

Fortunately, the comp_prob_freq does not require all 11 frequencies that were returned by comp_freq_prob and contained in the list of frequencies f1. Instead, we must provide comp_prob_freq with the 4 essential frequencies that were listed as hi, mi, fa, and cr in f1. The resulting probabilities (saved in p5) match our list of probabilities from above (saved in p4):

# Check equality of outputs:all.equal(p5, p4)

Switching back and forth

More generally, when we translate between formats twice — first from probabilities to frequencies and then from the resulting frequencies to probabilities — the original probabilities appear again:

To obtain the same results when translating back and forth between probabilities and frequencies, it is important to switch off rounding when computing frequencies from probabilities with comp_freq_prob. Similarly, we need to scale the computed frequencies to the original population size N to arrive at the original frequencies.

All at once: Defining a riskyr scenario

In the likely case that you are less interested in specific metrics or formats and only want to get a quick overview of the key variables of a risk-related situation, you can always define a riskyr scenario:

There are many other types of plots and customization options available. In the following, we explain different visualizations by introducing their corresponding plotting functions. For the impatient: The Quick start primer explains how to directly plot different visualizations of a given scenario.

C. Visualizing relationships between formats and variables

Inspecting the lists of probabilities and frequencies shows that the two problem formulations cited above are only two possible instances out of an array of many alternative formulations. Essentially, the same scenario can be described in a variety of variables and formats. Gaining deeper insights into the interplay between these variables requires a solid understanding of the underlying concepts and their mathematical definitions. To facilitate the development of such an understanding, riskyr recruits the power of visual representations and shows the same scenario from a variety of angles and perspectives. It is mostly this graphical functionality that supports riskyr’s claim on being a toolbox for rendering risk literacy more transparent. Thus, in addition to being a fancy calculator and a translator between formats, riskyr is mostly a machine that turns risk-related information into pretty pictures.

riskyr provides many alternative visualizations that depict the same risk-related scenario in the form of different representations. As each type of graphic has its own properties and perspective — strengths that emphasize or illuminate some particular aspect and weaknesses that hide or obscure others — the different visualizations are somewhat redundant, yet complement and support each other.6

Here are some examples that depict particular aspects of the scenario described above:

Icon array

A straightforward way of plotting an entire population of individuals is provided by an icon array that represents each individual as a symbol which is color-coded:

An icon array showing the mammography scenario for a population of 1000 individuals.

Tree diagram

Perhaps the most intuitive visualization of the relationships between probability and frequency information in our above scenario is provided by a tree diagram that shows the population and the frequency of subgroups as its nodes and the probabilities as its edges:

A tree diagram that applies the provided probabilities and frequencies to a population of 1000 individuals.

Importantly, the plot_prism function plot a simple frequency tree when providing a single perspective argument by = "cd" (as opposed to its default by = "cddc"). Here, it is called with the same 3 essential probabilities (prev, sens, and spec) and 1 frequency (the number of individuals N of our sample or population). But in addition to computing risk-related information (e.g., the number of individuals in each of the 4 subgroups at the 2nd level of the tree), the tree diagram visualizes crucial dependencies and relationships between frequencies and probabilities. For instance, the diagram illustrates that the number of true positives (hi) depends on both the condition’s prevalence (prev) and the decision’s sensitivity (sens), or that the decision’s specificity spec can be expressed and computed as the ratio of the number of true negatives (cr) divided by the number of unaffected individuals (cond_false cases).

Area plot

An alternative way to split a group of individuals into subgroups depicts the population as a square and dissects it into various rectangles that represent parts of the population. In the following area plot (aka. mosaic plot), the relative proportions of rectangle sizes represent the relative frequencies of the corresponding subgroups:

An area plot in which area sizes represent the probabilities/relative frequencies of subgroups.

The vertical split dissects the population into two subgroups that correspond to the frequency of cond_true and cond_false cases in the tree diagram above. The prev value of 1% yields a slim vertical rectangle on the left. (For details and additional options of the plot_area function, see the documentation of ?plot_area.)

Bar plot

The tree diagrams from above can also be depicted as a vertical bar plot:

(The plot is not shown here, but please go ahead and create it for yourself.)

However, due to a number of categories with a very low number of members, many bars are barely visible (which is why they are not shown here).

Prism plot

The prism plot (called network diagram in version 0.1.0 of riskyr) is a generalization of the tree diagram (see Wassner et al., 2004). It plots 9 different frequencies (computed by comp_freq_prop and comp_freq_freq and contained in freq) as nodes of a single graph and depicts 12 probabilities (computed by comp_prop_prop and comp_prop_freq and contained in prob) as edges between these nodes. Thus, the network/prism diagram integrates 2 perspectives of the tree diagrams. By scaling nodes by either frequency or probability, riskyr visualizes the interplay of frequencies and probabilities in a variety of ways:

A prism plot that integrates 2 tree diagrams and represents relative frequency by area size.

Changing the by and area options provides a variety of perspectives on a scenario. Additionally, all plotting functions provide summary information on scenario parameters and current accuracy metrics on the lower margin. Setting mar_notes = FALSE gets rid of this display. Similarly, setting scen_lbl = "" removes the title and sub-title above the diagrams.

Alternative perspectives

The variety of different graphical options provided by riskyr can be overwhelming at first – but fortunately the default options work reasonably well in most cases. In the following, we illustrate some additional parameters and points, and trust that you can evaluate and explore the corresponding commands yourself.

Both the tree/prism diagrams and the mosaic plots shown above adopted a particular perspective by splitting the population into 2 subgroups by condition (via the default option by = "cd"). Rather than emphasizing the difference between cond_true and cond_false cases, an alternative perspective could ask: How many people are detected as positive vs. negative by the test? By using the option by = "dc", the tree diagram splits the population by decisions into dec_pos and dec_neg cases:

The same perspectives can also be applied to other plots. For instance, the population of the area/mosaic plot can first be split by decisions (i.e., horizontally) by specifying the options by = "cddc" and p_split = "h":

riskyr uses a consistent color scheme to represent the same subgroups across different graphs. If this color coding is not sufficient, plotting the tree diagram with the option area = "hr" further highlights the correspondence by representing the relative frequencies of subgroups by the proportions of rectangles:

Incidentally, as both an icon array and a mosaic plot depict probability by area size, both representations can be translated into each other. Even when relaxing the positional constraint of icons in the icon array, the similarity is still visible:

The actual sample size N chosen is irrelevant, but the numbers are easier to calculate when N is a round number and at least as large as the frequencies mentioned in the problem.↩

Full disclosure: As enthusiastic students and colleagues of Gerd Gigerenzer, we think that his recommendations are insightful, convincing, and correct. However, while expressing probabilities in terms of natural frequencies promotes a better understanding of risks, it does not automatically lead to a better understanding of conditional probabilities per se. riskyr extends beyond translations between representational formats by visualizing the interplay between frequencies and probabilities in a variety of ways.↩

See Gigerenzer and Gaissmaier (2011) and Neth and Gigerenzer (2015) for discussions on different type of decision strategies that correspond to the distinction between risk and uncertainty.↩

The riskyr logo (showing 3 different dice facets) also represents its functionality in several ways:
1. First, each facet provides a frequency (e.g., the number \(3\)). Nevertheless, the dice is a paradigmatic example of a device that generates probabilities. (See Strevens, 2013, for inferring probabilistic properties from physical devices.)
2. The 3 facets indicate the 3 perspectives that classify a population of \(N\) individuals into 3 binary categories of medical diagnostics: (1) by condition (true vs. false), (2) by decision (positive vs. negative), and (3) by accuracy/correspondence (correct vs. incorrect).
3. Any individual facet is informative by itself — and often all that is of interest. However, to really understand the mechanism of the risk-generating device, it is crucial to view the dice from multiple angles. riskyr provides alternative perspectives that — when viewed together — render issues of risk literacy more transparent.
4. The 3 facets also hint at riskyr’s 3 key functions of (1) organizing information, (2) translating between representational formats, and (3) visualizing relationships between variables.↩

Although we tend to be enthusiastic about the potential of visualizations, we should not expect graphs to provide a magic potion for solving all problems of understanding. (For instance, see Micallef et al. (2012) and Khan et al. (2015) for somewhat sceptical and sobering studies on the potential benefits of different static representations of Bayesian problems.)↩