Abstract

Food additives play important roles in modern food manufacturing including adding color and flavor and extending shelf-life as well as improving processing, storing and transportation of food. This review examines a number of issues that have recently been raised in food additives safety assessments including the concepts of adverse and cumulative effects and whether the current testing approaches are adequately evaluating health outcomes of concern. We include recommendations for how food safety evaluations could be improved including- 1) adopting methods to comply with the statutory requirement to include cumulative effects of chemicals in the diet to the safety assessment of additives, 2) the inclusion of more sensitive endpoints in hazard assessments that better represent human health outcomes, 3) the use of systematic review methods to improve regulatory decision making, and 4) the use of improved screening methods to prioritize chemical evaluations.

Keywords

Introduction

Food processing has been around for thousands of years [1]. It was not until the early 20th century though that some readyto- eat foods became available thanks to new technologies. The second half of the century has seen an incredible growth in the food manufacturing industry [2]. This was possible, in part, by the development and mass production of chemicals including food dyes, preservatives, flavouring substances and chemicals used in the manufacturing of new materials used in packaging, handling and manufacturing equipment. These chemicals are usually referred to as food additives1 and quickly became a staple of the American diet. The rapid growth in the use of additives prompted Congress to investigate their potential impact on the public’s health and, as a result, the Food Additive Amendment (FAA) of 1958 to the Federal Food Drug and Cosmetic Act (FDCA) of 1938 was passed (Public Law 85-829, 72 Stat. 1784). It is clear from the Congressional investigation and hearings leading to the passage of the FAA [3] that both the US Food and Drug Administration (FDA) and the scientific community were concerned that many of the chemicals being added to the food supply were not adequately tested and might cause harmful effects after being regularly consumed for long periods of time. The FAA’s intent was to protect the public from harmful chemicals by requiring an affirmation of safety by the FDA, testing before they are used in or on foods, and considering the chronic and cumulative exposures that could cause chronic health effects in consumers [2].

Despite the good intentions of the law, diseases associated with dietary intake have become increasingly prevalent [4,5] and some have been associated with exposure to additives. For example, some ortho-phthalates have been associated with altered male reproductive development [6] and delayed neurological development [7]. Ortho-phthalates contamination in food is widespread [8-10]; these food additives were approved decades ago for a variety of uses, the most common being as plasticizers to process and package food (see for example [11]). It is also worth noting the mounting scientific evidence showing how some chemicals- including endocrine disrupting substancescan interact with biological systems at exceedingly low, chronic levels of exposure [12-14]. Further, concerns have been raised that some low doses may be capable of inducing harm to human populations especially when exposures occur during gestation or early childhood. In fact, numerous epidemiology studies suggest associations between food additives exposures and adverse outcomes in some human subpopulations (reviewed in [15-18]).

If safety evaluations have been conducted on these chemicals, why is there a gap between the safety expectation (e.g., that human exposures should be reasonably certain not to lead to harm from expected uses) and the observed effects in human populations (e.g., increased diseases associated with low-level exposures)? This review will focus on this central question and will address how inadequacies in the evaluation of environmental chemicals, and food additives in particular, may have contributed to this regulatory gap. We will start by revising some of the central issues that have been raised in the past five years and describe examples that shed light on weaknesses of the food additive safety evaluation process including issues related to the quantity and quality of safety data, the lack of independent evaluations of safety, and failures to enforce statutory requirements. We will then examine in more depth two specific issues that are relevant to the evaluation of food additives: adverse effects and cumulative effects. Finally, we will make recommendations to improve the evaluation of food additives and other environmental chemicals.

Prior concerns raised about food additives safety assessments

In 2011, participants to a multi-stakeholder meeting raised concerns about the FDA’s practice for allowing chemicals into the food supply [19]. An overarching theme was the agency’s failure to take into account best knowledge and methods for assessing safety. Specifically, some scientists argued that 1) a lack of definition of adverse effects; 2) use of testing doses outside what would be considered low and intended for human consumption; and 3) assuming all chemicals have monotonic dose-responses (e.g. responses that obey the toxicological assumption that ‘the dose makes the poison’, where more of an exposure leads to more of an effect) and thresholds below which they are considered safe were important roadblocks to adequately assess safety [20]. Regarding this latter issue, FDA was advised in 1982 that its long-held assumption that additives have thresholds below which no hazard exists was “scientifically untenable” [21]. Further, the Select Committee on GRAS Substances (SCOGS) [22] noted that assuming that no adverse effects occur after exposures to low doses ignores the possibility of accumulation in tissues of slowly excreted chemicals and slow irreversible functional alterations in vital organs. Contrary to SCOGS’s suggestion, FDA has continued to rely on the assumption that chemicals at low doses pose no risk for human health [23,24].

Other areas that have raised questions in the last several years are the quantity and quality of information available to make safety decisions. A study showed that there is very little publicly available data on food additives; less than 22% of almost 4000 chemicals directly added to food (e.g., preservatives, emulsifiers and sweeteners) having adequate data to estimate a safe level of exposure [25]. The paucity of data is worst for reproductive and developmental studies less than 7% of chemicals were tested for these effects [26].

The lack of independent scientific judgment could also influence the outcome of a safety decision [25]. Scientists with real or perceived conflicts of interest could be less critical of the data, less inclined to request more information, and more prone to use professional judgment. FDA has recognized the issue and has been working on developing guidance on qualification of scientists making decision and limitation of conflicts [27].

Perhaps more worrisome than the concerns mentioned above is the failure to fully enforce the statutory mandate to consider the cumulative effect of chemicals- both structurally related and causing similar biological effects (i.e., pharmacologically related) that are present in the diet [28]. The lawmakers that passed the FAA, the law that gave FDA the authority to ensure the safety of chemicals added to the food supply, understood that multiple additives are present in the diet. The goal of protecting consumers’ health can best be accomplished by assessing the safety of chemicals cumulatively rather than individually, contrary to what has been done for almost 60 years [29].

Identifying ‘adverse effects’

Chemicals are tested with the expectation of preventing adverse health effects as a result of exposure. Toxicity tests are used to calculate ‘safe’ doses of exposure to the general public that, with reasonable certainty, will cause no harm to humans [30]. They are not designed to predict effects of chemicals on specific disease endpoints in humans. Typically, the effects of chemicals are evaluated using test guidelines- well described protocols for evaluating chemical toxicity using internationally validated methods and endpoints [31]. Yet, endpoints measured in studies that do not follow regulatory testing guidelines may be more sensitive measures that map more appropriately to human diseases [32-34]. Unfortunately, most risk assessors consider non-guideline endpoints unreliable because they have not been sufficiently validated (e.g., demonstrated to be reproducible) to be included in guidelines. Although endpoints included in test guidelines are typically considered ‘overt’ signs of toxicity, with obvious relevance to adversity, determining whether non-guideline endpoints are ‘adverse’ is more challenging. One reason is that different agencies use different criteria to characterize the effects of chemicals as adverse (or not). The US Environmental Protection Agency’s (EPA) definition of an adverse effect is “a biochemical change, functional impairment, or pathologic lesion that affects the performance of the whole organism, or reduces an organism’s ability to respond to an additional environmental challenge” [35]. The World Health Organization’s International Programme on Chemical Safety (IPCS) defined an adverse effect as “a change in morphology, physiology, growth, development or lifespan of an organism which results in impairment of functional capacity or impairment of capacity to compensate for additional stress or increase in susceptibility to the harmful effects of other environmental influences” [36]. Even considering these definitions, there are concerns about how non-guideline endpoints are interpreted by regulators and risk assessors. For example, chemical-induced changes in sexual behaviors, memory and cognitive function, aggressive behaviors, increased body weight, responses to allergens and hormones, timing of puberty, altered serum concentrations of hormones, and many others are often determined to be ‘not adverse’. Moreover, stakeholders seldom have the opportunity to learn and challenge the rationale behind decisions about what should be considered adverse due to decision-making processes that are not transparent [37,38].

Other agencies including the European Food Safety Authority (EFSA) and the FDA have not made public their definitions for ‘adverse effects’ leading us to conclude that decisions about whether an observed effect is characterized as ‘adverse’ is left up to the judgement of individual risk assessors. This lack of clarity has been the source of many controversies about the safety of chemicals, including debates about chemicals with demonstrable endocrine disrupting properties that may cause chronic health effects [39,40]. Ambiguities in the use of terms like ‘harm’ or ‘adverse’ add another opportunity for the introduction of variability in regulatory action depending on the experience and opinion of the risk assessor [38] in an already non-transparent decision-making process.

Endpoints evaluated in guideline tests are not comprehensive

It is an unreasonable expectation that a single study, even a guideline study, could evaluate all adverse outcomes. Test guidelines were developed to examine overt signs of toxicity (e.g., changes to number of live births, altered organ weights, abnormal histopathological evaluations) [31], but cannot evaluate all diseases of interest. For example, there is no standardized test guideline that can evaluate whether chemical exposures induce endometriosis, hypertension, preeclampsia, asthma, autism, autoimmune diseases, or any number of other diseases of public health concern in human populations [41-43]. Simply put, guideline tests evaluate a limited number of endpoints, many of which are unrelated and perhaps irrelevant to effectively identify complex health outcomes that would better protect consumers.

Within the context of test guidelines, some endpoints will be affected by a chemical at much lower doses than are needed to affect other endpoints. Those that are affected by the lowest dose are the most sensitive. The question is whether test guideline endpoints are as sensitive as they could be, or in other words, as sensitive as those measured in non-guideline studies. If the endpoints that are evaluated in standard test guidelines were truly sensitive it might not matter if some disease endpoints are missing. The concern is that endpoints that are absent from test guidelines might be more sensitive than the standardized endpoints that are included [32,44], and thus the doses that are used to calculate ‘safe’ levels of exposure could be significantly higher than true no-observed-adverse-effect-levels (NOAELs) [14,45].

Recent collaborative efforts between scientists at the FDA and academic laboratories funded by the National Institutes of Health have taken place to determine whether these differences exist [45]. Results from this collaborative study showed few effects of bisphenol-A (BPA), a known endocrine disruptor, on guideline endpoints [46], but effects at low doses on nonguideline endpoints (see for example [47,48]). Similar results have been shown for other guideline-based studies of BPA that have also included non-guideline endpoints [49-51]. Collectively, these inclusions of non-guideline endpoints within traditional test guidelines suggest that effects are likely to be missed, thus potentially skewing the derivation of reference doses that are considered ‘safe’ for the general population to higher levels of exposure.

Further evaluations of guideline studies have revealed additional complications that may interfere with the identification of adverse effects including problems with controlling chemical contaminations [52], failure of the positive control to induce adverse effects [53], discrepancies between negative controls run concurrently with the test chemical and historical controls [54], and other technical problems [55,56]. For these reasons, guideline studies should be evaluated with caution, especially when their results are contradicted by non-guideline studies; hazard assessments should utilize all available data, including data collected from non-guideline studies, to derive NOAEL doses to be used in risk assessments [57,58].

Distinguishing adverse and adaptive effects

There are some concerns that non-guideline endpoints might represent adaptive, rather than adverse effects [59]. In 2002, Lewis and colleagues wrote, “Living organisms have a capacity to respond to environmental variations and stresses, whether physical or chemical, in order to maintain normal function and survival”. Physiological processes are regulated by hormonal and enzymatic control systems which operate at the level of the cell, organ or multiple organ systems. Certain effects may be adaptive responses to general chemical exposure and unrelated to inherent toxicity of the test substance itself [60]. A more recent report developed criteria to help distinguish adaptive and adverse effects and defined an adaptive response as “the process whereby a cell or organism responds to a xenobiotic so that the cell or organism will survive in the new environment that contains the xenobiotic without impairment of function” [61].

Of course, this definition requires that an exposed individual must survive without impaired function in a novel environment, but the problem is that the individual cannot predict what that environment might be. A historical example of this is the relationship between gestational growth and adult disease. During World War II and the Dutch Hunger, fetuses that were born to mothers without sufficient caloric intake had low birth weight [62]; if the “novel environment” they were born in had limited calories available, they would have displayed ‘adaptive’ responses and survived without disease. However, the caloric restriction experienced by these individuals was short-lived, and those that grew-up in an environment with sufficient calories developed metabolic diseases including diabetes, heart disease, liver disease, and stroke in adulthood [63-66]. Studies from rodents revealed that fetal malnutrition permanently alters the tissue organization of multiple organs including the liver and kidney due to diversion of blood flow away from these organs to protect brain development in the womb [67]; these developmental changes predisposed the individual to a number of adult diseases, particularly if caloric intake was high. Thus, the same event (fetal malnourishment) causes adaptive effects in those individuals that experience malnutrition in postnatal life, but adverse effects in those individuals with sufficient or plentiful caloric availability in postnatal development. These are the kinds of functional alterations in vital organs SCOGS warned about in 1982; regardless of whether they were caused by dietary restrictions, chemical exposure or other stressors, they cannot be ruled out. For these reasons, suggestions that an endpoint is ‘adaptive’ should be accompanied by scientific evidence and evaluation of multiple post-exposure environments.

Vulnerable windows of susceptibility: distinguishing adaptive and adverse effects: A common assumption among food additives safety assessors is that individuals adapt to chemical exposures and therefore the effects observed during toxicity testing (i.e., changes in body weight, organ weight, hormone levels) are transitory and will revert to normal when the exposure stops or the individual will adapt without evidence of harm. This assumption is not only short-sighted but scientifically flawed. First, the majority of animal toxicity testing is conducted on healthy adult and non-pregnant animals [26]. There are very few studies that evaluate whether discontinuing treatment in the exposed animals leads to reversion of observed adverse effects or further pathology.

Second, the assumption that effects caused by exposures during development are adaptive lacks scientific grounding. This is especially true for chemicals with endocrine disrupting properties, because of the role that hormones play in the normal differentiation and development of many if not most of the body’s organs [68]. Development is described as a “one-way street” [41] where a series of events occur in a coordinated manner and there is no opportunity to “re-do” specific developmental events [69]. Both the type and severity of the observed effect and the latency between exposure and the manifestation of these effects will depend on the type of chemical, the duration of the exposure and the dose. Thus, based on the available scientific knowledge it is appropriate to presume that effects induced by developmental chemical exposures are adverse- and not adaptive- unless clear evidence demonstrates otherwise. Adopting this as a default assumption would reduce the risk of error from the inductive inference.

In 2014, FDA recognized that there are differences in the physiology and biological susceptibilities between infants and adults [70]. Young children are more susceptible to exposures because their metabolic functions are not fully matured and many organs and systems are still developing and will continue to develop for years (e.g., neurological, immune and reproductive systems). Children also eat more food per body weight and consequently experience greater chemicals exposures compared to adults. Yet, the testing guidance for developmental exposures recommended by FDA is narrowly focused on embryo and fetal toxicity via use of a study design where evaluations of exposed offspring occur before the animals are born [71].

The common assumption among toxicologists and risk assessors that individuals ‘adapt’ to persistent exposures to low doses of chemicals stands in stark contrast to the statistics indicating that an estimated 10 million US children have a developmental disability that cannot be explained solely by genetic factors. This represents 15% of all children aged 3-17 years, and data suggest that the prevalence of these conditions is on the rise; between 1997 and 2008, autism increased by 290% and attentiondeficit hyperactivity disorder (ADHD) increased by 33% [72]. Similarly, the prevalence of type 2 diabetes increased more than 30% in children aged 10 to 19 years in the lapse of only 8 years (between 2001 and 2009), while type 1 diabetes increased more than 20% over the same period [73]. Lastly, epidemiology data on specific food additives and their association with health effects varying from lower IQ to behavioral changes and obesity has been increasing in the last several years [74-77].

Cumulative effects

The American diet has greatly changed in the last 60 years. According to recent data, almost 60% of the diet is composed of ultra-processed foods [78], in other words, foods containing substances such as flavors, colors, sweeteners, emulsifiers and other additives that imitate qualities of unprocessed or minimally processed foods. In 1958, Congress not only gave FDA authority to test chemicals before they were used in food but mandated that, when assessing safety, the cumulative effects of chemically and pharmacologically related substances in the diet must be considered. The reasoning behind assessing chemicals as a class is that structurally similar substances may have similar toxic effects or similar biological effects can be caused by chemicals that do not look alike. By assessing these chemicals cumulatively, as required by the FAA, the logic follows, a more accurate safe level for the class can be determined and the public can be better protected.

Regulatory agencies and scientists have proposed different methods to assess chemicals as a class, but one common feature is the grouping of chemicals that will be assessed together although the specific approaches vary. For instance, EFSA’s approach groups pesticides by similar toxicity effects on a biological system [79]. In a case study evaluating the potential for chemicals to have cumulative effects on the thyroid system, the effects included in the analysis were: (1) a decrease in circulating levels of the thyroid hormone; (2) a decrease in the thyroid hormone action in the body; (3) neurochemical and neuropathological effects; (4) effects on the motor, sensory and autonomic divisions; and (5) developmental neurotoxicity and cognitive end points when available [29]. The experts conducting this evaluation took this broad approach because alterations in these normal functions of the thyroid system could lead to impairment of brain development which was the health outcome of concern.

In contrast, EPA’s approach to grouping pesticides for evaluating cumulative effects is more narrowly focused. The agency states that “a cumulative risk assessment evaluate[s] the potential for people to be exposed to more than one pesticide at a time from a group [of compounds] that share an identified common mechanism of toxicity” (Emphasis added) [80]. EPA defines common mechanism of toxicity as “two or more chemicals or other substances that cause a common toxic effect(s) by the same, or essentially the same, sequence of major biochemical events (i.e., interpreted as mode of action)”. These strict requirements may pose two problems: 1) the mechanism(s) of action of most chemicals has not been identified [79], and 2) chemicals may induce similar adverse effects without sharing the same mode of action [81]. To date, the EPA has performed cumulative risk assessments for only five groups of pesticides (i.e., organophosphates, N-methyl carbamates, triazines, chloroacetanilides, and pyrethrins/pyrethroids). It also concluded that thiocarbamates and dithiocarbamates “do not share a common mechanism of toxicity” and therefore do not qualify to be assessed cumulatively [80].

Committees of the National Academy of Sciences have also presented their perspectives on best methods to assess cumulative effects of chemicals [81,82]. To our knowledge, the US FDA and the regulated industry continue to assess the safety of additives on an individual basis without considering the biological effects caused by other chemicals on the same organ or systems [29]. This means that for food additives, methods for evaluating cumulative effects- and guidance for how the FDA will define cumulative effects- remain to be developed. The agency provided some insight on its approach to grouping chemically-related substances in its decision to ban the use of long-chain perfluoroalkyl chemicals (LC-PFC) in contact with food. In its explanation for the ruling [83], FDA stated it used the Organisation for Economic Cooperation and Development (OECD) Guidance for Grouping of Chemicals to define the class of LC-PFCs [84].

Recommendations

As SCOGS said in its final 1982 report, “Failure to observe an adverse effect when a substance is widely used for a long time in uncontrolled, casual human applications is insufficient reason to pronounce it safe even at very low levels” [22]. Sir Austin Bradford-Hill, a pioneer of epidemiology, noted that actions taken to prevent public health should consider the consequences of not acting, stating, “All scientific work is liable to be upset or modified by advancing knowledge that does not confer upon us a freedom to ignore the knowledge we already have or to postpone the action that it appears to demand at a given time” [85]. The philosopher of science, Heather Douglas, notes that scientists must consider the consequences of potential errors in scientific policy-making from the perspective of citizens, and notes that qualitative approaches that consider the need to communicate “weight of evidence” results to the general public can meet the needs of decision-makers [86]. More recently, EPA’s scientists have explored the concept of public health-focused cumulative risk assessment [87] expanding on discussions previously published by others [29,81]. The authors stated that “there is a need to reframe chemical risk assessment to be more clearly aligned with the public health goal of minimizing environmental exposures associated with disease.” Considering these perspectives, we propose recommendations to improve the evaluation of food additives.

Improvements are needed in risk assessment methods to account for cumulative effects

As discussed above, there are several approaches that are available to consider how chemicals might act cumulatively; any of them would be a significant improvement over the current situation, namely, FDA’s lack of cumulative assessment methods. However, a broad approach to grouping chemicals focused on health outcomes, similar to the methods used by EFSA, would be preferable. FDA does not need additional statutory mandate to do this because it has been in the law for almost 60 years.

More sensitive endpoints are needed in standard test guidelines

Although it is unreasonable to expect that guideline studies evaluate all health outcomes, the purpose of these studies is to identify true NOAEL doses. This requires that truly sensitive endpoints are examined and used for the determination of reference doses during the risk assessment process. Efforts have been made to include more sensitive endpoints in guideline studies, and several studies indicate that these non-traditional endpoints identify effects at low doses, often several orders of magnitude below the NOAEL [51]. Validating these endpoints is a long process, but efforts should be undertaken to identify the most sensitive.

One way to include non-guideline endpoints is to change the methods used in risk assessment to move away from decisions based on a single (or a few) key ‘study’ toward an integration of data from all available sources [58,87]. Systematic review methodologies would provide transparent, robust, reproducible means of evaluating study quality and therefore offer improved mechanisms to make regulatory decisions [88]. Several examples illustrate how the use of systematic review methodologies have allowed conclusions to be drawn that are unlikely to be reached by the examination of just one or a few studies [89-91]. Of course, systematic reviews are best performed for chemicals- or groups of chemicals, in the case of evaluations that consider cumulative effects- that already have available data; they are not useful for poorly studied chemicals, as is common for many food additives [26].

Improved screening will benefit evaluations of cumulative effects and help chemical prioritization

High-throughput screening tools have become available in recent years including a battery of in vitro tests offered in the US EPA’s ToxCast and US National Toxicology Program’s Tox21 platforms [92]. Employing these assays in the screening of food additives offers a number of important benefits: 1) chemicals can be prioritized based on the number of biological pathways they are likely to disrupt [93], and 2) chemicals can be grouped based on their mode of action. This latter feature offers improved ability to examine mixtures of compounds with cumulative effects, e.g., chemicals with a common mechanism of toxicity.

Conclusions

Chemical risk assessment needs profound modernization. Decades-old assumptions based on outdated, and sometimes flawed, science must be revisited. The evaluation of one food additive at a time does not fulfill the statutory mandate of assessing the cumulative effect of chemicals in the diet on the health of individuals. Furthermore, FDA’s lack of enforcement leaves consumers less protected from the hundreds of additives they are exposed to on a daily basis. For the purposes of protecting public health, developing approaches to evaluate the cumulative effects of chemicals is an urgent need. Although some regulatory agencies have begun to consider the importance of this topic, the US FDA, which regulates food additives, has made little progress. Failure to take action to close the regulatory gap between the safety expectation that additives do not cause harm and the observed effects in human populations has costs to both public health and the economy.

Acknowledgements

LNV acknowledges funding support from the National Institute of Environmental Health Sciences of the National Institutes of Health (Award Number K22ES025811). MVM did not receive funding to do this work. The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Disclosure Statement

LNV has received travel reimbursements from Universities, Governments, NGOs and Industry, to speak about endocrinedisrupting chemicals. MVM is an independent consultant working with public interest organizations and industry.

1In this review, we use the term food additive in a general manner to encompass chemicals directly and indirectly added to food. It does not represent the legal meaning of the term.

Select Committee on GRAS Substances. Insights on food safety evaluation. US Department of Commerce, National Technical Information Service. Life Sciences Research Office, Federation of American Societies for Experimental Biology, Springfield, VA.1982.

U.S. Food and Drug Administration. Bisphenol A (BPA): Use in food contact application. 2014.

European Food Safety Authority Panel on Plant Protection Products and their Residues. Scientific opinion on the identification of pesticides to be included in cumulative assessment groups on the basis of their toxicological profile. EFSA J 2013;11:3293.