Reproducibility of findings is a core foundation of science. If scientific results only hold true in some labs but not in others, then how can researchers feel confident about their discoveries? How can society put evidence-based policies into place if the evidence is unreliable?

Recognition of this “crisis” has prompted calls for reform. Researchers are feeling their way, experimenting with different practices meant to help distinguish solid science from irreproducible results. Some people are even starting to reevaluate how choices are made about what research actually gets tackled. Breaking innovative new ground is flashier than revisiting already published research. Does prioritizing novelty naturally lead to this point?

Incentivizing the wrong thing?

One solution to the reproducibility crisis could be simply to conduct lots of replication studies. For instance, the scientific journal eLife is participating in an initiative to validate and reproduce important recent findings in the field of cancer research. The first set of these “rerun” studies was recently released and yielded mixed results. The results of 2 out of 5 research studies were reproducible, one was not and two additional studies did not provide definitive answers.

But there’s at least one major obstacle to investing time and effort in this endeavor: the quest for novelty. The prestige of an academic journal depends at least partly on how often the research articles it publishes are cited. Thus, research journals often want to publish novel scientific findings which are more likely to be cited, not necessarily the results of newly rerun older research.

Genetics researcher Barak Cohen at Washington University in St. Louis recently published a commentary analyzing this growing push for novelty. He suggests that progress in science depends on a delicate balance between novelty and checking the work of other scientists. When rewards such as funding of grants or publication in prestigious journals emphasize novelty at the expense of testing previously published results, science risks developing cracks in its foundation.

One of his main concerns is that scientific papers now inflate their claims in order to emphasize their novelty and the relevance of biomedical research for clinical applications. By exchanging depth of research for breadth of claims, researchers may be at risk of compromising the robustness of the work. By claiming excessive novelty and impact, researchers may undermine its actual significance because they may fail to provide solid evidence for each claim.

Prestigious journals often now demand complete scientific stories, from basic molecular mechanisms to proving their relevance in various animal models. Unexplained results or unanswered questions are seen as weaknesses. Instead of publishing one exciting novel finding that is robust, and which could spawn a new direction of research conducted by other groups, researchers now spend years gathering a whole string of findings with broad claims about novelty and impact.

Balancing fresh findings and robustness

A challenge for editors and reviewers of scientific manuscripts is assessing the novelty and likely long-term impact of the work they’re assessing. The eventual importance of a new, unique scientific idea is sometimes difficult to recognize even by peers who are grounded in existing knowledge. Many basic research studies form the basis of future practical applications. One recent study found that of basic research articles that received at least one citation, 80 percent were eventually cited by a patent application. But it takes time for practical significance to come to light.

A collaborative team of economics researchers recently developed an unusual measure of scientific novelty by carefully studying the references of a paper. They ranked a scientific paper as more novel if it cited a diverse combination of journals. For example, a scientific article citing a botany journal, an economics journal and a physics journal would be considered very novel if no other article had cited this combination of varied references before.

This measure of novelty allowed them to identify papers which were more likely to be cited in the long run. But it took roughly four years for these novel papers to start showing their greater impact. One may disagree with this particular indicator of novelty, but the study makes an important point: It takes time to recognize the full impact of novel findings.

Realizing how difficult it is to assess novelty should give funding agencies, journal editors and scientists pause. Progress in science depends on new discoveries and following unexplored paths – but solid, reproducible research requires an equal emphasis on the robustness of the work. By restoring the balance between demands and rewards for novelty and robustness, science will achieve even greater progress.

Competition for government research grants to fund scientific research remains fierce in the United States. The budget of the National Institutes of Health (NIH), which constitute the major source of funding for US biological and medical research, has been increased only modestly during the past decade but it is not even keeping up with inflation. This problem is compounded by the fact that more scientists are applying for grants now than one or two decades ago, forcing the NIH to enforce strict cut-offs and only fund the top 10-20% of all submitted research proposals. Such competition ought to be good for the field because it could theoretically improve the quality of science. Unfortunately, it is nearly impossible to discern differences between excellent research grants. For example, if an institute of the NIH has a cut-off at the 13 percentile range, then a grant proposal judged to be in the top 10% would receive funding but a proposal in top 15% would end up not being funded. In an era where universities are also scaling back their financial support for research, an unfunded proposal could ultimately lead to the closure of a research laboratory and the dismissal of several members of a research team. Since the prospective assessment of a research proposal’s scientific merits are somewhat subjective, it is quite possible that the budget constraints are creating cemeteries of brilliant ideas and concepts, a world of scientific what-ifs that are forever lost.

Red Panda

How do we scientists deal with these scenarios? Some of us keep soldiering on, writing one grant after the other. Others change and broaden the direction of their research, hoping that perhaps research proposals in other areas are more likely to receive the elusive scores that will qualify for funding. Yet another approach is to submit research proposals to philanthropic foundations or non-profit organizations, but most of these organizations tend to focus on research which directly impacts human health. Receiving a foundation grant to study the fundamental mechanisms by which the internal clocks of plants coordinate external timing cues such as sunlight, food and temperature, for example, would be quite challenging. One alternate source of research funding that is now emerging is “scientific crowdfunding” in which scientists use web platforms to present their proposed research project to the public and thus attract donations from a large number of supporters. The basic underlying idea is that instead of receiving a $50,000 research grant from one foundation or government agency, researchers may receive smaller donations from 10, 50 or even a 100 supporters and thus finance their project.

How can scientists get involved in scientific crowdfunding? Julien Vachelard and colleagues recently published an excellent overview of scientific crowdfunding. They analyzed the projects funded on experiment.com and found that projects which successfully achieved the funding goal tend to have 30-40 backers. The total amount of funds raised for most projects ranged from about $3,000 to $5,000. While these amounts are impressive, they are still far lower than a standard foundation or government agency grant in biomedical research. These smaller amounts could support limited materials to expand ongoing projects, but they are not sufficient to carry out standard biomedical research projects which cover salaries and stipends of the researchers. The annual stipends for postdoctoral research fellows alone run in the $40,000 – $55,000 range.

Vachelard and colleagues also provide great advice for how scientists can increase the likelihood of funding. Attention span is limited on the internet so researchers need to convey the key message of their research proposal in a clear, succinct and engaging manner. It is best to use powerful images and videos, set realistic goals (such as $3,000 to $5,000), articulate what the funds will be used for, participate in discussions to answer questions and also update backers with results as they emerge. Presenting research in a crowdfunding platform is an opportunity to educate the public and thus advance science, forcing scientists to develop better communication skills. These collateral benefits to the scientific enterprise extend beyond the actual amount of funding that is solicited.

One of the concerns that is voiced about scientific crowdfunding is that it may only work for “panda bear science“, i.e. scientific research involving popular themes such as cute and cuddly animals or studying life on other planets. However, a study of what actually gets funded in a scientific crowdfunding campaign revealed that the subject matter was not as important as how well the researchers communicated with their audience. A bigger challenge for the long-term success of scientific crowdfunding may be the limited amounts that are raised and therefore only cover the cost of small sub-projects but are neither sufficient to embark on exploring exciting new ideas and independent ideas nor offset salary and personnel costs. Donating $20 or $50 to a project is very different from donating amounts such as $1,000 because the latter requires not only the necessary financial resources but also a represents a major personal investment in the success of the research project. To initiate an exciting new biomedical research project in the $50,000 or $100,000 range, one needs several backers who are willing to donate $1,000 or more.

Perhaps one solution could be to move from a crowdfunding towards a tribefunding model. Crowds consist of a mass of anonymous people, mostly strangers in a confined space who do not engage each other. Tribes, on the other hand, are characterized by individuals who experience a sense of belonging and fellowship, they share and take responsibility for each other. The “tribes” in scientific tribefunding would consist of science supporters or enthusiasts who recognize the importance of the scientific work and also actively participate in discussions not just with the scientists but also with each other. Members of a paleontology tribe could include specialists and non-specialists who are willing to put in the required time to study the scientific background of a proposed paleontology research project, understand how it would advance the field and how even negative results (which are quite common in science) could be meaningful.

Tribefunding in higher education and science may sound like a novel concept but certain aspects of tribefunding are already common practice in the United States, albeit under different names. When wealthy alumni establish endowments for student scholarships, fellowship programs or research centers at their alma mater, it is in part because they feel a tribe-like loyalty towards the institutions that laid the cornerstones of their future success. The students and scholars who will benefit from these endowments are members of the same academic institution or tribe. The difference between the currently practiced form of philanthropic funding and the proposed tribefunding model is that tribe identity would not be defined by where one graduated from but instead by scientific interests.

Tribefunding could also impact the review process of scientific proposals. Currently, peer reviewers who assess the quality of scientific proposals for government agencies spend a substantial amount of time assessing the strengths and limitations of each proposal, and then convene either in person or via conference calls to arrive at a consensus regarding the merits of a proposal. Researchers often invest months of effort when they prepare research proposals which is why peer reviewers take their work very seriously and devote the required time to review each proposal carefully. Although the peer review system for grant proposals is often criticized because reviewers can make errors when they assess the quality of proposals, there are no established alternatives for how to assess research proposals. Most peer reviewers also realize that they are part of a “tribe”, with the common interest of selecting the best science. However, the definition of a “peer” is usually limited to other scientists, most of whom are tenured professors at academic institutions and does not really solicit input from non-academic science supporters. In a tribefunding model, the definition of a “peer” would be expanded to professional scientists as well as science supporters for any given area of science. All members of the tribe could participate during the review and selection of the best projects as well as throughout the funding period of the research projects that receive the support.

Merging the grassroots character and public outreach of crowdfunding with the sense of fellowship and active dialogue in a “scientific tribe” could take scientific crowdfunding to the next level. A comment section on a webpage is not sufficient to develop such a “tribe” affiliation but regular face-to-face meetings or conventional telephone/Skype conference calls involving several backers (independent of whether they can donate $50 or $5,000) may be more suitable. Developing a sense of ownership through this kind of communication would mean that every member of the science “tribe” realizes that they are a stakeholder. This sense of project ownership may not only increase donations, but could also create a grassroots synergy between laboratory and tribe, allowing for meaningful education and intellectual exchange.

Lingulodinium polyedrum is a unicellular marine organism which belongs to the dinoflagellate group of algae. Its genome is among the largest found in any species on this planet, estimated to contain around 165 billion DNA base pairs – roughly fifty times larger than the size of the human genome. Encased in magnificent polyhedral shells, these bioluminescent algae became important organisms to study biological rhythms. Each Lingulodinium polyedrum cell contains not one but at least two internal clocks which keep track of time by oscillating at a frequency of approximately 24 hours. Algae maintained in continuous light for weeks continue to emit a bluish-green glow at what they perceive as night-time and swim up to the water surface during day-time hours – despite the absence of any external time cues. When I began studying how nutrients affect the circadian rhythms of these algae as a student at the University of Munich, I marveled at the intricacy and beauty of these complex time-keeping mechanisms that had evolved over hundreds of millions of years.

I was prompted to revisit the role of Beauty in biology while reading a masterpiece of scientific writing, “Dreams of a Final Theory” by the Nobel laureate Steven Weinberg in which he describes how the search for Beauty has guided him and many fellow theoretical physicists to search for an ultimate theory of the fundamental forces of nature. Weinberg explains that it is quite difficult to precisely define what constitutes Beauty in physics but a physicist would nevertheless recognize it when she sees it.Over the course of a quarter of a century, I have worked in a variety of biological fields, from these initial experiments in marine algae to how stem cells help build human blood vessels and how mitochondria in a cell fragment and reconnect as cells divide. Each project required its own set of research methods and techniques, each project came with its own failures and successes. But with each project, my sense of awe for the beauty of nature has grown. Evolution has bestowed this planet with such an amazing diversity of life-forms and biological mechanisms, allowing organisms to cope with the unique challenges that they face in their respective habitats. But it is only recently that I have become aware of the fact that my sense of biological beauty was a post hoc phenomenon: Beauty was what I perceived after reviewing the experimental findings; I was not guided by a quest for beauty while designing experiments. In fact, I would have been worried that such an approach might bias the design and interpretation of experiments. Might a desire for seeing Beauty in cell biology lead one to consciously or subconsciously discard results that might seem too messy?

One such key characteristic of a beautiful scientific theory is the simplicity of the underlying concepts. According to Weinberg, Einstein’s theory of gravitation is described in fourteen equations whereas Newton’s theory can be expressed in three. Despite the appearance of greater complexity in Einstein’s theory, Weinberg finds it more beautiful than Newton’s theory because the Einsteinian approach rests on one elegant central principle – the equivalence of gravitation and inertia. Weinberg’s second characteristic for beautiful scientific theories is their inevitability. Every major aspect of the theory seems so perfect that it cannot be tweaked or improved on. Any attempt to significantly modify Einstein’s theory of general relativity would lead to undermining its fundamental concepts, just like any attempts to move around parts of Raphael’s Holy Family would weaken the whole painting.

Can similar principles be applied to biology? I realized that when I give examples of beauty in biology, I focus on the complexity and diversity of life, not its simplicity or inevitability. Perhaps this is due to the fact that Weinberg was describing the search of fundamental laws of physics, laws which would explain the basis of all matter and energy – our universe. As cell biologists, we work several orders of magnitude removed from these fundamental laws. Our building blocks are organic molecules such as proteins and sugars. We find little evidence of inevitability in the molecular pathways we study – cells have an extraordinary ability to adapt. Mutations in genes or derangement in molecular signaling can often be compensated by alternate cellular pathways.

This also points to a fundamental difference in our approaches to the world. Physicists searching for the fundamental laws of nature balance the development of fundamental theories whereas biology in its current form has primarily become an experimental discipline. The latest technological developments in DNA and RNA sequencing, genome editing, optogenetics and high resolution imaging are allowing us to amass unimaginable quantities of experimental data. In fact, the development of technologies often drives the design of experiments. The availability of a genetically engineered mouse model that allows us to track the fate of individual cells that express fluorescent proteins, for example, will give rise to numerous experiments to study cell fate in various disease models and organs. Much of the current biomedical research funding focuses on studying organisms that provide technical convenience such as genetically engineered mice or fulfill a societal goal such as curing human disease.

Uncovering fundamental concepts in biology requires comparative studies across biology and substantial investments in research involving a plethora of other species. In 1990, the National Institutes of Health (NIH – the primary government funding source for biomedical research in the United States) designated a handful of species as model organisms to study human disease, including mice, rats, zebrafish and fruit flies. A recent analysis of the species studied in scientific publications showed that in 1960, roughly half the papers studied what would subsequently be classified as model organisms whereas the other half of papers studied additional species. By 2010, over 80% of the scientific papers were now being published on model organisms and only 20% were devoted to other species, thus marking a significant dwindling of broader research goals in biology. More importantly, even among the model organisms, there has been a clear culling of research priorities with a disproportionately large growth in funding and publications for studies using mice. Thousands of scientific papers are published every month on the cell signaling pathways and molecular biology in mouse and human cells whereas only a minuscule fraction of research resources are devoted to studying signaling pathways in algae.

The question of whether or not biologists should be guided by conceptual Beauty leads us to the even more pressing question of whether we need to broaden biological research. If we want to mirror the dizzying success of fundamental physics during the past century and similarly advance fundamental biology, then we need substantially step-up investments in fundamental biological research that is not constrained by medical goals.

Universities and the scientific infrastructures in Muslim-majority countries need to undergo radical reforms if they want to avoid falling by the wayside in a world characterized by major scientific and technological innovations. This is the conclusion reached by Nidhal Guessoum and Athar Osama in their recent commentary “Institutions: Revive universities of the Muslim world“, published in the scientific journal Nature. The physics and astronomy professor Guessoum (American University of Sharjah, United Arab Emirates) and Osama, who is the founder of the Muslim World Science Initiative, use the commentary to summarize the key findings of the report “Science at Universities of the Muslim World” (PDF), which was released in October 2015 by a task force of policymakers, academic vice-chancellors, deans, professors and science communicators. This report is one of the most comprehensive analyses of the state of scientific education and research in the 57 countries with a Muslim-majority population, which are members of the Organisation of Islamic Cooperation (OIC).

1. Lower scientific productivity in the Muslim world: The 57 Muslim-majority countries constitute 25% of the world’s population, yet they only generate 6% of the world’s scientific publications and 1.6% of the world’s patents.

2. Lower scientific impact of papers published in the OIC countries: Not only are Muslim-majority countries severely under-represented in terms of the numbers of publications, the papers which do get published are cited far less than the papers stemming from non-Muslim countries. One illustrative example is that of Iran and Switzerland. In the 2014 SCImago ranking of publications by country, Iran was the highest-ranked Muslim-majority country with nearly 40,000 publications, just slightly ahead of Switzerland with 38,000 publications – even though Iran’s population of 77 million is nearly ten times larger than that of Switzerland. However, the average Swiss publication was more than twice as likely to garner a citation by scientific colleagues than an Iranian publication, thus indicating that the actual scientific impact of research in Switzerland was far greater than that of Iran.

To correct for economic differences between countries that may account for the quality or impact of the scientific work, the analysis also compared selected OIC countries to matched non-Muslim countries with similar per capita Gross Domestic Product (GDP) values (PDF). The per capita GDP in 2010 was $10,136 for Turkey, $8,754 for Malaysia and only $7,390 for South Africa. However, South Africa still outperformed both Turkey and Malaysia in terms of average citations per scientific paper in the years 2006-2015 (Turkey: 5.6; Malaysia: 5.0; South Africa: 9.7).

3. Muslim-majority countries make minimal investments in research and development: The world average for investing in research and development is roughly 1.8% of the GDP. Advanced developed countries invest up to 2-3 percent of their GDP, whereas the average for the OIC countries is only 0.5%, less than a third of the world average! One could perhaps understand why poverty-stricken Muslim countries such as Pakistan do not have the funds to invest in research because their more immediate concerns are to provide basic necessities to the population. However, one of the most dismaying findings of the report is the dismally low rate of research investments made by the members of the Gulf Cooperation Council (GCC, the economic union of six oil-rich gulf countries Saudi Arabia, Kuwait, Bahrain, Oman, United Arab Emirates and Qatar with a mean per capita GDP of over $30,000 which is comparable to that of the European Union). Saudi Arabia and Kuwait, for example, invest less than 0.1% of their GDP in research and development, far lower than the OIC average of 0.5%.

So how does one go about fixing this dire state of science in the Muslim world? Some fixes are rather obvious, such as increasing the investment in scientific research and education, especially in the OIC countries which have the financial means and are currently lagging far behind in terms of how much funds are made available to improve the scientific infrastructures. Guessoum and Athar also highlight the importance of introducing key metrics to assess scientific productivity and the quality of science education. It is not easy to objectively measure scientific and educational impact, and one can argue about the significance or reliability of any given metric. But without any metrics, it will become very difficult for OIC universities to identify problems and weaknesses, build new research and educational programs and reward excellence in research and teaching. There is also a need for reforming the curriculum so that it shifts its focus from lecture-based teaching, which is so prevalent in OIC universities, to inquiry-based teaching in which students learn science hands-on by experimentally testing hypotheses and are encouraged to ask questions.

In addition to these commonsense suggestions, the task force also put forward a rather intriguing proposition to strengthen scientific research and education: place a stronger emphasis on basic liberal arts in science education. I could not agree more because I strongly believe that exposing science students to the arts and humanities plays a key role in fostering the creativity and curiosity required for scientific excellence. Science is a multi-disciplinary enterprise, and scientists can benefit greatly from studying philosophy, history or literature. A course in philosophy, for example, can teach science students to question their basic assumptions about reality and objectivity, encourage them to examine their own biases, challenge authority and understand the importance of doubt and uncertainty, all of which will likely help them become critical thinkers and better scientists.

However, the specific examples provided by Guessoum and Athar do not necessarily indicate a support for this kind of a broad liberal arts education. They mention the example of the newly founded private Habib University in Karachi which mandates that all science and engineering students also take classes in the humanities, including a two semester course in “hikma” or “traditional wisdom”. Upon reviewing the details of this philosophy course on the university’s website, it seems that the course is a history of Islamic philosophy focused on antiquity and pre-modern texts which date back to the “Golden Age” of Islam. The task force also specifically applauds an online course developed by Ahmed Djebbar. He is an emeritus science historian at the University of Lille in France, which attempts to stimulate scientific curiosity in young pre-university students by relating scientific concepts to great discoveries from the Islamic “Golden Age”. My concern is that this is a rather Islamocentric form of liberal arts education. Do students who have spent all their lives growing up in a Muslim society really need to revel in the glories of a bygone era in order to get excited about science? Does the Habib University philosophy course focus on Islamic philosophy because the university feels that students should be more aware of their cultural heritage or are there concerns that exposing students to non-Islamic ideas could cause problems with students, parents, university administrators or other members of society who could perceive this as an attack on Islamic values? If the true purpose of liberal arts education is to expand the minds of students by exposing them to new ideas, wouldn’t it make more sense to focus on non-Islamic philosophy? It is definitely not a good idea to coddle Muslim students by adulating the “Golden Age” of Islam or using kid gloves when discussing philosophy in order to avoid offending them.

This leads us to a question that is not directly addressed by Guessoum and Osama: How “liberal” is a liberal arts education in countries with governments and societies that curtail the free expression of ideas? The Saudi blogger Raif Badawi was sentenced to 1,000 lashes and 10 years in prison because of his liberal views that were perceived as an attack on religion. Faculty members at universities in Saudi Arabia who teach liberal arts courses are probably very aware of these occupational hazards. At first glance, professors who teach in the sciences may not seem to be as susceptible to the wrath of religious zealots and authoritarian governments. However, the above-mentioned interdisciplinary nature of science could easily spell trouble for free-thinking professors or students. Comments about evolutionary biology, the ethics of genome editing or discussing research on sexuality could all be construed as a violation of societal and religious norms.

The 2010 study Faculty perceptions of academic freedom at a GCC university surveyed professors at an anonymous GCC university (most likely Qatar University since roughly 25% of the faculty members were Qatari nationals and the authors of the study were based in Qatar) regarding their views of academic freedom. The vast majority of faculty members (Arab and non-Arab) felt that academic freedom was important to them and that their university upheld academic freedom. However, in interviews with individual faculty members, the researchers found that the professors were engaging in self-censorship in order to avoid untoward repercussions. Here are some examples of the comments from the faculty at this GCC University:

“Yes, all the time. I avoid all references to Israel or the Jewish people despite their contributions to world culture. I also avoid any kind of questioning of their religious tradition. I do this out of respect.”

This latter comment is especially painful for me because one of my heroes who inspired me to become a cell biologist was the Italian Jewish scientist Rita Levi-Montalcini. She revolutionized our understanding of how cells communicate with each other using growth factors. She was also forced to secretly conduct her experiments in her bedroom because the Fascists banned all “non-Aryans” from going to the university laboratory. Would faculty members who teach the discovery of growth factors at this GCC University downplay the role of the Nobel laureate Levi-Montalcini because she was Jewish? We do not know how prevalent this form of self-censorship is in other OIC countries because the research on academic freedom in Muslim-majority countries is understandably scant. Few faculty members would be willing to voice their concerns about government or university censorship and admitting to self-censorship is also not easy.

The task force report on science in the universities of Muslim-majority countries is an important first step towards reforming scientific research and education in the Muslim world. Increasing investments in research and development, using and appropriately acting on carefully selected metrics as well as introducing a core liberal arts curriculum for science students will probably all significantly improve the dire state of science in the Muslim world. However, the reform of the research and education programs needs to also include discussions about the importance of academic freedom. If Muslim societies are serious about nurturing scientific innovation, then they will need to also ensure that scientists, educators and students will be provided with the intellectual freedom that is the cornerstone of scientific creativity.

Murder your darlings. The British writer Sir Arthur Quiller Crouch shared this piece of writerly wisdom when he gave his inaugural lecture series at Cambridge, asking writers to consider deleting words, phrases or even paragraphs that are especially dear to them. The minute writers fall in love with what they write, they are bound to lose their objectivity and may not be able to judge how their choice of words will be perceived by the reader. But writers aren’t the only ones who can fall prey to the Pygmalion syndrome. Scientists often find themselves in a similar situation when they develop “pet” or “darling” hypotheses.

How do scientists decide when it is time to murder their darling hypotheses? The simple answer is that scientists ought to give up scientific hypotheses once the experimental data is unable to support them, no matter how “darling” they are. However, the problem with scientific hypotheses is that they aren’t just generated based on subjective whims. A scientific hypothesis is usually put forward after analyzing substantial amounts of experimental data. The better a hypothesis is at explaining the existing data, the more “darling” it becomes. Therefore, scientists are reluctant to discard a hypothesis because of just one piece of experimental data that contradicts it.

In addition to experimental data, a number of additional factors can also play a major role in determining whether scientists will either discard or uphold their darling scientific hypotheses. Some scientific careers are built on specific scientific hypotheses which set apart certain scientists from competing rival groups. Research grants, which are essential to the survival of a scientific laboratory by providing salary funds for the senior researchers as well as the junior trainees and research staff, are written in a hypothesis-focused manner, outlining experiments that will lead to the acceptance or rejection of selected scientific hypotheses. Well written research grants always consider the possibility that the core hypothesis may be rejected based on the future experimental data. But if the hypothesis has to be rejected then the scientist has to explain the discrepancies between the preferred hypothesis that is now falling in disrepute and all the preliminary data that had led her to formulate the initial hypothesis. Such discrepancies could endanger the renewal of the grant funding and the future of the laboratory. Last but not least, it is very difficult to publish a scholarly paper describing a rejected scientific hypothesis without providing an in-depth mechanistic explanation for why the hypothesis was wrong and proposing alternate hypotheses.

For example, it is quite reasonable for a cell biologist to formulate the hypothesis that protein A improves the survival of neurons by activating pathway X based on prior scientific studies which have shown that protein A is an activator of pathway X in neurons and other studies which prove that pathway X improves cell survival in skin cells. If the data supports the hypothesis, publishing this result is fairly straightforward because it conforms to the general expectations. However, if the data does not support this hypothesis then the scientist has to explain why. Is it because protein A did not activate pathway X in her experiments? Is it because in pathway X functions differently in neurons than in skin cells? Is it because neurons and skin cells have a different threshold for survival? Experimental results that do not conform to the predictions have the potential to uncover exciting new scientific mechanisms but chasing down these alternate explanations requires a lot of time and resources which are becoming increasingly scarce. Therefore, it shouldn’t come as a surprise that some scientists may consciously or subconsciously ignore selected pieces of experimental data which contradict their darling hypotheses.

Let us move from these hypothetical situations to the real world of laboratories. There is surprisingly little data on how and when scientists reject hypotheses, but John Fugelsang and Kevin Dunbar at Dartmouth conducted a rather unique study “Theory and data interactions of the scientific mind: Evidence from the molecular and the cognitive laboratory” in 2004 in which they researched researchers. They sat in at scientific laboratory meetings of three renowned molecular biology laboratories at carefully recorded how scientists presented their laboratory data and how they would handle results which contradicted their predictions based on their hypotheses and models.

In their final analysis, Fugelsang and Dunbar included 417 scientific results that were presented at the meetings of which roughly half (223 out of 417) were not consistent with the predictions. Only 12% of these inconsistencies lead to change of the scientific model (and thus a revision of hypotheses). In the vast majority of the cases, the laboratories decided to follow up the studies by repeating and modifying the experimental protocols, thinking that the fault did not lie with the hypotheses but instead with the manner how the experiment was conducted. In the follow up experiments, 84 of the inconsistent findings could be replicated and this in turn resulted in a gradual modification of the underlying models and hypotheses in the majority of the cases. However, even when the inconsistent results were replicated, only 61% of the models were revised which means that 39% of the cases did not lead to any significant changes.

The study did not provide much information on the long-term fate of the hypotheses and models and we obviously cannot generalize the results of three molecular biology laboratory meetings at one university to the whole scientific enterprise. Also, Fugelsang and Dunbar’s study did not have a large enough sample size to clearly identify the reasons why some scientists were willing to revise their models and others weren’t. Was it because of varying complexity of experiments and models? Was it because of the approach of the individuals who conducted the experiments or the laboratory heads? I wish there were more studies like this because it would help us understand the scientific process better and maybe improve the quality of scientific research if we learned how different scientists handle inconsistent results.

In my own experience, I have also struggled with results which defied my scientific hypotheses. In 2002, we found that stem cells in human fat tissue could help grow new blood vessels. Yes, you could obtain fat from a liposuction performed by a plastic surgeon and inject these fat-derived stem cells into animal models of low blood flow in the legs. Within a week or two, the injected cells helped restore the blood flow to near normal levels! The simplest hypothesis was that the stem cells converted into endothelial cells, the cell type which forms the lining of blood vessels. However, after several months of experiments, I found no consistent evidence of fat-derived stem cells transforming into endothelial cells. We ended up publishing a paper which proposed an alternative explanation that the stem cells were releasing growth factors that helped grow blood vessels. But this explanation was not as satisfying as I had hoped. It did not account for the fact that the stem cells had aligned themselves alongside blood vessel structures and behaved like blood vessel cells.

Even though I “murdered” my darling hypothesis of fat –derived stem cells converting into blood vessel endothelial cells at the time, I did not “bury” the hypothesis. It kept ruminating in the back of my mind until roughly one decade later when we were again studying how stem cells were improving blood vessel growth. The difference was that this time, I had access to a live-imaging confocal laser microscope which allowed us to take images of cells labeled with red and green fluorescent dyes over long periods of time. Below, you can see a video of human bone marrow mesenchymal stem cells (labeled green) and human endothelial cells (labeled red) observed with the microscope overnight. The short movie compresses images obtained throughout the night and shows that the stem cells indeed do not convert into endothelial cells. Instead, they form a scaffold and guide the endothelial cells (red) by allowing them to move alongside the green scaffold and thus construct their network. This work was published in 2013 in the Journal of Molecular and Cellular Cardiology, roughly a decade after I had been forced to give up on the initial hypothesis. Back in 2002, I had assumed that the stem cells were turning into blood vessel endothelial cells because they aligned themselves in blood vessel like structures. I had never considered the possibility that they were scaffold for the endothelial cells.

This and other similar experiences have lead me to reformulate the “murder your darlings” commandment to “murder your darling hypotheses but do not bury them”. Instead of repeatedly trying to defend scientific hypotheses that cannot be supported by emerging experimental data, it is better to give up on them. But this does not mean that we should forget and bury those initial hypotheses. With newer technologies, resources or collaborations, we may find ways to explain inconsistent results years later that were not previously available to us. This is why I regularly peruse my cemetery of dead hypotheses on my hard drive to see if there are ways of perhaps resurrecting them, not in their original form but in a modification that I am now able to test.

Fareed Zakaria recently wrote an article in the Washington Post lamenting the loss of liberal arts education in the United States. However, instead of making a case for balanced education, which integrates various forms of creativity and critical thinking promoted by STEM (science, technology, engineering and mathematics) and by a liberal arts education, Zakaria misrepresents STEM education as primarily teaching technical skills and also throws in a few cliches about Asians. You can read my response to his article at 3Quarksdaily.

Here is a graphic showing the usage of the words “scientists”, “researchers”, “soldiers” in English-language books published in 1900-2008. The graphic was generated using the Google N-gram Viewer which scours all digitized books in the Google database for selected words and assesses the relative word usage frequencies.

It is depressing that soldiers are mentioned more frequently than scientists or researchers (even when the word frequencies of “scientists” and “researchers” are combined) in English-language books even though the numbers of researchers in the countries which produce most English-language books are comparable or higher than the number of soldiers.

Here are the numbers of researchers (data from the 2010 UNESCO Science report, numbers are reported for the year 2007, PDF) in selected English-language countries and the corresponding numbers of armed forces personnel (data from the World Bank, numbers reported for 2012):

I find it disturbing that our books – arguably one of our main cultural legacies – give a disproportionately greater space to discussing or describing the military than to our scientific and scholarly endeavors. But I am even more worried about the recent trends. The N-gram Viewer evaluates word usage up until 2008, and “soldiers” has been steadily increasing since the 1990s. The usage of “scientists” and “researchers” has reached a plateau and is now decreasing. I do not want to over-interpret the importance of relative word frequencies as indicators of society’s priorities, but the last two surges of “soldiers” usage occurred during the two World Wars and in 2008, “soldiers” was used as frequently as during the first years of World War II.

It is mind-boggling for us scientists that we have to struggle to get funding for research which has the potential to transform society by providing important new insights into the nature of our universe, life on this planet, our environment and health, whereas the military receives substantially higher amounts of government funding (at least in the USA) for its destructive goals. Perhaps one reason for this discrepancy is that voters hear, see and read much more about wars and soldiers than about science and research. Depictions of heroic soldiers fighting evil make it much easier for voters to go along with allocation of resources to the military. Most of my non-scientist friends can easily name books or movies about soldiers, but they would have a hard time coming up with books and movies about science and scientists. My take-home message from the N-gram Viewer results is that scientists have an obligation to reach out to the public and communicate the importance of science in an understandable manner if they want to avoid the marginalization of science.