Does science make you moral?

Several readers have called my attention to a new paper in PLoS ONE by Christine Ma-Kellams and Jim Blascovich, psychologists from the University of California at Santa Barbara, supposedly showing that reading about science in an experimental study makes one behave more morally and altruistically. They also claim to show that studying science improves your moral judgment (reference below; download is free.) I say “supposedly” and “claim” because I don’t find the paper terribly convincing.

The authors did four experiments, three of them using the same protocol to “prime” the subjects with science or with a “neutral” non-science task. The authors’ predicted that “the notion of science as part of a broader moral vision of society [i.e., Enlightenment values] facilitates moral and prosocial judgments and behaviors.” And that’s what they found, in all four studies. I’ll briefly describe the experiments and results.

Study 1. This used 48 undergraduates from UCSB. All of them first read a date-rape story in which a guy drives a woman home, the woman invites him in for a drink, and then he has “nonconsensual sex” (a euphemism for “rape”, I guess) with her. Afterwards, the participants answered questions about their field of study and how wrong they thought the man’s act was (on a scale from 1–completely right, to 100–completely wrong). They also answered the question “How much do you believe in science?” on a scale from 1 (not at all) to 7 (very much).

Results: Field of study was correlated with greater moral condemnation with rape, with science students being significantly (p = 0.01) more condemning than nonscience students. Belief in science was also positively correlated with moral condemnation (p <0.001).

Problems here include use of a small pool of college undergraduates, correlation that doesn’t show causation (perhaps more ‘moral’ students tend to gravitate to or are more accepting of science), and lack of replication, as this was a one-time study. It’s also not clear whether the science-friendly students would actually behave more morally. Answering one question doesn’t show that you are generally “more moral” than others.

Studies 2,3, and 4. These studies used 33 undergraduates, 32 volunteers from the area, and 43 participants between 18 and 22 from the university’s “research participation pool,” respectively.

In all three studies, the participants got a list of five scrambled words from which they had to choose four to make a complete sentence. The “science” condition contained words like “logical”, “hypothesis”, “theory”, “laboratory” and “scientists.” The controls had a list of five nonscience words; they give the example of “shoes give replace old the.” There were thus two primes: a science one and a “control” one. Then each of the three groups was subject to a different test, described under “results” below.

Results, study 2. After their prime, students read the same date-rape scenario and made their 1-100 moral judgment. The subjects primed with science words were more condemnatory of rape (p = 0.04).

Problems here include small sample size again, a probability that is barely significant (0.05 is the cut-off level), and a worry that this ranking (82 for control primes, 96 for science primes) isn’t a good indicator of moral behavior.

Results, study 3. Students, after priming, completed a “prosocial intentions measure,” which included “the likelihood of engaging of each of several behaviors in the following month, including prosocial activities (donating to charity, giving blood, volunteering, and distractor activities (attending a party, going on vacation, seeing a movie).” The subjects primed with science reported greater prosocial intentions relative to controls (p = 0.024).

Problems are again small sample size unrepresentative of the general population, a probability that isn’t all that impressive, and the uncertainty that reporting your intentions doesn’t mean you’ll actually fulfill those intentions.

Results, study 4. After priming, students took a standard “economic exploitation” test; each was given five one-dollar bills and told to divide the money between themselves and another anonymous participant (there was no opportunity for the other person to reject the dosh). After the experiment was over, both participants got $5 anyway. In this case alone there was an effect of gender, with women keeping more money for themselves than did men (p = 0.03), but there was no gender X prime interaction, and those primed with science gave away significantly more money than those receiving the control prime (p = 0.046).

Problems are again small sample size, nonrepresentative population, and a probability that is barely significant (agai, 0.05 is the cutoff).

Note that in the last three studies, the possibility that science-y people were more moral a priori was not an issue, as the priming tests were allocated randomly among participants, regardless of their area of study.

The authors conclude that “Taken together the present results provide support for the idea that the study of science itself independent of the specific conclusions reached by scientific inquiries holds normative implications and leads to moral outcomes.” Well, only the study #1 had anything to say about the “study of science,” as it was the only one that tested science students vs. non-science students. “Priming” with words in the other three studies has nothing to do with “the study of science.”

What this study shows is simply a need for further studies, as sample sizes were small and probabilities often marginal. But I am wary of assessing how moral or altruistic someone is from their response to a single test. That itself would need validation by correlating test performance with moral or altruistic behavior in the real world, something that has ever been done, much less can be done.

Further, tests like these are in severe need of replication. For example, the authors mention the paper of Vohs and Schooler (free download at link) showing that reading about humans’ lack of free will made them more likely to cheat on a subsequent task. That paper got a lot of attention. But, as I’ve written about before, the Vohs and Schooler paper was not replicated in a subsequent study by Rolf Zwaan at the University of Rotterdam. Zwaan found no difference in cheating behavior after participants read a piece a piece by Francis Crick on the illusory nature of free will vs. those who read a “control” piece by Crick. Ma-Kellams and Blascovich don’t mention the failure of replication. And of course their own tests need to be replicated on larger and more diverse populations.

I’ve been hard on this study precisely because it produced the results I’d like to see: studying science is good for your behavior. The connection between the two is not as obvious to me as to the authors, but maybe it’s true. But the differences were small, and I’m not sure what the implications would be even if the results were real. As Feynman said, the main purpose of science is to keep you from fooling yourself, so we must be extra cautious about accepting results that meet our preconceptions.

Note, too, that the paper was published in PLoS ONE, which is a journal that doesn’t review papers for novelty or generality. The journal will publish anything so long as the experiments seem to have been performed properly. There have been some good papers in that journal, but I regard it largely as a dumping ground for papers that can’t meet the more rigorous standards of other journals. With increasing pressure to publish, scientists can turn to journals like this to publish nearly anything, so long as the experiment was properly designed and properly analyzed. I’m not saying that this paper was not interesting, but if the results were so momentous why did the authors send them to PLoS ONE?

You aren’t required to “pay the journal” to make the paper open access, you are only required to make it open access (strictly, half of an institution’s papers need to be open-access within 12 months after publication).

I predict that within 10 years all papers will be open access on large, low-cost, online servers (no print editions) and that “journals” will have changed into refereeing quality-stamps that are one sort option on the server.

Some fields are already open-access compliant — e.g. in astrophysics the top journals, ApJ, MNRAS, A&A, even Nature, all accept you putting the paper on arXiv.

I suspect that money considerations will force all fields towards my above scenario. The transition, though, would be helped if people didn’t regard open-access journals as inferior to “high ranked” closed journals.

Agreed, but the distorting effect of the REF/RAE tends to work against that.

In Statistics we are doing well in open access to software, with R, but are trailing way behind Physics etc when it comes to journal publications. I suspect most subject areas are still closer to Statistics than Physics in this so far.

Well, yeah. I meant what bullshit cover story, if any, used to justify it. Not the actual unvarnished truth of the matter.

But, going by your consequent discussion with Coel it seems the intent may be honorable and the negative issues a result of a system with various players with seperate interests, some trying to maintain the status quo.

I would agree with that. We do need places to put results that, while not earth shattering, should be in the public domain, and PlosOne will come up on a literature search.

Perhaps as valuable would be somewhere to put negative data of the “we made this mouse and it had no phenotype” kind. These are not easy to publish (except perhaps as a throwaway line in a related manuscript with some positive results). Publicizing the negative result would at least ensure that the experiment won’t be repeated and funding wasted.

Indeed, I have published in PLoS ONE exactly so that the paper would get as wide a readership as possible. The work could definitely have gone to a “more prestigious” journal but we felt not having it behind a paywall was more important than the journal.

Same here. I’m planning on submitting the two papers I’m currently writing to PLoS One.

I’ve really acquired a deep dislike of the practices that develop whilst trying to get a paper published in a “prestigious” journal.

Lab sit on data and keep it secret for years and years in order to build a “story” that will get them published in a good journal.

Other labs keep flipping what they’re working on in order to be able to use all of the latest buzzwords and over-hyped (and largely unproven) techniques.

I understand why the current system is the way it is, but I do think that it’s damaging to scientific progress in general.

Hopefully, journals like PLoS One will lead to an alternative system, where your success as a academic researcher is judged by a record of regular, timely, useful, well-conducted research – rather than having a handful of papers in “prestigious” journals.

One thing I like about PLoS is the immediate metrics, e.g. I know that at least 980 people have looked at my paper and 215 have downloaded the pdf, it only came out in March so there is only 1 citation (in a journal that would cost my $45 to see and, since is from non-NIH funded authors, is not and will not be available on PubMed Central).

With my other papers, I have no idea about the number of views, except the citations, which take years to accumulate. I know the journals have access to that data, but I can’t seem to find it.

Well, as far as I know ALL the PLoS journals are open access, but only PLoS ONE has a policy of not vetting papers for general interest or importance. If open access is what’s important, why not submit it to one of the other PLoS journals.

I agree that a paper should be judged on its merits, not on the journal, but one can’t read every journal, and so I go to those that have a higher fraction of good-quality papers.

Is is fair to question what the name of the journal really means any more? It’s been a while since I’ve picked up an actual copy. Pretty much everything I read now comes across a screen (including News and Views). I walk past the medical school library twice a day but like most of my colleagues I haven’t physically been inside in several years. As more of the back catalog has become available in pdf format these gaps have become longer. Finding things normally involves a pubmed search and a bunch of pdfs. At one level this can be frustrating because there is so much to chose from, but at another it does give one a reasonably unbiased view – in the sense that I can see the four papers from small groups in that disagree with the big group who just published in Cancer Cell – or whatever. So to some extent the variety at least helps me to be skeptical, and to treat the data at face value.

I fully support the open access concept, but I have always worked in institutions with access to pretty much anything. As a result I have been less impacted (in an obvious way) by the exorbitant costs that journals put on libraries. In contrast, page charges do affect my budget and I’d be happier with a different system, if only study sections were not so hung up on where we publish.

I suspect you could get lots of positive results depending on what you were looking for. Prime people with religion, prime people with motivational stories, prime people with love songs… there is a danger of reading too much into small behaviors.

The two combine, they argue, to make “US researchers potentially more likely to express an underlying propensity to report strong and significant findings.”

There may be nothing more nefarious here than writing a paper as if you expected the results you got the whole time. However, the problem that’s been identified could potentially encompass things like choosing experimental procedures in order to make it more likely that you’ll get the results you desire. The study simply can’t discriminate between the two.

In any case, lest the rest of the world get smug about this, the authors note that other countries are moving toward a publication-based evaluation system for their researchers as well. So, it may be that a similar bias will crop up in other locations.”

It is tempting to conclude that scientific literate people are more moral than others, but one could also argue that these morality-tests are a freebie in the sense that there are no real consequences to the choice made.

It is easy to be moral on a piece of paper, but it is another thing to actually live by those ideals/morals irl.

Does the question “How much do you believe in science?” have any meaning? I have heard far too many people of the anti-science mentality (creationists of all flavours, anti-vaxxers, fans of conspiracy theories, etc.) say they believe in science, that they like science.

Asking a direct question on something like “what does science mean to you?” would be more meaningful. But the answers would be hard to quantify. Maybe a multiple choice questionnaire.

I am always very skeptical about these kinds of studies that are dependent on self reporting about behavior in an environment that bears little resemblance to the real world the study is attempting to investigate.

I think anyone who is capable of being honest enough with themselves, and / or who has raised children, understands that actual behavior in spontaneous real world situations does not necessarily coincide with the moral decisions about hypothetical situations we may make in the comfort of a consequence free environment.

People studying these types of questions, human behavior, need to find ways to study their subjects in the wild with a minimum of interference.

Using “moral behavior” in a way that excludes any definition based on authority or revealed moral codes (perhaps “ethical behavior” is a better term), it seems to me that there are two main components to moral behavior: 1) empathy, and 2) critical thinking. If one cares about the consequences of one’s behavior on others and one is able to think through what those consequences might be, one is more likely to behave ethically (or morally).

It seems likely that those who “like” science tend to have better-developed critical thinking skills than those who “like” religion, it seems likely that the headline has some validity.

I find plausible the idea that those with scientific training are more moral: scientific protocols encourage honesty, self-criticism, accuracy, attention to detail, etc. But that assumes that people in question have had thorough training and practice in doing real science; those who merely have favorable attitudes toward science may not have any of the disciplined behavior that science represents.

What you didn’t mention seems to be the bigger problem for me. The scale, 1-100 seems to encompass ‘perfect action’ through to ‘no moral weight’ to ‘absolutely condemnable’.

Apparently, the control primes were around 82 on the rape story, but the science primes were at 96. Now, I’d side more with 82, as while rape is pretty nasty, I wouldn’t say there aren’t actions that are 5% more evil. If you put rape at 96, that means you have only 4 more degrees of evil, out of 100, to fit massive property damage or theft, murder, and torture, and that’s not even counting the ‘large scale’ versions of these, such as serial killing or genocide.

In short, the science priming seems to cause a loss of a sense of scale, more than anything else.

It’s not at all clear that the 1-100 scale was meant to rank rape against other evils. It could also be interpreted as a measure of justification (“rape is wrong 96% of the time”) or of confidence (“I’m 96% sure that what John did was wrong in this instance”).

Based on the sketchy description in the paper, it’s hard to know which interpretation(s) the subjects adopted.

Even leaving that aside, there’s still the confidence interpretation (“I’m 96% sure that what John did was wrong.”).

And here’s another: something wrong happened, and John deserves 96% of the blame for it.

It seems very unlikely to me that anyone reading the John-and-Sally story would conclude that it was just a roundabout way of asking “How evil is rape on a scale of 1 to 100?” The whole point seems to be to assess the blameworthiness of John’s actions, which is quite a different question.

Suppose the story wasn’t about rape, but about stealing $20 from his roommate’s wallet. One can reasonably take the position that that’s just plain wrong, period, without any implication that it’s the worst crime imaginable.

I thank the authors for publishing their article in an open access journal like PloS !

If they are going to conduct more research regarding this topic (perhaps using more participants this time), I hope they will again try and publish it in the same journal (also when they would find non-significant results): that way related information will be stored in the same journal/ on the same site (and can possibly be linked to eachother?).

They weren’t being primed for science. They were being primed for reality. When induced to think about the world as it really is, you’re more likely to wish it were a better place, so you’re more vocally moral.

A load of rubbish which has exactly as much support as the study authors’ contentions.

I really dislike the “priming” concept in studies like these. I’ve yet to encounter a single type of “prime” that didn’t immediately suggest many different interpretations, some more plausible than the one simply assumed correct by the study authors.

Another problem with the interpretation is the same as studies that have shown that people primed with religious notions also self-report as more “moral”. Perhaps it is the act of priming people with notions of societal projects of improvement in general (enlightenment goals of science, religious goals etc), and their implicit membership in such projects, that elicits the self-reported improvement in morality?

Reminder: small sample size is *not* a bias — to the contrary: having a significant result on a small sample is a good indicator that the effect is strong (insofar as sample size is taken into account in the calculation of statistical significance).

We are currently halfway done with another replication replication attempt of the Vohs & Schooler study, this time in the lab. Data collection should be complete in October. We plan to submit a manuscript that includes both experiments in November.

I am very disappointed in the Ma-Kellams paper on 2 major grounds: 1. its lack of depth listing and addressing the dozens of hidden assumptions that affect the study, and 2. its lack of depth regarding both the study approach and their whole understanding of the issue.

Depth of assumptions:

The study goal was stated as, “LINKS between EXPOSURE to SCIENCE and MORAL or prosocial BEHAVIORS.” (verbatim from the paper) The summary conclusion was, “Across four studies, both naturalistic measures of science exposure and experimental primes of science LED to increased adherence to moral norms and more morally normative behaviors across domains.”

Every word I emphasized in the goals should have been defined in substantially more detail just to clarify what was being discriminated. “LINKS” can mean a lot of things: correlation, cause, catalyst, interpretive element etc. In the summary conclusion, the word “led” was used to clarify this. I read “led” to mean “causal”. The study is far from proving that, and not just due to the small sample sizes.

EXPOSURE to SCIENCE can include reading single words or reading complete books, studying it in college, pursuing it as a career etc. It can include a hard science (physics), soft science (psycology), pseudo-science (astrology) etc. or sciences with very different study content: physics (matter), psychology (emotions and behaviors), chemistry (molecular interactions), geology (rocks), criminology (law) etc. The words used in the summary statement, “across four studies… naturalistic measures … and experimental primes … increased adherence … more.. normative behaviors ACROSS DOMAINS” suggest too much inclusivity without addressing any of these differences.

MORALITY is also a very large topic. What is sorely missing in the report is any discussion of the definition or context of morality. Sexual morality is very different from altruism (study 3 variable): social “altruism”, family “altruism”, national “altruism”, “altruism” toward clan or enemy; or financial greed ( study 4 variable). The issue of “secular morality” vs. “religious morality” should have been discussed as well as there are wide variations. The discussion should also have addressed the fact that “cultural morality” is well known to be primarily driven by technology and the media, rather than religion – with religions playing catch-up rather than leading. “Technology” of course is driven by scientific discovery. So, a connection between “cultural morality” and the “sciences” which created its current values, creates a de-facto “link” between them, regardless of psychological interference.

The paper repeatedly states, “these studies are the first of their kind to systematically and empirically test the relationship between science and morality. No studies to date have directly investigated the link between beliefs in science and moral or prosocial outcomes.” This statement demonstrates a significant lack of understanding of the issues being used in the study. Morality has been studied in great depth since Socrates. The literature explodes around 1969 with structural-developmental psychology. (Kohlberg ’69, Loevinger ‘70, Rest ‘79) The fact that just a few words could affect “immediate attitudes” and social behavior, drastically in many cases, was well known and studied since antiquity. This is the whole basis of word selection in political speeches and advertising. (Discussed in new ways on the language tab at my website: A3society.org).

In summary, I would describe this study as a “single data point” describing some sparsely defined sexual and economic attitudes of a few people from the Santa Barbara community, which, due to its reputation as the American Riviera, may only be of interest to a few researchers.