Posted
by
Soulskill
on Tuesday November 26, 2013 @10:10PM
from the science-is-self-correcting dept.

ananyo writes "Science has a much publicized reproducibility problem. Many experiments seem to be failing a key test of science — that they can be independently verified by another lab. But now 36 research groups have struck a blow for reproducibility, by successfully reproducing the results of 10 out of 13 past experiments in psychology. Even so, the Many Labs Replication Project found that the outcome of one experiment was only weakly supported and they could not replicate two of the experiments at all."

But isn't really able to hold on to much of the scientific method because there's so much we don't know.

Alchemy isn't a science, but it led to chemistry which was.

Astrology isn't a science, but it led to astrophysics which was.

Because the methods Alchemy and Astrology used to get BETTER predictions was the nascent scientific method and keeping what worked and ejecting what didn't was part of that method (which a lot of social science doesn't do, worse luck). And the remains were the germs of the science tha

The ironic thing about statements like these is that they usually come from people with no scientific training in any field, nor any meaningful training in statistics, but only a "sciency" inclination and questionable, popular distillation-derived knowledge of some principles from what they consider "the hard sciences".

Sadly, this irony will be lost on the people making such statements, who will, for some unfathomable reason, continue to disparage people doing meaningful work in the sciences, while never coming close to accomplishing anything of the sort themselves.

Actual academics have an idea of the hard work involved in contributing to the human knowledge base in all scientific disciplines, and thus, tend to respect each other's work (as long as others don't step on their own toes in their particular area of specialization, in which case, prepare for turbulence).

If about half of the studies showed a positive effect, then it is hardly a proof that there is no effect. It may not be sufficient to show that it has an effect, but it's a clear hint that it might have. To make better statements one would have to have a closer look at the studies in question, because not every study has the same quality. In the extreme case, all of the studies showing a positive effect might be flawed, while those showing no effect might be sound (in which case, the claim that there's no p

Medication belongs to the field of psychiatry. And for the most part, they do have an effect. But it's only temporary, and the human body gets used to it after a while. So in the long term, medicine is largely useless, and in fact is counterproductive, as they tend to cause other, worse effects ("side" effects). But in the short term, it helps.

All systems have a state of equilibrium, a state of stability. The same holds for the body and the mind, two different but related and dependent systems. They'll always tend towards the state of equilibrium because that's the path of least resistance.

Psychological ills are not the equilibrium being tipped, but the point of equilibrium itself changing. To truly "cure" someone of depression or OCD or bipolar, you have to change the point of equilibrium itself. That's much, much harder than you can imagine, and a far greater challenge than any pill will ever resolve. Those whose equilibrium were changed by an event in their life are easier to change back than those who are born with a certain equilibrium. Some people call the former nature vs. nurture. I call it, again, the past of least resistance.

Psychology is not attempting to medicate everyone. It's attempting to explain in terms familiar to the scientific-minded humanity.

Evidence-based medicine is valuable to the extent that the evidence base is complete and unbiased. Selective publication of clinical trials — and the outcomes within those trials — can lead to unrealistic estimates of drug effectiveness and alter the apparent risk–benefit ratio.

(**) Also, I have no meaningful training in science or statistics. If you want, you can win the argument by pointing this out in your response.

Read this book: Why zebras don't get ulcers [amazon.com]. It explains how stress influences our life, and how complex the system works that tries to regulate this. It shows that it all works beautifully, for people living in the wild, but the system is not so good for us.

This book won't give you the solution to depression, but it will show that the body uses many methods to accomplish several things, like redirecting sources when in danger or in rest. There is not one solution - any solution will have side effects, and

(**) Also, I have no meaningful training in science or statistics. If you want, you can win the argument by pointing this out in your response.

It's not my intention to get into any arguments or win anything. When I got snarky above, it was to get some people to consider whether they're qualified to disparage the work of others. Anyway, you raise a good question.

Like all sciences, psychology entails a set of beliefs/theories/ideas/models. These constructs should be informed by evidence gathered through a certain methodology. As new evidence is gathered, old models are continually revised or superceded by new ones in an iterative process that's inte

Without a doubt, there is a LOT of unscientificness going on in the field of psychology. Look at the satanic ritual abuse situation of the 80s [wikipedia.org] for an obvious example, especially when you realize that even today there are still psychologists treating patients for this 'malady.'

Without a doubt, there is a LOT of unscientificness going on in the field of psychology. Look at the satanic ritual abuse situation of the 80s [wikipedia.org] for an obvious example, especially when you realize that even today there are still psychologists treating patients for this 'malady.'

Psychology is like quantum physics, except that not only does observing something change the behavior, but the degree and kind of change is likewise unpredictable. Isaac Asimov understood that decades ago. "Psycohistory" depended on A) large populations, to smooth out personal variances and B) keeping most of the mechanism out of sight so that people wouldn't factor the predicted results into their behavior.

A lot of what's wrong with mental health in general is that we're still chipping flints. As new studi

The ironic thing about statements like these is that they usually come from people with no scientific training in any field, nor any meaningful training in statistics, but only a "sciency" inclination and questionable, popular distillation-derived knowledge of some principles from what they consider "the hard sciences".

Could you please show me a reasonable experiment with proper statistics supporting this claim?

Well to be honest - we have to confess something to you. Slashdot was once setup as a psychological experiment.... Just take a look at the users and comments, no more proof needed!

Unfortunately there are 2 major hurdles that limit all but the most groundbreaking experiments:

1. Money
2. Glory

Money is probably the biggest factor, there just isn't enough money allotted to trying to reproduce experiments. Most budgets only exist for new/continuing research, not verifying experiments done by others. And as the cost of doing experiments rises(more sophisticated equipment necessary, lots of paid "volunteers" etc) this is only going to get worse.

Notional amounts of derivative securities have nothing to do with money.

For example, I recently traded some stock options. The actual value was around $600 at the time of the sale. The notional amount, which happens to be the stock's price trigger at which the options activate times the number of options, was $13,000. You do the math.

There's no shortage of money. The only scarcity is the lack of political will to fund basic social services and science. There is no economic or physical necessity preventing us from funding food stamps, science research, health care, a basic income.

Even if you reduce it by an order of magnitude, that's still $60 trillion.

And if you reduce it by two orders of magnitude, that's still $6 trillion.

There's no shortage of money. The only scarcity is the lack of political will to fund basic social services and science. There is no economic or physical necessity preventing us from funding food stamps, science research, health care, a basic income.

I agree that's there's no "shortage of money". But these things which you mention do not have that much value to them. For example, most people earn enough to pay for their own food, health care, and still have a basic income.

Also, efforts to insure that these services are available in turn harms the providing of these services. A lot of effort has been made to turn corn inefficiently into a gasoline additive. The motives are mostl

The sad thing is that it's on par with the level of TV Tropes. They first use their pattern matching brains to notice some pattern, then go seek out quantifications for it. That's ridiculous. That's why literally everything is a trope -- even the tropeless story is a trope. The same goes for psychological classifications and categorizations of behaviours. Some psychologists claim to study cognitive bias -- Yet their own confirmation bias has blinded them to the fact that their distinctions themselves w

That is a very disingenuous statement. While I would agree that there are many aspects of Psychology/Psychiatry that are not very scientific, there are some areas that are rigorous. Neuropsychology can be a good example. Testing and measurement is another good example. There is a lot more to Psychology than the hokey therapy that we think of.

Psychology is a huge field. Perception, experimental analysis of animal behaviour, clinical psychology, cognitive biases etc. etc. (Note that only one of those involves psychiatrists.) Some bits allow for harder science than other bits.

I personally don't know enough about psychiatry to form a judgement on how scientific they are, but unlike you, at least I know what a psychologist is (or something of the range that they could be.) Your trite dismissal says much about your ignorance and nothing about psychology.

Psychology is a soft science because of the numerous variables that in studies are often simplified into a constant often for simplicity's sake and nothing else. Economics and politics are the same, mostly because they're based on psychology.

It's an inexact science because the human condition is imperfect. As opposed to the hard sciences, which are exact, because the universe around us is "perfect". And then, there's computer science, which is a mathematical, computational science that's absolute. It's not even "perfect" anymore; it's exactly what the maths say it is, and any failure sits between keyboard and chair.

Anyway, psychology is important, because the only way to truly understand the imperfect conditions of humans is via an inexact science. And it's something only fully understood by humans (computers can simulate the hard sciences to a calculable degree of accuracy, but they'll never be able to simulate the soft sciences in the same way), and innately at that.

The way to think about psychology is using fractals. X% | X is > statistical significance, of the population behaves in manner a. X * (100-X)% of the population behaves in manner b. X * (100 - X * (100-X))% of the population behaves in manner c. Etc. a, b, c, etc. are up to you to figure out. And when you change the test, the individual that falls into one category is not guaranteed to fall into the same category again.

Note that the human mind can comprehend infinity (poorly for most, but very possible for a few), both countable and uncountable variants, but a computer will never be able to calculate it. So the fractal analogy works really, really well.

The universe only appears to be perfect to macroscopic viewers, because the time dimensions are held comparatively constant due to between small particles.

In other words, and electron -- which may well "see" 1 spatial dimension and three time dimensions, appears to behave statistically, not predictably. Therefore, quantum mechanics behaves like an imperfect science, as you call it.

That said, psychology, economics, and sociology attempt to make their science more perfect, often, by binding large numbers of s

Psychology is a soft science because of the numerous variables that in studies are often simplified into a constant often for simplicity's sake and nothing else. Economics and politics are the same, mostly because they're based on psychology.

It's an inexact science because the human condition is imperfect. As opposed to the hard sciences, which are exact, because the universe around us is "perfect". And then, there's computer science, which is a mathematical, computational science that's absolute. It's not even "perfect" anymore; it's exactly what the maths say it is, and any failure sits between keyboard and chair.

Anyway, psychology is important, because the only way to truly understand the imperfect conditions of humans is via an inexact science. And it's something only fully understood by humans (computers can simulate the hard sciences to a calculable degree of accuracy, but they'll never be able to simulate the soft sciences in the same way), and innately at that.

The way to think about psychology is using fractals. X% | X is > statistical significance, of the population behaves in manner a. X * (100-X)% of the population behaves in manner b. X * (100 - X * (100-X))% of the population behaves in manner c. Etc. a, b, c, etc. are up to you to figure out. And when you change the test, the individual that falls into one category is not guaranteed to fall into the same category again.

Note that the human mind can comprehend infinity (poorly for most, but very possible for a few), both countable and uncountable variants, but a computer will never be able to calculate it. So the fractal analogy works really, really well.

Ironically, the same things that make psychology a soft science are the same things used by theoretical physicists. Both rely heavily on probability versus observable inputs (although for different reasons). So, I would posit that what makes psychology and the others you mentioned a soft science versus a hard science is not aobut exactitude, but semantics. That and the fact that it is was the hard sciences that created the definition in the first place.

No, the field you're thinking of is Neuroscience and Cybernetics -- These have evidence based on observation and models which have predictive power. Psychology is just confirmation bias. [wikipedia.org] You must prove the null hypothesis more implausible than the original hypothesis, yet Psychology does not do this. For every ridiculous Sexual Epistemology, there's an equally valid Scatological Epistemology.

The truth is that neurons fire in brains, and that complexity gives rise to emergent behaviours. Leaping the gulf

No, the field you're thinking of is Neuroscience and Cybernetics -- These have evidence based on observation and models which have predictive power. Psychology is just confirmation bias. [wikipedia.org] You must prove the null hypothesis more implausible than the original hypothesis, yet Psychology does not do this. For every ridiculous Sexual Epistemology, there's an equally valid Scatological Epistemology.

The truth is that neurons fire in brains, and that complexity gives rise to emergent behaviours. Leaping the gulf in understanding to arrive at the explanations that Psychology and Philosophy give is akin to claiming a God in a Chariot pulls the Sun across the sky.

Your argument is only valid for recent times. For most of the history of modern psychology, there was not a separate field between neuroscience and psychology. Often in med school today, the psych departments and the neurology departments have been combined because we have found that the two are interrelated.

Since most psychological research deals with evidence based observation and models which have predictive power, what distinguishes it from the subclassification of neuroscience? It's a little bit like

Good post Woodhams, I'll use an analogy I formed when discussing Psychology with my girlfriend whose been in the field a while: Psychology today is like studying Chemistry in the bronze age. Back then, they didn't have the means to understand the why of this chemical working with this chemical, they just knew it worked and did Chemistry via trial and error and guessing. Today, psychology is classifying things based on relations and forming best practices, but we don't understand why things are the way the

Good post Woodhams, I'll use an analogy I formed when discussing Psychology with my girlfriend whose been in the field a while: Psychology today is like studying Chemistry in the bronze age. Back then, they didn't have the means to understand the why of this chemical working with this chemical, they just knew it worked and did Chemistry via trial and error and guessing. Today, psychology is classifying things based on relations and forming best practices, but we don't understand why things are the way they are because of our limited understanding of the brain.

Maybe things will change in 100 years, maybe not. I think the field is worth its weight in gold though, there's a lot of good that can be/is being done and a lot of progress still to be made.

That is an extremely narrow view of psychology today and pretty much views it in terms of therapy. Let me ask you this, when Warren Buffet invests in the market using a contrarian strategy, are you stating that there is no underlying science backing him up? I ask, because he and many others seem to be quite successful at it.

Real psychology has a lot more depth than the therapist's couch. Should the determination of what is science be based on if it can fulfill the requirements of the scientific method vers

I agree. Science goes through the upgrade of hypothesis, to tested results, to verified results, to working theories, and eventually laws (although the line between the last two is arbitrary in modern science, really.) The more results that are tested again and again, the better science is as a whole.

If the scientific community valued reproducibility as much as original work, we would solve 2 problems:

1) Science without confirmation can lead us astray for years.2) There are plenty of scientists who a great at experimentation but lousy at coming up with new ideas, and these scientists (or potential scientists) may not be finding their full potential.

And while we're at it, let's value failed experiments as much as successful experiments.

This is very important, as they did not randomly pick studies but rather chose the ones they "Deemed Worthy". As they did not want to be proven bad scientists (I assume), their conscious or unconscious bias will have been towards sound or easy studies.

Did you read TFA? Or did you choose sentences to read randomly? Those we're quoted as the results that worked. In fact, here is the original paragraph:

Ten of the effects were consistently replicated across different samples. These included classic results from economics Nobel laureate and psychologist Daniel Kahneman at Princeton University in New Jersey, such as gain-versus-loss framing, in which people are more prepared to take risks to avoid losses, rather than make gains1; and anchoring, an effect in which the first piece of information a person receives can introduce bias to later decisions2. The team even showed that anchoring is substantially more powerful than Kahneman’s original study suggested.

Two that didn't were about social priming, one was currency priming, in which participants supported what I assume is the current state of capitalism after seeing money, and the other, priming feelings of patriotism with a flag. Moreover, both original authors we're positive about it:

Social psychologist Travis Carter of Colby College in Waterville, Maine, who led the original flag-priming study, says that he is disappointed but trusts Nosek’s team wholeheartedly, although he wants to review their data before commenting further. Behavioural scientist Eugene Caruso at the University of Chicago in Illinois, who led the original currency-priming study, says, “We should use this lack of replication to update our beliefs about the reliability and generalizability of this effect”, given the “vastly larger and more diverse sample” of the Many Labs project. Both researchers praised the initiative.

There you go, quoting the article directly since you can't be bothered to read it. It is true that they apparently chose what some consider to be important effects and the evidence against social priming is upsetting to some. Still, the fact that verification actually happened and people are happy about it shows science is alive and kicking.

Anyway, another cool thing about this study should be that it uses this thing, the open science framework [openscienceframework.org] which I haven't heard about until today, but seems pretty cool.

The "problem" with experiments that aren't reproducible may not be with the experiments as much as with the popular media that decides to make sweeping generalizations based on one result. Though I guess some blame definitely needs to be applied to the researcher who allows unverified results to be misrepresented to get that 15 minutes of fame in a quote in The Guardian or USA Today...

Something passed through our hands here at the office recently. A "scientific" study, in the very soft field of human behaviour, where their sample size was 27. And that set was split into 4 groups. Absolutely any result from that experiment was possible and could be explained as pure random chance, not deviating from the null hypothesis.Reading works like that, and others from the same authors and institutes, we got the feeling that it was mostly a delusion that they were doing science. Like how when a 4-y

there's a delusion that it's actually doing the washing up, and you really don't want to spoil its fun by breaking that illusion - it's not doing any harm, is it?

The harm comes when they "dry" the dishes with the dish towel, getting it all nasty and dirty, and then put the dishes away in the cupboard still dirty, contaminating the other dishes they come into contact with. There's a direct analogy to be made here.

That's not really a problem from the perspective of scientists - in the fields of psychology, cog sci, and neuroscience, I've never encountered an instance of a researcher using any popular media distillation of some study as a meaningful source of info on that study (aside from making them aware of the study's existence).

Also, you seem to be assigning some a priori status of reproducibility or lack thereof to some studies, which really confuses the issue. For example, what does it mean for an experiment to

I guess I should clarify the larger epistemic point at which I was hinting. That others may not, in some reasonable number of attempts, reproduce an experiment does not mean that the experiment is categorically not reproducible. Any number of things, such as lab conditions (which are not, in practical, absolute terms, reproducible within a lab, much less between different labs), can influence the results of experiments, and while adhering to certain sound methodological principles abstracts away a lot of th

Isn't the ability to reproduce results based on the "idea" in the methods section central to the concept of scientific "reproducibility"? If I claim I applied one set of methods and got a certain result --- but those methods are different from the physical reality of my setup that reasonable adherence to the stated methods will produce a vastly different result --- then I have failed at publishing a "reproducible" experiment.

Example:"By the method of releasing lead spheres at rest into the air, I have obser

Sorry, I have no idea how Slashdot managed to completely mangle my post, cutting out a big chunk in the middle, after previewing and submitting. I don't have the patience to re-write it, so just ignore the garbled mess left after Slashdot's unexpected redaction of a whole middle paragraph.

Don't worry about the post getting cut off - the point you were trying to make is clear. To address the idea that "the methods are different from the physical reality of my setup" - it's simply not possible to be comprehensive in the instructions you provide when you write up your methods, and furthermore, in reality, variables are often introduced in a lab which impact experimental results, but which are not accounted for in the methods writeup because the authors are not aware of these variables (this is

No, I still think that, if the methods are insufficiently described (do not sufficiently capture the nuances of the physical experiment) to permit reliable replication of results, then this makes an experiment un-reproducible by definition.

Indeed, allowances must be made for the study of ephemeral phenomena --- people's perceptions in some particular time and place; a comet passing by once --- that preclude actual recreation of the experiment. So, I think there is a "methodological description" requirement

For example, what does it mean for an experiment to be "not reproducible"? You can fail to reproduce the results of an experiment, but proving that a result is not reproducible, or somehow knowing that it isn't, is a different issue altogether.

You make absolutely no sense here. If I follow the original researcher's notes and experiment design correctly and cannot obtain his result, the experiment was not reproducible. That's what "not reproducible" *mean*--I couldn't reproduce his results.

Most people here are not looking it from the perspective of scientists, though, but from the (as the article states) "much publicized" perspective.

For example, what does it mean for an experiment to be "not reproducible"?

Exactly. You should be asking that question about the article, not to me:) In fact I do have a neuroscience background, even if that's not what I am doing these days... but that is pretty much irrelevant to this thread, which was to start a conversation about how the mainstream media tends to latch onto any unverified/reproduced study and report it as the cano

If you're going to pick a paper then The Guardian was not a good choice, given Dr Ben Goldacre writes a regular column for the called "Bad Science" where he critiques terrible science reporting in the media (amongst other things).

But unfortunately not all of his colleagues share his standards for scientific repairing - there have been plenty of horribly reported scientific studies in the Guardian, as well as almost all "popular media" sources that just can't help it...

Funny, because there's another study recently published that said technology is killing everyone's sex lives. It's probably old news here, as it's just a downward extrapolation of the extreme case found here.

I thought at first it was saying 36 groups each tried to reproduce the results of 13 experiments, and all 36 were successful with 10 of 13 (though not necessarily the same ten), successfully reproducing the results of a reproducibility meta-experiment.

Were they not able to reproduce the outcome of an experiment, or were they not able to reproduce the whole experiment (as in "We assume that during a total eclipse in the month of may" or... "to reproduce this, take any old Large Hadron Collider lying around...")

Apparently, there were no "experiments", in the lab/contrivance sense - it was just a questionaire.Regarding the failures:"Of the 13 effects under scrutiny in the latest investigation, one was only weakly supported, and two were not replicated at all. Both irreproducible effects involved social priming. In one of these, people had increased their endorsement of a current social system after being exposed to money[3]. In the other, Americans had espoused more-conservative values after seeing a US flag[4]."

I have a particular experiment which, in my mind, highlights an aspect of human nature we would all prefer to deny. We all have within us the capacity to ruthlessly abuse others, even and especially friends and family when given the opportunity. This famous experiment [wikipedia.org] a group of peers where some were assigned the role of prison guard and others that of prisoner. It really didn't take long before things went really bad.

I have often heard that corruption is a problem of opportunity more than of character.

"Science has a much publicized reproducibility problem" links to an article that says "as many as 17â"25% of such findings are probably false", but a group reproducing 10 out of 13 experiments (23% not replicable) is striking a blow for reproducibility?

Because that's how "science" works in Psychology.
You come up with some ridiculous simple experiment, like giving people the same amount of money for a simple boring task, or a complex creative task.
Turns out most people go for the simple boring one.
Study Conclusion: People prefer simple boring tasks!
My Conclusions: why take more risk in screwing up when you can make the same money with something easy?