Sunday, 17 May 2015

The Royal Society has been celebrating the 350th
anniversary of Philosophical Transactions, the world's first scientific journal,
by holding a series of meetings on the future
of scholarly scientific publishing. I followed the whole event on social
media, and was able to attend in person for one day. One of the sessions
followed a Dragon's Den format, with speakers having 100 seconds to convince
three dragons – Onora O'Neill, Ben Goldacre and Anita de Waard – of the
fund-worthiness of a new idea for science communication. Most were
light-hearted, and there was a general mood of merriment, but the session got
me thinking about what kind of future I would like to see. What I came up with
was radically different from our current publishing model.

Most of the components of my dream system are not new, but
I've combined them into a format that I think could work. The overall idea had its origins in a blogpost I wrote in 2011, and has
points in common with David Colquhoun's submission to the dragons, in that
it would adopt a web-based platform run by scientists themselves. This is what
already happens with the arXiv for the physical
sciences and bioRxiv for biological sciences.
However, my 'consensual communication' model has some important differences.
Here's the steps I envisage an author going through:

1.An initial protocol
is uploaded before a study is done, consisting only of introduction, and a
detailed methods section and analysis plan, with the authors anonymised. An
editor then assigns reviewers to evaluate it. This aspect of the model draws on
features of registered
reports, as implemented in the neuroscience journal, Cortex. There are two key scientific advantages to
this approach; first, reviewers are able to improve the research design, rather
than criticise studies after they have been done. Second, there is a record of
what the research plan was, which can then be compared to what was actually
done. This does not confine the researcher to the plan, but it does make
transparent the difference between planned and exploratory analyses.

2. The authors get a chance to revise the protocol in
response to the reviews, and the editor judges whether the study is of an adequate
standard, and if necessary solicits another round of review. When there is
agreement that the study is as good as it can get, the protocol is posted as a
preprint on the web, together with the non-anonymised peer reviews. At this
point the identity of authors is revealed.

3. There are then two optional extra stages that could be
incorporated:

a) The researcher can solicit collaborators for the study.
This addresses two issues raised at the Royal Society meeting – first, many
studies are underpowered; duplicating a study across several centres could help
in cases where there are logistic problems in getting adequate sample sizes to
give a clear answer to a research question. Second, collaborative working
generally enhances reproducibility of findings.

b)It would make
sense for funding, if required, to be solicited at this point – in contrast to
the current system where funders evaluate proposals that are often only
sketchily described. Although funders currently review grant proposals, there
is seldom any opportunity to incorporate their feedback – indeed, very often a
single critical comment can kill a proposal.

4. The study is then completed, written up in full, and reviewed
by the editor. Provided the authors have followed the protocol, no further
review is required. The final version is deposited with the original preprint,
together with the data, materials and analysis scripts.

5. Post-publication discussion of the study is then encouraged
by enabling comments.

What might a panel of dragons make of this? I anticipate
several questions.

Who would pay for it?
Well, if arXiv is anything to go by, costs of this kind of operation are modest
compared with conventional publishing. They would consist of maintaining the web-based
platform, and covering the costs of editors. The open access journal PeerJ has developed an efficient e-publishing
operation and charges $99 per author per submission. I anticipate a similar
charge to authors would be sufficient to cover costs.

Wouldn't this give an
incentive to researchers to submit poorly thought-through studies? There
are two answers to that. First, half of the publication charge to authors would
be required at the point of initial submission. Although this would not be
large (e.g. £50) it should be high enough to deter frivolous or careless
submissions. Second, because the complete trail of a submission, from pre-print
to final report, would be public, there would be an incentive to preserve a
reputation for competence by not submitting sloppy work.

Who would agree to be
a reviewer under such a model? Why would anyone want to put their skills in
to improving someone else's work for no reward? I propose there could be
several incentives for reviewers. First, it would be more rewarding to provide
comments that improve the science, rather than just criticising what has
already been done. Second, as a more concrete reward, reviewers could have
submission fees waived for their own papers. Third, reviews would be public and
non-anonymised, and so the reviewer's contribution to a study would be
apparent. Finally, and most radically, where the editor judges that a reviewer
had made a substantial intellectual contribution to a study, then they could
have the option of having this recognised in authorship.

Why would anyone who
wasn't a troll want to comment post-publication? We can get some insights
into how to optimise comments from the model of the NIH-funded platform PubMed Commons. They do
not allow anonymous comments, and require that commenters have themselves
authored a paper that is listed on PubMed.Commenters could also be offered incentives such as a reduction of
submission costs to the platform. To
this one could add ideas from commercial platforms such as e-Bay, where sellers
are rated by customers, so you can evaluate their reputation. It should be
possible to devise some kind of star rating – both for the paper being
commented on, and for the person making the comment. This could provide
motivation for good commenters and make it easier to identify the high quality
papers and comments.

I'm sure that any dragon from the publishing world would
swallow me up in flames for these suggestions, as I am in effect suggesting a
model that would take commercial publishers out of the loop. However, it seems
worth serious consideration, given the enormous
sums that could be saved by universities and funders by going it
alone.But the benefits would not just
be financial; I think we could greatly improve science by changing the point in
the research process when reviewer input occurs, and by fostering a more open
and collaborative style of publishing.

Monday, 4 May 2015

The Early
Years Foundation Stage Profile was developed by the government's Standards
and Testing Agency "to support
practitioners in making accurate judgements about each child's attainment".
More specifically:

The EYFS Profile
summarises and describes children’s attainment at the end of the EYFS. It is
based on ongoing observation and assessment in the three prime and four
specific areas of learning, and the three characteristics of effective
learning,

• Prime areas:
communication and language; physical development; personal, social and
emotional development

• Characteristics:
playing and exploring;active learning; creating and thinking critically

… for each ELG, practitioners must judge whether a child is meeting
the level of development expected at the end of the Reception year (expected),
exceeding this level (exceeding), or not yet reaching this level (emerging).

The manual gives concrete examples of the kinds of behaviour
that meet the expected level for a given Early Learning Goal. For instance:

Understanding:
Children follow instructions involving several ideas or actions. They answer
‘how’ and ‘why’ questions about their experiences and in response to stories or
events.

Speaking: Children
express themselves effectively, showing awareness of listeners’ needs. They use
past, present and future forms accurately when talking about events that have happened
or are to happen in the future. They develop their own narratives and
explanations by connecting ideas or events.

Strikingly absent from these descriptions is any allowance
for the child's age. The timing of the assessment is specified to occur when
children are aged from 4 yrs 10 months to 5 yr 9 months.

Children's language skills (and indeed other skills) develop
rapidly in the preschool and early school years. I first became aware of this many years ago
when I was developing a children's comprehension assessment (TROG). The goal
was to establish the typical range of performance at different ages and
subsequently use TROG to identify cases of poor comprehension in clinical
settings. The assessment involved showing children sets of four pictures and
asking them to point to the one that matched a spoken phrase or sentence. I knew very little about developmental psychology
at the time, so I just decided to try the materials with children of different
ages to see how they reacted. It soon became apparent that there were
substantial age-related changes, and I realised that if I would need to use
four age-bands for 4-year-olds and two age-bands for 5-year-olds. Some
illustrative data are shown in Figure 1.

Findings like this are not specific to this test. I've
developed several language assessments over the years and I've used those
developed by others: they all show rapid change from 4 to 6 years.

Concerned by this, I wrote for information to the
government's Children and Early Years Data Unit, who referred me to this report. This gives
percentages of children reaching a Good Level of Development, defined as
achieving "at least the expected
level in the early learning goals in the prime areas of learning (personal,
social and emotional development; physical development; and communication and
language) and in the specific areas of mathematics and literacy." A
Good Level of Development was obtained by 69% of autumn-born children, 59% of
spring-born children and 47% of summer-born children, confirming that the
standards used to evaluate children are sensitive to age.

This is seriously problematic for at least reasons. First,
it means we are using flawed assessments that will over-identify problems in
younger children. It is already established that in the USA attentional
deficits are over-diagnosed in summer-born children (Elder, 2010) –
a problem that has long-term consequences when children are subsequently
prescribed medication for what may actually normal behaviour in an immature
child. Making children feel that they are falling short of an expected standard
before they are 5 years old cannot be good for their development. In this
regard it is noteworthy that there is evidence that being summer-born continues
to be associated with educational disadvantage in English children through the
later school years (Crawford et
al, 2013).

A second problem is that use of inappropriate criteria for
'expected' levels of development will give a false impression of the numbers of
children with developmental difficulties. Consider this
article describing an 'early learning crisis' with '20 percent of children unable to communicate properly at age 5'. I
have a particular interest in children who have language difficulties, but
nobody is helped by over-identifying problems in children who are just the
youngest in their class. I've seen enough 4 and 5-year-olds to know that the
'early learning goals' for understanding and speaking are not realistic
'expectations' for 4-year-olds and for those who have only just turned 5 years.
Indeed, the fact that one third of the oldest children are not regarded as
having a good level of development suggests to me that the expectations are
inappropriately high even for the oldest 5-year-olds.

My colleague Courtenay Norbury, Professor in the Psychology
Dept at Royal Holloway, will shortly be publishing data from a large survey of
language development in reception class children in Surrey*. She tells me that
month of birth is once again emerging as an important factor.

I'm not someone who is opposed to assessment in principle,
but if you are going to do it, it's important to do it in an informed manner. Surely
it is time for the policy-makers in this area to recognise that their current
practices of early assessment are misleading, and have the potential to cause
damage when children are evaluated against standards that are overly stringent
and do not take age into account.

*Update 5th June 2015: This is now published as an open access 'early view' paper in Journal of Child Psychology and Psychiatry: http://onlinelibrary.wiley.com/doi/10.1111/jcpp.12431/abstract