This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

There seems to be a common belief that women are better in multi-tasking than men,
but there is practically no scientific research on this topic. Here, we tested whether
women have better multi-tasking skills than men.

Methods

In Experiment 1, we compared performance of 120 women and 120 men in a computer-based
task-switching paradigm. In Experiment 2, we compared a different group of 47 women
and 47 men on "paper-and-pencil" multi-tasking tests.

Results

In Experiment 1, both men and women performed more slowly when two tasks were rapidly
interleaved than when the two tasks were performed separately. Importantly, this slow
down was significantly larger in the male participants (Cohen’s d = 0.27). In an everyday multi-tasking scenario (Experiment 2), men and women did
not differ significantly at solving simple arithmetic problems, searching for restaurants
on a map, or answering general knowledge questions on the phone, but women were significantly
better at devising strategies for locating a lost key (Cohen’s d = 0.49).

Conclusions

Women outperform men in these multi-tasking paradigms, but the near lack of empirical
studies on gender differences in multitasking should caution against making strong
generalisations. Instead, we hope that other researchers will aim to replicate and
elaborate on our findings.

Background

In the current study, we address the question whether women are better multi-taskers
than men. The idea that women are better multi-taskers than men is commonly held by
lay people (for a review see Mäntylä
2013). While the empirical evidence for women outperforming men in multi-tasking has been
sparse, researchers have shown that women are involved more in multi-tasking than men, for example in house-hold tasks (Offer and Schneider
2011; Sayer
2007). In this paper we address the question if it is true that women actually outperform men when multi-tasking.

Multi-tasking is a relatively broad concept in psychology, developed over several
decades of research (for a review see Salvucci and Taatgen
2010); this research has enormous relevance for understanding the risk of multi-tasking
in real-life situations, such as driving while using a mobile phone (Watson and Strayer
2010).

There are at least two distinct types of multi-tasking abilities. The first type is
the skill of being able to deal with multiple task demands without the need to carry out the involved tasks simultaneously. A good example of this type
of multi-tasking is carried out by administrative assistants, who answer phone calls,
fill in paperwork, sort incoming faxes and mail, and typically do not carry out any
of these tasks simultaneously.

A second type of multi-tasking ability is required when two types of information must
be processed or carried out simultaneously. An example of the latter category is drawing a circle with one hand while drawing
a straight line with the other hand. While humans have no difficulty carrying out
each of these tasks individually, drawing a circle with one hand and drawing a straight line with the other simultaneously is nearly impossible (the circle becomes more
of an ellipse and the line more of a circle, Franz et al.
1991). Another example is the requirement to process different types of sensory information
at the same time (Pashler
1984), such as different auditory streams on different ears (Broadbent
1952). While humans frequently are asked to do such tasks in the psychological laboratory,
humans seem to try to avoid these situations in real life, unless they are highly
trained (e.g., playing piano, with the left and right hands playing different notes,
or having a conversation while driving a car). Arguably, we are not good at doing
multiple tasks simultaneously (except when well trained), and that probably explains
why this type of multi-tasking is less common than the type in which we serially alternate
between two tasks (Burgess
2000). It is because of this that we focus on the first type of multi-tasking in this
study. Also, it is important to note that the two types of multi-tasking described
above are two extreme examples on a continuum of multi-tasking scenarios.

Cognitive scientists and psychiatrists have postulated a special set of cognitive
functions that help with the coordination of multiple thought processes, which include
the skills necessary for multi-tasking, namely "executive functions" (Royall et al.
2002): task planning, postponing tasks depending on urgency and needs (i.e., scheduling),
and ignoring task-irrelevant information (also known as "inhibition"). Healthy adults
can reasonably well interleave two novel tasks rapidly (Vandierendonck et al.
2010). The involved (human) brain areas necessary for multi-tasking have been investigated
and we can at the very least make a reasonable estimate of which are involved (Burgess
et al.
2000). Among primates, humans seem to have a unique way of dealing with task switching
(Stoet and Snyder
2003), which we hypothesize reflects an evolutionary unique solution for dealing with
the advantages and disadvantages of multi-tasking (Stoet and Snyder
2012). The specific contributions of individual brain areas to executive control skills
in humans have been linked to a number of mental disorders, in particular schizophrenia
(Evans et al.
1997; Kravariti et al.
2005; Royall et al.
2002; Semkovska et al.
2004; Dibben et al.
2009; Hill et al.
2004; Laws
1999).

Currently, there are few studies on gender and multi-tasking, despite a seemingly
confident public opinion that women are better in multi-tasking than men (Ren et al.
2009). Ren and colleagues (
2009) extrapolated the hunter-gatherer hypothesis (Silverman and Eals
1992) to make predictions about male and female multi-tasking skills. The hunter-gatherer
hypothesis proposes that men and women have cognitively adapted to a division of labor
between the sexes (i.e., men are optimized for hunting, and women are optimized for
gathering). Ren and colleagues speculated that women’s gathering needed to be combined
with looking after children, which possibly requires more multi-tasking than doing
a task without having to look after your offspring. In their experiment, men and women
performed an Eriksen flanker task (Eriksen and Eriksen
1974) either on its own (i.e., single task condition) or preceded by an unrelated other
cognitive decision making task (i.e., multi-tasking condition). They found that in
the multi-tasking condition, women were less affected by the task-irrelevant flankers
than men. Thus, the latter study supports the hypothesis that women are better multi-taskers.

We tested whether women outperform men in the first type of multi-tasking. In Experiment
1, we tested whether women perform better than men in a computer-based task-switching
paradigm. In Experiment 2a, we tested whether women outperform men in a task designed to test "planning" in
a "real-life" context that included standardized tests of executive control functions.
Our prediction was that women would outperform men.

Experiment 1

In this experiment, we used a task-switching paradigm to measure task-switching abilities.
Task-switching paradigms are designed to measure the difficulty of rapidly switching
attention between two (or more) tasks. Typically, in these types of studies, performing
a task consists of a simple response (e.g., button press with left or right hand)
to a simple stimulus (e.g., a digit) according to simple rules (e.g., odd digits require
left hand response, even digits a right hand response).

In task-switching paradigms, there are usualy two different tasks (e.g., in task A
deciding whether digits are odd or even, and in task B deciding whether digits are
lower or higher than the value 5). An easy way to think of task-switching paradigms
is to call one task "A" and another task "B". A block of just ten trials of task A
can be written as "AAAAAAAAAA" and a block of just ten trials of task B can be written
as "BBBBBBBBBB". Most adults find carrying out sequences of one task type relatively
simple. In contrast, interleaving trials like "AABBAABBAABB" is difficult, as demonstrated
for the first time in 1927 by Jersild (
1927). Today, the slowing down associated with carrying out a block of mixed trials compared
to a block of pure trials is known as "mixing cost". Further, within mixed blocks,
people slow down particularly on trials that immediately follow a task switch (in
AABBAA there are two such trials, here indicated in bold font); the latter effect is known
as "switch cost".

Researchers have given switch costs more attention than mixing costs, especially since
the mid-1990s(Vandierendonck et al.
2010)b. In the current experiment, we measured both types of costs.

Methods

Participants

We recruited participants via online advertisements and fliers in West Yorkshire (UK).
Our recruitment procedure excluded participants with health problems and disorders
that could potentially affect their performance, which included color-vision deficits,
as tested with the Ishihara color test (Ishihara
1998) before each experimental session. Altogether, we selected 240 participants stratified
by gender and age (Figure
1).

Figure 1.The distribution of participants by gender and age. The average age of women was 27.4 years (SD = 6.0); the average age of men was 27.8 years (SD = 6.4).

Research ethics

Research was in accordance with the declaration of Helsinki, and approval of ethical
standards for Experiment 1 was given by the ethics committee of the Institute of Psychological
Sciences, University of Leeds. All participants gave written or verbal consent to
participate.

Apparatus and stimuli

The experiment was controlled by a Linux operated PC using PsyToolkit software (Stoet
2010). A 17" color monitor and a Cedrus USB keyboard (model RB-834) were used for stimulus
presentation and response registration, respectively. Of the Cedrus keyboard, only
two buttons were used. These were the buttons closest to the participant (3.2 × 2.2
cm each, with 4.3 cm between the two buttons), which we will further refer to as the
left and right button, respectively.

A rectangular frame (7 × 8 cm) with an upper and lower section (Figure
2a) was displayed. The words "shape" and "filling" were presented above and below the
frame, respectively. Further four imperative stimuli were used in different trials
(Figure
2b). These four were the combination of two shapes (diamond and rectangle) and a filling
of two or three circles. The frame and the imperative stimuli were yellow and were
presented on a black background. Feedback messages were presented following trials
that were not performed correctly ("Time is up" or "That was the wrong key").

Figure 2.Schematic representation of the task-switching paradigm.A: Example trial. During a block of trials, a rectangular frame with the labels "shape"
and "filling" was visible. On each trial, a different imperative stimulus (i.e., a
stimulus that requires an immediate response) was presented in the top or bottom part
of this frame. The location (i.e., in top or bottom part of frame) determined whether
the participant had to apply the shape or filling task rules to it. B: There were four different imperative stimuli, which needed to be responded to as
follows. In the shape task, a "diamond" required a left-button response, and a rectangle
a right-button response. In the filling task, a filling of two circles required a
left-button response, and a filling of three circles a right-button response. Congruent
stimuli are those that required the same response in both tasks, whereas incongruent
stimuli required opposite responses in the two tasks. Thus, the imperative stimulus
in panel A is incongruent: It appears in the top of the frame, thus is should be responded
to in accordance to the shape task, and because it is a diamond (the filling of three
circles is irrelevant in the shape task) it should be responded to with a left-button
response (see Additional file
1 for demonstration).

Procedure

Participants were seated in a quiet and dimly lit room, and received written and verbal
instructions from the experimenter. They were instructed to respond to stimuli on
the computer screen. There were two different tasks, namely a shape and a filling
task. In the shape task, participants had to respond to the shape of imperative stimuli
(diamonds and rectangles required a left and right response, respectively). In the
filling task, participants had to respond to the number of circles within the shape
(two and three circles required a left and right response, respectively). The essential
feature of this procedure was that both task dimensions (shape and filling) were always
present and that the two dimensions required opposite responses on half the trials
(incongruent stimuli). This meant that participants were forced to think of which
of the two tasks needed to be carried out and to attend to the relevant stimulus dimension.
Participants were informed which task to carry out based on the imperative stimulus
location: If the stimulus appeared in the upper half of the frame, labeled "shape",
they had to carry out the shape task, and when it appeared in the bottom half of the
frame, labeled "filling", they had to carry out the filling task.

Participants first went through 3 training blocks (40 trials), and then performed
3 further blocks (192 trials total) that were used in the data analysis. The first
two blocks were blocks with just one of the two tasks (pure blocks), and in the third
block the two tasks were randomly interleaved (mixed block). In the mixed block, task-switch
trials were those following a trial of the alternative task, and task-repeat trials
were those following the same task. The order of blocks was identical for all participants.
The computer used a randomisation function to choose which task would occur on a given
trial. Further, it is important to note that participants had training in both tasks
before the blocks started that were used for data analysis; this means that even in
the first pure block of the analyzed data, participants were aware that incongruent
stimuli were associated with opposite responses in the alternative task.

In each trial, the frame and its labels (as displayed in Figure
2a) were visible throughout the blocks. When an imperative stimulus (one of the four
shown in Figure
2b) appeared (they were chosen at random by the software), participants had up to 4
seconds to respond. The imperative stimulus disappeared following a response or following
the 4 seconds in case no response was given. Incorrect responses (or failures to respond)
were followed by a 5 seconds lasting reminder of the stimulus-response mapping, and
then followed by a 500 ms pause. The intertrial interval lasted 800 ms. A demonstration
of the task is available in the Additional file
1.

When we report response times in task switching trials or in pure blocks, we always
report the average of both tasks. For example, when reporting the response times in
the pure blocks, we will report the average of the pure block of the shape task and
pure block of the filling task.

Results

Response time analyses were based on response times in correct trials following at
least one other correct trial. Further, we excluded all participants who performed
not significantly different from chance level in all conditions. This exclusion is
necessary, given that response time analyses in cognitive psychology are based on
the assumption that response times reflect decision time. When participants guess,
for example because they find the task difficult, the response times are no longer
informative of their decision time.

The procedure for testing if participants performed better than chance was carried
out as follows. Given that there were only two equally likely response alternatives
on each trial, participants had 50% chance to get a response correct. To determine
if a participant performed significantly better than chance level, we applied a binomial
test to the error rates in each condition. Based on this analysis, we concluded that
nine participants (5 men and 4 women, aged 18-36) did not perform better than chance
in at least one experimental condition. We found that each of these nine participants
worked at chance level in the incongruent task-switching condition (with error rates
ranging from 29% to 60%), and for five of them, this was the only condition they failed
in. None of these nine failed in the pure task blocks. We excluded these participants
from all reported analyses.

The next set of analyses were carried out to confirm that the used paradigm showed
the typical effects of task-switching and task-mixing paradigms as described in the
introduction (Figure
3). Throughout, we only report statistically significant effects (α criterion of.05).

Figure 3.The response times and error rates + 1 standard error of the mean in the pure, task-switching
and task-mixing conditions. Further, data is split up for congruent and incongruent stimuli, and for men and
women.

We repeated the same analysis on the error rates. Again, we found a significant effect
of switching, F(1,229) = 53.20, p < .001, with people making 1.97 ± 0.27 error percentage points (ppt) more in the
task-switch (4.62 ± 0.27%) than in the task-repeat (2.65 ± 0.18%) condition. Further,
people made 3.77 ± 0.31 ppt more errors in incongruent (5.52 ± 0.30%) than in congruent
(1.75 ± 0.18%) trials, F(1,229) = 143.90,p < .001. Finally, the interaction between switching and congruency was significant,
F(1,229) = 14.65,p < .001.

Next, we analyzed task-mixing costs using a similar approach as above. Now, we contrasted
trials in the pure blocks with task-repeat trials in mixed block. We observed a slow
down of 319 ± 8 ms due to mixing, F(1,229) = 1555.34,p < .001, with an average response time in mixed trials of 763 ± 10 ms and in pure
trials of 444 ± 5 ms. This effect interacted significantly with the gender of participants.
The slow down due to mixing was 336 ± 11 ms in men and 302 ± 12 ms in women (the effect
size of this gender difference expressed as Cohen’s d = 0.27). We also found an effect of congruency, F(1,229) = 24.46,p < .001, with people responding 18 ± 4 ms slower in incongruent (613 ± 7 ms) than
congruent (594 ± 7 ms) trials. Finally, there was a significant interaction between
mixing and congruency, F(1,229) = 10.37,p = .001.

We carried out the same analysis using error rate as dependent variable, and we found
a significant effect of task-mixing again. People made 0.55 ppt more errors in the
task mix condition (2.65 ± 0.18%) than in the pure condition (2.10 ± 0.13%), F(1,229) = 9.17,p = .003. People made 1.77 ± 0.20 ppt more mistakes in the incongruent (3.26 ± 0.19%)
than in the congruent (1.49 ± 0.13%) condition, F(1,229) = 80.86,p < .001. The factors switching and congruency interacted, F(1,229) = 26.94,p < .001. In the error rates, there were no effects of gender. Even so, it might be
of interest to report that women’s mixing cost in error rates was 0.50 ± 0.28 percentage
points and that of men 0.60 ± 0.23 percentage points.

Altogether, the ANOVAs of task-switching, task-mixing, and congruency confirmed the
well known picture of task-switching data. The novelty is the gender difference in
task-mixing costs. Although men and women did not show an overall speed difference,
we wanted to ensure that the gender difference was not simply related to overall speed
(e.g., people with larger switch costs might also have had a different baseline speed).
To do so, we analyzed relative mixing costs as well. Relative mixing costs is the
percentage slowing down in mixed compared to pure task blocks. For example, if a person
responds on average in 500 ms in mixing blocks and 400 ms in pure blocks the person
gets 25% slower due to mixing tasks.

We found that when analyzing the relative slow down due to mixing in relationship
to performance in pure blocks, there was a significant effect of gender. Women’s relative
slow down (69.1 ± 2.6%) was, in correspondence to the ANOVA of the absolute response
time, less than that of men (77.2 ± 2.6%), t(229) = 2.18,p = .030; in other words, both the analysis of absolute and relative mixing costs show
the same phenomenon.

Experiment 2

In Experiment 1, we found that men’s and women’s performance differed in a computer-based
task measuring the capacity to rapidly switch between different tasks. One of the
difficulties with computer-based laboratory tasks is their limited ecological validity.
Experiment 2 aimed to create a multi-tasking situation in a "real-life" context that
included standardized neurocognitive tests.

The approach of this experiment is based on tasks common in cognitive neuropsychology.
From a neuropsychological perspective, Burgess (Burgess et al.
2000) described multi-tasking as the ability to manage different tasks with different
(sometimes unpredictable) priorities that are initiated and monitored in parallel.
Furthermore, goals, time, and other task constraints are seen as self defined and
flexible. Shallice and Burgess (Shallice and Burgess
1991) devised the Six Elements Test to assess precisely these abilities (later modified
by others, Wilson et al.
1998). In this task, participants receive instructions to do three tasks (simple picture
naming, simple arithmetic and dictation), each of which has two sections, A and B.
The subject has 10 minutes to attempt at least part of each of the six sections, with
the proviso that they cannot do sections A and B of the same task after each other.

Burgess and colleagues (Burgess
2000; Burgess et al.
2000) have highlighted various features of multitasking behaviour, including: (1) several
discrete tasks to complete; (2) interleaving required for effective dovetailing of
task performance; (3) performing only one task at a particular time; (4) unforeseen
interruptions; (5) delayed intentions for the individual to return to a task which
is already running; (6) tasks that demand different task characteristics (7) self-determining
targets with which the individual decides for him/herself; and (8) no minute-by-minute
feedback on how well an individual performs. As Burgess and colleagues note, most
laboratory-based tasks do not include all of these features when assessing multi-tasking.
If this is indeed the case, there is a real advantage in studying multi-tasking using
this approach.

Methods

Participants

We recruited 47 male and 47 female participants, largely undergraduate students of
Hertfordshire University. The mean age was 24.2 years (SD = 8.1, range 18–60) for men, and 22.6 years (SD = 5.6, range 18–49) for women; there was no significant age difference between these
two groups, t(92) = 1.1,p = .28.

Research Ethics

Research was in accordance with the declaration of Helsinki, and approval of ethical
standards for Experiment 2 was given by the ethics committee of the School of Life
and Medical Sciences, University of Hertfordshire. All participants gave written or
verbal consent to participate.

Material

We used three different tasks. The "Key Search task" was taken from the Behavioral
Assessment for Dysexecutive Syndrome (BADS, Wilson et al.
1998). This is a specific test of planning and strategy, in which participants are required
to sketch out how they might route an attempt to search a "field" for a missing set
of keys. This task is normally used as a measure of problems in executive function,
and low scores are indicative of frontal lobe impairment. In the healthy population,
this task reveals no evidence of a gender difference according to test norms and personal
communication with Jon Evans (one of the test designers). The test designers reported
a high (r = .99) correlation between raters (Wilson et al.
1998).

The Map search task was taken from the "Tests of Everyday Attention" (Robertson et
al.
1994). The task requires individuals to find restaurant symbols on an unfamiliar color
map of Philadelphia (USA) and its surrounding areas. Again, this task reveals no evidence
of a gender difference according to the test norms and personal communication with
test designer Ian Robertson.

The third task was custom designed and involved solving simple arithmetical questions
presented on paper as shown in Figure
4. We did pilot these mathematics questions (unlike the first two tests, this test
is not standardised, and after piloting we moderated these questions to make sure
they could be largely successfully attempted while doing the other tasks).

Although there are reports that men outperform women on more complex mathematics problems,
this is typically not the case for simple calculations like this (Halpern et al.
2007).

A scoring system established within the BADS marks these plans according to set rules
such as parallel patterns and corner entry. A panel of 3 scorers agreed on the scores
for each test to ensure reliable scoring. Examples of key search strategies are shown
in Figure
5.

Figure 5.Examples of the key search task. The example on the left is from a male participant, the example on the right from
a female participant.

Procedure

Each participant was given 8 minutes to attempt the three tasks described above (Arithmetic,
Map, Key Search). The layout of the position of the map task, maths task and key search
was counterbalanced to avoid any bias affecting which tasks participants chose to
do. They were instructed that each task held equal marks; it was left to participants
to decide how they would organize their time between each task. The participants were
also informed that they would receive a phone call at some unknown time point (always
after 4 minutes) asking them 8 simple general-knowledge questions (e.g., "What is
the capital of France"), it was again left to participants to decide whether or not
they answered the phone call. Without or with answering the phone call, they were
multi-tasking; answering the call just added to that multi-tasking ’burden’ as such.
If they attempted to multi-task while answering the phone call, this was recorded.
We recorded time spent on each task as well as performance.

Results

We compared test scores (Table
1) and response times (Table
2) of men and women using t tests. We found that women (10.26 ± 0.58) scored significantly higher than men (8.13
± 0.68) on the key search task. Importantly, this finding cannot simply be explained
as a preference difference for the speed with which the task was carried out, as no
response time differences were found (Table
2).

No differences emerged in the numbers of men and women who answered the phone (79%
of men and 81% of women, χ2(1) = 0.06,p = .80). Those who answered the phone heard 8 simple general knowledge questions and
the correct answers did not differ between men (3.35 ± 0.35) and women (3.84 ± 0.34),t(73) = 1.0,p = .32; nor did time spent on the phone differ between men (97.68 ± 3.13 seconds)
and women (106.87 ± 3.65 seconds), t(73) = 1.91,p = .06. Of those that did answer the phone, we also measured whether they actively
multi-tasked while on the phone or concentrated purely on this phone - and there was
no significant difference 73% of men and 84% of women multi-tasked, χ2(1) = 1.41,p = .24.

Discussion

Using two very different experimental paradigms, we found that women have an advantage
over men in specific aspects of multi-tasking situations. In Experiment 1, we measured
response speed of men and women carrying out two different tasks. We found that even
though men and women performed the individual tasks with the same speed and accuracy,
mixing the two tasks made men slow down more so than women. From this, we conclude
that women have an advantage over men in multi-tasking (of about one third of a standard
deviation). In Experiment 2, we measured men and women’s multi-tasking performance
in a more ecologically valid setting. We found that women performed considerably better
in one of the tasks measuring high level cognitive control, in particular planning,
monitoring, and inhibition. In both experiments, the findings cannot be explained
as a gender difference in a speed-accuracy trade off. Altogether, we conclude that,
under certain conditions, women have an advantage over men in multi-tasking.

Relation to other work

As noted in the introduction, there is almost no empirical work addressing gender
differences in multi-tasking performance. For example, even though there are numerous
task-switching papers, none has focused on gender differencesd. In fact, most task-switching studies do not explore individual differences, and
accordingly are carried out with small samples.

Because they are typically carried out in psychology undergraduate programmes (with
less than 20% male students), there are few male participants. The novelty of our
study is not only the relatively large number of participants, but also the good gender
balance. Despite the few studies about gender differences in multi-tasking, there
has been an interesting discussion very recently about a study by Mäntylä (
2013) which received much attention. Probably the main reason for the attention in the
media for this study was the conclusion that men performed better than women in a multi-tasking paradigm. The finding of that study thus not only contrasts
with the widely held belief that women are better at task switching, it also contrasts
with our current data and the experiment by Ren and colleagues (
2009).

In the study by Mäntylä (
2013), men and women’s accuracy in a visual detection task was measured. Participants
had to detect specific numerical patterns in three different counters presented on
a computer screen. Simultaneously, participants had to carry out an N-back task (stimuli
appeared above the aforementioned counters). Men had a higher accuracy score of detecting
the correct numerical patterns than women. The latter study is of great interest,
because it addresses gender differences in multi-tasking of the second type, namely
when tasks need to be carried out simultanously. Of interest is that for this specific
type of multi-tasking, men had an advantage over women, and the degree of the advantage
was directly related to men’s advantage in spatial skills. But as argued in the introduction,
this type of multi-tasking is potentially of less relevance to daily life contexts
in which people often carry out tasks sequentially. In a comment on the study by Mäntylä
(
2013), Strayer and colleagues (
2013) argue that gender is a poor predictor of multi-tasking. They present data to back
this up from their own work on multi-tasking when driving. Arguably, studies showing
no gender differences might simply have received less attention due to a publication
bias for positive effects. We think that Strayer et al.’s comments are valuable to
the discussion, although their findings seem to primarily apply to the concurrent
multi-tasking situations. That said, we found only one study that reported no gender
differences in a task-switching paradigm in which people switched between two tasks.
Buser and Peter (Buser and Peter
2012) had three groups of participants solving two different types of puzzles (sudoku
and word-search). The group that did the two puzzles without switching between them
solved the puzzles best, while switching between the puzzles while solving them impaired
performance. The degree of impairment was similar for men and women, irrespective
of whether the switching was voluntary or imposed. This situation is somewhat similar
to Experiment 2, and thus, especially gender differences in this type of task-switching
need further study to draw strong conclusions.

Finally, our finding that men and women did not differ in the effect of phone calls
might be linked to a study by Law and colleagues (
2004). They stated that the effects of interruptions are "quite subtle" and that more
research on their effect on multi-tasking is necessary.

Limitations

We would like to consider a number of limitations of our current study that have implications
for the interpretation of our results. First, as already mentioned above, there are
many different ways to test multi-tasking performance. Because this is an emerging
field with a small extant knowledge base we cannot exclude the possibility that our
findings only hold true for the two specific paradigms we employed. Given the aforementioned
work by Mäntylä (
2013) and others that did not find the effect, and the general sparsity of the reports
on the effect, this is a possibility that must be seriously considered.

A second limitation is that we did not formally record levels of education or control
for general cognitive ability. Although we think it is not very likely, we appreciate
the comment of one of the reviewers that if their were different levels of education
this could potentially affect cognitive performance. The only way to exclude this
possibility is to formally record the highest level of education of all participants.

A third limitation is that the power of the Experiment 2 may be low. Again, it is
difficult to say although evidently powerful enough to detect moderate differences
on the key search task - so it may be a task-related issue and further work needs
to investigate task-based constraints in multi-tasking. For example, we did not conclude
that there was a gender difference in arithmetic performance or time spent on the
phone, but this could potentially be due to a lack of statistical power. In the case
of the arithmetic task, there are good reasons not to expect a gender difference on
simple arithmetic problems, even though we acknowledge the complexity of the study
of gender differences in mathematical ability (c.f., Halpern et al.
2007).

A final limitation is that although we checked that no gender differences emerged
on the Key Search with both the test authors and with the published norms, we cannot
eliminate the possibility that a difference may have emerged tested alone. We could
have retested the individual tasks with another sample of participants. Also, we could
have run a repeated measures design (same participants on the individual tasks), although
this would defeat the novelty aspect of the task. The best way to address this issue
is for another research group to replicate the finding.

Conclusions

Our findings support the notion that woman are better than men in some types of multi-tasking
(namely when the tasks involved do not need to be carried out simultaneously). More
research on this question is urgently needed, before we can draw stronger conclusions
and before we can differentiate between different explanations.

Endnotes

a The two experiments were carried out by independent groups of researchers. We only
realised the similarity between the two experiments and their findings afterwards.
We believe that the two experiments complement each other: While Experiment 1 uses
a laboratory based reaction time experiment, Experiment 2 uses a much more ecologically
valid approach.

b This is likely because of the availability of computers to measure response times.
In the 1920s, it would have been hard, if not impossible, to accurately measure task-switching
costs, while measuring mixing costs could be done with the paper-and-pensil tests
used by Jersild (
1927).

c Throughout the results section, we report means ±1 standard error of the mean.

d To the best of our knowledge.

Competing interests

The authors declare that they had no competing interests.

Authors’ contributions

GS, DO, and MC carried out Experiment 1. KL carried out Experiment 2. The four authors
wrote the article together. All authors read and approved the final manuscript.

Acknowledgements

Experiment 1 was made possible with a grant from the British Academy to Stoet, O’Connor,
and Conner and with the assistance of Weili Dai, Caroline Allen, and Tansi Warrilow.

Royall DR, Lauterbach EC, Cummings JL, Reeve A, Rummans TA, Kaufer DI, LaFrance WC, Coffey CE: Executive control function: a review of its promise and challenges for clinical research.
a report from the Committee on Research of the American Neuropsychiatric Association.

Sayer LC: Gender differences in the relationship between long employment hours and multitasking. In Workplace Temporalities (Research in the Sociology of Work). Edited by Rubin BA. Amsterdam: Elsevier; 2007:403-435.