Abstract

How the human brain controls hand movements to carry out different tasks is still debated. The concept of synergy has been proposed to indicate functional modules that may simplify the control of hand postures by simultaneously recruiting sets of muscles and joints. However, whether and to what extent synergic hand postures are encoded as such at a cortical level remains unknown. Here, we combined kinematic, electromyography, and brain activity measures obtained by functional magnetic resonance imaging while subjects performed a variety of movements towards virtual objects. Hand postural information, encoded through kinematic synergies, were represented in cortical areas devoted to hand motor control and successfully discriminated individual grasping movements, significantly outperforming alternative somatotopic or muscle-based models. Importantly, hand postural synergies were predicted by neural activation patterns within primary motor cortex. These findings support a novel cortical organization for hand movement control and open potential applications for brain-computer interfaces and neuroprostheses.

eLife digest

The human hand can perform an enormous range of movements with great dexterity. Some common everyday actions, such as grasping a coffee cup, involve the coordinated movement of all four fingers and thumb. Others, such as typing, rely on the ability of individual fingers to move relatively independently of one another.

This flexibility is possible in part because of the complex anatomy of the hand, with its 27 bones and their connecting joints and muscles. But with this complexity comes a huge number of possibilities. Any movement-related task – such as picking up a cup – can be achieved via many different combinations of muscle contractions and joint positions. So how does the brain decide which muscles and joints to use?

One theory is that the brain simplifies this problem by encoding particularly useful patterns of joint movements as distinct units or “synergies”. A given task can then be performed by selecting from a small number of synergies, avoiding the need to choose between huge numbers of options every time movement is required.

Leo et al. now provide the first direct evidence for the encoding of synergies by the human brain. Volunteers lying inside a brain scanner reached towards virtual objects – from tennis rackets to toothpicks – while activity was recorded from the area of the brain that controls hand movements. As predicted, the scans showed specific and reproducible patterns of activity. Analysing these patterns revealed that each corresponded to a particular combination of joint positions. These activity patterns, or synergies, could even be ‘decoded’ to work out which type of movement a volunteer had just performed.

Future experiments should examine how the brain combines synergies with sensory feedback to allow movements to be adjusted as they occur. Such findings could help to develop brain-computer interfaces and systems for controlling the movement of artificial limbs.

Introduction

Unique among primates, the human hand is capable of performing a strikingly wide range of movements, characterized by a high degree of adaptability and dexterity that enables complex interactions with the environment. This is exemplified by the hand’s ability to mold to objects and tools by combining motion and force in the individual digits so to reach a variety of hand postures. The multiple ways in which the hand can perform a given goal-directed movement arise from anatomical, functional, and kinematic redundancies, i.e., a large number of degrees of freedom (DoFs) (Bernstein, 1967). Such an organization results highly advantageous from an operational perspective, as redundant DoFs enable the hand to flexibly adapt to different task demands, or to switch among multiple postural configurations, while maintaining grasp stability (Bernstein, 1967; Santello et al., 2013). At the same time, this organization raises the question about how the central nervous system deals with these redundancies and selects a set of DoFs to accomplish a specific motor task (Latash et al., 2007). While some models propose the notion of “freezing” of redundant DoFs (Vereijken et al., 1992) or the implementation of optimization strategies (Flash and Hogan, 1985; Todorov and Jordan, 2002; Todorov, 2004), further studies have favored an alternative solution based on linear dimensionality reduction strategies or motor synergies (Latash, 2010).

The first quantitative description of kinematic hand synergies was obtained by analyzing hand postures used by subjects for grasping imagined objects that varied in size and shape (Santello et al., 1998). Three hand postural synergies were identified through a principal component analysis (PCA) that accounted for a high fraction (>84%) of variance in the kinematic data across all hand postures and characterized hand configurations as linear combinations of finger joints (Santello et al., 1998). Notably, other studies achieved similar results using kinematic data acquired during grasping of real, recalled and virtual objects (Santello et al., 2002), exploratory procedures (Thakur et al., 2008), or during different movements, such as typing (Soechting and Flanders, 1997), as well as with EMG signals from finger muscles during hand shaping for grasping or finger spelling (Weiss and Flanders, 2004).

Given that final hand postures can be described effectively as the linear combination of a small number of synergies, each one controlling a set of muscles and joints, the question arises whether kinematic or muscular hand synergies merely reflect a behavioral observation, or whether instead a synergy-based framework is grounded in the human brain as a code for the coordination of hand movements. According to the latter hypothesis, motor cortical areas and/or spinal modules may control the large number of DoFs of the hand through weighted combinations of synergies (Gentner and Classen, 2006; Santello et al., 2013; Santello and Lang, 2014), in a way similar to that demonstrated for other motor acts, such as gait, body posture, and arm movements (Cheung et al., 2009). Furthermore, the biomechanical constraints of the hand structure that group several joints in nature (e.g., multi-digit and multi-joint extrinsic finger muscles whose activity would generate coupled motion), are compatible with the synergistic control of hand movements.

Previous brain functional studies in humans are suggestive of a synergistic control of hand movements. For instance, in a functional magnetic resonance imaging (fMRI) study, synergistic/dexterous and non-synergistic hand movements elicited different neural responses in the premotor and parietal network that controls hand posture (Ehrsson et al., 2002). Equally, transcranial magnetic stimulation (TMS) induced hand movements encompassed within distinct postural synergies (Gentner and Classen, 2006). Despite all the above pieces of information, however, whether and to what extent the representation of hand movements is encoded at a cortical level in the human brain directly as postural synergies still remains an open question.

Alternative solutions to synergies for hand control have been proposed as well. Above all, classic somatotopic theories postulated that distinct clusters of neuronal populations are associated with specific hand muscles, fingers, or finger movements (Penfield and Boldrey, 1937; Penfield and Rasmussen, 1950; Woolsey et al., 1952). However, whereas a coarse arrangement of body regions (e.g., hand, mouth, or face) has been shown within primary motor areas, the intrinsic topographic organization within limb-specific clusters remains controversial. In hand motor area, neurons controlling single fingers are organized in distributed, overlapping cortical patches without any detectable segregation (Penfield and Boldrey, 1937; Schieber, 1991, Schieber, 2001). In addition, it has been recently shown that fMRI neural activation patterns for individual digits in sensorimotor cortex are not somatotopically organized and their spatial arrangement is highly variable, while their representational structure (i.e., the pattern of distances between digit-specific activations) is invariant across individuals (Ejaz et al., 2015).

The present study was designed to determine whether and to what extent synergistic information for hand postural control is encoded as such at a neural level in the human brain cortical regions.

An identical experimental paradigm was performed in two distinct sessions to acquire kinematic and electromyographic (EMG) data while participants performed grasp-to-use movements towards virtual objects. Kinematic data were analyzed according to a kinematic synergy model and an individual-digit model, based on the independent representation of each digit (Kirsch et al., 2014), while EMG data were analyzed according to a muscle synergy model to obtain independent descriptions of each final hand posture. In a separate fMRI session, brain activity was measured in the same participants during an identical motor task.

Hence, encoding techniques (Mitchell et al., 2008) were applied to brain functional data to compare the synergy-based model with the alternative somatotopic and muscular models on the basis of their abilities to predict neural responses. Finally, to assess the specificity of the findings, we applied a decoding procedure to the fMRI data to predict hand postures based on patterns of fMRI activity.

Results

Motion capture and EMG sessions: discrimination accuracy of different models on behavioral data

The hand kinematic data, acquired from the motion capture experiment, provided a kinematic synergy description, created using PCA on digit joint angles, and an individual digit description, i.e., a somatotopic model based on the displacements of single digits, calculated as the average displacement of their joint angles. The EMG data provided a muscle synergy description. To obtain comparable descriptions of hand posture, three five-dimensions models were chosen. A validation procedure based on a rank-accuracy measure was performed to assess the extent to which static hand postures could be reliably discriminated by each behavioral model, regardless of its fraction of variance accounted for. All the three models were able to significantly distinguish between individual hand postures (average accuracy ± standard deviation -SD-; chance level: 50%; kinematic synergy: 91.1 ± 3.6%; individual digit: 85.9 ± 5%; muscle synergy: 72 ± 7.7%) (Supplementary file 1A). Specifically, the kinematic synergy model performed significantly better than both the individual digit and muscle synergy models while the individual digit model was significantly more informative than the muscle synergy model (Wilcoxon signed-rank test, p<0.05, Bonferroni-Holm corrected).

fMRI session: discrimination accuracy of different models in single-subject encoding of hand posture

Three independent encoding procedures (Mitchell et al., 2008) were performed on the fMRI data to assess to what extent each model (kinematic synergy, individual digit or muscle synergy) would predict brain activity. The discrimination accuracy was tested for significance against unique null distributions of accuracies for each participant and model obtained through permutation tests.

Overall, the encoding procedure based on the kinematic synergy model was highly successful across all participants (average accuracy ± SD: 71.58 ± 5.52%) and always significantly above chance level (see Supplementary file 1B for single subject results). The encoding of the individual digit model was successful in five of nine participants only (63.89 ± 6.86%). Finally, the muscle synergy model successfully predicted brain activity in six out of eight participants, with an average accuracy that was comparable to the individual digit model (63.9 ± 6.5%).

The kinematic synergy model outperformed both the individual digit and the muscle synergy models (Wilcoxon signed-rank test, p<0.05, Bonferroni-Holm corrected), whereas no significant difference was found between the individual digit and muscle synergy models (p=0.95).

To obtain a measure of the overall fit between neural responses and behavioral performance, we computed the R2 coefficient between the fMRI data and each behavioral model across voxels, subjects, and acquisition modalities. The group averages were 0.41 ± 0.06 for the kinematic synergies, 0.37 ± 0.03 for the individual digits, and 0.37 ± 0.06 for the muscle synergies. Therefore, 40.8% of the BOLD signal was accounted for by the kinematic synergies, whereas the two other behavioral models explained a relatively smaller fraction of the total variance.

Functional neuroanatomy of kinematic hand synergies

The group analysis was performed only on the encoding results obtained from kinematic synergies, as this was the most successful model and the only one that performed above chance level across participants. The single-subject encoding results maps – containing only the voxels recruited during the procedure – were merged, with a threshold of p>0.33 to retain consistently informative voxels, overlapping in at least four participants.

This probability map shows the voxels that were consistently engaged by the encoding procedure across subjects, i.e., those voxels whose activity was predictable on the basis of the kinematic synergies.

A hand-posture- related network comprising the left primary and supplementary motor areas, the superior parietal lobe and the anterior part of intraparietal sulcus (bilaterally) was recruited with high overlap across subjects. Despite additional regions (i.e., Brodmann Area 6) resulted from the encoding analyses, they are not evident in the map due to their deep location.

Behavioral and neurofunctional stability of kinematic synergies and synergy-topic mapping

Since postural synergies were obtained in each subject independently, a procedure to assess the stability of the principal components (PCs) across participants was performed (see Materials and methods section). For visualization purposes, we focused on the first three PCs, which could explain more than 80% of the variance across the entire hand kinematic dataset, and were also highly consistent across participants (Video 1).

This video shows the meaning of the kinematic synergies measured in this study, by presenting three movements from the minimum to the maximum values of kinematic synergies 1, 2, and 3, respectively, expressed as sets of twenty-four joint angles averaged across subjects.

It can be observed that the first synergy modulates abduction-adduction and flexion-extension of both the proximal and distal finger joints, while the second synergy reflects thumb opposition and flexion-extension of the distal joints only. Maximizing the first synergy leads, therefore, to a posture resembling a power grasp, while the second one is linked to pinch movements directed towards smaller objects, and the third one represents movements of flexion and thumb opposition (like in grasping a dish or a platter) (Santello et al., 1998; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008).

Accordingly to the aforementioned results, the first three kinematic PCs were mapped onto a flattened mesh of the cortical surface. This map displayed the fitting of each synergy within the voxels that were recruited by the encoding procedure across participants. Figure 2 shows that the group kinematic synergies are represented in the precentral and postcentral cortex in distinct clusters that are arranged in a topographical continuum with smooth supero-inferior transitions. The procedure developed to assess the topographical arrangement of synergies (see Materials and methods) was statistically significant (C=0.19; p=0.038), indicating that anatomically close voxels exhibited similar synergy coefficients (see Figure 2—figure supplement 1).

Representational Spaces, drawn separately for the three models and fMRI data (using the activity from a region consistently activated across all the grasping movements), were compared at a single subject and group level to assess the similarity between each behavioral model and the neural content represented at a cortical level. All group correlations, both between fMRI and behavioral data and between behavioral models were highly significant (p<0.0001) (for details see Supplementary file 1D,E and Figure 3—figure supplement 1). Moreover, a MDS procedure was performed to represent data from kinematic synergies and fMRI BOLD activity. Figure 3 shows the high similarity between these two spaces.

With the exception of few postures (e.g., dinner plate, frisbee and espresso cup) that were misplaced in the fMRI data with respect to the kinematic synergies representation, the other object-related postures almost preserved their relative distances.

To confirm the presence of a neural representation of hand synergies at a cortical level and that that this information can be used to specifically control hand postures based on brain activity, we applied decoding methods as complementary approaches to encoding analyses (Naselaris et al., 2011). Hand posture (expressed as a matrix of 24 joints angles by 20 hand postures) was therefore predicted with a multiple linear regression procedure from fMRI data. Specifically, this procedure could reliably reconstruct the different hand postures across participants. The goodness-of-fit (R2) between the original and reconstructed joint angle patterns related to single movements, averaged across subjects, ranged between 0.51 and 0.90 (Supplementary file 1F). Three hand plots displaying original and reconstructed postures from a representative subject are shown in Figure 4. Notably, this decoding attempt reveals that brain activity elicited by our task can effectively be used to reconstruct the postural configuration of the hand. Moreover, the rank accuracy procedure specifically designed to test the extent to which each decoded posture could be discriminated from the original ones yielded significant results in six of nine participants (Supplementary file 1G).

This picture represents the postures obtained from the fMRI data and those originally recorded through optical tracking.

The figure shows three pairs of hand plots corresponding to three postures from a representative subject, and the goodness-of-fit between the original and decoded sets of joint angles. In these plots, the two wrist angles are not rendered.

The possible role of visual object presentation: control analyses

Since motor and premotor regions supposedly contains neuronal populations that respond to visual stimuli (Kwan et al., 1985; Castiello, 2005; Klaes et al., 2015), one may argue that the visual presentation of objects in the current experiment contributes to the synergy-based encoding of BOLD activity in those regions. To exclude this possibility, an encoding procedure using the kinematic synergy model was performed within the region of interest (ROI) chosen for RSA and posture reconstruction, using exclusively the neural activity related to visual object presentation, measured five seconds after the stimulus onset. The procedure was unsuccessful in all participants, thus indicating that the kinematic synergy information in motor and premotor regions was purely related to motor activity (Supplementary file 1H).

The encoding maps of kinematic synergies never included visual areas. Nonetheless, visual areas are likely to participate in the early stages of action preparation (Gutteling et al., 2015) and the motor imagery might have played a role during the task in the fMRI session. For this reason, we first defined a ROI by contrasting visual related activity after stimulus presentation and rest (q<0.01, FDR corrected), thus to isolate regions of striate and extrastriate cortex within the occipital lobe. Subsequently, an encoding analysis was performed similarly to the above-mentioned procedures. The results were at the chance level in seven out of nine participants (see Supplementary file 1I), suggesting that visual imagery processes in the occipital cortex did not retain kinematic synergy information.

Discussion

Scientists have debated for a long time how the human hand can attain the variety of postural configurations required to perform all the complex tasks that we encounter in activities of daily living. The concept of synergy has been proposed to denote functional modules that may simplify the control of hand postures by simultaneously recruiting sets of muscles and joints. In the present study, by combining kinematic, EMG, and brain activity measures using fMRI, we provide the first demonstration that hand postural information encoded through kinematic synergies is represented within the cortical network controlling hand movements. Importantly, we demonstrate that kinematic synergies strongly correlate with the neural responses in primary and supplementary motor areas, as well as movement-related parietal and premotor regions. Furthermore, we show that kinematic synergies are topographically arranged in the precentral and postcentral cortex and represent meaningful primitives of grasping. Finally, the neural responses in sensorimotor cortex allow for a highly successful decoding of complex hand postures. Therefore, we conclude that the human motor cortical areas are likely to represent hand posture by combining few elementary modules.

Validation of behavioral data was performed as the first stage of analysis to assess the information content and the discriminability of the postures from the kinematic or EMG data. This procedure showed that each posture could be successfully classified above chance level by kinematic synergy, individual digit, and muscle synergy models.

In addition, the encoding procedures on fMRI-based neural responses show that kinematic synergies are the best predictor of brain activity, with a significantly higher discrimination accuracy across participants, indicating that kinematic synergies are represented at a cortical level. Even if previous studies suggest that the brain might encode grasp movements as combinations of synergies in the monkey (Overduin et al., 2012), or indirectly in humans (Gentner and Classen, 2006; Gentner et al., 2010), to the best of our knowledge, no direct evidence has been presented to date for a functional validation and characterization of neural correlates of synergy-based models.

The results from RSA suggest that the three models used to predict brain activity may have similar, correlated spaces. However, each model provides a unique combination of weights for each posture across different dimensions (e.g., synergies or digits), thus resulting in distinct descriptions of the same hand postures. It should be noted that both the individual digit model and the muscle synergy model failed to predict brain activity in four and two participants, respectively. Thus, while they discriminated hand postures at a behavioral level, these models are clearly less efficient than the kinematic synergy model in predicting neural activity.

Finally, the descriptive procedures (RSA and MDS) were performed to assess the differences between the fMRI representational space and the single-model spaces. The results indicate a high similarity between fMRI and kinematic synergies, as reflected in the largely overlapping representations obtained from kinematic data and fMRI as depicted in Figure 3.

A recent study employed descriptive procedures (i.e., RSA) to demonstrate that similar movement patterns of individual fingers are reflected in highly correlated patterns of brain responses, that, in turn, are more correlated with kinematic joint velocities than to muscle activity, as recorded through high-density EMG (Ejaz et al., 2015). Our paper introduces a methodological and conceptual advancement. While, in Ejaz et al., full matrices of postural, functional or muscle data have been considered in the RSA, here we focused on descriptions with lower dimensionality, which lose only minor portions of information. Consequently, by showing that brain activity in motor regions can be expressed as a function of a few meaningful motor primitives that group together multiple joints, rather than as combinations of individual digit positions, our results suggest that a modular organization represents the basis of hand posture control.

The functional neuroanatomy of kinematic synergies is embedded in motor cortical areas

The group probability maps of our study indicate that the regions consistently modulated by kinematic synergies, that include bilateral precentral, SMA and supramarginal area, ventral premotor, left inferior parietal and postcentral cortex, overlapped with a network strongly associated with the control of hand posture (Castiello, 2005).

Specifically, we show that the combination of five synergies, expressed as PCs of hand joint angles, predicts neural activity of M1 and SMA, key areas for motor control. While previous studies in humans showed differential activations in M1 and SMA for power and precision grip tasks (Ehrsson et al., 2000) and for different complex movements (Bleichner et al., 2014), to date no brain imaging studies directly associated these regions with synergy-based hand control.

Beyond primary motor areas, regions within parietal cortex are involved in the control of motor acts (Grafton et al., 1996). Inferior parietal and postcentral areas are engaged in higher-level processing during object interaction (Culham et al., 2003). Since grasping, as opposed to reaching movements, requires integration of motor information with inputs related to the target object, these regions may integrate the sensorimotor features needed to preshape the hand correctly (Grefkes et al., 2002; Culham et al., 2003). Consistently, different tool-directed movements were decoded from brain activity in the intraparietal sulcus (Gallivan et al., 2013), and it has been reported that this region is sensitive to differences between precision and power grasps (Ehrsson et al., 2000; Gallivan et al., 2011). The current motor task, even if performed with the dominant right hand only, also recruited motor regions of the right hemisphere. Specifically, bilateral activations of SMA were often described during motor tasks (Ehrsson et al., 2001; Ehrsson et al., 2002) and a recent meta-analysis indicated a consistent recruitment of SMA in grasp type comparisons (King et al., 2014). Equally, a bilateral, but left dominant, involvement of intraparietal cortex for grasping has been reported (Culham et al., 2003).

Moreover, some authors have hypothesized recently that action recognition and mirror mechanisms may rely on the extraction of reduced representations of gestures, rather than on the observation of individual motor acts (D'Ausilio et al., 2015). The specific modulation of neural activity by kinematic synergies within the action recognition network seems in agreement with this proposition.

Finally, the width of our probability maps, measured on the cortical mesh, was ca. 1cm, which corresponds to the hand area, as defined by techniques with better spatial resolution, including ultra-high field fMRI or electrocorticography (ECoG) (Siero et al., 2014).

To exclude that the results from the encoding analysis can be driven by differences between classes of acts, i.e., precision or power grasps, rather than reflect the modulation of brain activity by kinematic synergies, the similarity between the 20 hand postures was evaluated in a pairwise manner. Specifically, the accuracy of the encoding model was estimated for each pair of distinct movements, unveiling the extent to which individual hand postures could be discriminated from each other based on their associated fMRI activity. In the result heat map (Figure 5), two clusters can be identified: one composed mainly by precision grasps directed towards small objects, and a second one composed mainly by power grasps towards heavy tools. The remaining postures did not cluster, forming instead a non-homogeneous group of grasps towards objects that could be either small (e.g., espresso cup) or large (e.g., jar lid, PC mouse).

Discrimination accuracies for single postures as represented by kinematic synergies.

Two clusters of similar postures are easily identifiable (i.e., precision grip and power grasps). However, other postures were recognized without showing an evident clustering, suggesting that the encoding procedure was not biased by a coarse discrimination of motor acts.

These results indicate that goal-directed hand movements are represented in the brain in a way that goes beyond the standard distinction between precision and power grasps (Napier, 1956; Ehrsson et al., 2000). Other authors have proposed a possible 'grasp taxonomy' in which multiple, different types of grasps are described according to hierarchical criteria rooted on three main classes: precision, power and intermediate (Feix et al., 2009). By combining these three elementary grasps, it is possible to generate a wide number of postures. Notwithstanding the advancements of these taxonomies in describing hand posture, much less effort has been made to understand how the wide variety of human hand postures can be represented in the brain. Our results indicate that a synergy framework may predict brain activity patterns underlying the control of hand posture. Of note, the highest-ranked kinematic synergies can be clearly identified as grasping primitives: the first synergy modulates abduction-adduction and flexion-extension of both the proximal and distal finger joints, while a second synergy reflects thumb opposition and flexion-extension of the distal joints only. Maximizing the first synergy leads therefore to a posture resembling a power grasp, while the second one is linked to pinch movements directed towards smaller objects, and the third one represents movements of flexion and thumb opposition (like in grasping a dish or a platter) (Santello et al., 1998; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008) (Video 1). For this reason, the description of hand postures can benefit from reduction to combinations of few, meaningful synergies, which can provide more reliable results than clustering methods based on a small number of categories (Santello et al., 2002; Ingram et al., 2008; Thakur et al., 2008; Tessitore et al., 2013).

A challenge to individual digit cortical representations? The functional topography of hand synergies

The first three synergies are displayed on a flattened map of the cortical surface in Figure 2. The map suggests that the PCs are topographically arranged, forming clusters with a preference for each of the three synergies, separated by smooth transitions. This organization resembles that observed in the retinotopy of early visual areas (Sereno et al., 1995) or in auditory cortex as studied with tonotopic mapping (Formisano et al., 2003). This observation strongly suggests that primary motor and somatosensory brain regions may show specific, organized representations of synergies across the cortical surface. Such an observation is unprecedented, since the large number of previous studies adopted techniques, such as single cell recording (Riehle and Requin, 1989; Zhang et al., 1997) or intracortical microstimulation (ICMS) (Overduin et al., 2012), which can observe the activity of single neurons but do not capture the functional organization of motor cortex as a whole. Motor cortex has historically been hypothesized to be somatotopically organized in a set of sub-regions that control different segments of the body (Penfield and Boldrey, 1937). However, whereas subsequent work confirmed this organization (Penfield and Welch, 1951), a major critical point remains the internal organization of the single subregions (e.g., hand, leg or face areas). To date, a somatotopy of fingers within the hand area appears unlikely: as each digit is controlled by multiple muscles, individual digits may be mapped in a distributed rather than discrete fashion (Penfield and Boldrey, 1937; Schieber, 2001; Graziano et al., 2002; Aflalo and Graziano, 2006). An alternative view posits that movements are represented in M1 as clusters of neurons coding for different action types or goals (Graziano, 2016). In fact, mouse motor cortex is organized in clusters that encode different motor acts (Brown and Teskey, 2014). Similarly, stimulation of motor cortex in monkeys produces movements directed to stable spatial end-points (Graziano et al., 2002; Aflalo and Graziano, 2006) and may have a synergistic organization (Overduin et al., 2012). Recently, it has been demonstrated in both monkeys and humans that complex movements can be recorded from parietal as well as premotor and motor areas (Aflalo et al., 2015; Klaes et al., 2015; Schaffelhofer et al., 2015). Interestingly, a successful decoding can be achieved in those regions both during motor planning and execution (Schaffelhofer et al., 2015). These observations about the internal organization of motor cortex were demonstrated also in humans, revealing that individual representations of digits within M1 show a high degree of overlap (Indovina and Sanes, 2001) and that, despite digits may be arranged in a coarse ventro-dorsal order in somatosensory cortex, their representations are intermingled so that the existence of digit specific voxels is unlikely (Ejaz et al., 2015). In contrast, individual cortical voxels may contain enough information to encode specific gestures (Bleichner et al., 2014).

Measuring synergies: back from brain signal to motor actions

Finally, we questioned whether the information encoded in M1 could be used to reconstruct hand postures. To this aim, each individual posture was expressed as a set of synergies that were derived from the fMRI activity on an independent cortical map. The results were reported as correlation values between the sets of joint angles originally tracked during kinematic recording and the joint angles derived from the reconstruction procedure. Overall, hand postures can be reconstructed with high accuracy based on the neural activity patterns. This result yields potential applications for the development of novel brain computer interfaces: for instance, previous studies demonstrated that neural spikes in primary motor cortex can be used to control robotic limbs used for performing simple or complex movements (Schwartz et al., 2006; Schwartz, 2007; Velliste et al., 2008). Previous studies in monkeys suggest that neural activity patterns associated with grasp trajectories can be predicted from single neuron activity in M1 (Saleh et al., 2010; Saleh et al., 2012; Schaffelhofer et al., 2015) and recently neuronal spikes have been associated with principal components (Mollazadeh et al., 2014). In humans, cortical activity obtained through intracranial recordings can be used to decode postural information (Pistohl et al., 2012) and proper techniques can even lead to decode EMG activity from fMRI patterns (Ganesh et al., 2008) or from ECoG signals (Flint et al., 2014). So far, decoding of actual posture from fMRI activity in M1 was possible at individual voxel level, albeit with simplified paradigms and supervised classifiers that identified only four different movements (Bleichner et al., 2014). In contrast, by proving that posture-specific sets of joint angles – expressed by synergy loadings – can be decoded from the fMRI activity, we show that information about hand synergies is present in functional data and can be even used to identify complex gestures. Other authors similarly demonstrated that a set of few synergies can describe hand posture in a reliable way, obtaining hand postures that correlated highly with those recorded with optical tracking (Thakur et al., 2008).

Limitations and methodological considerations

While nine subjects may appear to be a relatively limited sample for a fMRI study, our study sample is comparable to that of most reports on motor control and posture (e.g., Santello et al., 1998; Weiss and Flanders, 2004; Ingram et al., 2008; Thakur et al., 2008; Tessitore et al., 2013; Ejaz et al., 2015) as well as to the sample size of fMRI studies that use encoding techniques, rather than univariate analyses (Mitchell et al., 2008; Huth et al., 2012). In addition, the data of our multiple experimental procedures (i.e., kinematic tracking, EMG, and fMRI) were acquired within the same individuals, so to minimize the impact of inter-subject variability and to facilitate the comparison between different models of hand posture. Finally, robust descriptive and cross-validation methods complemented single-subject multivariate approaches, which are less hampered by the number of participants than univariate fMRI procedures at group level.

A further potential criticism may involve the use of imagined objects – instead of real objects – as targets for grasping movements. The use of imagined objects allows to avoid confounding variables including grasping forces, difficulty in handling objects within a restricting environments, that could play a role in modulating motor acts. In previous behavioral reports, synergies were evaluated using contact with real objects (Santello et al., 2002) and participants could also explore them in an unconstrained manner instead of concentrating on single actions (e.g., grasping) (Thakur et al., 2008). Another study tracked hand motion across many gestures performed in an everyday life setting (Ingram et al., 2008). Interestingly, the dimensionality reduction methods were adopted with high consistency in these reports, despite the wide variety of experimental settings, and the first few PCs could explain most of the variance across a very wide number of motor acts. Moreover, when motor acts were performed toward both real and imagined objects, the results obtained from synergy evaluation were highly similar (Santello et al., 2002).

It can be argued that the better performance for kinematic synergies as compared to the other two alternative models may be due to the differences in the intrinsic signal and noise levels of the optical motion tracking and EMG acquisition techniques. Moreover, the muscle synergy model is inevitably simplified, since only a fraction of the intrinsic and extrinsic muscles of the hand can be recorded with surface EMG. Since all these factors may impact our ability to predict brain activity, we tested whether and to what extent different processing methods and EMG channel configurations could affect the performance of the muscle synergy model in discriminating single gestures and encoding brain activity. Therefore, we performed an additional analysis on an independent group of subjects, testing different processing methods and EMG channel configurations (up to 16 channels). The results, reported in the Appendix, demonstrate that EMG recordings with a higher dimensionality (Gazzoni et al., 2014; Muceli et al., 2014) or a different signal processing (Ejaz et al., 2015) do not lead to better discrimination results. These findings are consistent with previous reports (Muceli et al., 2014), and indicate that, in the current study, the worst performance of the muscle model relates more to the signal-to-noise ratio of the EMG technique per se, rather than to shortcomings of either the acquisition device or the signal processing methods adopted here.

While our data suggest that synergies may be arranged topographically on the cortical surface, the assessment of such a mapping is currently limited to the first three unrotated PCs. Additional studies are needed to investigate how topographical organization may be affected by the rotation of the principal components. Indeed, such an assessment requires the definition of stable population-level synergies to allow for the identification of optimally rotated components and to test their topographical arrangements across subjects; for this reason, it falls beyond the aims of the current study. Our work demonstrates that the topography of synergies, as defined as a spatial map of the first three PCs, is resistant to different arrangements; however, alternative configurations (rotated solutions within the PCA) can be encoded as well in sensorimotor cortical areas. The relatively low C index obtained in the mapping procedure and the total variance explained by the kinematic synergy model during the encoding procedure leave the door open to better models and different topographical arrangements.

Beyond synergies: which pieces of information are also coded in the brain?

In summary, our results provide strong support for the representation of hand motor acts through postural synergies. However, this does not imply that synergies are the only way the brain encodes hand movements in primary motor cortex. In our data, only a portion (40%) of the total brain activity could be accounted for by kinematic synergies. Hand motor control results from complex interactions involving the integration of sensory feedback with the selection of motor commands to group of hand muscles. Similarly, motor planning is also a complex process, which requires selecting the desired final posture based on the contact forces required to grasp or manipulate an object. These elements must be continuously monitored to allow for on-line adaptation and corrections (Castiello, 2005). Previous studies demonstrated that only a small fraction of variance in M1 is related to arm posture (Aflalo and Graziano, 2006) and that grasping force can be efficiently decoded from electrical activity, suggesting that at least a subset of M1 neurons processes force-related information (Flint et al., 2014). In addition, motor areas can combine individual digit patterns on the basis of alternative non-synergistic or nonlinear combinations and the correlated activity patterns for adjacent fingers may depend on alternative mechanisms such as finger enslaving (Ejaz et al., 2015). It is likely that sensorimotor areas encode also different combinations of synergies, based – for instance – on the rotated versions of kinematic PCs: the encoding of synergies and of their rotated counterparts may represent a wider repertoire of motor primitives which can improve the flexibility and adaptability of modular control. Moreover, the information encoded may be related to the grasping action as a whole, not only to its final posture. Dimensionality reduction criteria can also be applied to hand posture over time, leading to time-varying synergies that encode complete preshaping gestures without being limited to their final position (Tessitore et al., 2013). This is consistent with EMG studies, which actually track muscle activity over the entire grasping trajectory (Weiss and Flanders, 2004; Cheung et al., 2009) and can add information about the adjustments performed during a motor act. Information about the temporal sequence of posture and movements may therefore be encoded in M1 and a different experimental setup is needed to test this hypothesis.

It should also be noted that studies in animal models bear strong evidence for a distributed coding of hand synergies beyond motor cortex, i.e., spinal cord (Overduin et al., 2012; Santello et al., 2013). The question about the role of M1 – i.e., whether it actually contains synergic information or simply act as a mere selector of motor primitives that are encoded elsewhere – still remains open. Our study provides a relatively coarse description of the role of M1 neurons. According to the redundancy principle, only a part of M1 neurons may be directly implied in movement or posture control (Latash et al., 2007), whereas the remaining neurons may deal with force production or posture adjustments and control over time, allowing for the high flexibility and adaptability which are peculiar features of human hand movements.

Altogether, the coding of motor acts through postural synergies may shed new light on the representation of hand motor acts in the brain and pave the way for further studies of neural correlates of hand synergies. The possibility to use synergies to reconstruct hand posture from functional activity may lead to important outcomes and advancements in prosthetics and brain-machine interfaces. These applications could eventually use synergy-based information from motor cortical areas to perform movements in a smooth and natural way, using the same dimensionality reduction strategies that the brain applies during motor execution.

Materials and methods

Subjects

Nine healthy volunteers (5F, age 25 ± 3 yrs) participated in the study. The subjects were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971). All participants had normal or corrected-to-normal visual acuity and received a medical examination, including a brain structural MRI scan, to exclude any disorder that could affect brain structure or function.

Experimental setup

The kinematic, EMG, and fMRI data were acquired during three separate sessions that were performed on different days, in a randomly alternated manner across participants. Eight of nine subjects performed all the three sessions, while EMG data from one participant were not recorded due to hardware failure. Across the three sessions, participants were requested to perform the same task of grasp-to-use gestures towards 20 different virtual objects. A training phase was performed prior to the sessions to familiarize participants with the experimental task.

The kinematic and EMG experiments were performed to obtain accurate descriptions of the final hand posture. Three models of equal dimensions (i.e., five dimensions for each of the twenty postures) were derived from these two sessions: a kinematic synergy model based on PCA on kinematic data, an additional kinematic description which considers separately the displacements of each individual digit for each posture, and an EMG-based muscle synergy model. The models were first assessed using a machine-learning approach to measure their ability to discriminate among individual postures. The models were then used in a comparable method (i.e., encoding procedure) aimed at predicting the fMRI activity while subjects performed the same hand grasping gestures. Finally, fMRI activity was used to reconstruct the hand postures (i.e., decoding procedure).

Kinematic experiment

The first experimental session consisted of kinematic recording of hand postures during the execution of motor acts with common objects. More specifically, we focused on the postural (static) component at the end of reach-to-grasp movements. Kinematic postural information was acquired with the model described in a previous study (Gabiccini et al., 2013), which is a fully parameterized model, reconstructed from a structural magnetic resonance imaging of the hand across a large number of postures (Stillfried et al., 2014). Such a model can be adapted to different subjects through a suitable calibration procedure. This model is amenable to in vivo joint recordings via optical tracking of markers attached to the skin and is endowed with a mechanism for compensating soft tissue artifacts caused by the skin and marker movements with respect to the bones (Gustus et al., 2012).

Kinematic data acquisition

During the recordings, participants were comfortably seated with their right hand in a resting position (semipronated) and were instructed to lift and shape their right hand as to grasp a visually-presented object. Stimuli presentation was organized into trials in which pictures of the target objects were shown on a computer screen for three seconds and were followed by an inter-stimulus pause (two seconds), followed by an auditory cue that prompted the grasping movements. The interval between two consecutive trials lasted seven seconds. In each trial, subjects were requested to grasp objects as if they were going to use them, and to place their hands in the resting position once the movement was over. Twenty different objects, chosen from our previous report (Santello et al., 1998), were used in the current study (see Supplementary file 1J for a list).

The experiment was organized in five runs, each composed by twenty trials, in randomized order across participants. Therefore, all the grasp-to-use movements were performed five times. The experiment was preceded by a training session that was performed after the positioning of the markers. Hand posture was measured by an optical motion capture system (Phase Space, San Leandro, CA, USA), composed of ten stereocameras with a sampling frequency of 480 Hz. The cameras recorded the Cartesian positions of the markers and expressed them with reference to a global inertial frame and to a local frame of reference obtained by adding a bracelet equipped with optical markers and fastened to the participants’ forearm. This allowed marker coordinates to be expressed with reference to this local frame. To derive the joint angles of the hand, other markers were placed on each bone (from metacarpal bones to distal phalanxes) and on a selected group of joints: thumb carpo-metacarpal (CMC), metacarpophalangeal (MCP) and interphalangeal (IP); index and middle MCPs; and all proximal interphalangeals (PIPs). This protocol is shown in Figure 5—figure supplement 1 and a full list of markerized joints and their locations can be found in Supplementary file 1K and in Gabiccini et al., 2013.

The placement of the markers was performed according to the model described in Gabiccini et al., 2013, which consists of 26 Degrees of Freedom (DoFs), 24 pertaining to the hand and 2 to the wrist. The wrist markers were not used in subsequent analyses. The marker configuration resembles a kinematic tree, with a root node corresponding to the Cartesian reference frame, rigidly fastened to the forearm, and the leaves matching the frames fixed to the distal phalanxes (PDs) of the five digits, as depicted in the first report of the protocol (Gabiccini et al., 2013).

Kinematic data preprocessing

First, the frame rate from the ten stereocameras was downsampled to 15 Hz. After a subject-specific calibration phase, which was performed to extract the geometric parameters of the model and the marker positions on the hand of each participant, movement reconstruction was performed by estimating all joint angles at each sample with an iterative extended Kalman filter (EKF) which takes into account both measurements explanation and closeness to the previous reconstructed pose (see Gabiccini et al., 2013 for further details).

Once all trials were reconstructed, the posture representing the final grasping configuration was selected through direct inspection. The final outcome of this procedure was a 24 x 100 matrix for each subject, containing 24 joint angles for 20 objects repeated five times.

Kinematic model

The kinematic data from each subject were analyzed independently. First, the hand postures were averaged across five repetitions for each object, after which the data matrix was centered by subtracting, from each of the 20 grasping movements, the mean posture calculated across all the motor acts. Two different models were obtained from the centered matrix. The first was a kinematic synergy model, obtained by reducing the dimensionality with a PCA on the 20 (postures) by 24 (joint angles) matrix and retaining only the first five principal components (PCs). In this way, the postures were projected onto the components space, hence obtaining linear combinations of synergies.

To obtain an alternative individual digit model, defined on a somatotopic basis, the displacement of individual digits was also measured (Kirsch et al., 2014). Briefly, the displacement of each finger for the twenty single postures was obtained by calculating the sum of the single joint angles within each digit and gesture, again excluding wrist DoFs.

The analyses of all the sessions were carried out using MATLAB (MathWorks, Natick, MA, USA), unless stated otherwise

EMG experiment

The second session consisted of a surface electromyography acquisition (EMG) during the execution of grasp-to-use acts performed towards the same imagined objects presented during the kinematic experiment.

EMG acquisition

EMG signals were acquired from five different muscles using self-adhesive surface electrodes. The muscles used for recording were: flexor digitorum superficialis (FDS), extensor digitorum communis (EDC), first dorsal interosseus (FDI), abductor pollicis brevis (APB), and abductor digiti minimi (ADM). The individuation of the sites for the recording of each muscle was performed according to the standard procedures for EMG electrode placement (Hermens et al., 1999; Hermens et al., 2000). The skin was cleaned with alcohol before the placement of electrodes.

Participants performed the same tasks and protocol used in the kinematic experiment, i.e., visual presentation of the target object (three seconds), followed by an inter-stimulus interval (two seconds), an auditory cue to prompt movement, and an inter-trial interval (seven seconds). The experiment was divided into runs that comprised the execution of grasping actions towards all the 20 objects, in randomized order. Participants performed six runs. Each gesture was therefore repeated six times.

EMG signals were recorded using two devices (Biopac MP35 for four muscles; Biopac MP150 for the fifth muscle) and Kendall ARBO 24-mm surface electrodes, placed on the above-mentioned muscles of the participants’ right arm. EMG signals were sampled at 2 kHz.

EMG model

First, EMG signals were resampled to 1 kHz and filtered with a bandpass (30–1000 Hz) and a notch (50 Hz) filter. For each channel, each trial (defined as a time window of 2500 samples) underwent the extraction of 22 primary time-domain features, chosen from those that are most commonly used in EMG-based gesture recognition studies (Zecca et al., 2002; Mathiesen et al., 2010; Phinyomark et al., 2010; Tkach et al., 2010; see Chowdhury et al., 2013 for a review). Additional second-order features were obtained from the first features, computing their signal median, mean absolute deviation (MAD), skewness, and kurtosis. A complete list of the EMG features we used can be found in Supplementary file 1L.

A muscle model was derived from the chosen features as follows: first, the pool of 410 features (82 for 5 channels) was reduced to its five principal components. The 1 x 5 vectors describing each individual movement were averaged across the six repetitions. This 20 (movements) x 5 (synergies) matrix represented the muscle synergy model for the subsequent analyses.

Models validation

To verify that the three models (kinematic synergies, individual digit, and muscle synergies) were able to accurately describe hand posture, their capability to discriminate between individual gestures was tested. To this purpose, we developed a rank accuracy measure within a leave-one-out cross-validation procedure, as suggested by other authors to solve complex multiclass classification problems (Mitchell et al., 2004). For each iteration of the procedure, each repetition of each stimulus was left out (probe), whereas all other repetitions (test set) were averaged. Then, we computed PCA on the data from the test set. The PCA transformation parameters were applied to transform the probe data in a leave-one-repetition-out way. Subsequently, we computed the Euclidean distance between the probe element and each element from test dataset. These distances were sorted, generating an ordered list of the potential gestures from the most to the least similar. The rank of the probe element in this sorted list was transformed in a percentage accuracy score. The procedure was iterated for each target gesture and repetition of the same grasping movement. The accuracy values were first averaged across repetitions and then across gestures, resulting in one averaged value for each subject. In this procedure, if an element is not discriminated above chance, it may fall in the middle of the ordered list (around position #10), which corresponds to an accuracy of 50%. For this reason, the chance level is always 50%, regardless of the number of gestures under consideration, while 100% of accuracy indicated that the correct gesture in the sorted list retained the highest score (i.e., the lowest distance, first ranked) across repetitions and participants.

The accuracy values were then tested for significance against the null distribution of ranks obtained from a permutation test. After averaging the four repetitions within the test set, the labels of the elements were shuffled; then, the ranking procedure described above was applied. The procedure was repeated 10,000 times, generating a null distribution of accuracies; the single-subject accuracy value was compared against this null distribution (one-sided rank test). This procedure was applied to the three models extracted from kinematic and EMG data, obtaining a measure of noise and stability across repetitions and each posture, as described by the three different approaches. Such validation procedure was therefore a necessary step to measure the information content of these three models before testing their ability to predict the fMRI signal.

Individuation of the optimal number of components

The extraction of postural or muscle synergies from kinematic and EMG data was based on a PCA applied to the matrices of sensor measures or signal features, respectively. For the analyses performed here, we chose models based on the first five principal components that were shown to explain more than 90% of the variance in previous reports, even if those models were applied on data with lower dimensionality (Santello et al., 1998; Weiss and Flanders, 2004; Gentner and Classen, 2006). Moreover, an additional model was obtained from the postural data, thus leading to three different models with the same dimensionality (five dimensions): a kinematic synergy model (based on PCA applied to joint angles), an individual digit model (based on the average displacement of the digits), and a muscle synergy model (based on PCA applied to EMG features). However, to verify that the procedures applied here to reduce data dimensionality yielded the same results of those applied in previous works, we performed PCA by retaining variable numbers of components, from 1 to 10, and applied the above-described ranking procedure to test the accuracy of all data matrices. The plots of the accuracy values as a function of the number of PCs can be found in Supplementary files 1M and in Figure 6. The result of this analysis confirmed that the present data are consistent with the previous literature. The same testing procedure was also applied to the individual digit model by computing the rank accuracies for the full model (five components) and for the reduced models with 1 to 4 PCs.

The three graphs display the rank accuracy values as a function of the dimensionality (i.e., the number of retained PCs) of each behavioral model.

The two models derived from kinematic and EMG data (upper and middle graphs, respectively) have a number of synergies ranging from 1 to 10 while the individual digit model (lower) had 1 to 5 retained PCs. Darker bar colors indicate the dimensionality chosen for encoding brain functional data.

The task design was identical to that used in previous sessions. Specifically, participants had to shape their hand as if grasping one of the twenty visually-presented objects. In the current session, the subjects were asked to perform only the hand preshaping, limiting the execution of reaching acts with their arm or shoulder, since those movements could easily cause head motion. The day before MRI, all subjects practiced movements in a training session.

The paradigm was composed of five runs, each consisting of 20 randomized trials. Each trial consisted of a visual presentation of the target object (2.5s), an inter-stimulus pause (5 s) followed by an auditory cue to prompt movements, and an inter-trial interval (12.5s). The functional runs had two periods of rest (15 s) at their beginning and end to measure baseline activity. The total duration was 6 min and 10 s (172 time points). The total scanning time was about 40 min.

fMRI preprocessing

The initial steps of fMRI data analysis were performed with the AFNI software package (Cox, 1996). All volumes within each run were temporally aligned (3dTshift), corrected for head motion by registering to the fifth volume of the run that was closer in time to the anatomical image (3dvolreg) and underwent a spike removal procedure to correct for scanner-associated noise (3dDespike). A spatial smoothing with a Gaussian kernel (3dmerge, 4 mm, Full Width at Half Maximum) and a percentage normalization of each time point in the run (dividing the intensity of each voxel for its mean over the time series) were subsequently performed. Normalized runs were then concatenated and a multiple regression analysis was performed (3dDeconvolve). Each trial was modeled by nine tent functions that covered its entire duration from its onset up to 20 s (beginning of the subsequent trial) with an interval of 2.5 s. The responses associated with each movement were modeled with separate regressors and the five repetitions of the same trial were averaged. Movement parameters and polynomial signal trends were included in the analysis as regressors of no interest. The t-score response images at 2.5, 5, and 7.5 s after the auditory cue were averaged and used as estimate of the BOLD responses to each grasping movement compared to rest.

The choice to average three different time points for the evaluation of BOLD response was justified by the fact that such a procedure leads to simpler encoding models for subsequent analyses and that the usage of tent functions is a more explorative procedure that is not linked to an exact time point. For this reason, we could obtain an estimation of brain activity that is more linked to the motor act than to the visual presentation of the target object by concentrating only on a restricted, late time interval. This approach – or similar ones – has also been used by other fMRI studies (Mitchell et al., 2008; Connolly et al., 2012).

The coefficients, averaged related to the 20 stimuli of each subject, were transformed to the standard MNI 152 space. First FMRIB Nonlinear Image Registration Tool (FNIRT) was applied to the anatomical images to register them in the standard space with a resolution of 1 mm3 (Andersson et al., 2007). The matrix of nonlinear coefficients was then applied to the BOLD responses, which were also resampled to a resolution of 2 x 2 x 2 mm.

fMRI single-subject encoding analysis

To identify the brain regions whose activity co-varied with the data obtained from the three models – kinematic, EMG synergies, and individual digits – a machine learning algorithm was developed, based on a modified version of the multiple linear regression encoding approach first proposed by Mitchell and colleagues (Mitchell et al., 2008). This procedure is aimed at predicting the activation pattern for a stimulus by computing a linear combination of synergy weights obtained from the behavioral models (i.e., Principal Components) with an algorithm previously trained on the activation images of a subset of stimuli (see Figure 5—figure supplement 1). The procedure consisted in 190 iterations of a leave-two-out cross-validation in which the stimuli were first partitioned in a training set (18 stimuli) and a test set with the two left-out examples. The sample for the analysis was then restricted to the 5000 voxels with the best average BOLD response across the 18 stimuli in the training set (expressed by the highest t-scores). For each iteration, the model was first trained with the vectorized patterns of fMRI coefficients of 18 stimuli associated with their known labels (i.e., the target objects). The training procedure employed a least-squares multiple linear regression to identify the set of parameters that, if applied to the five synergy weights, minimized the squared error in reconstructing the fMRI images from the training sample. After training the model, only the 1000 voxels that showed the highest R2 (a measure of fitting between the matrix of synergy weights and the training data) were retained. A cluster size correction (nearest neighbor, size = 50 voxels) was also applied to prune small, isolated clusters of voxels. The performance of the trained model was then assessed in a subsequent decoding stage by providing it with the fMRI images related to the two unseen gestures and their synergy weights, and requiring it to associate an fMRI pattern with the label of one of the left-out stimuli. The procedure was performed within the previously chosen 1000 voxels and accuracy was assessed by considering the correlation distance between the predicted and real fMRI patterns for each of the two unseen stimuli. This pairwise procedure led therefore to a number of correctly predicted fMRI patterns ranging from 0 to 2 with a chance level of 50%. This cross-validation loop was repeated 190 times, leaving out all the possible pairs of stimuli. Therefore, the results consisted of an overall accuracy value – the percentage of fMRI patterns correctly attributed, which is an expression of the success of the model in predicting brain signals – and a map of the voxels that were used in the procedure – i.e., the voxels whose signal was predictable on the basis of the synergy coefficients. Every voxel had a score ranging from 0 (if the voxel was never used) to a possible maximum of 380 (if the voxel was among the 1000 with the highest R2 and the two left-out patterns could be predicted in all the 190 iterations). The encoding analysis was performed in separate procedures for each model – i.e., kinematic and muscle synergies and individual digit. We obtained, therefore, three sets of accuracy values and three maps of the most used voxels for each subject. These results, which displayed the brain regions whose activity was specifically modulated by the grasping action that was performed inside the scanner, were subsequently used for building the group-level probability maps (see below).

Assessment of the accuracy of the encoding analysis

The single-subject accuracy was tested for significance against the distribution of accuracies generated with a permutation test within the above-defined encoding procedure. Permutation tests are the most reliable and correct method to assess statistical significance in multivariate fMRI studies (Schreiber and Krekelberg, 2013; Handjaras et al., 2015). The null distribution of accuracies was built with a loop in which the model was first trained with five randomly chosen synergy weights that were obtained by picking a random value out of the 18 (one for each gesture) in each column of the matrix of synergies. The trained model was subsequently tested on the two left-out images. The procedure was repeated 1000 times, leading to a null distribution of 1000 accuracy values against which we compared the value obtained from the above-described encoding method. Similarly to the encoding analysis, we did not use either the fMRI images or the synergy weights of the two test stimuli for training the model. The left-out examples were therefore tested by an algorithm that had been trained on a completely independent data sample. The weights were shuffled only within column: this procedure yielded vectors of shuffled weights with the same variance as the actual kinematic PCs, even though those vectors were no longer orthogonal. Permutation tests were performed separately for each subject with the three data matrices. Each single-subject accuracy was therefore tested against the null distribution of accuracy values obtained from the same subject data (one-sided rank test).

Group-level probability maps

A group map displaying the voxels that were consistently recruited across subjects was obtained for the kinematic synergy model. The single-subject maps achieved from the encoding analysis, which display the voxels recruited by the encoding procedure in each subject, were first binarized by converting non-zero accuracy values to 1, then summed to obtain an across-subjects overlap image. Moreover, a probability threshold of these maps (p>0.33) was applied on the maps to retain voxels in which the encoding procedure was successful in at least four out of the nine subjects (Figure 1).

Discrimination of single postures by fMRI data

The accuracies of pairwise discrimination of postures, achieved during the decoding stage of the encoding procedure, were combined across subjects, so to identify the postures that could be discriminated with the highest accuracy based on their associated BOLD activity. The results were displayed as a heat map (Figure 5), with a threshold corresponding to the chance level of 50%.

Assessment of kinematic synergies across subjects

To evaluate whether the synergies computed on kinematic data from our sample would allow for a reliable reconstruction of hand posture, we needed to verify that these synergies are consistently ranked across individuals. Therefore, we used Metric Pairwise Constrained K-Means (Bilenko et al., 2004), a method for semi-unsupervised clustering that integrates distance function and constrained classes. We used the weights of the first three kinematic synergies for the 20 gestures in each subjects as input data and arranged the set of 27 20-items vectors into three classes with nine synergies that showed the higher similarity (see Supplementary file 1N). This analysis was limited to the first three PCs since previous reports (Santello et al., 1998; Gentner and Classen, 2006) suggest that they may constitute a group of “core synergies”, with a cumulative explained variance greater than 80%. This analysis was performed only on the synergies obtained from the kinematic synergy model, which was able to outperform both the individual digit and muscle synergy models in terms of encoding accuracy percentages on fMRI data.

To facilitate the interpretation of the first kinematic PCs as elementary grasps, we plotted the time course of the corresponding hand movements. The plots are 2s-long videos showing three movements from the minimum to the maximum values of PCs 1, 2 and 3, respectively, expressed as sets of 24-joint angles averaged across subjects (Video 1).

Cortical mapping of the three group synergies

The three group synergies were studied separately, computing the single correlations between each PC and the fMRI activation coefficient. This correlation estimated the similarity between the activity of every voxel for the 20 grasping acts and the weights of each single synergy. The coefficient of determination (R2) for each synergy was averaged across participants to achieve a measurement of group-level goodness of fit. The overlap image between the group-level probability map and the goodness of fit for each synergy was then obtained and mapped onto a flattened mesh of the cortical surface (Figure 2). The AFNI SUMA program, the BrainVISA package and the ICBM MNI 152 brain template (Fonov et al., 2009) were used to render results on the cortical surface (Figure 1 and 2).

To provide a statistical assessment of the orderly mapping of synergies across the regions recruited by the encoding procedure, a comparison between the map space and the feature space was performed (Goodhill and Sejnowski, 1997; Yarrow et al., 2014). The correlation of the two spaces is expressed by an index (C parameter) that reflects the similarity between the arrangement of voxels in space and the arrangement of their information content: high values indicate that voxels which contain similar information are also spatially close, suggesting a topographical organization. The map space was derived measuring the standardized Euclidean distance between each voxel position in the grid. The feature space was computed using the standardized Euclidean distance between the three synergy weights, as defined by their R2, for each voxel and averaged across subjects according to the classes described in the sections Assessment of kinematic synergies across subjects and Cortical mapping of the three group synergies. The C parameter was achieved by computing the Pearson correlation between the map space and the feature space (Yarrow et al., 2014). An ad-hoc statistical test was developed to assess the existence of the topography. A permutation test was performed generating a null-distribution of C values by correlating the map space with feature spaces obtained by averaging the three synergies across subjects with different random combinations (10,000 iterations). The p-value was calculated by comparing the null-distribution with the C parameter obtained with the cortical mapping (one-sided rank test).

Representational content measures (Kriegeskorte et al., 2008a; Kriegeskorte and Kievit, 2013) were carried out to explore the information that is coded in the regions activated during the execution of finalized motor acts. Representational spaces (RSs) are matrices that display the distances between all the possible pairs of neurofunctional or behavioral measures, informing us about the internal similarities and differences that can be evidenced within a stimulus space. By computing a second-order correlation between single model RSs we can evaluate both the similarity between the information carried by the single behavioral models (kinematic, individual digits and EMG) and between behavioral data and brain activity as measured by fMRI.

RSA was therefore performed within a subset of voxels that were consistently activated by the task. A Region of Interest (ROI) was derived from the fMRI data by performing a t-test (AFNI program 3dttest++) that compared the mean brain activity at 2.5, 5, and 7.5 s after the auditory cue and the activity at rest. The results were corrected for False Discovery Rate (Benjamini and Hochberg, 1995; p<0.05) (Figure 4—figure supplement 2). Afterwards, the t-scores relative to each voxel within the ROI were normalized by subtracting the mean across-stimulus activation of all the voxels in the ROI and dividing the value by the standard deviation (z-score normalization). PCA was performed to reduce the BOLD activity of the voxels in the ROI to the first five principal components. Activation pattern RSs were then obtained for each subject by calculating the Euclidean distance between the PCs of all the possible pairs of stimuli (Edelman et al., 1998; Kriegeskorte et al., 2008b; Haxby et al., 2014). Model RSs were similarly computed for the three types of postural data. This procedure led to a set of brain activity RSs and three sets of model RSs for kinematic synergy, individual digit, and muscle synergy models, respectively. The single subject RSs were averaged to obtain a unique group RS for each model.

Since we were interested in identifying the similarities and differences between the information expressed by the behavioral models and the information encoded in the brain, we estimated Pearson correlation separately between the fMRI-based RS and each model RS (Kriegeskorte et al., 2008a, 2008b; Devereux et al., 2013). Moreover, to study the possible specific relations between the behavioral models, additional pairwise correlations between the three model RSs were also performed.

These correlations were tested with the Mantel test by randomizing the twenty stimulus labels and computing the correlation. This step was repeated 10,000 times, yielding a null distribution of correlation coefficients. Subsequently, we derived the p-value as the percent rank of each correlation within this null distribution (Kriegeskorte et al., 2008a). The correlations were also estimated between single-subject RSs.

In addition, a MDS procedure, using standardized Euclidean distance, metric stress criterion and Procrustes alignment (Kruskal and Wish, 1978) was performed to represent the kinematic synergies and the patterns of BOLD activity across subjects (Figure 3).

Decoding of hand posture from fMRI data

Additionally, the fMRI data were used to decode hand postures from stimulus-specific brain activity.

This procedure was performed using fMRI coefficients to obtain a set of 24 values, each representing the distances between adjacent hand joints, which could then be used to plot hand configuration. To this purpose, we first run a PCA on the fMRI data, using the voxels within the mask obtained for the RSA and MDS (see above and Figure 4—figure supplement 2) to avoid any possible selection bias; with this procedure, the dimensionality of the data was reduced to the first five dimensions, as previously done for kinematic and EMG data.

Then, a multiple linear regression was performed within a leave-one-stimulus-out procedure by using the matrix of postural coefficients as predicted data and the reduced fMRI matrix as predictor. This allowed for the reconstruction of the coefficients of the left-out posture, yielding a matrix with 20 rows (postures) and 24 columns (joint angles). Finally, we estimated the goodness of fit (R2) between the reconstructed data and the original postural matrices recorded with the optical tracking system, both subject-wise (i.e., computing the correlation of the whole matrices) and posture-wise (i.e., computing the correlation of each posture vector). In addition, the decoding performance was assessed using a rank accuracy procedure (similar to those performed in the behavioral analyses) in which each reconstructed posture was classified against those originally recorded during the kinematic experiment. The accuracy values were tested against the null distribution generated by a permutation test (10,000 iterations). The reconstructed data were then plotted, using custom code written in MATLAB and Mathematica 9.0 (Wolfram Research, Inc., Champaign, IL, USA) (Figure 4).

Appendix: Impact of the number of channels on gesture discrimination from EMG data

It could be hypothesized that the worse performance of the muscle synergy model as compared to the alternative kinematic synergy or individual digit models could be related to its lower dimensionality (five muscles against 26 hand DoFs). Despite previous reports indicate that a reliable gesture discrimination can be achieved from seven (Weiss and Flanders, 2004) or fewer muscles (Ganesh et al., 2007; Ahsan et al., 2011), it is feasible to record a larger number of muscles using advanced EMG devices.

Hence, we verified the impact of the number of EMG channels on the muscle synergy model in an independent sample of four healthy young subjects (4 M, age 34 ± 6) using the same experimental paradigm described in the Methods.

EMG data were acquired using a 16-channel Bagnoli 16 EMG recording device (Delsys Inc, Natick, MA, USA). Sixteen electrodes were placed on the hand and forearm using the same placement adopted in our protocol (see Materials and methods and Figure 1 below) as well as in two distinct protocols with different spatial resolutions (Bitzer and van der Smagt, 2006; Ejaz et al., 2015). Six runs were acquired, each comprising twenty trials of delayed grasp-to-use motor acts towards visually-presented objects (see Materials and methods).

To estimate the impact of the number of EMG recording sites and the preprocessing methods, data were analyzed using two distinct procedures: a mean-based procedure (similarly to Ejaz et al., 2015), and a feature-based procedure.

In the mean-based procedure, data from the sixteen EMG channels (acquired at 1000 Hz) were de-trended, rectified, and low-pass filtered (fourth-order Butterworth filter, 40 Hz). The time series from each gesture and channel were later averaged over a 2.5 s time window (2500 time points). From this preprocessing we obtained twenty 16x1 vectors for each run.

In the feature-based procedure, EMG signals were preprocessed and eighty-two features from each channel were extracted as described in the Methods section.

Subsequently, two procedures were developed to uncover the impact of different processing methods and EMG channel configurations. First, we generated all the possible configurations that could be obtained by choosing the channels randomly. Second, we selected three fixed configurations as subsamples of electrodes (displayed in Figure 1), according to the Methods in this manuscript (electrodes 1-5) and previous reports that recorded ten (Bitzer and van der Smagt, 2006; electrodes 1-4, 6-8, 14-16), or fourteen channels (Ejaz et al., 2015; electrodes 1-14).

To allow comparisons across different channel configurations, the EMG matrix (i.e., the averaged EMG activity in the mean-based procedure and the extracted features in the feature-based procedure) was reduced to five dimensions using PCA. Then, both these procedures were assessed with a leave-one-out cross-validation algorithm based on the same rank accuracy measure described in the manuscript.

This additional experiment provides a measure of the quality of each channel configuration: the higher the accuracy, the more informative the configuration. The results are shown in Figure 2 as the average across combinations and subjects ± SEM. We tested all configurations that could be obtained by randomly selecting 5 to 16 electrodes (red and blue lines), as well as three fixed configurations according to the setups described above (orange and light blue dots). The red line represents the results using the mean-based procedure, while the blue line depicts the feature-based procedure. The orange and light blue dots represent the results of the three fixed configurations of channels in the two procedures.

Results of the rank accuracy procedure as a function of the number of EMG channels.

The red line shows the accuracy values for random configurations of 5 to 16 electrodes, using the mean-based preprocessing adopted by Ejaz et al. (2015). The orange dots represent the accuracy values for three fixed configurations. The blue line shows the accuracy values for 5 to 16 channels using the feature-based preprocessing (see Materials and methods); the light blue dots show the accuracy for three fixed configurations. Values are reported as mean across subjects ± SEM (error bars and bands).

The results show that, for the feature-based procedure, the accuracy increases as a function of the number of electrodes, reaching a peak with 16 channels (mean ± SEM: 81.6 ± 2%); the mean accuracy across all the possible configurations with five channels is 73.5 ± 2.5%. The accuracy obtained with the setup adopted in our current paper was 74.2 ± 6.4%. For the mean-based procedure described in Ejaz et al. (2015), eleven channels yielded the highest accuracies among all the possible random configurations (value: 72.2 ± 3.2%); accuracy decreased when lower or higher numbers of electrodes were recorded. In these data, the accuracy for the configuration of five channels adopted in our paper was 69.5 ± 1.6%.

Overall, these results indicate that the extraction of features from the EMG signal proves to be a reliable procedure to a discriminate complex hand gestures. In addition, despite the fact that the feature-based approach seems to benefit from EMG recordings with more channels, the gain when raising the number of channel to 16 is low (5.5%). This result, along with the above-chance discrimination achieved when analyzing five channels clearly suggests that the number of muscles recorded in our paper represents the muscle space with a reasonable accuracy. Moreover, feature-based approaches are likely to be better descriptors of more complex gestures (as the ones considered in our study) with respect to the mean signal over time, as hypothesized and discussed in previous reports (Hudgins et al., 1993; Zecca et al., 2002).

In conclusion, the muscle synergy model, even if based on many EMG channels, still underperforms relatively to the models obtained from kinematic data in encoding fMRI responses. For this reason, the worst performance of the muscle synergy model is likely to represent an intrinsic limitation of surface EMG signals rather than a flaw of the recordings and analyses performed in our paper.

Decision letter

Jody C Culham

Reviewing Editor; University of Western Ontario, Canada

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

[Editors’ note: a previous version of this study was rejected after peer review, but the authors submitted for reconsideration. The first decision letter after peer review is shown below.]

Thank you for submitting your work entitled "A synergy-based control is encoded in human motor cortical areas" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Jody Culham as Reviewing Editor and Timothy Behrens as the Senior Editor. Our decision has been reached after consultation between the reviewers.

The following individuals involved in the review of your submission have agreed to reveal their identity: Joern Diedrichsen and Jason Gallivan (reviewers).

Based on these discussions and the individual reviews below, we regret to inform you that we are rejecting the manuscript. As it stands, the manuscript would require substantial revisions, including new data and analyses, to address reservations raised by the reviewers, particularly Reviewer #1. All three reviewers (like the editors in the initial evaluation) saw potential in the approach but their enthusiasm was tempered by a number of concerns. In post-review discussions, even Reviewers #2 and #3, who were largely positive, agreed that the manuscript should not be published unless the major concerns detailed below are addressed. Normally eLife tries to avoid making authors go through a gauntlet of revisions if a positive final outcome is uncertain. As such, we are rejecting the current manuscript.

That said, we would be willing to consider a new manuscript that unequivocally addresses the concerns raised. In this case, we would aim to recruit the same reviewers. We must emphasize that, as with any new manuscript, there is no guarantee of publication, especially if you do not make the required changes or the new analyses do not support the conclusions. As such, you may decide instead to submit your manuscript to another journal, in which case we hope the reviewers' comments are helpful.

To be considered for publication in eLife, the following changes would be essential (based on the specific reviewer comments detailed below):

1) The authors would need to test their synergy model against a more plausible "muscle model" based on better EMG recordings of more muscles (Reviewer #1, Point #1; though the other two reviewers were in close agreement during post-review discussion). Though the possibility of removing the muscle model from the paper was discussed, the consensus was that this would reduce the impact of the paper.

2) There was a clear consensus among all three reviewers and reviewing editor that the interpretation of principal components and their mapping needs to be unpacked better. One concern is that without any insight as to what the principal components represent, the demonstration that they show a topography is of limited value (Reviewer #2, first point). A second concern is whether the topography really reflects those components or would equally reflect components in other rotated versions of the space (Reviewer #1, Point #3). There may be some potential with the PCA data ameliorate some concerns at the initial review stage and from Reviewer #1 about whether the paper provides a sufficient advance beyond Ejaz et al. (2015). While Ejaz et al. limited their analyses to M1 and S1, the present manuscript shows potentially interesting patterns in other parietal and frontal regions. However, as it stands, the patterns and interpretation so vague that they do not provide any real insight into regional differences.

3) The authors need to clarify their discussion of their PCA-based methods against the RSA based method (as used both in their paper and in Ejaz et al. – see Reviewer #2's comment on RSA and Reviewer #1, Comment #2).

In addition, if the authors choose to resubmit the manuscript in a new form to eLife, the other comments of the reviewers should be addressed. In post review discussion, all reviewers agreed with the suggestion that noise ceilings should be reported.

Reviewer #1:

The study "A synergy-based control is encoded in human motor cortical areas" provides an investigation of the MRI patterns associated with the execution of grasp-like hand shapes. The main finding is that the activity patterns of two left –out postures can be better discriminated when using 5 regressors extracted from kinematic synergies than 5 regressors reflecting the unsigned displacement of individual fingers or 5 regressors picked from features of EMG recording of 5 different hand muscles. The conclusion of the study are largely overlapping with that of an earlier paper from our lab (Ejaz et al., 2015), but the study adds a number of interesting extra aspects to this line of work, including testing the generalisation of the model to new postures and the investigation of the spatial arrangement of these synergies onto the cortical sheet (but see point 3). The current version of the paper, however, has a number of weaknesses that certainly would need addressing.

1) The alternative models (individual fingers and muscle) give the appearance of straw men. The individual finger uses the L1 norm of movement of each finger – so in contrast to the kinematic synergy model, it does not distinguish between finger flexion and finger extension. This decision appears to be somewhat arbitrary. So, it leaves the reader with the question of whether there is something special about taking the absolute value, or about the specific rotation of these 5 factors in representational space. A more convincing line of investigation would be to try to use optimisation to rotated the 5 linear factors in the kinematic space as to get the best possible decoding performance, and then test the closeness of this solution with the one provided by the kinematic synergy model.

Matters are worse with the muscle model. The authors recorded 5 muscles only, despite the fact that in our experience it is feasible to get 14 or more distinct signals from hand muscles from surface electrodes (Ejaz et al., 2015). These may not always reflect individual muscles, but that is hardly important if we only want to obtain a representative picture of the space of muscle activity. The extracted features from the EMG signals appear obscure to me; and ultimately, the ability to distinguish between postures based on this data is very bad, indicating that most of these numbers reflect noise. Given these large a-priori differences in the quality of the models, I think any subsequent difference in how well fMRI activity patterns can be predicted become utterly unconvincing. So I think the authors need to work harder on trying to equate the reliability of their models (for a possible method, see Ejaz et al. (2015), supplementary methods).

2) While I think that some of the techniques used in the paper are interesting and promising, I believe that they are currently not going beyond the RSA analysis presented in Ejaz et al. Unfortunately, in the discussion the authors misconstrue the previous evidence and perpetuate some misunderstandings that are all too common in the synergy field. For example the authors state that "these findings, however, provide no clue regarding the extent to which the brain may control the hand using functional modules" and "their model (hand usage model) was therefore similar to the individual digit model adopted in the present study", showing that they clearly do not appreciate the tight connection between RSA and the methods chosen here.

It is important to point out that the matrix of pairwise distances contains the same information as the covariance matrix between experimental conditions (and can easily be transformed into it). Extracting PCA factors from this covariance matrix provides a distribution of the same statistical quantity, only that it throws away a certain proportion of the information (and by just considering the principle vectors also disregards the relative importance of the factors). Therefore RSA and extraction of principal components from the covariance reflect highly related information – and I do not think that the authors have made any convincing case that anything can be learned from the PCA approach that cannot be learned from looking at the whole space. This boils down to the key question in the synergy field of whether there is something special about the principal vectors (or synergies) themselves, or just about the representational space they describe. With the current evidence, the difference between this paper and Ejaz et al. is purely superficial and methodological, but not conceptual.

Similarly, our hand use model is not equivalent to the individual finger model used in this paper, but is much closer to the kinematic synergy model (we present a single-finger model in the multi-finger experiment, which is much inferior to the natural statistics model). The authors choose to extract synergies by taking 20 postures that serve as ad-hoc samples of the natural statistics of movement. In our paper we chose to use data sets that are representative samples of the natural statistics of movement. Furthermore, we use the whole covariance matrix of the data to compare to the brain activity patterns, not just the first 5 factors.

Indeed, in the analysis of the multi-finger experiment in that paper, we started with very similar methods employed by the authors here, but ultimately decided to present only the RSA methods, as we believe that they show the main point of the similarity of representational space more concisely, than the extraction of some arbitrary number of main factors, which then serve as descriptors of the same space.

3) The mapping of synergies on the cortical sheet is an interesting addition and provides a real potential argument that the kinematic synergies are more that statistical descriptors of representation space, but that the factors themselves have special status. The problem, however, is that currently the one single mapping is not evaluated against many other possible mappings. Thus the authors have not shown that there is something special about the synergies extracted. For this, one would need to a) develop a measure of the "topological orderliness" of the mapping and b) compare the synergy map systematically against an exhaustive set of alternative rotations in the same space (again using optimisation). We actually attempted this analysis on our multi-finger experiment, but preliminary results were not terribly encouraging, as there seem to be rotations of these factors in the same rotational space which gave similarly orderly mappings. If the authors could show in a stringent and convincing fashion that the particular rotation chosen here is more orderly organised than any other possible rotation in the 3-dimensional space (or even conclude after careful evaluation that this is not the case), I think the paper would really increase in quality. Without such analysis Figure 2 remains merely suggestive and anecdotal and the claims not substantiated.

Reviewer #2:

This is excellent work. The study is well thought-out and executed, the paper is clearly written and the analyses are rigorous and appear to have been conducted with care. I suspect that the experimental question asked and results obtained will be of general interest to the sensorimotor research community and the authors do a good job of integrating and motivating their study based on what is currently known.

While I do not have any significant concerns about the work, there are a few points, summarized below, that I think should be considered in a revision.

1) Given the data-driven nature of the analyses used, I found some interpretation of the top 3 principal components, and their relation to the topography noted, lacking. Ultimately the insight provided by PCA in neurophysiology rests on being able to directly link the components to neural activity. While I agree that there is some general map of the components in sensorimotor cortex, their organization has no interpretation. Some interpretation of the components (PCs 1-3) and how they relate to cortex organization might be informative on this front. Otherwise, simply saying PCs are mapped onto cortex is fairly impenetrable for the reader.

2) I found the intermixing of results material into the Discussion section a bit disruptive (e.g., inclusion of Figure 5 and visual control analyses). I think that results material should be described and motivated in the Results section.

3) It was unclear to me why, after measuring from five muscles, and thus obtaining five measures (i.e., the same number of components in the synergy and individual digit models), the data was reduced through PCA and then up-sampled again (through cross-validation methods) to achieve 5 components. This should be fully explained, as no such manipulation was done to the individual digit data.

4) It might be interesting in the supplement to show the results of RSA/MDS for the other models (EMG and individual digit), allowing the reader to make comparisons between all 3 models.

5) I was initially confused in the Discussion why the authors referred to brain areas that were not apparent in the group maps shown in Figure 1 (e.g., ventral premotor cortex). This became apparent, however, after I viewed the actual source data on the MNI-152 brain, as 2-3 subjects show overlap in some of these areas. In any case, the authors should only use the text to refer to what is actually shown in the paper, to avoid such confusion.

6) In the Discussion, I was hoping for some discussion of the bilaterality of the effects observed, which are interesting. I would suggest adding this in a revision.

7) In addition to the visual control used in the paper (i.e., analysis of visual stimulation evoked time points in the RSA mask), I was thinking that an equally good control to show the selectivity of effects to sensorimotor cortex would be to localize much of visual cortex (e.g., based on visual stimulation response vs. rest) and then perform the exact same encoding analyses on those voxels. Visual cortex is well known to be involved in imagery-a key component of the experimental task-and to see how the kinematic models performs in that area would be of interest. If it does fairly well, it would have some significant bearing on what is actually being measured in sensorimotor cortex as well as its underlying organization.

Reviewer #3:

This study investigates whether and to what extent kinematics or muscle synergies are represented in the human motor system during grasping toward virtual objects. To this aim, authors measured hand kinematics and electromiography (EMG) signals and used these information to create a kinematic synergy model, an individual-digit model and a muscle synergy model. By computing correlations between each of these models and brain activity during grasping of virtual objects, they found that the kinematic synergy model explained better the fMRI data than the other models in various motor areas. The authors concluded that the control of hand postures in the brain is based on kinematics synergies.

This study addressed a very interesting question in the motor control field by using the state of the art fMRI analyses and combining various measurements like kinematics, EMG and fMRI data. I find the results and the conclusions of this study highly relevant for the understanding of the motor control system and for the advance of neuroprosthetics.

[Editors’ note: what now follows is the decision letter after the authors submitted for further consideration.]

Thank you for resubmitting your work entitled "A synergy-based control is encoded in human motor cortical areas" for further consideration at eLife. Your revised article has been favorably evaluated by Timothy Behrens (Senior editor), Reviewing editor Jody Culham, and by Joern Diedrichsen, one of the original reviewers.

The external reviewer has devoted a considerable amount of time to re-examining the revision and has had detailed conversations with the editors to make the point clear. We are now at the following position: Two of the original reviewers thought the manuscript was strong and we believe all of their concerns have been addressed. We also think the new manuscript is improved and remains of substantial interest but have several remaining concerns (see list below). Out of these concerns there is one point that is really essential and has a major impact on the interpretation of the study. Let us be clear, we think the manuscript is of interest and we intend to publish it, but we do not think that your current analyses support that claim that the synergies reflect actual cortical codes. The figure where you try to make this point (Figure 2) is not subjected to the correct tests to make this point (as you concede in your response to Reviewer 1, point 3 in your response).

eLife tries not to subject authors to endless rounds of revision, but we would like to give you one more opportunity to revise the paper to take into account the new comments. Essentially we are asking you to either (a) to perform a proper analysis which shows that the synergies do a better job of representing the cortical patterns than other rotations of the covariance space, or (b) to be clear in the manuscript that this claim is not supported and that the claim that you can support is that there covariance space as a whole is well-represented.

There is also confusion as to how to interpret the text around this critical analysis, which you describe is "Resistant" to rotations in the covariance space. It is clear that the analysis is not invariant to such rotations as described in point 11 below. When we read the text surrounding this description, we read the exact opposite interpretation from each other. One of us thought you were saying the R2 value was invariant to rotations and therefore resistant. Another thought you were saying that the R2value was NOT invariant, and therefore the overall analysis was resistant. The R2values change with rotations, and it is important that manuscript is clear about this. However, it is also clear that this point is not sufficient to demonstrate that the particular eignenvectors that you arrive at via the PCA are the ones that are encoded topologically in cortex. To make this point, you would need to compare them to other rotations of covariance space as described above, and in the previous review.

For clarity for a broad readership, if indeed you can show that the three principal components are better than other rotations, the Reviewing Editor thought that some of your wording in the reply to reviewers would be helpful to include in the manuscript itself ("However, a remarkable body of literature indicate that the highest-ranked kinematic PCs correspond to strictly coded grasping primitives (Santello et al., 1998; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008; Gentner et al., 2010; Overduin et al., 2012), see Santello et al., 2013 for review). […] In our study, we examined the first two PCs, which were highly consistent with the literature, along with a third one representing a movement of flexion and thumb opposition (as to grasp a dish or a platter)."

Reviewer #1:

Upon reading the revision, I think the authors have addressed some points raised in the original critique, whereas in other areas I found the response and the changes to the manuscript not satisfactory. This may be partly due to a strong philosophical difference regarding what synergies are and how to interpret the evidence – where I seem to fundamentally disagree with the authors. But I agree that the paper provides an interesting additional and alternative viewpoint to Ejaz et al. (2015) – so I do not want to stay in the way and would recommend publication after a number of clarifications and corrections have been made.

Overall I still found the methods and analysis presented in the paper still somewhat obscure and relatively hard to follow. In interest of clarity and transparency, I would therefore urge the authors to clarify the remaining points in the manuscript. It is the policy of eLife to not restrict length or supplimentary materials to allow the presentation of self-contained papers, and the authors should really try to be as clear as possible.

1) "Individual-digit model, based on a somatotopic criterion (Kirsch et al. 2014)" remains still as obscure as it was before. I urge you to clarify here in the Introduction. There is no notion of somatotopy in the individual-digit model as far as I can see. Somatotopy implies that it matters that the middle finger is closer to the ring than to the pinkie finger. The Individual-finger model treats all finger movements equally and independently – so it is not "somatotopic".

2) Results section, first paragraph: I found the analysis provided on the additional 4 subjects interesting and thank you for the additional clarification, but would ask you for two things: a) When using temporal averaging of the EMG signal (mean-based EMG analysis), you should use your dimensionality reduction to 5 PCA, as you did for the feature based analysis. This way we can clearly see that your temporal features, and not the dimensionality reduction provide the critical difference between the red and blue curve. Note also that you did not replicate exactly the analysis performed in Ejaz et al. (2011), as you skipped the critical prewhitening step. It is not clear whether this analysis would be sensible here, as your gestures are a ad-hoc sample from the natural statistics, not an equally-spaced sample of possible finger movements b) this analysis should be included as supplementary material and cited from the main text.

3) Supplementary file 15: I think the table should be supplemented by a one-sentence description for each feature that is detailed enough to be able to calculate these features without going onto a wild-goose chase in the cited papers. I urge the authors to start with a clear definition of symbols and then give a concise and unambiguous formula.

4) Section “A challenge to individual digit correction representations? The functional topography of hand synergies”: I find this section on functional topography overstated and do believe it requires a major change in tone. Your data shows that there is "some" topological organization of the first three synergies, not a "strict" one. Furthermore, some somatotopic clustering can also be shown for individual fingers or – most like for other rotations of the synergy vectors, and you have not provided a quantitative comparison with other possible organizations (see point 11).

5) Section “Limitations and methodological considerations”: The limitation section discusses relatively minor points. Two important weakness should be added: a) the point that while some clustered representation was shown in sensory-motor regions, you did not convincingly show that this specific set of synergies is more clustered than other rotation of the same vectors b) that in comparing the different models, the EMG-model had much less ability to discriminate different gestures and that the disadvantage of the muscle model may simply reflect noise levels on your measurement. These are important limitations that should be pointed out.

6) Paragraph two, “Models validation”: Please clarify in the text how the labels of the test set where shuffled. Specifically, if your test set contained 4 repetitions of each of the 20 gestures, did you shuffle the labels of all 80 trials completely randomly, or did you keep the 4 trials for the same gesture together and just give them together a new label (or equivalently shuffle the labels after averaging over the 4 trials)? This difference has important consequences for the variance of your reshuffling statistics.

7) “Every voxel had a score ranging from 0 (if the voxel was never used) to a possible maximum of 380 (if the two left-out patterns could be predicted, for that voxel, in all the 190 iterations).” Please explain this statement better. Do you mean to say the score was the number of times the voxel was included in the 1000 voxels AND got a specific gesture correct?

8) Section “Assessment of the accuracy of the encoding analysis”: Please clearly point out in the text that the weights were randomly shuffled within each column. Please also point out explicitly (I assume that this is true) that the new "PCA"s were now not orthogonal to each other anymore.

9) Now that I think that I understand what the single subject maps are, I think the group-level maps also needs some more explanation. The score for each voxel varied between 0 and 380 (as stated above). For each subject, which value was then considered as "successful"? Why was it called a "probability map"? Probability of what?

10) Section “Cortial mapping of the three group synergies”: I disagree that using R2 as a goodness of fit for each individual synergy makes the results invariant to rotations in synergy space. It does not. Maybe we fundamentally misunderstand each other, so I will make my point more concrete. Say, you have 2 "synergies" of 5 elements X1 and X2 and a 5-element data series Y.

x1=[-2 -1 0 1 2]';x2=[1 0 -2 0 1]';

Y=[-2 -2 1 2 2]';

Then the R2-values of each of the columns of X can be calculated as

R2_1 = Y'*x1*inv(x1'*x1)*x1'*Y/(Y'*Y) = 0.847

R2_2 = Y'*x2*inv(x2'*x2)*x2'*Y/(Y'*Y) = 0.039

Now I rotate

R=[cos(0.9) sin(0.9);-sin(0.9) cos(0.9)];

Z=[x1 x2]*R;z1 = Z(:,1);z2 = Z(:,2);

Now the individual R2-values are changed, and hence a mapwise evaluation criterion would also be changed.

R2_1 = Y'*z1*inv(z1'*z1)*z1'*Y/(Y'*Y) = 0.6351

R2_2 = Y'*z2*inv(z2'*z2)*z2'*Y/(Y'*Y) = 0.4629

I hope that clarifies my point and why I think a) the sentence stating that individual R2 values are rotation invariant should be removed and b) any claims regarding a special organisation on the cortex should be made weaker – for stronger claims you would need to compare the C-metric (which I think would fit this purpose) across many different rotations of the same vectors or ways of picking encoding vectors from the 20-dimensional space.

Author response

[Editors’ note: the author responses to the first round of peer review follow.]

In addition, if the authors choose to resubmit the manuscript in a new form to eLife, the other comments of the reviewers should be addressed. In post review discussion, all reviewers agreed with the suggestion that noise ceilings should be reported.

Reviewer #1:

The study "A synergy-based control is encoded in human motor cortical areas" provides an investigation of the MRI patterns associated with the execution of grasp-like hand shapes. The main finding is that the activity patterns of two left -out postures can be better discriminated when using 5 regressors extracted from kinematic synergies than 5 regressors reflecting the unsigned displacement of individual fingers or 5 regressors picked from features of EMG recording of 5 different hand muscles. The conclusion of the study are largely overlapping with that of an earlier paper from our lab (Ejaz et al., 2015), but the study adds a number of interesting extra aspects to this line of work, including testing the generalisation of the model to new postures and the investigation of the spatial arrangement of these synergies onto the cortical sheet (but see point 3). The current version of the paper, however, has a number of weaknesses that certainly would need addressing.

1) The alternative models (individual fingers and muscle) give the appearance of straw men. The individual finger uses the L1 norm of movement of each finger – so in contrast to the kinematic synergy model, it does not distinguish between finger flexion and finger extension. This decision appears to be somewhat arbitrary. So, it leaves the reader with the question of whether there is something special about taking the absolute value, or about the specific rotation of these 5 factors in representational space. A more convincing line of investigation would be to try to use optimisation to rotated the 5 linear factors in the kinematic space as to get the best possible decoding performance, and then test the closeness of this solution with the one provided by the kinematic synergy model.

The Reviewer is absolutely correct in stating that finger flexion and extension fully deserve to be kept as distinct as possible. As a matter of fact, our individual digit model was actually obtained by summing the individual joint angles for each digit, though in the description in the text we erroneously reported that it was the L1-norm of each digit. Consequently, our original individual digit model already distinguished positive and negative joint angles (with respect to the resting posture), thus preserving the difference between flexion and extension. We apologize for this erroneously reported piece of information.

Matters are worse with the muscle model. The authors recorded 5 muscles only, despite the fact that in our experience it is feasible to get 14 or more distinct signals from hand muscles from surface electrodes (Ejaz et al., 2015). These may not always reflect individual muscles, but that is hardly important if we only want to obtain a representative picture of the space of muscle activity.

Previous reports indicate that a reliable gesture discrimination can be achieved from seven (Weiss & Flanders, 2004; Shyu et al., 2002) or fewer muscles (Ganesh et al., 2007; Ahsan et al., 2011) and that, since EMG data are notoriously prone to artifacts such as cross-talk or amplitude cancellation (i.e., the masking of small or deep muscles by bigger or superficial ones), it is fairly important to place the electrodes as close as possible to individual muscles. In our study, both the number of recorded muscles and the analyses performed were able to guarantee a reliable discrimination between gestures, as indicated by the rank accuracy measure, which was well above the chance level for each subject (Supplementary file 1M and Figure 6).

In most advanced EMG devices, the number of acquisition sites on the forearm can raise dramatically up to 128 or 192 (Gazzoni et al., 2014; Muceli et al., 2014). However, due to the above mentioned cross-talk, the effect of the potential benefit that could derive from an increased number of EMG electrodes is still much debated: indeed multi-channel setups show a high collinearity, since many channels inevitably record the same muscles. As a matter of fact, a recent work that directly compared multiple channel setups – using either 6, 8, 16 or 192 channels – and extracted five synergies, found no significant differences among the 6, 16, 128 or 192 channel configurations (Muceli et al., 2014).

Nevertheless, in order to explore directly the Reviewer’s critique, we tested an independent sample of four healthy young subjects who performed the very same task used in our current paper to verify the impact of the number of EMG channels on the ‘muscle’ model. To this aim, for each subject we acquired six runs, each comprising twenty trials of delayed grasp-to-use motor acts towards visually-presented objects. Data were acquired using a Bagnoli 16 EMG recording device (Delsys Inc, Natick, MA, USA). Sixteen electrodes were placed on the hand and forearm using the same placement adopted in our protocol (see revised Methods and Appendix figure 1) as well as in three distinct protocols with different spatial resolutions (Bitzer and van der Smagt, 2006; Ganesh et al., 2007; Ejaz et al., 2015).

To obtain an overall estimation of the impact of the number of EMG recording sites and different preprocessing steps, data were analyzed using two distinct procedures: the first suggested by the Reviewer (Ejaz et al., 2015), and the other one based on feature extraction, as described in our Methods. Both preprocessing procedures were assessed with a leave-one-out cross-validation algorithm.

Using the first procedure (Ejaz et al., 2015), data from the sixteen EMG channels (acquired at 1,000 Hz) were de-trended, rectified, and low-pass filtered (fourth-order Butterworth filter, 40 Hz). The time series from each gesture and channel were later averaged over a 2.5 seconds time window (2,500 time points). We obtained twenty 16x1 vectors for each run. Then, we tested the discriminability of each individual movement (probe element) from each run against all the movement vectors averaged across the five remaining runs (rest dataset). This was performed with a leave-one-out rank accuracy procedure (Mitchell et al., 2004) in which the similarity between the probe element and all the vectors in the rest dataset were compared using the Mahalanobis distance. If the distance between the probe element and the vector from the rest dataset, which represents the same gesture, is lower than the distances between the probe vector and the other elements of the rest dataset, one may state that the element can be discriminated. The accuracies were tested against null distributions of 10,000 values, generated shuffling the labels in the rest dataset. This rank accuracy procedure provides a measure of the quality of each individual channel configuration: the higher the accuracy, the more informative the configuration. Since we wanted to estimate the discriminability as a function of the number of channels, we performed two tests, whose results are shown in Appendix figure 2.

First, we generated all the possible configurations that could be obtained by choosing the channels randomly. The rank accuracy procedure was performed for each of these channel configurations. The result of this procedure is shown in Appendix figure 2 (red line).

In the feature-based procedure, we computed, for each trial, eighty-two features from each channel, as described in the Methods section of our manuscript. Each individual gesture was then described as a combination of five PCs (muscle synergies), extracted from the features pooled across channels. Subsequently, a machine learning procedure based on a rank accuracy measure was employed to test to what extent the gestures could be discriminated based on the five muscle synergies (see Methods). As done with the procedure previously described, we tested all configurations that could be obtained by randomly selecting 1 to 16 electrodes (Appendix figure 2, blue line), as well as four subsamples according to the setups described above (Appendix figure 2, light blue dots).

The results show that, for the feature-based procedure, the accuracy increases as a function of the number of electrodes, reaching a peak with 16 channels (mean ± SEM: 81.6 ± 2%); the mean accuracy across all the possible configurations with five channels is 73.5 ± 2.5%. The accuracy obtained with the setup adopted in our current paper was 74.2 ± 6.4%, while the electrode placement described in Ejaz et al. (2015) led to an accuracy value of 79.8 ± 2.4%. For the procedure described in Ejaz et al. (2015), five channels yielded the highest accuracies among all the possible random configurations (value: 65.9 ± 2.1%); accuracy decreased when lower or higher numbers of electrodes were recorded. In these data, the accuracy for the configuration of five channels adopted in our paper was 69 ± 1.6%, higher than the one adopted in Ejaz et al. (59.6 ± 2.4%).

In summary, these results indicate that:

1) The extraction of features from the EMG signal obtained using the methodological approach adopted in our paper leads to a better discrimination of complex hand gestures

2) While the feature-based approach seems to benefit from EMG recordings with more channels, the low gain (5.5%) when raising the number of channel to 16, as well as the above-chance discrimination achieved when analyzing five channels clearly suggests that the number of muscles recorded in our paper represents the muscle space with a reasonable accuracy.

3) The results from the electrode placement adopted by Ejaz et al. (2015) show a lower accuracy, despite its higher dimensionality. A possible explanation for this finding may be related to the different experimental designs: when more complex gestures are considered (as in our design), data are probably more prone to electrode shifts or amplitude cancellation artifacts, whose impact increases with the number of electrodes in the EMG acquisition setup. Moreover, the better results achieved by using the feature-based approach, as compared to the procedure suggested by the Reviewer, may indicate that feature extraction represent a better technique for exploiting the advantages of EMG setups with higher spatial resolution, consistent with the data already present in the literature (Hudgins et al., 1993; Zecca et al., 2002).

Despite the better discrimination accuracy obtained with the feature-based procedure using a higher number of channels, we found that the accuracy values using a ‘muscle’ model based on many EMG channels still underperforms relatively to the models obtained from kinematic data in encoding fMRI responses. Therefore, we conclude that the worst performance of the ‘muscle model' is likely to represent an intrinsic limitation of surface EMG signals rather than a flaw of the recordings and analyses performed in our paper. For the reasons detailed below (Point 1c), we would expect that even high-density EMG data may not guarantee a better fit to brain signals than electrode configurations with a lower dimensionality.

The extracted features from the EMG signals appear obscure to me; and ultimately, the ability to distinguish between postures based on this data is very bad, indicating that most of these numbers reflect noise. Given these large a-priori differences in the quality of the models, I think any subsequent difference in how well fMRI activity patterns can be predicted become utterly unconvincing. So I think the authors need to work harder on trying to equate the reliability of their models (for a possible method, see Ejaz et al. 2015, supplementary methods).

EMG data are complex signals, as they represent indirectly the activity of one or more underlying muscles (Farina et al., 2004; Reaz et al., 2006; Farina et al., 2014). However, a well-consolidated assumption is that the instantaneous value of the EMG signal contains little or no information (Zecca et al., 2002). In addition, during complex movements the temporal structure of the EMG signal varies to a great deal. This makes the most elementary frequency or amplitude-based parameters (e.g. mean absolute value, median frequency) not sufficient for the classification of complex movements: an accurate classification may instead benefit very much from a greater number of signal descriptors (Hudgins et al., 1993). For this reason, the extraction of time-domain or frequency domain features is widely applied when EMG signals are used to classify distinct hand gestures.

Consequently, the most commonly adopted approach relies on the extraction of a wide number of features, which are later analyzed through classification algorithms or neural networks for posture discrimination (Zardoshti-Kermani et al., 1995; Ahsan et al., 2011; Chowdhury et al., 2013). The features chosen in our study are commonly employed for classifying gestures from EMG recordings of hand muscles (for instance, see Zecca et al., 2002; Boostani and Moradi, 2003; Tkach et al., 2010; Kendell et al., 2012; Chowdhury et al., 2013). The matrix containing those features was subsequently reduced in dimensionality, retaining only the most relevant five components, which accounted for 72.64% of the variance (mean across subjects). In addition, the rank accuracy measure showed that the twenty gestures could be successfully discriminated by EMG data in all the subjects (average accuracy value across subjects: 72%; individual subject data are reported in Supplementary File 16).

Nonetheless, even if feature extraction is widely used, we followed the Reviewer’s suggestion and adopted the elegant RSA-based procedure performed by Ejaz and colleagues (Ejaz et al., 2015, Supplementary Methods). EMG signals from our five-channels setup were resampled at 1,000 Hz, de-trended, rectified and low-pass filtered with a fourth-order Butterworth filter at 40 Hz. The time series from each gesture and channel were later averaged over a 2.5 seconds time window (2,500 time points). As a result, we described each gesture using a 5x1 vector. A representational space (RS) was then obtained from the pairwise comparisons of all gestures, performed using the Mahalanobis distance. That RS turned out to be highly similar to our muscle model (Pearson’s r: 0.78, p<0.0001). This suggests that feature extraction is able to provide a reliable description of the muscle space and that a different data analysis would not have had a significant impact on the model we derived from EMG data.

We believe that two additional details should be pointed out. First, we described our EMG signals using a high number of features, subsequently reduced to the five most relevant dimensions with PCA. Ejaz et al., (2015) described the EMG signal for each movement exclusively through its average over time. While this is certainly suited for simpler single or multi-digit movements as the ones examined in their study, many pieces of evidence in the literature indicate that complex hand actions, including grasping movements, require a more detailed description (Hudgins & Parker, 1993; see Zecca et al., 2002 for review). Second, an analysis limited to a descriptive RSA-based approach would not have allowed us to perform the rank accuracy procedures to evaluate the discrimination of individual postures from the EMG data, which was an important and innovative aim of our study.

2) While I think that some of the techniques used in the paper are interesting and promising, I believe that they are currently not going beyond the RSA analysis presented in Ejaz et al. Unfortunately, in the discussion the authors misconstrue the previous evidence and perpetuate some misunderstandings that are all too common in the synergy field. For example the authors state that "these findings, however, provide no clue regarding the extent to which the brain may control the hand using functional modules" and "their model (hand usage model) was therefore similar to the individual digit model adopted in the present study", showing that they clearly do not appreciate the tight connection between RSA and the methods chosen here.

It is important to point out that the matrix of pairwise distances contains the same information as the covariance matrix between experimental conditions (and can easily be transformed into it). Extracting PCA factors from this covariance matrix provides a distribution of the same statistical quantity, only that it throws away a certain proportion of the information (and by just considering the principle vectors also disregards the relative importance of the factors). Therefore RSA and extraction of principal components from the covariance reflect highly related information – and I do not think that the authors have made any convincing case that anything can be learned from the PCA approach that cannot be learned from looking at the whole space. This boils down to the key question in the synergy field of whether there is something special about the principal vectors (or synergies) themselves, or just about the representational space they describe. With the current evidence, the difference between this paper and Ejaz et al. is purely superficial and methodological, but not conceptual.

The main finding of our paper is that synergies can be encoded as such in the human brain, and that they may represent functional modules through which the brain controls the hand to achieve a wide variety of flexible and stable postures. The extraction of Principal Components from the covariance matrix is therefore a cornerstone of our current work, since we were interested in demonstrating that a model based on synergies (or kinematic PCs) can predict brain activity in motor areas.

The Reviewer is right in pointing out that “the matrix of pairwise distances contains the same information as the covariance matrix between experimental conditions (and can easily be transformed into it)” and that “RSA and extraction of principal components from the covariance reflect highly related information”. The high fraction of variance explained by the kinematic synergy model reflects a high degree of similarity between the whole space and its five principal dimensions. However, we can hardly see this as a limitation of our approach. On the contrary, we observed that a small number of PCs can describe complex postural configurations losing only a little portion of information, in line with a large body of literature (e.g., Santello et al., 1998, 2002; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008; Overduin et al., 2012). For this reason, the high similarity between the whole space and its PCs represents a success of the dimensionality reduction procedure and a full validation of the reliability of synergies as descriptions of complex hand postures.

According to the synergy hypothesis, the kinematic Principal Components reflect clearly defined motor primitives and they are not simply quantitative descriptors of hand posture (Santello et al., 1998; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008). Following the suggestions made by the other Reviewers as well (see Reviewer #2, Comment #1 and Reviewer #3, Comment #3), we have shown that the first PCs represent elementary grasps (see response below, Methods and Video 1). A direct link between kinematic PCs and motor primitives raises questions about whether hand postures are encoded in the brain as combinations of those elementary grasps, thus confirming modularity principles. To verify this hypothesis, our encoding analyses were purposely designed to predict brain activity as a function of kinematic synergies. In addition, the subsequent decoding procedure tested the reliability of synergies as models to decode hand postures from brain activity.

To conclude, we do appreciate the high quality of the analyses and the innovative results reported in the elegant study by Ejaz et al. (2015). As a matter of fact, we are convinced that our paper complements – rather than simply overlaps with – the findings by the Ejaz et al. (2015) study. Specifically, the complementary nature of our paper is due to significant differences between the objectives of the study by Ejaz et al. (2015) – aimed mainly at providing reliable descriptions of kinematic, muscle and brain activity spaces through RSA – and our work, aimed instead at assessing modularity in hand posture control using machine-learning procedures. Thus, our findings do possess a conceptual novelty as compared to the Ejaz et al. (2015) study.

Similarly, our hand use model is not equivalent to the individual finger model used in this paper, but is much closer to the kinematic synergy model (we present a single-finger model in the multi-finger experiment, which is much inferior to the natural statistics model). The authors choose to extract synergies by taking 20 postures that serve as ad-hoc samples of the natural statistics of movement. In our paper we chose to use data sets that are representative samples of the natural statistics of movement. Furthermore, we use the whole covariance matrix of the data to compare to the brain activity patterns, not just the first 5 factors.

Indeed, in the analysis of the multi-finger experiment in that paper, we started with very similar methods employed by the authors here, but ultimately decided to present only the RSA methods, as we believe that they show the main point of the similarity of representational space more concisely, than the extraction of some arbitrary number of main factors, which then serve as descriptors of the same space.

We agree with the Reviewer on this point: for descriptive purposes, considering the whole space has surely many benefits. In addition, RSA methods are seemingly more robust, as they allow to compare the postural or functional models and to take into account all the data without performing any reduction.

Nonetheless, as already stated above, we should point out that our paper has a different objective and a different approach. Specifically, we aim at showing that the first PCs of the posture space, in addition to being valid descriptors of that space, do actually modulate brain activity within motor areas, and thus they represent useful modules that can be exploited for prosthetic control and Brain Computer Interface (BCI) applications.

All the Reviewers raised the issue regarding the interpretation of such PCs, as – in the absence of a more detailed description – they could, at first, appear arbitrary. However, a remarkable body of literature indicate that the highest-ranked kinematic PCs correspond to strictly coded grasping primitives (Santello et al., 1998; Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008; Gentner et al., 2010; Overduin et al., 2012), see Santello et al., 2013 for review). Consistently, in the above reports, a first synergy for grasping was identified, which modulates abduction-adduction and flexion-extension of all the finger joints (both proximal and distal), while a second synergy reflects thumb opposition and flexion-extension of the distal joints only. Maximizing the first synergy leads therefore to a posture resembling a power grasp, while the second one is linked to pinch movements directed towards smaller objects. In our study, we examined the first two PCs, which were highly consistent with the literature, along with a third one representing a movement of flexion and thumb opposition (as to grasp a dish or a platter). Since we thought that a graphical representation of the meaning of synergies could be a useful addition to the interpretation of such PCs, we plotted the time course of hand movements corresponding to these elementary grasps. The plots are 2s-long videos showing three movements from the minimum to the maximum values of PCs 1, 2 and 3, respectively, expressed as sets of twenty-four joint angles averaged across subjects (Video 1).

As indicated above, synergies are meaningful combinations of digit movements: as elementary grasps, they can be considered higher-level representations with respect to single digit or multi-digit movements. For this reason, the kinematic synergy model was tested, in our paper, against an individual digit model in which the five digits are considered independently.

In addition, we would like to point out that the hand movement statistics used in the paper by Ejaz et al. (2015) were taken from a previous study (i.e., Ingram et al., 2008) in which the kinematic PCs were later extracted. Notably, the plots from that report show that the first PCs were highly similar to ours (Figure 3b in Ingram et al., 2008).

3) The mapping of synergies on the cortical sheet is an interesting addition and provides a real potential argument that the kinematic synergies are more that statistical descriptors of representation space, but that the factors themselves have special status. The problem, however, is that currently the one single mapping is not evaluated against many other possible mappings. Thus the authors have not shown that there is something special about the synergies extracted. For this, one would need to a) develop a measure of the "topological orderliness" of the mapping and b) compare the synergy map systematically against an exhaustive set of alternative rotations in the same space (again using optimisation). We actually attempted this analysis on our multi-finger experiment, but preliminary results were not terribly encouraging, as there seem to be rotations of these factors in the same rotational space which gave similarly orderly mappings. If the authors could show in a stringent and convincing fashion that the particular rotation chosen here is more orderly organised than any other possible rotation in the 3-dimensional space (or even conclude after careful evaluation that this is not the case), I think the paper would really increase in quality. Without such analysis Figure 2 remains merely suggestive and anecdotal and the claims not substantiated.

We completely agree with the Reviewer on this point: the assessment of the “topological orderliness” of hand synergies on the cortex should be an important addition to our paper, as it provides persuasive data towards a real encoding of synergies as the building blocks of hand control in the human brain. Nonetheless, we ought to notice that the demonstration of topological gradients is a major challenge of functional MRI. Since the very first studies which posited a topological organization of some brain regions, e.g., early visual or auditory areas (Sereno et al., 1995; DeYoe et al., 1996; Formisano et al., 2003), most of the reports discussed this type of organization just in a descriptive way, displayed the cortical activations in each experimental subject and described each map individually (DeYoe et al., 1996; Formisano et al., 2003). This approach is typically performed in many studies in which portions of cortex are mapped, often using ultra-high-field fMRI (Formisano et al., 2003; Olman et al., 2010; Sanchez-Panchuelo et al., 2014).

The papers which provide methods to measure topography are very scarce. For instance, Engel et al. (1997) correlated the visual field mapped by V1 as a function of the distance from the occipital pole, obtaining a value which represents the topographical organization of eccentricity maps.

In the present study, we used methods developed by other authors to analyze electrocortical recordings (Yarrow et al., 2014), to measure the orderly mapping of synergies across the regions recruited by the encoding procedure. The method is based on the comparison between the map space (the physical distance between voxels) to the feature space (the distance between voxels based on their information content). The two spaces can be correlated to each other, estimating an index (C parameter) which is, in fact, conceptually similar to the correlation coefficient between two RSs (Yarrow et al., 2014). The C coefficient reflects the similarity between the arrangement of voxels in space and the arrangement of their information content, high values indicate that voxels that contain similar information are also spatially close. This closeness can be considered as a demonstration of a topographical organization, as the distance between the content of voxels is reflected by their physical distance.

Here, we measured the physical distance (voxel space) as the distance (i.e., standardized Euclidean distance) between each voxel. The feature space was computed measuring the distance (i.e., standardized Euclidean distance) between the three synergy weights, as defined by their R2, for each voxel. This analysis yielded a significant C coefficient (C=0.192; p-value=0.0383), denoting high similarity between the feature space and the voxel space; this supports the existence of a topographical organization of synergies across the cortical surface.

We thank again the Reviewer for this comment, which gave us the opportunity to assess the synergy-based organization using a procedure that had not yet been applied to fMRI data.

The synergy map displayed in Figure 2 was obtained plotting the R2 coefficient between each synergy with the fMRI data. This is particularly important because, as R2 represents the goodness-of-fit of the model, is resistant to the possible rotations of the three components in the same space, while the β (the angular coefficient of the regression) can be altered by the possible alternate configurations (e.g., rotations) of the principal components. The choice of the goodness-of-fit better suits our task, as it can be divided into two sub-tasks with opposite sign (i.e., performing the grasping movement and returning back with the hand in its original position), which cannot be disentangled due to the low temporal resolution of fMRI.

Overall, our results show that each synergy coefficient, being expressed as a measure of the goodness-of-fit instead of an individual β value, represents a unique measure which can be mapped onto the cortical surface without the need for testing it against alternate rotated configurations. In addition, the test performed shows that the feature space is strongly connected to the voxel space, providing the demonstration of the topological arrangement of hand synergies in the fronto-parietal cortical network specifically recruited by the encoding procedure.

Reviewer #2: 1) Given the data-driven nature of the analyses used, I found some interpretation of the top 3 principal components, and their relation to the topography noted, lacking. Ultimately the insight provided by PCA in neurophysiology rests on being able to directly link the components to neural activity. While I agree that there is some general map of the components in sensorimotor cortex, their organization has no interpretation. Some interpretation of the components (PCs 1-3) and how they relate to cortex organization might be informative on this front. Otherwise, simply saying PCs are mapped onto cortex is fairly impenetrable for the reader.

Since the main aim of our work was to demonstrate that PC-based representation of hand postures are actually encoded as such in human motor cortical areas, it is particularly important to unveil what those principal components may represent. The Reviewer’s request of an interpretation of the first PCs gives us an opportunity to clarify the meaning of synergies as the building blocks of hand postures.

Synergies are fundamental motor modules that are controlled as weighted combinations during hand movements. They are defined as Principal Component and usually ranked using the fraction of explained variance; the first paper on kinematic synergies (Santello et al. 1998, Figure 6) identified a first synergy which modulates abduction-adduction and flexion-extension of all the finger joints (both proximal and distal), while a second synergy reflects thumb opposition and flexion-extension of the distal joints only. Consequently, a modulation of the first synergy leads to a posture resembling a power grasp, while the second one is linked to pinch movements directed towards small objects. It is noteworthy that a similar organization was also identified by subsequent independent studies (Gentner and Classen, 2006; Ingram et al., 2008; Thakur et al., 2008; Gentner et al., 2010).

In our study, we showed that the first three kinematic PCs, which account for a major fraction of variance, are topographically arranged on the cortical surface. We added a more detailed interpretation of those PCs in the Discussion. The first two synergies are highly consistent with those reported in the literature, representing power and pinch grasps, respectively, while the third one reflects movements of flexion and thumb opposition (as to grasp a dish or a platter).

Following the Reviewer’s suggestion, we developed a method to represent clearly the kinematic PCs. The video represents the whole course of a hand movement from the minimum to the maximum values of PCs 1, 2 and 3, respectively. Please refer also response to issue #2 by Reviewer #1.

2) I found the intermixing of results material into the Discussion section a bit disruptive (e.g., inclusion of Figure 5 and visual control analyses). I think that results material should be described and motivated in the Results section.

We agree with this suggestion. The visual control analyses have been moved to a new paragraph, as described in the response to comment #7.

3) It was unclear to me why, after measuring from five muscles, and thus obtaining five measures (i.e., the same number of components in the synergy and individual digit models), the data was reduced through PCA and then up-sampled again (through cross-validation methods) to achieve 5 components. This should be fully explained, as no such manipulation was done to the individual digit data.

In accordance with the Reviewer’s suggestion and to strengthen our muscle model, we performed the EMG data analysis avoiding the reduction of the number of channels. Therefore, we were able to achieve a model with five dimensions, fully consistent with the alternative descriptions employed in the study. The features were extracted from the five channels separately; later, the five muscle synergies were obtained as linear combinations of features, by computing PCA on the overall set of features. In this way, we could avoid a great deal of data manipulation, achieving a muscle model built on simpler analyses. Changes have been described and reported in the revised version of the manuscript.

4) It might be interesting in the supplement to show the results of RSA/MDS for the other models (EMG and individual digit), allowing the reader to make comparisons between all 3 models.

The MDS procedure was performed on the representational spaces drawn during RSA. RSs from each of the three models (kinematic synergy, muscle synergy and individual digit) were significantly correlated (p<0.0001, corrected for Mantel test, 10,000 iterations). Hence, as their MDS would be very similar, they were omitted for brevity.

5) I was initially confused in the Discussion why the authors referred to brain areas that were not apparent in the group maps shown in Figure 1 (e.g., ventral premotor cortex). This became apparent, however, after I viewed the actual source data on the MNI-152 brain, as 2-3 subjects show overlap in some of these areas. In any case, the authors should only use the text to refer to what is actually shown in the paper, to avoid such confusion.

We thank the Reviewer for bringing up this potentially confusing passage which could lead to a misinterpretation of the findings. The analyses were conducted on 3D volume data, then the results were mapped onto the cortical surface using specific software (Brainvisa 4.4). Due to the interpolation which occurs during the mapping process, some clusters more associated to deeper brain regions did not appear on the surface view. However, as those regions (e.g., ventral premotor cortex) were actually recruited by the encoding procedure, they were referred to in the Discussion. We modified the figure legend accordingly.

6) In the Discussion, I was hoping for some discussion of the bilaterality of the effects observed, which are interesting. I would suggest adding this in a revision.

We are grateful to the Reviewer for this comment. Indeed, the bilateral recruitment of motor, supplementary motor and parietal areas is quite interesting and, previously, other authors have posited a bilateral engagement of those regions during complex hand movements. This aspect has now been discussed more extensively.

7) In addition to the visual control used in the paper (i.e., analysis of visual stimulation evoked time points in the RSA mask), I was thinking that an equally good control to show the selectivity of effects to sensorimotor cortex would be to localize much of visual cortex (e.g., based on visual stimulation response vs. rest) and then perform the exact same encoding analyses on those voxels. Visual cortex is well known to be involved in imagery-a key component of the experimental task-and to see how the kinematic models performs in that area would be of interest. If it does fairly well, it would have some significant bearing on what is actually being measured in sensorimotor cortex as well as its underlying organization.

We agree with the Reviewer that the role of visual cortex in motor imagery is significant and early visual areas likely participate to action preparation, as suggested by a very recent study (Gutteling et al., 2015), in which different actions were decoded based on brain activity in visual regions. To rule out such potential confounds and to verify that synergies modulate brain activity exclusively in sensorimotor regions, we considered visual activations vs. rest (measured five seconds after visual stimulus presentation) and performed the encoding procedure on the resulting mask. This procedure, along with the encoding of visual-related responses in sensorimotor regions originally included in the paper, has been described in a new section (Control analyses) in the Results section. The results show that both early activity in motor regions and activity in extrastriate visual regions do not encode kinematic synergies above-chance, excluding therefore those imagery-related confounds.

[Editors' note: the author responses to the re-review follow.]

1) "Individual-digit model, based on a somatotopic criterion (Kirsch et al. 2014)" remains still as obscure as it was before. I urge you to clarify here in the Introduction. There is no notion of somatotopy in the individual-digit model as far as I can see. Somatotopy implies that it matters that the middle finger is closer to the ring than to the pinkie finger. The Individual-finger model treats all finger movements equally and independently – so it is not "somatotopic".

The Reviewer is correct in pointing out that our individual digit model considered only the independence of digit representations, without taking into account their orderly somatotopic arrangement on the cortex. The individual digit model was based on the paper by Kirsch and colleagues (Kirsch et al., 2014), which compared the firing rate of M1 neurons with individual finger kinematics and with their Principal Components. In our paper, we described the individual digit model as somatotopic because a somatotopic arrangement requires the presence of discrete, independent single digit representations at a neuronal level. If the discharge pattern of a M1 neuron is correlated with individual digit kinematics, that neuron can be labeled as digit specific. Thus, following the Reviewer’s comment, we removed the potentially misleading references to somatotopy in the description of the individual digit model.

2) Results section, first paragraph: I found the analysis provided on the additional 4 subjects interesting and thank you for the additional clarification, but would ask you for two things: a) When using temporal averaging of the EMG signal (mean-based EMG analysis), you should use your dimensionality reduction to 5 PCA, as you did for the feature based analysis. This way we can clearly see that your temporal features, and not the dimensionality reduction provide the critical difference between the red and blue curve. Note also that you did not replicate exactly the analysis performed in Ejaz et al. (2011), as you skipped the critical prewhitening step. It is not clear whether this analysis would be sensible here, as your gestures are a ad-hoc sample from the natural statistics, not an equally-spaced sample of possible finger movements b) this analysis should be included as supplementary material and cited from the main text.

We thank the Reviewer for his comment and suggestions, which we have integrated into the manuscript. In the mean-based procedure (as we are currently labeling this pipeline, alternative to the ‘feature-based’ approach), we reduced the dimensionality of each model with a PCA, retaining only the first five dimensions, independently from the number of EMG channels initially considered (from 5 to 16). Hence, for each number of channels the data have the same dimensionality, making the comparison more reliable.

Following the request by the Reviewer, we have described the EMG analysis in a self-standing way and we have added this procedure to the article. Also, we have discussed the impact of EMG dimensionality in the Limitations section of the main text of the article (as requested also in Comment #5). The ‘prewhitening’ of EMG signals was not included in our analysis, as we did not find any mention of it in the EMG methods of the paper by Ejaz et al. (2015).

3) Supplementary file 15: I think the table should be supplemented by a one-sentence description for each feature that is detailed enough to be able to calculate these features without going onto a wild-goose chase in the cited papers. I urge the authors to start with a clear definition of symbols and then give a concise and unambiguous formula.

We agree with the Reviewer on the importance of such pieces of information to achieve the highest possible clarity and to provide all the necessary information for independent replication of the study. Hence, we have added a description of each feature to the list of features reported in Supplementary file 1L.

4) Section “A challenge to individual digit correction representations? The functional topography of hand synergies”: I find this section on functional topography overstated and do believe it requires a major change in tone. Your data shows that there is "some" topological organization of the first three synergies, not a "strict" one. Furthermore, some somatotopic clustering can also be shown for individual fingers or – most like for other rotations of the synergy vectors, and you have not provided a quantitative comparison with other possible organizations (see point 11).

The Reviewer is right that, though our results indicate a topographical organization of the first three synergies which has been tested against random arrangements of the same synergies across the cortical surface (see Cortical mapping of the three group synergies), this organization should be properly tested against the possible rotations of the synergy vectors. This assessment, though, falls far beyond the aim of the current study, as we discussed in the response to Point #11 and in the Limitations and methodological considerations section of the main text of the manuscript. We have modified the Results accordingly, and indicated the need of additional studies to ascertain the topographical arrangement of motor primitives as indicated by our data.

5) Section “Limitations and methodological considerations”: The limitation section discusses relatively minor points. Two important weakness should be added: a) the point that while some clustered representation was shown in sensory-motor regions, you did not convincingly show that this specific set of synergies is more clustered than other rotation of the same vectors b) that in comparing the different models, the EMG-model had much less ability to discriminate different gestures and that the disadvantage of the muscle model may simply reflect noise levels on your measurement. These are important limitations that should be pointed out.

The Reviewer is right in pointing out that the impact of the rotation of synergy vectors deserves a deeper discussion. While the topographical arrangement of synergies across the cortical surface is suggested by our data, its demonstration against the possible rotated configurations of the synergy space needs novel and specific experiments, as we detailed in the Response to Comment #11. This has been discussed in a specific paragraph after the Limitations and Methodological considerations section.

The EMG analyses on the additional subjects suggest that the worse performance of the muscle model is likely to be related to the intrinsic signal and noise levels of surface EMG acquisition, as a much higher number of channels (up to 16) did not bring any significant benefit to gesture discrimination. This has been discussed in the “Limitations and methodological considerations” section.

6) Paragraph two, “Models validation”: Please clarify in the text how the labels of the test set where shuffled. Specifically, if your test set contained 4 repetitions of each of the 20 gestures, did you shuffle the labels of all 80 trials completely randomly, or did you keep the 4 trials for the same gesture together and just give them together a new label (or equivalently shuffle the labels after averaging over the 4 trials)? This difference has important consequences for the variance of your reshuffling statistics.

In the permutation test, we adopted the former procedure: the labels were shuffled after averaging the four trials of the rest dataset. For this reason, each permutation matrix still comprised twenty coherent postures. We clarified the description of the rank accuracy procedure in the Methods.

7) “Every voxel had a score ranging from 0 (if the voxel was never used) to a possible maximum of 380 (if the two left-out patterns could be predicted, for that voxel, in all the 190 iterations).” Please explain this statement better. Do you mean to say the score was the number of times the voxel was included in the 1000 voxels AND got a specific gesture correct?

Since the accuracy assessment (decoding stage) was performed only on the 1000 voxels with the highest R2, the accuracy score for each voxel corresponds to the number of times the voxel was among the 1000 with the highest R2 and was successful in the pairwise discrimination of gestures. This was clarified in the text.

8) Section “Assessment of the accuracy of the encoding analysis”: Please clearly point out in the text that the weights were randomly shuffled within each column. Please also point out explicitly (I assume that this is true) that the new "PCA"s were now not orthogonal to each other anymore.

Following the Reviewer’s indication, this aspect has been specified in the Methods.

9) Now that I think that I understand what the single subject maps are, I think the group-level maps also needs some more explanation. The score for each voxel varied between 0 and 380 (as stated above). For each subject, which value was then considered as "successful"? Why was it called a "probability map"? Probability of what?

The group-level map expresses the probability for each voxel to successfully encode synergy-based representations of hand postures. The single subject maps were first binarized, converting the non-zero scores to 1. The nine binary maps, one for each subject, were then summed, obtaining a group map with values ranging from 0 (for a voxel which was never used) to 9 (for a voxel which was recruited in all subjects). This procedure has been better explained in the Methods and Results.

10) Section “Cortial mapping of the three group synergies”: I disagree that using R2 as a goodness of fit for each individual synergy makes the results invariant to rotations in synergy space. It does not. Maybe we fundamentally misunderstand each other, so I will make my point more concrete. Say, you have 2 "synergies" of 5 elements X1 and X2 and a 5-element data series Y.

Now the individual R2-values are changed, and hence a mapwise evaluation criterion would also be changed.

R2_1 = Y'*z1*inv(z1'*z1)*z1'*Y/(Y'*Y) = 0.6351 R2_2 = Y'*z2*inv(z2'*z2)*z2'*Y/(Y'*Y) = 0.4629 I hope that clarifies my point and why I think a) the sentence stating that individual R2 values are rotation invariant should be removed and b) any claims regarding a special organisation on the cortex should be made weaker – for stronger claims you would need to compare the C-metric (which I think would fit this purpose) across many different rotations of the same vectors or ways of picking encoding vectors from the 20-dimensional space.

In the encoding of fMRI data, the kinematic synergy model used in the multiple linear regression considered the first five synergies together, thereby fitting the combination of those synergies to brain activity patterns. For this reason, the results from the encoding analysis are resistant to the rotation of the synergy space as a whole and the estimation of the goodness-of-fit of the kinematic synergy model is reliable.

That said, the Reviewer is certainly right in noting that, if each synergy is fitted independently as done for the topographical mapping, its R2 coefficient may be affected by the possible rotated configurations of the synergy space. Actually, when mapping the synergies onto the cortical surface, the rotation of the PCs may lead to different R2 values and, consequently, will affect the topographical organization, as correctly pointed out by the Reviewer. However, the kinematic PCs that were examined in the present paper were identified and represented as elementary, meaningful grasping primitives (see Results and Video 1). Hence, in our opinion, the topographical organization of those synergies (as reported and assessed with the procedure described in paragraph two section “Cortical mapping of the three group synergies”) may indicate that the cortex encodes functional modules.

While we agree with the Reviewer that a direct assessment of functional topography against other possible rotations is a very interesting issue that requires to be explored to achieve a deeper understanding of the cortical organization of hand movement control, we think that this falls beyond the scope of the current study, in primis for the following main reasons:

– The kinematic synergy model accounts for a portion (40%) of the variance of fMRI data. For this reason, we hypothesize that different rotations of the synergy space can actually be encoded in the brain, as well as additional pieces of information (force, temporal patterns of movements, action goals). Without a better model able to explain a larger amount of fMRI variance, the assessment of different rotations of the synergy space may have a limited validity. We discussed this aspect in the Beyond synergies paragraph.

– An encoding of different or rotated configurations of the synergy space is plausible in the motor system: synergies are flexible configurations (Turvey, 2007) and are therefore likely to modify or adapt to task demands rather than being rigid postural schemes (Latash et al., 2007). In addition, the solution to many distinct problems of motor control, including the Degrees of Freedom problem, is hardly unique (Bernstein, 1967). Hence, both synergies and their rotated versions may coexist in the brain, representing strategies adopted to solve the Degrees of Freedom problem in multiple possible ways, achieving the one-to-many organization that has been posited by some authors (Latash et al., 2007).

– Though the principal components obtained from the synergy space were encoded in each subject separately, we showed a highly consistent similarity of the first three synergies across subjects (Video 1). However, with the current procedure, slight inter-subject differences in the principal components could not be considered during the assessment of the topographical mapping. We believe that the topographical assessment requires the definition of stable population-level synergies to allow for the identification of the optimal rotation of each component and to test their arrangements in an independent group of subjects.

– The present study focused only on hand-grasping movements. We expect that the addition of more action types, such as non-grasping object-directed movements or intransitive actions, may lead to the description of synergies that are different from those described here for grasping. For this reason, the cortical organization described in this study may reflect only in part the overall cortical organization of hand movement control.

In summary, in this paper we reported that the three synergies topographically arranged on the cortex are the best descriptors (as Principal Components) of posture space and correspond to meaningful grasping primitives. While certainly this may not be true in the case of the rotated versions of those synergies, theoretically they could fit as well, or even better, the brain activity patterns.

We have discussed these important aspects, the caution they warrant in the generalization of our results, the requirements for additional distinct studies with specific goals in the “Limitations” and in the “Beyond synergies” sections. Finally, in agreement with the suggestions by the Reviewer, we attenuated the description of the topographical organization of hand synergies and removed the sentence stating the invariance of individual R2 coefficients to rotation.

European Research Council (ERC-291166 The Hand Embodied)

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Mirco Cosottini and Luca Cecchetti for help with data collection, technical assistance, and critical discussions; Arash Ajoudani and Alessandro Altobelli for their help with additional experiments.

Ethics

Human subjects: This study was approved by the Ethical Committee at the University of Pisa, Italy. Participants received a detailed explanation of all the study procedures and risks and provided a written informed consent according to the protocol approved by the University of Pisa Ethical Committee (1616/2003). All participants retained the right to withdraw from the study at any moment.

eLife is a non-profit organisation inspired by research funders and led by scientists. Our mission is to help scientists accelerate discovery by operating a platform for research communication that encourages and recognises the most responsible behaviours in science.eLife Sciences Publications, Ltd is a limited liability non-profit non-stock corporation incorporated in the State of Delaware, USA, with company number 5030732, and is registered in the UK with company number FC030576 and branch number BR015634 at the address:
eLife Sciences Publications, Ltd
Westbrook Centre, Milton Road
Cambridge CB4 1YG
UK