Figures

Abstract

Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.

Author summary

Schooling fish exhibit impressive group-level coordination in which multiple individuals move together in a seamless way. This is possible because each individual in the group responds to the movement of other group members. But how many individuals does each fish pay attention to? Which are the influential neighbors? It is necessary to answer these questions in order to understand how directional information propagates across a group. Our research shows that in the rummy-nose tetra species there is a limited number of influential neighbors which are not necessarily the closest ones.

Funding: LJ was funded by a grant from the China Scholarship Council (CSC NO. 201506040167). LG acknowledges support from EPSRC grant EP/I013717/1. RE has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 655235 "SmartMass". VL was supported by doctoral fellowships from the scientific council of the University Paul Sabatier. ZH was supported by the National Natural Science Foundation grants 61374165, 31261160495. This study was supported by grants from the Centre National de la Recherche Scientifique and University Paul Sabatier (project Dynabanc). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Collective motion phenomena such as swarming, flocking and schooling behavior have been observed in a large variety of animal species ranging from bacteria to humans [1]. Several theoretical models have been proposed to explain how such large scale coordination patterns emerge from “microscopic level” interaction rules among individual animals [2–7]. These models have been instrumental in improving our understanding of collective motion in real animal groups by providing an indication of which interaction mechanisms are sufficient to reproduce realistic patterns of collective behavior. In particular, most models agree on the fact that two types of interaction are responsible for maintaining group cohesion to achieve coherent collective motion: attraction and alignment.

More recent improvements in remote sensing and video-tracking technologies [8–10] have made possible to automate data collection and test directly theoretical models against highly resolved empirical movement data in various species. Generally, these studies have confirmed the importance that attraction and alignment behavior play in the formation and maintenance of collective movement patterns [11–15]. However, there is a less clear scientific consensus about how these interaction rules are implemented in the sensory-motor responses of individuals. This lack of agreement underscores the importance of answering the following question: how do individuals mediate interactions with multiple neighbors? [16].

Specifically, theoretical studies have postulated a number of factors that are likely to affect the probability and intensity of interactions: distance (metric neighborhood) [2–7], position rank (topological neighborhood) [17], projected size (visual neighborhood) [18–20], and spatial arrangement around a focal individual (Voronoi neighborhood) [13]. Each of these different definitions of influential neighborhood is supported to some extent by computational models and empirical observations.

Rather than siding with one or more of the proposed neighborhood definitions, we adopt a fully data-driven approach with minimalist modeling assumptions. The simplest hypothesis consists of assuming that fish copy the actions of their neighbors, but not instantaneously: the fish reaction takes time to process sensory information and to trigger the appropriate behavioral response. Those assumptions impose a temporal constraint given by the sequential occurrence of the perception of the neighbors’ actions, and the movement response [21, 22]. We thus assume that animals following a particular neighbor in a new direction are subject to a time-delay when copying the heading of influential neighbors.

Considerable work has already appeared on the identification of these time-delays. The delays with which individuals align with each other have in fact been exploited to determine social hierarchies in animal groups, as shown, e.g., for pigeon flocks [23], where the leadership network is constructed with link weights given by the delay for which pairwise angle correlation is maximal. Improvements on how to identify such delays from movement data have proposed the use of time-dependence in pairwise angle correlation [24]. A computational analysis, based on similarities between trajectories (Fréchet distance), has also been proposed and implemented in a visual analytic tool [25]. A different approach has made use of a time-ordering procedure on the pairwise angle correlation to determine temporary leader/follower relations in foraging pairs of echolocating bats [26]. The analysis of the bat trajectories was instrumental in identifying transient leadership and coupling it to sensory biases of the species. However, only pairs of individuals were considered and group influence on individual behavior was not investigated.

Since identifying influential neighbors is key to unravel the mechanisms of interaction, there is a need in collective behavior studies to establish transient leadership from the dynamics of the individual trajectories. One way to bridge this gap consists of determining who are those influential individuals whose heading is being copied more closely by others, how many of such influential neighbors exist, and where are located in the group.

Fish have the ability to choose not only when to copy the heading of another individual, but also the extent to which this heading is copied, that is the similarity and the pace at which fish match the trajectory’s curvature of another individual [11, 27]. The closer two (or more) fish are to this matching, the more aligned they are (even if with some delay), and the more faithfully they are following the movement path of the transient leader.

Here, we introduce a procedure that allows us to identify the influential neighbors of fish moving in a group, and we test it along a series of experiments in groups of two and five individuals of the freshwater tropical fish Hemigrammus rhodostomus swimming in a ring-shaped tank (see details in Materials and methods). In this set-up, fish swim in a highly synchronized and polarized manner, and can only head in two directions, clockwise or anticlockwise, regularly switching from one to the other. We base our procedure for identifying influential neighbors on time-dependent directional correlations between fish, focussing our analysis on the interactions that occur during these collective U-turns. Indeed, during U-turns, fish have to make a substantial change of direction to reverse their heading, making easier the extraction of the correlation resulting from the direct interactions between individuals rather than other incidental correlations, e.g., their channeled motion in the ring-shaped tank. Moreover, as correlation does not imply causal influence, we need to control for potential spurious correlations. We do so by constructing a null model of collective U-turns to show that the patterns of interaction observed in the experiments are not due to random processes.

Results

Dynamics of collective U-turns

Hemigrammus rhodostomus performs burst-and-coast swimming behavior that consists of sudden heading changes combined with brief accelerations followed by quasi-passive, straight decelerations [15]. Moreover, fish spend most of their time swimming in a single group along the wall of the tank. Fish regularly change their position within the group [28], so that every individual fish can be found at the front of the group.

A typical collective U-turn event starts with the spontaneous turnaround of a single fish (hereafter called the initiator), mostly located at the front of the group [28]. This sudden change of behavior triggers a collective reaction in which all the other individuals in the group make a U-turn themselves, so that, after a short transient, all individuals adopt the same final direction of motion as the initiator. Overall, we analyzed 1586 U-turns of which 1111 were observed in groups of 2 fish and 475 in groups of 5 fish. Fig 1 shows two examples of collective U-turns in groups of N = 2 (left column, panels ABC) and N = 5 fish (right column, panels DEF; see also supplementary S8 Fig and supplementary S1 and S2 Videos in the Supplementary Information).

Fig 1A shows a first fish F1 (red color) swimming close to the upper-left region of the tank, followed by a second fish F2 (purple color) at a distance d12 ≈ 8.5 cm, swimming in the same direction. Right before the U-turn starts (Fig 1A), fish F1 reduces its speed (circles become closer to each other), the distance d12 decreases (to ≈ 5.1 cm), and F2 also reduces its speed. Then, both fish perform a change of direction which lasts about 1 second and during which fish F2 clearly follows fish F1 (see the corresponding circles at each instant of time in Fig 1B). Once the U-turn is completed (Fig 1C), F1 accelerates again, and so does F2, which also adopts the direction of motion of F1. The distance d12 increases again (≈ 9.5 cm), due to the larger velocities, and remains of the same order along the depicted trajectory.

The situation is less clear when we try to describe collective U-turns in larger groups. Fig 1D, 1E and 1F show a collective U-turn for the case where N = 5. Before the U-turn, fish F2 (orange) seems to be the fish that the rest of the group follows, the first circle of its trajectory being the most advanced one in the direction of motion. In fact, a position order can be inferred from Fig 1D: F2, F3, F5, F1 and F4. However, it is rather complicated to extract from Panel E a precise information about which fish is the initiator of the U-turn, in which order the other fish follow, and therefore, who is influencing whom, especially if time-delays and reaction times are taken into account. The same happens with the information about fish’s positions after the U-turn, provided by Panel F.

In order to describe rigorously the individual behavior of the N fish during a U-turn, we introduce the angle ϕi(t) as an instantaneous measure of the direction of motion of a fish Fi; see Fig 2. We assume that the instantaneous heading of a fish Fi can be defined in terms of the velocity vector , so that . The heading of a fish ϕi allows us to characterize the angle of incidence of the fish relative to the wall, θwi = ϕi − ψi, where ψi is the angle formed by the position vector of the fish with the horizontal line (see Fig 2). The angle of incidence θwi is an individual measure that doesn’t depend on the heading of another fish. When a fish Fi is swimming along the wall, the value of θwi is around ±90° (we choose, by convention, the positive sign for the anticlockwise angle). In our experiments, most of the time the absolute value of the angle of incidence is close to 90°; equivalently, |sin(θwi(t))| ≈ 1. When the motion is perpendicular to the wall, the incidence is zero if the fish points towards the wall (θwi = 0°), and maximal if the fish points towards the center of the tank (θwi = 180°); in both cases, sin(θwi(t)) = 0.

Fig 2. Angles and lengths characterizing the relative position of two fish.

Angle ψj denotes the angular position of fish Fj with respect to the horizontal (positive values fixed in the anticlockwise direction); angle ϕi is the heading of fish Fi; θwi is the angle of incidence of fish Fi with respect to the outer wall; dij is the distance between Fi and Fj; θij is the viewing angle of Fi with respect to Fj (not necessarily equal to θji), and ϕij = ϕj − ϕi is the heading difference of Fi with respect to Fj.

The change of sign of angle θwi can serve as an indicator that a U-turn has taken place. In fact, this allows us to delimit the individual U-turns with precision and, consequently, to determine the start and the end of a collective U-turn.

We define the start and end times ts,i and te,i of the individual U-turn of fish Fi in terms of the absolute value of the angle of incidence, |θwi(t)|. Once a U-turn has been detected, we obtain the time ts,i at which |θwi(t)| has decreased (from approximately 90°) below a given threshold , and the time te,i at which |θwi(t)| has increased again and is above another given threshold (see Materials and methods for more details).

Thus, the start of a collective U-turn is determined by the time ts at which the first individual U-turn starts, while the end of a collective U-turn is given by the time te at which the last individual U-turn finishes. That is:
(1)
For each collective U-turn, we have made a convenient time shift so that ts = 0. Then, te denotes not only the end time but also the duration of the collective U-turn.

We also introduce an instantaneous measure of how similar the direction of motion of individual fish are across the group. We define the instantaneous group polarization P(t) as the following function of normalized fish velocity vectors:
(2)
where . When all the fish have the same direction then the polarization is maximal and P(t) = 1. The minimum value P(t) = 0 is reached instead when the velocity vectors cancel.

Figs 3 and 4 depict the two U-turns introduced in Fig 1, in terms of the polarization P(t) and the sine of the angle of incidence of each fish with respect to the outer wall θwi(t). The duration of the two illustrated collective U-turns is te = 0.94 s for N = 2 and te = 1.5 s for N = 5.

(A) Individual fish trajectories in the tank during the U-turn. Each individual is represented by a unique color. The temporal sequence is indicated by circles equally spaced over time with a time-step of 0.04 s (empty circles) and 0.1 s (filled circles). Arrows denote direction of motion. Grey wide line is the tank’s border. (B) Group polarization P(t), with a minimum value Pmin ≈ 0.27 reached at t ≈ 0.46 s. (C) Sine of the angle of incidence of fish to the wall: when parallel to the wall, sin(θw) = 1 (anti-clockwise direction) or sin(θw) = −1 (clockwise). The three vertical lines of each color indicate for each fish the beginning, the middle and the end of its U-turn, with the middle representing the time when a fish has finally reversed its original direction. (D) Interaction with influential neighbors: arrows point from influential neighbors to the focal fish and with the same color as the focal fish. (E) Fish bursting activity and their influential neighbors. Dots at i = 1, 2 correspond to bursting activity, blank corresponds to coasting. Dots at i − 0.5 represent bursting activity of the neighbor influencing fish i.

The displayed temporal sequence is drawn from the fish trajectories one second before the U-turn begins till one second after its end. Symbols in all panels are the same as in Fig 3. (A) Individuals trajectories in the tank during the U-turn. (B) Group polarization with a minimum value Pmin ≈ 0.59 reached at t ≈ 0.66 s. (C) Sine of the angle of incidence of fish to the wall θw. The three vertical lines of each color indicate for each fish the beginning, the middle and the end of its U-turn. Here the middle time means the instant where sin(θw) = 0. (D) Interactions with influential neighbors: arrows point from influential neighbors to the focal fish and with the same color as the focal fish. (E) Fish bursting activity and their influential neighbors. If there is more than one influential neighbor, Fj with largest index value j is shown. Grey lines in Panels BCDE denote the start and end of the collective U-turn.

For both group sizes, the group polarization (Figs 3B and 4B) before and after the U-turn is quite close to 1, showing that before and after the collective U-turn, all individual fish maintain essentially the same common direction. During the U-turn, the polarization decreases, describing a sharp V-form with a minimum at P(t) ≈ 0.27 for N = 2 and P(t) ≈ 0.60 for N = 5. The minimum is reached at approximately half the duration of the collective U-turn, tm = (ts + te)/2: tm = 0.47 s for N = 2 and tm = 0.75 s for N = 5.

Figs 3C and 4C show the change of direction individually for each fish in both U-turns: from anticlockwise to clockwise direction for N = 2, and vice versa for N = 5. Fig 3C clearly indicates that at t ≈ 0.3 s, the fish F1 has almost completed its individual U-turn, while F2 has just started to change direction: sin(θw2(0.3)) ≈ 0.98, while sin(θw1(0.3)) ≈ −0.5.

In Fig 4C, a similar ordering can be inferred from the times of departure from the bottom line at ordinate sin(θwi) = −1 + δ, where δ > 0 is a small parameter with respect to the range of ordinate values; we used δ = 0.1. Thus, the order is 2-3-1-5-4. However, the order in which individual fish change the sign of their angle of incidence θwi is different, 2-1-3-5-4, and also different is the arrival order to the top line at ordinate sin(θwi) = 1 − δ: 2-5-1-4-3. Moreover, some of these departure and arrival times are almost identical (see, e.g., F1 and F4), and the behavior of the fish during the U-turn is completely different. These difficulties in establishing a consistent order show that another criterion is necessary to identify the relation of influence between fish.

We have based our criterion to decide if a fish is an influential neighbor of another fish on the average value of the time-dependent directional correlation between the two fish along a time window.

For each pair of fish Fi and Fj, we define the directional correlation Hij as a function of the heading of Fi evaluated at time t and the heading of Fj evaluated at a delayed time t − τ, where τ is the time-delay [26]:
(3)
The function Hij(t, τ) is in fact the cosine of the angle formed by the headings and , and is a measure of how aligned is fish Fi at time t with fish Fj at time t − τ. The values of Hij(t, τ) are between −1 (when fish swim in opposite directions) and 1 (when fish have the same direction), and equals zero when fish have perpendicular directions.

By averaging Hij(t, τ) along a time-window of length (2w + 1)Δt, we are able to quantify how much the focal fish Fi is copying the moving direction of its neighbor with a time-delay τ by means of the following function [26]
(4)
where tk = kΔt (the time-step in our experiments is Δt = 0.02s). The time-window parameter length w has been determined by means of a sensitivity analysis (pairwise similarity matrix), finding that w = 2 yields the more satisfactory results; see Section “Parameter selection” in Materials and methods and S5 Fig.

The average directional correlation Cij(t, τ, w) allows us to characterize a fish Fj as an influential neighbor of a focal fish Fi at time t with time-delay τ, if the value of Cij(t, τ, w) is larger than a given threshold Cmin. Details on how w and Cmin are obtained are given in Sections “Optimal setting parameters for influential neighbors identification” and “Parameter selection” in Material and Methods.

Fig 5 shows the directional correlation H12 and its time-average C12 between fish F1 and F2 along the collective U-turn depicted in Fig 3. Left (resp. right) panels aim to indicate the alignment of fish F1 (resp. F2) at each time t with respect to the alignment of fish F2 (resp. F1) at an earlier time t − τ. Panels A and C show respectively that for all τ, there is always an interval of time during which H12(t, τ) ≈ −1 and C12(t, τ) ≈ −1 (dark region), meaning that for all time-delays there is always an interval of time in which fish have opposite directions. Moreover, the larger the time-delay, the wider the black region where the direction of F1 is opposite to the direction of F2 at the previous time.

On the other hand, the figures of the directional correlation of F2 with F1, especially Panel D, show a connected region in which the correlation C21(t, τ) remains positive and above the threshold (yellow in the figure) around τ ≈ 0.42 s where H21 ≈ 1 during all the time interval [−0.5, 2 s]. This strongly suggests that, during this time interval, F2 is copying the behavior of F1 with a 0.42 s time-delay, denoted τ2,1 for this specific U-turn. Thus, one can consider that F1 is influencing F2 with time-delay τ2,1, while F2 is not influencing F1 in this specific case. This influence dynamics is illustrated in Fig 3D by drawing an arrow at time t from Fj to Fi when Fj satisfies the condition Cij(t, τ, w) > Cmin for being an influential neighbor of Fi at time t, which in turn receives this influence and responds by copying the exhibited heading with a time-delay τ.

Using the same procedure for the N = 5 case depicted in Fig 4, we draw Fig 6 that shows F1 copying F2 with a time-delay τ1,2 ≈ 0.5 s (Panels A and E). F1 also copies F3 and F5 with, respectively, τ1,3 ≈ 0.2 s (Panels B and F) and τ1,5 ≈ 0.1 s (Panels D and H), but it doesn’t copy F4 (Panels C and G). The influential neighbors of F1 are thus F2, F3 and F5, at different times and with different time-delays. We have calculated the rest of the correlations for all pairs of fish (see S1 Fig for an overview of all the heading correlations). As for the N = 2 case, these relations are illustrated by arrows going from the influential neighbors to the reacting fish in Fig 4D.

Effect of bursting on the transmission of information

The specific behavior of H. rhodostomus, namely, the successive alternation of bursts and coasts [15], leads us to ask whether these abrupt changes of acceleration and speed can provide information that other fish could use to adjust their own movement. To address this aspect we study whether there is any correlation between the bursting activity of one fish at time t and the fact that this fish is an influential neighbor of another fish shortly after time t.

A burst corresponds to a brief phase of acceleration during which most changes in fish heading occur [15]. Panels E in Figs 3 and 4 show the bursting activity of each fish Fi, i = 1, …, N, and that of its influential neighbors. For each fish Fi, we draw a dot at time t and ordinate i if fish Fi is displaying a burst precisely at time t. Dot color at ordinate i corresponds to fish Fi’s color. The absence of a dot at a given time denotes that the fish is in a coasting phase at that time.

A second row of colored dots is drawn at ordinate i − 0.5 for some values of t when two conditions are met: (1) Fish Fi is being influenced at those times by one or more fish Fj, j ∈ {1, …, N}, j ≠ i, whose identity is given by the color of the dots, and (2) the influential fish Fj was bursting when it was influencing Fi at time t − τ earlier. If Fi has more than one influential neighbor at time t, the dot drawn at time t in row i − 0.5 has the color of the Fj fish with the highest index j.

In Fig 3E, red dots at i = 1 mean that fish F1 is bursting at those time-steps and coasting at the other time-steps, and red dots at i − 0.5 = 1.5 indicate that, first, F1 is the influential fish of F2 at those time-steps, and second, F1 was bursting when it was earlier influencing F2. In turn, there are two possible reasons to explain the absence of red dots at i − 0.5 = 1.5 for certain time values: either F2 has no influential neighbor, or F1 was coasting. To assess which of the two explanations is valid, one needs to look at Fig 3D. For example, the absence of dots at i − 0.5 = 1.5 during 0.57 s and 0.62 s is due to F2 having no influential neighbors, while the absence of dots in the same row between 0.75 s and 0.81 s results from the fact that F1, which is the influential neighbor of F2, is in a coasting phase at time t − τ (in this example the delay was found to be τ = 0.42 s).

Fig 3E shows that the bursting activities of both the focal fish and its influential neighbor are not directly correlated, suggesting that the primary source of information for fish to adjust their movements is the distance, orientation and angular position of their neighbors [15]. The same conclusion is obtained for N = 5. By focusing on fish F2 for example, Fig 4E shows that there is no systematic overlap between the yellow dots at i = 2 and those at i − 0.5 for i ≠ 2, suggesting that the correlation between the bursting activity of a fish and that of their influential neighbors is marginal.

Number of influential neighbors

For all U-turns, we have counted the number of frames in which a fish is an influential neighbor, that is, the number of frames where the above described condition for identifying influential neighbors is met. When there are only two fish, a fish is found to be the influential neighbor 30% of the time spent in a U-turn. In groups of five fish, this proportion grows up to 62%.

We have counted the number of influential neighbors Nif a fish Fi has during a U-turn in groups of five fish, finding that in most cases, a fish has only one or two influential neighbors (for 58% of the time spent in a U-turn Nif = 1 or 2); see Fig 7A. The most frequent case is Nif = 1 (43%). Having more than one influential neighbor is frequent (19%), but less than having no influential neighbors (38%). The cases where there are more than two influential neighbors are negligible (less than 4% of the total time spent in U-turns).

Cumulative analysis of collective U-turns of over 475 experimental (blue) and 1000 artificial (red) observations in groups of N = 5 fish. (A) Number of influential neighbors.(B) Distance rank of influential neighbors with respect to the focal fish. (C) Position rank of influential neighbors in the group. (D) Turning rank of influential neighbors. Histograms represent the proportion of time during which influential neighbors have been observed in a given class. The procedure to construct the artificial observations is presented later and in the section “Null model” in Materials and Methods.

For each fish Fi, we have calculated the respective distance dij(t) at which the other N − 1 fish Fj are from Fi during the U-turns, thus establishing a rank order among the neighbors influenced by Fi. We have then compared the influence of close neighbors with those of distant neighbors, finding no correlation between the distance rank of a neighbor and the influence it exerts on the focal fish. This is shown in Fig 7B, where we have depicted the distribution of the distance rank of influential neighbors with respect to a focal fish. The figure shows that fish spent the same proportion of time (≈ 25%) being an influential neighbor of a focal fish independently of their distance rank. In other words, influential neighbors are not necessarily the closest ones.

When trying to identify events of causal influence by means of correlations, it is crucial to keep in mind that correlation does not imply causation. We thus have controlled the effects of potential chains of influence, where e.g. fish F1 is highly correlated with F3 not because F1 is directly influencing F3, but because F1 is influencing fish F2, which in turn is influencing F3. To check the impact of these chains of influence on our results, we have removed from our data all the pairwise influence data that correspond to the following situation: if F1 is influenced by both F2 and F3 and F2 is simultaneously influenced by F3 (or F3 is influenced by F2), then we removed the pairwise correlation (focal fish, influential neighbor) corresponding to (F1, F2) (or (F1, F3)). After removing 7172 out of 69703 data points and recomputing the results with the remaining data, we found that our results remain practically unchanged.

We have also calculated the position rank that each fish occupies in the group during a collective U-turn, finding that influential neighbors are mostly located in the front region of the group: 32% in the leading most advanced position, and 20% in the second place; see Fig 7C. Noticeably, influential neighbors can be found in the back of the group (in 29% of the cases in the fourth or fifth position), and even in the last position (a non-negligible 13% of cases).

We also paid attention to the order in which each fish starts its individual U-turn during a collective U-turn, finding that influential neighbors are those that most frequently turn earlier (32% of the cases), and that this relation decreases linearly; see Fig 7D. It is again noticeable that influential fish can be found to be the last turning fish (in 8% of the cases).

The apparently surprising fact that influential fish can be found in the back of the group and that the last fish turning can be an influential fish is due to the anisotropic perception of fish and their relative orientations during U-turns. But these findings have to be understood in the light of our specific time-dependent characterization of influential neighbor. If, for instance, F1 turns first and influences F2, F2 will turn with some time-delay after F1. Then, when F2 is at half of its individual turning process, F2 can be rotating in the same direction as F1 in such a way that F1, influenced by F2, slightly adjusts its direction. We would then say that F2, which is the last turning fish, has influenced F1, the first turning fish.

In order to compare different collective U-turns, we define a normalized time in terms of the actual time t and the starting and ending time of each U-turn, so that the duration of a U-turn is now . Thus, corresponds to a time as long as the U-turn duration previous to the start of the U-turn, and corresponds to a time as long as the U-turn duration after the end of the U-turn. We have calculated the instantaneous value of the average speed , the average group polarization and the average number of influential neighbors . Here, angle brackets refer to the average across all fish in the U-turn along a time-window containing the collective U-turn.

Fig 8A and 8B show respectively the time evolution of and during the collective U-turns in groups of 5 fish. The description of the specific U-turn presented in Fig 4 is also valid for the general case: the speed decreases before the U-turn (from mm/s to mm/s), it reaches a minimum at half the U-turn duration ( mm/s), and it then grows to a higher value after the U-turn ( mm/s). A very similar behavior was found in groups of 2, 4, 8 and 10 fish of the same species in [28]. At the same time, the polarization is very high and almost constant outside the U-turn (), and exhibits a perfect V-shape during the U-turn, with the high values () reached at exactly the instants where the start and end of the U-turn takes place and , and the minimum value () at the middle of the U-turn. As expected, the average group polarization significantly decreases during the U-turn to almost half the value it has outside the U-turn. Right after reaching this minimum, there is a sharp increase of speed and polarization as more fish adopt the new direction of motion.

We depict here the temporal dynamics for the average velocity, average polarization, number of influential neighbors and its variation in over 475 experimental (blue) and 1000 artificial (red) recordings of collective U-turns. (A) Average speed . (B) Average group polarization . (C) Average number of influential neighbor per focal fish. (D) Average of the absolute variation in the number of influential neighbors |ΔNif| divided by the number of influential neighbors Nif, defined in Eq (5): 〈η(t)〉. Horizontal axis denotes normalized time , where ts and te denote the start and end of the collective U-turn respectively. The procedure to construct the artificial observations is presented later and in the section “Null model” in Materials and Methods.

Fig 8C shows that before the U-turn the average number of influential neighbors increases until a maximum value is reached right before the start of the U-turn (). During more than one half of the U-turn, decreases until a minimum (), and grows again beyond the end of the U-turn until a second maximum (, twice the height of the minimum). After that, all fish have completed their U-turns and decreases again.

When the polarization is very high, the time-delay with which influential neighbors are detected is often too small in comparison with biologically realistic reaction times τR, so that these influential neighbors are not taken into account (we used τR = 0.04 s; see Section “Optimal setting parameters for influential neighbors identification” in Materials and methods). This is the reason why the average number of influential neighbors appears to be smaller in regions outside the U-turn, than when the U-turn is just about to start ( or slightly after its end (). Meanwhile, the decrease of in the middle of the U-turn has a different origin: once a fish has started to turn around, there is no real need of updating its alignment according to all its neighbors. That fish can safely reverse its motion by keeping the alignment with only one of those neighbors and even not paying attention to them for some period of time.

Another indicator of how fish make decisions while turning is how frequently a focal fish pays attention to other individuals. We define the relative variation of the number of influential neighbor per fish Nif(t) between two successive time-steps as follows:
(5)
denoting by Δt the time-step between frames (Δt = 0.02 s).

We have depicted the time-evolution of the average 〈η(t)〉 in Fig 8D, finding that 〈η(t)〉 remains essentially constant before, during and after the U-turn event, the amplitude of its variation being smaller than 10% of the signal (0.007 and 0.08, respectively).

Since the average number of influential neighbors is smaller when fish are engaged in the U-turn than right before or right after the U-turn, a constant average 〈η(t)〉 suggests that fish adjust their heading more frequently during the U-turn than outside the U-turn. Indeed, in the middle of a U-turn, no real common direction of motion exists (), that is, there is a high diversity of headings, so that fish have to frequently update their direction by paying attention to different neighbors.

Spatial organization of influential neighbors

We are now interested in determining the dynamical spatial organization of the influential neighbors of a focal fish. The relative state of a fish Fj with respect to a focal fish Fi is characterized by several parameters: the relative position of the neighbor , where is the vector position of Fi in cartesian coordinates, the distance between them , the viewing angle of Fj relative to the direction of Fi [26], which is the angle θij with which Fi perceives Fj (note that θij is not necessarily equal to θji), the relative velocity , and the relative heading ϕij = ϕj − ϕi. All these quantities are time-dependent. We have calculated their average value for all the U-turns in a uniform spatial grid of square cells to facilitate the interpretation of the vector field of these continuous variables. Each square cell, of side 20 mm, shows the average of the arbitrarily different number of values contained in the cell.

Fig 9A shows the density map of the relative position of the influential neighbor with respect to the focal fish when N = 2. The intensity of color is proportional to the frequency of occupation of the grid cell, showing that the influential neighbor is mostly located in front of the focal fish and at a distance of one to three body lengths from the focal fish. The same information is quantified in Panel B with a heat map in polar coordinates, highlighting the most frequent location of the influential neighbor.

The average relative velocity is shown in Fig 9A (arrows), superimposed to the density map. The vector field shows that when the influential neighbor is in front of or behind the focal fish (sin〈θij〉 ≈ 0), both fish move at similar speed although the focal fish is a little bit faster (the small black arrows are pointing in the opposite direction to the red one) and the difference in heading is also small. However, when the influential neighbor is on the sides of the focal fish, relative speed and heading difference tend to vary more as the distance between them increases.

The distributions of distances dij and exposure angles θij between a focal fish and its neighbors are depicted in Panels C and D of Fig 9 respectively. We find, on the one hand, that their most frequent separation is 62.6 mm ± 29.7 mm (mean and standard deviation of histogram in Fig 9C), a value that is consistent with previous results where it was shown that the behavioral reactions of a fish depend on the angular position of its neighbors, as a consequence of the anisotropic perception of the environment [15].

On the other hand, the distribution of the exposure angle of fish Fj to the focal fish Fi is narrower when Fj is influencing Fi than when Fj is a neighbor of Fi, not necessarily influencing Fi. As both distributions are centered on θij = 0, this shows that Fj is more frequently located in front of Fi when Fj is an influential neighbor of Fi than in the case when Fj is just a neighbor of Fi.

Fig 10 shows similar results for groups of N = 5 fish. Influential neighbors are more frequently located in front of the focal fish (although with a slight shift to the right; see Panels A and B) and at a mean distance of 67.5 mm ± 40.6 mm (Panel C).

In turn, the velocity field has a smaller intensity and is much more homogeneous than in the case where N = 2. A slight asymmetry can also be observed (not noticed when N = 2) with fish located in front and slightly to the right of the focal fish having a higher velocity than those located elsewhere. Moreover, the distribution of exposure angles is more dispersed than in the case of two fish, meaning that influential neighbors are exposed to the focal fish with a larger diversity of angles, something that is simply due to the higher number of fish.

The difference in the homogeneity of the velocity field between groups of 5 and 2 individuals is not necessarily the result of averaging over a larger number of individuals. Although averaging over fish data pairs may reduce the uncertainty in the extracted parameter values, it is well-known that the level of homogeneity in the direction of motion of the school increases with group size [29]. But one also ought to consider that specific values of delay and curvature the individuals adopt during the U-turns could help to limit variability in coordinating the group. Some theoretical studies support this idea: simplified models of velocity alignment with additive noise have shown semi-analytically the existence of delay and rate of turn values that minimise the fluctuations in the variance of the individual speed [30], and flocking models of self-propelled particles have also shown that delay can be tuned to increase stability and alignment of the group [31].

Finally, we have analyzed the variation of the time-delay τ as a function of both the distance between the focal fish and its influential neighbors dij and the difference of heading ϕij, finding that in both cases N = 2 and N = 5, the time-delay increases with respect to both the distance dij and the heading difference ϕij (see Fig 11). This result can be understood because during a U-turn the fish speed is decreasing and two fish can display larger reaction times the more separated they are and the less aligned they are.

Time delay τ extracted from the empirical observations as a function of heading difference ϕij and separation distance dij. (A) N = 2, (B) N = 5. In both cases, the larger the heading difference and the distance, the longer the time-delay.

A null model to detect spurious correlations

As already mentioned in the introduction, establishing causal influence on the basis of correlation measures requires controlling for spurious effects. Although our experimental data correspond to a specific collective behavior in which individuals influence each other, the relatively short time-windows over which cross-correlation are averaged and the use of several parameters through sensitivity analysis can weaken the accuracy of our results. To demonstrate that the particular detections of influential neighbors are not purely due to chance, we generated random artificial U-turns events by bootstrapping the data and applying the same procedure used to analyze collective U-turns in our experiments.

The null model is built for groups of 5 fish, for which our experimental data provide M = 2375 individual trajectories (5 × 475 collective U-turns). For every fish Fi, i = 1, …, M, the trajectory is rotated so that the individual turning point of the fish (where sin(θwi) = 0) is located in the upper part of the tank, by randomly sampling the new angular position ψi in the interval [π/2 − ξ, π/2 + ξ], where ξ is a small angle (we used ξ = π/12). Similarly, the time scale of each fish is shifted by sampling the instant of turning in the time interval [−ζ, ζ], where ζ is a short time (we have used ζ = 1 s). Then, five trajectories are randomly sampled, each one from a different randomly sampled collective U-turn, and mirrored if necessary so that the five individual U-turns are done in the same direction, clockwise or anti-clockwise. This way, the five fish of the artificial U-turn make their individual U-turn approximately at the same place and approximately the same time. For more details, see the section “Null model” in Materials and methods.

We have produced 1000 artificial collective U-turns; S9 Fig shows a collection of 10 of them. The results of our analysis are shown in red in Figs 7 and 8. As expected, they reveal clear differences between artificial and experimental U-turns.

Fig 7A shows that in artificial U-turns the proportion of time during which a focal fish has no influential neighbor is more than 63% of the time, while in the experiments it was less than 39%. The analysis also reveals that in artificial U-turns a focal fish has one influential neighbor for less than 28% of the time, while in the experiments, the proportion raises to 43%. Similarly, Fig 8C shows that the average number of influential neighbors is much smaller in artificial U-turns (≈ 0.4) than in real U-turns, where is almost always greater than 1. Note that the increase of during U-turns in artificial data is the consequence of the channeled motion of fish by the corridor. Moreover, the variation of along time, including the transients preceding and following the U-turn, decreases in artificial U-turns while it remains constant and with a higher value in experiments.

Fig 7B shows that distance rank has no significant effect on which fish is the influential one, both in experiments and in artificial U-turns. The decreasing number of influential neighbors comes from the fact that the tank is circular and the method we use. If the tunnel had been a straight corridor, we should have detected no decrease in our null model. However, in a circular tank, because of the geometrical constraints imposed by the curvature, even when two fish are both swimming in the same direction (i.e., clockwise or anti-clockwise), as the distance between fish increases, our method will detect a decrease of correlation. While Fig 7C confirms that influential neighbors are slightly more often ranked in the first position of the group, this effect is much more pronounced in the experiments. In fact, Figs 7B, 7C and 7D and 8A and 8B show that the selected null model satisfactorily reproduces the typical spatiotemporal behavioral patterns of real U-turns: the position and turning ranks are almost identical, as well as the variation of the average speed and the average group polarization, although the V-shape of the average polarization in real U-turns is significantly sharper than in artificial U-turns.

An additional, albeit expected, result of our null model is the homogeneous (isotropic) spatial distribution of “influential neighbors”, while in real collective U-turns influential neighbors are mostly located in front of the focal fish; see S10A and S10B Fig, compared with Fig 10A and 10B.

Discussion

By sharing information with other group members, schooling fish and other collectively moving animals can potentially improve their navigational accuracy (e.g. the many wrongs principle [32]), take better decisions (e.g. to avoid a predator [33]), or improve their abilities to sense the environment [34]. However, there are both physical and practical reasons why information is expected to be shared with only a few neighbors. Physical reasons involve material limitations, such as visual occlusions. Practical reasons often refer to trade-offs between sharing information, so that the group collectively selects a direction of motion, and deciding independently [35, 36].

Assuming that correlations between fish behavior rely to some extent on a causal influence, our analysis reveal that in groups of H. rhodostomus, during a collective U-turn, at any moment in time each fish only pays attention to a small number of neighbors whose identity regularly changes. We also find that the phases during which a focal fish is affected by one or two influential neighbors are interspersed with other phases during which its movement appears uninfluenced by the movement of neighbors. Moreover, influential fish are mostly located in front of the focal fish. The distance between a focal fish and its influential neighbors is about two body-lengths and the relative exposure angle is smaller than 60 degrees.

Our results bring insights on the way information on the neighborhood is processed by fish. Instead of having a synchronous update based on a fixed number of neighbors (topological neighborhood) or on all neighbors located within a fixed distance (metric neighborhood), our results suggest an asynchronous updating that does not depend on the distance between a focal fish and its influential neighbors. A similar asynchronous updating scheme has been previously introduced by Bode et al. [37] in a flocking model showing that it can give rise to emergent topological interactions consistent with the measures done on starling flocks [38].

It is however worth noting that our experiments, performed on small group sizes, may have prevented us from detecting any influence of the distance, since each of the four neighbors are located between one and three body lengths. In larger groups of fish moving in an unconstrained space, we expect the effective neighborhood of fish to result from the interplay between an asynchronous updating on a small number of neighbors and a modulation of the strength of interactions with the distance between fish [15].

Previous studies on the number and the spatial arrangement of influential neighbors led to different results depending on the species and on the procedure used to analyse the data. The work by Ballerini et al. [39] provides evidence that each bird within a starling flock (Sturnus vulgaris) coordinates its motion with a fixed number of closest neighbors, irrespective of their distance, while in mosquitofish (Gambusia holbrooki), one single nearest neighbor was sufficient to account for the large majority of the observed interaction responses [12]. In barred flagtails (Kuhlia mugil), it has been shown that different kinds of neighborhoods (Voronoi neighborhood and the k nearest neighbors (k ≈ 6 ∼ 8) were compatible with experimental data in a tank [13]. Our study points to a low number of influential neighbors. There are multiple possible explanations for the differences in the number of interacting neighbors found across the scientific literature. (i) It is possible that different animal groups interact with different numbers of neighbors. (ii) Temporal factors are also important [37], as interactions can be integrated in time to produce effectively larger neighborhoods. Here, we propose a third explanation (iii) based on the consideration that interaction responses such as attraction, alignment and avoidance are qualitatively different mechanisms that rely on different sensory-motor responses and, consequently, on different interacting neighborhoods. In particular, attraction and repulsion require to process information about the position of neighbors, while alignment is intrinsically a response dependent on orientation and velocity. These different interactions are likely to rely on different neural circuits (motion and form are typically processed by different brain areas in many animal groups [40, 41]) and hence might depend on different sets of influential neighbors: for instance, a focal individual could avoid collisions with its Voronoi neighbors, be attracted towards a different neighborhood of visually salient individuals and only process alignment information for one or two selected neighbors. It might also depend on different sets of influential neighbors: for instance a focal individual could avoid collisions with its Voronoi neighbors, be attracted towards a different neighborhood of visually salient individuals and only process alignment information for one or two selected neighbors.

It is thus natural to suggest that influential neighbors are intrinsically associated with different interaction mechanisms, which might also explain why fish point to different neighborhoods.

Our method for identifying influential neighbors is based on the computation of the time-dependent directional correlation between a focal fish and its neighbors. Of course, correlation does not imply causation, so that inferring causal influence between fish from directional correlation requires an extremely cautious methodology.

The methodology we proposed here is based on two solid procedural cornerstones. First, the data used in our study were carefully selected from a clearly recognizable behavior, the collective U-turns, where influence from neighbors undoubtedly exists, and thus should be, to some extent, responsible for a fundamental part of the correlations detected by our method. Time-delay between individuals’ direction choices has already been used to measure the interactions between group members in animal flocking. Specifically, Nagy et al. [23] used correlation delay times to reconstruct flight hierarchies in flocks of pigeons. Their approach consisted in integrating delay times over the entire trajectory to obtain a “leadership mark” for each individual. Our assumption is instead that the time-delay results from the individuals’ behavior and their environment, which varies in time depending on the information being gathered. To detect the response delay of each individual, we have instead followed the approach employed in [26] that allows for a change of delay over time. In fact, it is easy to show that the time delay between the same pair of fish is not constant, as revealed by our analysis of pair of fish (see Material and methods). Applying Nagy et al.’ method to different subsets of data in the same experiment, we found that the time delays between the same pair of fish vary substantially (see S2 Fig). The second methodological cornerstone is provided by the results of the null model that clearly show that the correlations we detected come from causal influence between neighbors and not from spurious random coincidences. The results of the null model also confirm that distance rank has no effect.

Identifying the number and position of influential neighbors is an essential step towards reconstructing behavioral cascades of information propagation across a group. Our method provides an accurate basis for mapping interaction network that does not rely on any assumption about the channel (e.g., vision, sound or hydrodynamic interactions) mediating information transfer. We are confident that by adopting our technique to map interactions in different species and different experimental contexts we will gain a much more detailed understanding of the distributed information processing taking place in fish schools.

Materials and methods

Ethics statement

Our experiments have been approved by the Ethics Committee for Animal Experimentation of the Toulouse Research Federation in Biology N°1 and comply with the European legislation for animal welfare.

Experimental procedures and data collection

Hemigrammus rhodostomus (rummy-nose tetras, Fig 12A) were purchased from Amazonie Labège (http://www.amazonie.com) in Toulouse, France. Fish were kept in 150 L aquariums on a 12:12 hour, dark:light photoperiod, at 27.7°C (±0.5°C) and were fed ad libitum with fish flakes. The average body length of the fish used in these experiments was 31 mm (± 2.5 mm). The experimental tank (120 × 120 cm) was made of glass and was set on top of a box to isolate fish from vibrations. The setup was placed in a chamber made by four opaque white curtains surrounded by four LED light panels to provide an isotropic lighting. A ring-shaped corridor was set inside the experimental tank filled with 7 cm of water of controlled quality (50% of water purified by reverse osmosis and 50% of water treated by activated carbon) heated at 28.1°C (±0.7°C) (Fig 12B). The corridor was made of a vertical circular outer wall of radius 35 cm and a circular inner wall with a conic shape of radius 25 cm at the bottom, so that the effective width of the corridor available to fish for swimming ranges from 10 cm at the bottom to 12 cm at the surface. The conic shape was chosen to avoid the occlusion on videos of fish swimming too close to the inner wall. Fish were randomly sampled from their breeding tank for a trial and were used at most in only one experiment per day. Groups of 2 or 5 fish were introduced in the experimental tank and acclimatized to their new environment for a period of 10 minutes. Their behavior was then recorded for one hour by a Sony HandyCam HD camera filming from above the setup at 50 images per second in HDTV resolution (1920x1080p). We performed 10 trials for each group size of 2 and 5 fish.

Data extraction and pre-processing

The positions of each fish on each frame were tracked with idTracker 2.1 [10]. Fish were sometimes misidentified by the tracking software, for instance when two fish were swimming too close to each other for a long period of time. In those cases, the missing positions were corrected manually. All sequences with 50 consecutive missing positions or less were interpolated. Larger sequences of missing values were checked by eye to determine whether interpolating was reasonable or not; if not, namely the trajectory doesn’t look like a straight line, then merging positions with closest neighbors were considered. Time series of positions were converted from pixels into meters. The origin of the coordinate system was set to the center of the ring-shaped tank. Body orientation of fish were measured using the first axis of a principal component analysis of the fish shapes detected by idTracker 2.1.

Detection and quantification of collective U-turns

Since the experiments were performed in an annular setup, the direction of rotation can be converted into a binary value: clockwise or anti-clockwise. We choose the anti-clockwise direction as the positive values for angular position. Before a U-turn event, all fish move in the same direction, say clockwise. Then, one fish, not necessarily the one located at the front of the group, changes its direction of motion to anti-clockwise direction. After a short transient, the other fish of the group display the same direction change, from clockwise to anti-clockwise. We defined the whole process of changing direction as a collective U-turn (see examples in Fig 1 and in S8 Fig). After data extraction and pre-processing, we found 1111 and 475 collective U-turns in groups of 2 and 5 fish, respectively. The duration distribution of collective U-turns in groups of 2 fish is shown in S3 Fig while the results for groups of 5 fish are shown in S4 Fig. Most of the collective U-turns last between 1 and 3 seconds, while the individual turning time usually lasts between 0.4 and 1 second.

The procedure used to define an individual U-turn for a fish Fi is as follows: we first determine the time tm,i at which the sign of the angle of incidence of fish Fi changes sign (from negative to positive or vice versa). Then, starting from tm,i, we reverse time step by step until the first time at which the absolute value of the angle of incidence is higher than a threshold is reached. We denote this time by ts,i. Similarly, we start again from tm,i and go forward step by step until the first time at which the absolute value of the angle of incidence is higher than a second threshold is reached. We denote this time by te,i. To determine the values of the thresholds and , we first compute the moving average of the angle of incidence over a period of 50 time steps (1s in real time), before and after the middle point tm,i, with a window of 5 time steps (0.1s in real time), respectively. Then we set the threshold values as the maximum values of the absolute moving average. Doubling the length of the period of time over which the average is computed, or doubling the width of the window, do not affect the results. Finally, the time at which the collective U-turn starts (resp. ends) is defined by (resp. ).

Position rank in a group

The relative position of a fish Fi in a group of N fish is calculated by projecting the vector position of the fish on the average group velocity vector . This allows us to define a group centroid in the direction of , with respect to which the fish are ranked: the first fish in the group is the fish whose projection on is the most advanced one in the direction of motion of the group (given by ), the second fish in the group is the second most advanced, and so on. Relative distance between fish are not taken into account when establishing the rank.

Optimal setting parameters for influential neighbors identification

Four parameters are used to identify influential neighbors: the time-delay τ, the window size w, the correlation threshold Cmin above which individuals are supposed to be interacting, and the threshold ε for selecting more than one influential fish.

The time delay must be specified along the whole trajectory of the focal fish: it is thus a series of values , where M is the number of time-steps or frames in the individual U-turn. The parameters Cmin, ε and w are in turn given for all time and for all fish by means of a sensitivity analysis described in the next section.

Assume by now that the three values Cmin, ε and w are known, and denote by Fi the focal fish and by Fj one of its neighbors. Then, the series of time-delays is built recursively as follows (actually only w is required to extract the time delays).

Denote by Γi(tk) the highest value of the pairwise directional correlation Cij of the velocity of fish Fi at time tk with the velocity of Fj at each time-step in the range of the previous time-steps :
(6)
Then, the time-delays , k = 1, …, Mi, are determined by the smallest value of the time-delay τr ∈ Rk where Γi(tk, w) reaches its maximum. For t1, the maximum correlation is reached at , for some time-delay . We set for the initial value of the recurrence. For the rest of time-delays , k = 2, …, Mi, the size of Rk is based on the assumption that if, at some time t, Fi copies the behavior that Fj displayed at a previous time t − τ, then, after time t, Fi will not copy the behavior that Fj displayed at any time earlier than t − τ.

Time-delays obtained with more complicated and time consuming procedures such as the time-ordered technique developed in [26] or through the similarity analysis based on Fréchet distances [25] would in principle produce similar values.

Fig 13B shows the distribution of time-delays obtained with this procedure in groups of two fish. The distribution is clearly bimodal with a first peak when τ = 0 and a second one around τ = 0.4 s. Considering a reaction time threshold of 50-100 ms for a fish to integrate information and reach a decision [42], we cannot attribute small values of time-delays to situations where the behavioral decision of the focal fish has been influenced by its neighbors. This is confirmed by the analysis of the spatial distribution of the extracted time-delays (Fig 13A), where we show that the lowest average values of τ are found mostly when the neighbor was behind the focal fish, in a zone with the lowest perception [15], while the highest values of τ > 0.4 s are found when the neighbor is located in front of the focal fish. This has lead us to consider in our analyzes only situations where τ > τR = 0.04 s.

(A) Spatial distribution of time-delays obtained by selecting the maximum of the pairwise correlation between the focal fish and its neighbor. The color of each bin represent the mean value of all the cases in that bin. An angle of 0° degree corresponds to when the influential neighbor is in front of the focal individual. (B) Spatially integrated distribution of time-delays. The data here are the same as those used in panel A. The dotted line corresponds to the reaction time threshold τR = 0.04s.

Parameter selection

Although the time-delays are determined once w is known, they also strongly depend on Cmin and ε, as the value of these three parameters must be fixed at the same time. This is done by means of a sensitivity analysis in which we have tested the following 40 combinations of parameter values: w ∈ {0, 1, 2, 3, 4}, ε = {3, 5}, and Cmin ∈ {0.995, 0.99, 0.95, 0.5}.

Each combination (Cmin, ε, w) gives rise to four histograms like those depicted in Fig 7. These histograms constitute the solution of our method of analysis, and can be characterized by a vector in 19 dimensions: (i) the 5 proportions of the number of influential neighbors in groups of 5 fish, (ii) the 4 proportions of their distance rank, (iii) the 5 proportions of their position rank, and (iv) the 5 proportions of their turning rank. This allows us to determine how similar are the results arising from two combinations (Cmin, ε, w) and , by computing the cosine similarity of the two vectors and .

The cosine similarity of two vectors and , denoted , is the cosine of the angle between these two vectors. Thus, two colinear vectors are such that independently of their magnitude, while two perpendicular vectors are such that . In our case, the components of the vectors are positive, so for all (Cmin, ε, w) and . Moreover, as the components are proportions, colinearity implies identity, both in direction and magnitude. Thus, means that both results are identical, while means that they differ as much as possible.

S5 Fig shows the cosine similarity matrix for the 40 combinations we have tested. Note that the matrix is symmetric with respect to the diagonal, where . Except for Cmin = 0.5, all similarity values are in the thin range [0.96, 1], showing that all combinations yield practically the same results. The higher dissimilarity is found in the white-yellow lines, where one of the combinations is (Cmin, ε, w) = (0.5, 3, 2).

The selection of parameter values is thus done as follows.

We choose w = 2, which corresponds to the higher dissimilarity regions. The selected time window size is sufficiently large so that the jagged nature of the movement data is smoothed out but not too large so that the actual turns gets washed out from the data.

Using ε = 3 or ε = 5 yields very similar results and we have arbitrarily chosen ε = 3.

The selection of Cmin is done by a specific procedure, which consists in calculating the number of data points that remain available for our analysis for each value of Cmin. S6 and S7 Figs exhaustively demonstrate that the larger Cmin is, the less data points remain available, and vice versa. We might be prone to choose a sufficiently small Cmin in order to get the maximum number of data points. However, according to our definition of influential neighbor, Cmin should be sufficiently large to select only the real influential neighbors. We have thus chosen the highest value which provides a sufficiently large number of data points, that is, the largest value before the fall of the number of data points in S11 Fig, Cmin = 0.95. This value preserves 61% (23830) and 76% (69703) of data points for N = 2 and N = 5 respectively.

Null model of collective U-turns

We want to design artificial collective U-turns in groups of 5 fish where all fish perform an individual U-turn at more or less the same place and more or less the same time, and in the same direction (clockwise or anti-clockwise). Fish must coincide in time and space to constitute a “group”, but individual U-turns must happen in an absolutely independent way. Correlations at hand in this paper are thus reduced to a minimum, while preserving the general aspect of a group of fish changing direction.

Our experimental data provide us with 5 × 475 = 2375 trajectories of individual fish, which we have conveniently normalized and combined to build 1000 groups of 5 fish changing direction in the same spatiotemporal interval. This is done as follows.

The whole trajectory of a fish Fi during a U-turn takes place in an interval of time [ts,i, te,i], where ts,i is the instant at which the individual U-turn of fish Fi starts, and te,i is the time at which the individual U-turn ends. See the paragraph above Eq (1). The trajectory of fish Fi in radial coordinates is given by
(7)
where ρi(tk) is the radius (distance of the fish from the center of the tank), ψi(tk) the already defined angle position (computed anticlockwise as positive), and Ni is the number of time-steps tk in the trajectory.

Denote by Ti the instant at which fish Fi effectively turns, i.e., Fi is perpendicular to the wall: sin(θwi(Ti)) = 0. In well defined individual U-turns as the ones we are using in our data, this happens only once per U-turn. Accordingly, (ρi(Ti), ψi(Ti)) denotes the fish position at time Ti.

Although we would like to have absolutely uncorrelated fish, it would not make sense to use groups of trajectories that do not reproduce a consistent U-turn, e.g., if one fish makes its U-turn much later than another, or on the other side of the tank. We thus try to decorrelate fish trajectories as much as possible, while preserving at the same time the typical spatiotemporal shape of real collective U-turns.

The decorrelation of all individual U-turns is done with the following two steps:

Spatial rotation: For all individual fish Fi in all U-turns, we rotate its trajectory an angle −ψi(Ti) + π/2 + ξi, where ξi is a random number in [−π/12, π/12] sampled uniformly, so that the new location of fish Fi at the time Ti when it performs its individual U-turn is in the upper part of the tank around π/2, in [5π/12, 7π/12].

Time shift: For all individual fish Fi in all U-turns, we shift the time scale a value −Ti + ζi, where ζi is a random number sampled uniformly in [−1, 1] s, so that Fi makes its individual U-turn at around time 0, in [−1, 1] seconds.

The artificial collective U-turn is thus built as follows:

Select randomly 5 real collective U-turns, and, from each collective U-turn, select randomly one trajectory. Rotate and time-shift trajectories according to the process described above.

Select randomly one of the 5 fish as the fish of reference Fref for building the artificial U-turn. If necessary, mirror the trajectories of other fish so that all fish move in the same direction as Fref with respect to the center of the tank, i.e., clockwise or anti-clockwise.

Then, the fish of reference of the artificial U-turn will make its individual U-turn at time ζref ∈ [−1, 1] s and position (ρref(Tref), π/2 + ξref). The other four fish Fj will make their individual U-turn at time ζj ∈ [−1, 1] s and position (ρj(Tj), π/2 + ξj) respectively, for j = 1, …, 5, j ≠ ref.

We have depicted in S9 Fig a set of artificial U-turns for comparison with the real U-turns shown in S8 Fig. Note that in these figures the time-scale has been shifted again so that collective U-turns start at t = 0 s.

S2 Fig. Different values of τ* for different subsets of the same data set computed with the method of Nagy et al. [23].

Consider the dataset of U-turns of 2 fish composed by U-turn number 1 to U-turn number 36, coming from the same experiment, and divide it in two subsets SA and SB containing respectively the U-turns [1,…,18] and the U-turns [19,…,36]. (A) Average directional correlation Cij with respect to time-delay τ for the U-turns from dataset SA. Red star and dashed blue vertical line denotes τ* = 0.96. (B) Cij for the U-turns from dataset SB. Red star: τ* = 0.32. (C) Cij for all the U-turn in data set SA ∪ SB. Red star: τ* = 0.80. The method of Nagy et al. is based on the assumption that the pairwise interaction between two individuals in a group has a constant time-delay τ*. However, Panels A and B provide different values of τ* for different data sets, showing that the method of Nagy et al. is not suitable for studing our data, and that the method we introduce here, which is based on the detection of dynamic time-delays, has potential for a broader range of applications.

S4 Fig. Distribution of the average duration (in seconds) of (A) individual and (B) collective U-turns in groups of 5 fish.

Matrix of 40 × 40 square cells, where each cell corresponds to the similarity value SV arising from the comparison of the two parameter combinations shown in the corresponding horizontal and vertical axes. We considered 40 parameter combinations, thus the size of the matrix. The similarity value SV is represented by the color of the cell, where the brightest red color corresponds to SV = 1 and the white color to SV = 0.92. For instance, the top-left cell displays a similarity value of SV = 0.95, showing how similar the results are when comparing the two combinations {ε = 5, Cmin = 0.995, w = 0} (horizontal axis) and {ε = 3, Cmin = 0.5, w = 4} (vertical axis). Cells along the diagonal correspond to the comparison of two identical parameter combinations and therefore SV = 1 there.

S6 Fig. Available data for different values of the average directional correlation threshold Cmin in the case of N = 2 fish.

Small panels: (there are 10, one per experiment) Number of data points available from the respective experiment for each value of Cmin in [0.5, 1]. The values of Cmin are denoted by small circles. Three specific values are shown by arrows: 0.6, 0.95 and 0.995. The value highlighted in red corresponds to the value we chose and is denoted by a star instead of a circle. Each vertical line corresponds to the fish that is taken as being the focal fish: F1 (red) and F2 (cyan). For instance, selecting Cmin = 0.6 in the upper-left small panel, 700 data points will be available for both fish. For Cmin = 0.95, around 450 points will be available for both fish.

Leftmost higher panel: Total number of data points available from all fish from all the experiments (summary of the 10 small panels, i.e., there is only one –pink– line). Vertical axis: ratio between the available number of data points for Cmin and the number of data points available for Cmin = 0.5. Total data points available from all the experiments (for Cmin = 0.5): 39381; data points available for Cmin = 0.95: 23830.

S7 Fig. Available data for different values of the average directional correlation threshold Cmin, in the case of N = 5 fish.

Small panels: (there are 10, one per experiment) Number of data points available from the respective experiment for each value of Cmin in [0.5, 1]. The values of Cmin are denoted by small circles. Three specific values are shown by arrows: 0.6, 0.95 and 0.995. The value highlighted in red corresponds to the value we chose and is denoted by a star instead of a circle. Each vertical line corresponds to the fish that is taken as being the focal fish: F1 (red), F2 (yellow), F3 (green), F4 (blue) and F5 (magenta). For instance, selecting Cmin = 0.6 in the third small panel of the upper row, 55 data points will be available for each one of the 5 fish. For Cmin = 0.95, around 75 points will be available for each fish.

Leftmost higher panel: Total number of data points available from all fish from all the experiments (summary of the 10 small panels, i.e., there is only one –pink– line). Vertical axis: ratio between the available number of data points for Cmin and the number of data points available for Cmin = 0.5. Total data points available from all the experiments (for Cmin = 0.5): 91827; data points available for Cmin = 0.95: 69703.

S11 Fig. Number of available data points for different values of Cmin.

Solid black line: Remaining data points for each value of Cmin for N = 2 according to the leftmost panel in S6 Fig. Red line: same thing, for N = 5, according to S7 Fig. Dashed line: highest number of available data points before the sharp fall of the black curve at Cmin = 0.95.

Acknowledgments

We would like to acknowledge Schloss Dagstuhl Leibniz Zentrum für Informatik in Germany and all the participants at the seminar 16022 on Geometric and Graph-based Approaches to Collective Motion organized in 2016 where many ideas developed in this paper have been discussed. In particular we would like to thank Martin Beye, Anael Engel, Marc van Kreveld, Frank Staals and Goce Trajcevski. We also thank Pierre Tichit and Gérard latil for technical assistance.