Abstract Extensive research on multisensory processing has established that temporal and/or spatial proximity of sensory information lead to the percept of a unified multisensory event, both at a behavioral and neuronal level (e.g., Stein, Huneycutt, & Meredith, 1988). Binding of multiple sensory inputs has also been demonstrated for stimuli presented with a certain degree of temporal disparity (e.g., Vatakis & Spence, 2010). A classical example of crossmodal interaction is the well-known sound induced flash illusion (SIFI; whereby a brief flash paired with two auditory beeps is actually perceived as two distinct flashes; Shams, Kamitani, & Shimojo, 2000). \{SIFI\} is considered an example of auditory dominance, where auditory stimulation modulates visual perception for audiovisual presentations that fall within the temporal window of integration. Studies on the \{SIFI\} (Andersen, Tiipana, & Sams, 2004; Shams, Kamitani, & Shimojo, 2002) have shown diminished performance in 1 flash-2 beeps (SIFI illusion) and 2 flash-1 beep presentations (considered different from \{SIFI\} but as yet not elucidated), while the performance in 1 flash-1 beep is excellent. That is, the close in time-space presence of 2 versus 1 input from different sensory modalities affects participant performance, while this is not the case for presentations of equal number of sensory stimulus inputs. We claim that the diminished performance in 1 flash-2 beeps and 2 flash-1 beep conditions are not two different illusions but they both represent examples of the crossmodal binding rivalry between the unequal number of sensory inputs presented. That is, presentations of multiple sensory inputs in close spatial and temporal proximity lead to a rivalry between the sensory inputs that are to be integrated. This rivalry will be weaker or stronger depending on a number of findings related to multisensory integration. As has been previously shown, a unified multisensory percept is more robust if the visual input is presented slightly before or in synchrony with the auditory input (Keetels & Vroomen, 2012; van Wassenhove, Grant, & Poeppel, 2007; Vatakis & Spence, 2007, 2008). In cases where the auditory input precedes the visual, binding is weaker leading to a less integrated percept. Moreover, binding is highly dependent on timing with temporally proximal presentations taking precedence over distal presentations (e.g., Vatakis & Spence, 2010). Thus, presentations of asynchronous stimuli even if presented within the temporal window of integration represent binding types of different strength with synchronous presentations being the ones leading to higher binding. These findings drive our crossmodal binding rivalry hypothesis and we support that the rivalry between the unequal number of sensory inputs will vary according to their binding robustness. So, for inputs where the visual is in synchrony or leading the auditory input, the binding is robust leading to a stronger rivalry with the spare stimulus. This rivalry results to a lower percent of illusory percepts and slower reaction times. On the other hand, if the binding between the auditory and visual inputs is weak, then the rivalry between them and the spare stimulus is less intense, thus resulting in quicker responding and higher illusory experiences. Generally, during illusion conditions, we expect to have slower reaction times than in conditions with equal number of visual and auditory inputs and in bimodal conditions (equal number of inputs) more accurate responses than in unimodal conditions. We have tested directly the rivalry hypothesis by utilizing the classical \{SIFI\} but with multiple timing presentations (never tested before in one experimental set-up). More specifically, we have used 0, 25, 50, and 100 ms onset asynchronies of auditory beep before and after the visual flash. Illusion conditions and test conditions were intermixed in order to avoid biased responding (in terms of the number of flashes) and to be sure that the task is not too difficult for the participants to carry out. The proposed project will allow us to evaluate the rivalry hypothesis for multiple audiovisual inputs, which will provide a common explanation for both 1 flash-2 beeps and 2 flash-1 beep presentations, while at the same time it will allow the revisiting of the role of auditory dominance in the double flash illusion.