The majority of work on the perception of transparency has focused on static images with luminance-defined contour junctions, but recent work has shown that dynamic image sequences with dynamic image deformations also provide information about transparency. The present study demonstrates that when part of a static image is dynamically deformed, contour junctions at which deforming and nondeforming contours are connected facilitate the deformation-based perception of a transparent layer. We found that the impression of a transparent layer was stronger when a dynamically deforming area was adjacent to static nondeforming areas than when presented alone. When contour junctions were not formed at the dynamic–static boundaries, however, the impression of a transparent layer was not facilitated by the presence of static surrounding areas. The effect of the deformation-defined junctions was attenuated when the spatial pattern of luminance contrast at the junctions was inconsistent with the perceived transparency related to luminance contrast, while the effect did not change when the spatial luminance pattern was consistent with it. In addition, the results showed that contour completions across the junctions were required for the perception of a transparent layer. These results indicate that deformation-defined junctions that involve contour completion between deforming and nondeforming regions enhance the perception of a transparent layer, and that the deformation-based perceptual transparency can be promoted by the simultaneous presence of appropriately configured luminance and contrast—other features that can also by themselves produce the sensation of perceiving transparency.

Introduction

Many materials refract light passing through the air–material interface. When the material has a curved surface, it often causes complex image deformations in any background patterns. Such image deformations, generated by a transparent material, can be a cue that causes the visual system to recognize a transparent layer. Although an observer can easily perceive image deformation due to refraction in a still image, it has been shown that static image deformations do not serve as a strong cue to perceptual transparency, as shown in figures 1 and 12 of Sayim and Cavanagh (2011). Some previous studies have reported that static image deformations produce only weak impressions of water and hot air (Kawabe & Kogovšek, 2017; Kawabe, Maruya, & Nishida, 2015). However, dynamic image deformations do serve as an effective cue for an observer to perceive a transparent layer (Kawabe & Kogovšek, 2017; Kawabe, Maruya, & Nishida, 2014, 2015). Here we need to pay attention to the fact that dynamic image deformations do not always come from the optical perturbation of the background by transparent materials. They may come from the physical deformation of a material itself as in the case of a flapping flag or a dynamic facial expression. In such situations, the visual system needs to resolve some ambiguity about the source of dynamic image deformations.

Previous studies have shown that a specific band in the spatiotemporal frequency of an image deformation served as a critical cue to the recognition of a transparent rigid layer (Kawabe et al., 2014), a transparent liquid layer (Kawabe et al., 2015), and transparent hot air (Kawabe & Kogovšek, 2017). Specifically, the visual system uses dynamic image deformations having a temporal frequency of 2–10 Hz and a spatial frequency of 0.13 to 1.03 cycles per degree (cpd) as diagnostic visual features, suggesting the existence of a transparent water layer (Kawabe et al., 2015). In addition, decreasing the magnitude of an image deformation that has the spatiotemporal deformation frequency optimally suggestive of transparent water alters the perceptual bias from water to hot air (Kawabe & Kogovšek, 2017).

In the present study, we examined whether visual features other than the spatiotemporal deformation frequency played a role in enhancing the perception of a transparent layer. Specifically, by means of deformation maps that consisted of two-dimentional band-pass noises, we partly deformed a natural image so that deformation-defined contour junctions were generated between deforming and nondeforming regions, and examined how the presence of the deformation-defined junctions could affect the perception of a transparent layer.

Traditionally, vision science researchers have proposed that luminance information plays a fundamental role in the perception of transparency. Metelli (1974) tried to explain transparency perception on the basis of a physical model called the Episcotister model, which is based on additive color mixing. A following study argued against this physical model–based explanation, and instead proposed that spatial luminance patterns called contour junctions provide a visual cue for perceiving transparency (Adelson & Anandan, 1990; Beck & Ivry, 1988; Singh & Anderson, 2002). Specifically, the spatial pattern of luminance across contour junctions serves as a strong cue for human observers to perceive the presence of overlapping surfaces. Not only luminance-defined junctions, but also texture-defined junctions also play a strong role in segregating overlapping surfaces from each other. Previous studies (Kawabe & Miura, 2004, 2006) showed that when vertical and horizontal stripes defined by Gabor patches overlapped each other, the junction generated by these texture-defined borders also served as a strong cue to the stratification of the stripes. Thus, irrespective of the attributes that define the junctions, evaluating image information across contour (or border) junctions is likely an essential function of the visual system as it determines the structure of overlapping surfaces.

The present study tested whether contour junctions defined by dynamic image deformations appear to be due to a transparent layer. In this study, a “deformation-defined junction” refers to a contour junction caused by an image deformation along a continuous contour wherein one spatial side of the contour dynamically deforms while the other side remains unchanged. As experimental stimuli, a part of an image was deformed so that deformation-defined junctions were generated across deforming and nondeforming areas. We predicted that the sharp change of image deformation would be interpreted as coming from optical disturbance due to the intervention of a transparent layer, and would thus enhance the perception of a transparent layer. By using the stimuli, we tested whether dynamic image deformations involving deformation-defined junctions made a stronger contribution to the perception of a transparent layer than dynamic image deformations without the junctions.

Experiment 1

Methods

Observers

All observers in this study (nine females, one male, mean age 37.2 years, SD = 3.29 years) were unaware of the specific purpose of the experiments. They reported having normal or corrected-to-normal visual acuity. Participants were recruited from outside of the laboratory and paid for their participation. Ethical approval for this study was obtained from the ethical committee at Nippon Telegraph and Telephone Corporation (H28-008 by NTT Communication Science Laboratories Ethical Committee). The experiments were conducted according to the principles that have their origin in the Helsinki Declaration. Written informed consent was obtained from all participants.

A stimulus clip consisted of two vertical columns of five rings (Figure 1). The diameter of each ring was 1.44° of visual angle, and its contour width was 0.12°. The center-to-center separation between the rings was 1.92°. The rings were presented against a neutral gray background with a luminance of 29 cd/m2. The luminance of the ring was 21.0 cd/m2. In a junction condition, a central rectangular area, which subtended 1.92° (width = 64 px) × 7.68° (height = 256 px), was dynamically deformed at a fixed spatiotemporal frequency. To deform an image, we first created two deformation maps, each measuring 64 px × 256 px × 120 frames (see the supplemental file “Example psychology code” for actual implementation). One map was used for horizontal deformations and the other for vertical deformations. In each trial, each pixel in the maps was initially given a random value that was drawn from white noise. Then, the map was spatiotemporally filtered so as to have a spatial frequency below 0.26 cpd and a temporal frequency less than 2 Hz. Finally, the amplitude of the map was normalized in the range of −0.36° to 0.36°. Based on the maps, a stimulus image with the rings was deformed by means of a standard image warp procedure (Glasbey & Mardia, 1998). Repeating this manipulation for 64 frames of the map consequently generated a stimulus clip consisting of 120 frames and lasting for 2 s. In the no-junction condition, the nondeforming part of the rings had a luminance of 29 cd/m2, which was equivalent to the background luminance. In the one-column condition, only one vertical column of the rings was presented in the stimulus area undergoing the image deformation. Similarly to the no-junction condition, the one-column condition did not have deformation-defined junctions.

We employed a lower spatiotemporal deformation frequency that was outside of the optimal band for deformation-based transparency perception (Kawabe et al., 2015). Hence, based on the previous study, it was expected that the deformation-based perceptual transparency would be weak unless an additional manipulation was given to the stimuli. Moreover, using the lower spatiotemporal deformation frequency did not induce any reduction of apparent image contrast due to spatiotemporal pooling in the visual system, and hence, ruled out the involvement of a conventional luminance-based transparency cue (contrast change) in determining the appearance of a transparent layer in our stimulus setting.

Stimuli of Experiment 1b

The stimuli in the junction condition were identical to those used in Experiment 1. In the offset condition, the nondeforming part of the rings in the junction condition was vertically offset by 0.96° so that junctions were not generated between the deforming and nondeforming areas. In a discontinuous flanker condition, the nondeforming part of the rings in the junction condition was added to the one-column condition as used in Experiment 1a. In this condition, no deformation-defined junctions were generated.

Stimuli of Experiment 1c

In the neutral condition, stimuli identical to those used in the junction condition of Experiment 1 were employed. In the consistent condition, the nondeforming part of the rings was given a luminance of 7.2 cd/m2 so that the spatial pattern of luminance contrast was consistent with the interpretation that the central area of the stimuli was covered with a perceptually transparent layer. In the consistent condition, the nondeforming part of the rings was given a luminance of 36.4 cd/m2 so that the spatial pattern of luminance contrast was inconsistent with the interpretation of a luminance-based perceptual transparency.

Procedure

Each observer was individually tested in a dimly lit room. The observer sat at a distance of 65 cm from the CRT display. To start each session, the observer clicked a green button in the interface of Psychopy. Within 5 s of the click, the first trial started. A stimulus clip was presented for 2 s. The task of the observer was to carefully view the clip, and to rate the impression of a transparent layer on a 5-point scale where 1 = A strong impression of no transparent layer, 2 = A moderate impression of no transparent layer, 3 = Ambiguous, 4 = A moderate impression of a transparent layer, and 5 = A strong impression of a transparent layer. Two seconds after reporting the rating score by pressing the appropriate key, the next trial started. In each experiment, a session included 60 trials consisting of 3 experimental conditions × 20 repetitions. The order of the trials was pseudorandomized across the observers.

Results and discussion

In Experiment 1a, we examined how the impression of a transparent layer varied with the presence or absence of deformation-defined junctions. The purpose of this experiment was to check whether the addition of the deformation-based junctions to stimuli would facilitate the perception of a transparent (Movie 1). The results are shown in Figure 2a. A one-way repeated measures analysis of variance (ANOVA) for the rating scores showed the main effect of the conditions to be significant, F(2, 14) = 9.279, p < 0.005, partial η2 = 0.57. Multiple comparison tests (Ryan, 1959) showed that the rating scores in the junction condition were significantly different from the scores in the no-junction condition (p < 0.008, Cohen's d = 1.26) and the one-column condition (p < 0.008, d = 1.30). By using a two-tailed one-sample t test, we also tested whether rating scores in each condition deviated from 3, which corresponded to “an ambiguous impression,” and found that the scores in the junction condition differed significantly from 3 (t[7] = 5.22, p < 0.002), while the scores in the other two conditions did not, (ts[7] = 0.58 and 0.83, ps > 0.5 and 0.4, for the no-junction and one-column conditions, respectively). This showed that dynamic image deformations with deformation-defined junctions produced a stronger impression of a transparent layer than dynamic image deformation without junctions, indicating that the deformation-defined junctions may serve as a strong cue for the observer to perceive a transparent layer.

In Experiment 1b, to eliminate the effect of the junctions on the perception of a transparent layer, we manipulated the continuity of the rings between deforming and nondeforming areas. In addition to the junction condition as used in Experiment 1, we employed two additional conditions (Figure 1b; Movie 2). In an offset condition, the nondeforming region of the junction condition was vertically shifted so that the contours were made spatiotemporally discontinuous. In the discontinuous flanker condition, the nondeforming region of the junction condition was added to the one-column condition as used in Experiment 1a, so that the contours were made spatiotemporally discontinuous. Similarly to Experiment 1a, we asked the observers to rate their impression of a transparent layer, and those results are shown in Figure 2b. A one-way repeated measures ANOVA again showed a significant main effect of the conditions, F(2, 14) = 4.371, p < 0.035, partial η2 = 0.38. Multiple comparison tests showed that the rating score in the junction condition was different from the score in the offset condition (p < 0.0504, d = 0.85) and the discontinuous flanker condition (p < 0.0504, d = 1.01), with only marginal significance but a reliable effect size. Similarly to Experiment 1a, we examined whether rating scores in each condition deviated from 3, and found that the scores in the junction condition differed significantly from 3 (t[7] = 5.93, p < 0.0006), while the scores in the other two conditions did not (ts[7] = 1.00 and 0.80, ps > 0.3 and 0.4, for the offset and one-column conditions, respectively). These results indicate that when the junctions were eliminated from the stimuli, the perception of a transparent layer was attenuated even when both deforming and nondeforming areas existed in the stimuli.

Although we observe subjective contours at the boundary between deforming and nondeforming areas, they do not play a causal role in the perception of transparency from our stimuli. Several studies have suggested that subjective contours serve as a cue to surface stratification (Gillam & Marlow, 2014; Sato, 1983). In addition, some previous studies (Kanizsa, 1979; Kingdom, 2008) discussed that subjective contours produced by T-junctions contributed to the emergence of a luminance-based transparent surface. Although we did not directly measure the strength of the subjective contours, the stimuli in the offset condition had a configuration that was related to an abutting contour (Soriano, Spillmann, & Bach, 1996), and seemed to cause a stronger impression of a subjective contour than the other two conditions. Despite the generation of the strong subjective contour, no distinct effect of the offset condition on a transparent layer perception was observed. In this respect, we suggest that the subjective contours were unlikely to be a key feature inducing the perception of a transparent layer in our stimulus setting.

In Experiment 1c, we explored the interaction between deformation-based and luminance-based transparent layer perception. As shown in Figure 1c and Movie 3, we manipulated the luminance of the rings in the nondeforming region so that the spatial pattern of luminance contrast at the junction was consistent or inconsistent with perceiving transparency of the central region in stimuli (Anderson, 1997; Anderson, Singh, & Meng, 2006; Kingdom, 2008; Singh & Anderson, 2002), and examined how the luminance contrast cue affected the impression of a transparent layer in the presence of a deformation-defined junction. The results are shown in Figure 2c. A one-way repeated measures ANOVA showed a significant main effect of the conditions, F(2, 14) = 6.490, p < 0.02, partial η2 = 0.48. The rating scores in the neutral condition were significantly different from the scores in the inconsistent condition (p < 0.02, d = 1.06). In addition, the rating scores in the consistent condition were significantly different from the scores in the inconsistent condition (p < 0.02, d = 1.12). Similarly to the previous experiments, we examined whether rating scores in each condition deviated from 3 (where a score of 3 meant “ambiguous, neither clearly transparent nor nontransparent”), and found that the scores for the neutral condition were significantly different from 3 (t [7] = 5.85, p < 0.0007). The scores in the consistent condition also differed significantly from 3 (t [7] = 5.85, p < 0.0007). The scores in the inconsistent condition were not (t [7] = 0.88, p > 0.4). The results indicate that the perception of a transparent layer due to dynamic image deformation is also influenced by spatial patterns of luminance contrast at contour junctions.

Because the same group of observers participated in all the three experiments, it is possible to analyze statistical differences among the experimental conditions of each experiment by means of a single ANOVA. Accordingly, we conducted an additional ANOVA to compare the seven conditions as a within-subjects factor. In the analysis, by using Dunnett's method, we assessed whether each experimental condition was statistically different from the no-junction condition as a control. The data from the junction conditions in Experiments 1a and 1b and from the consistent condition in Experiment 1c were averaged for each individual. By using the data, we conducted a one-way repeated measures ANOVA with the conditions as a within-subjects factor, and found a significant main effect, F(6, 49) = 3.537, p = 0.00548, partial η2 = 0.30. Dunnett's method showed that the conditions that were significantly different from the no-junction condition were the following two: the junction condition (t = 3.038, adjusted p = 0.019) and the consistent condition (t = 3.173, adjusted p = 0.0134). These results again suggest that the stimuli with deformation-defined junctions produced a stronger impression of a transparent layer.

Experiment 2

Purpose

The results of Experiment 1b left a question as to whether object completions across junctions were required to see a transparent layer. Specifically, there was a possibility that the completion of contours across deforming and nondeforming regions played a role in transparent layer perception. In this experiment, instead of half rings, we added straight lines to the nondeforming regions so that the lines were connected to half rings in the deforming regions. We manipulated the length of the additional lines, and examined how the length influenced the strength of a transparent layer impression. We assumed that at the junctions, contour completions would be established as the length of the additional lines in the nondeforming condition increased. Hence it was predicted that the transparent layer impression would increase with the line length if the contour completion of stimuli played a major role in seeing a transparent layer. On the other hand, if the visual system was sensitive to the junction itself, transparent layer impressions would be high even when the line length in the nondeforming region was short.

Methods

Observers

All six observers (four females, two males, mean age 32.7 years, SD = 2.90 years) were unaware of the specific purpose of the experiment. Three of them had participated in Experiment 1.

Apparatus and stimuli

Apparatus was identical to that used in Experiment 1. Stimulus properties were identical to those used in Experiment 1 except for the following. All stimulus clips of this experiment contained the no-junction condition of Experiment 1a in the deforming region. To the nondeforming region, horizontal straight lines were added, connecting the terminators of the rings in the deforming regions (Figure 3a). The line length was manipulated to produce the following eight levels of visual angular separation of the line ends: 0.12°, 0.24°, 0.36°, 0.48°, 0.6°, 0.72°, 0.84°, and 0.96°. The vertical width of the lines was 0.12°. The luminance of the lines was 21 cd/m2, which was identical to the luminance of the rings.

(a) Snapshots taken from stimulus video clips as used in Experiment 2. Line lengths in nondeforming regions increase from left to right panels. (b) Experimental results in Experiment 2. Error bars denote standard errors of the mean (N = 6).

Figure 3

(a) Snapshots taken from stimulus video clips as used in Experiment 2. Line lengths in nondeforming regions increase from left to right panels. (b) Experimental results in Experiment 2. Error bars denote standard errors of the mean (N = 6).

The procedure was also identical to that used in Experiment 1 except for the following. In this experiment, we alternatively presented a clip without the horizontal lines and a clip with the lines. Each clip lasted for 2 s, and was repeatedly presented at an identical spatial location until the observers made a judgment. The task of the observer was to carefully compare the clips, and rate the impression of a transparent layer by using a 5-point scale where 1 = A strong impression of a transparent layer without horizontal lines, 2 = A moderate impression of a transparent layer without horizontal lines, 3 = Ambiguous, 4 = A moderate impression of a transparent layer with horizontal lines, and 5 = A strong impression of a transparent layer with horizontal lines. Our reason for using a relative comparison task in Experiment 2 was to efficiently extract the effect of contour. After reporting the rating score by pressing the appropriate key, the next trial started in 2 s. Each session included 60 trials consisting of 8 experimental conditions × 8 repetitions. Each observer participated in two sessions. The order of the trials was pseudo-randomized across the observers.

Results

For each observer, mean rating scores were calculated for each line length. The group mean data are shown in Figure 3b. We conducted a one-way repeated measures ANOVA with the line length as a within-subjects factor. The main effect was significant, F(7, 35) = 57.695, p < 0.0001, partial η2 = 0.92. Multiple comparison tests showed that the rating scores for lengths of 0.12°, 0.24°, or 0.36° were significantly different from the rating scores for other lengths (p < 0.05). By using the Holm-Bonferroni test, we also assessed whether the rating scores differed significantly from 3 (i.e., a vague impression), and observed that the rating score was significantly higher than 3 when the length of the lines was greater than or equal to 0.36° (p < 0.0001).

When the lines in the nondeforming region were short, impressions of transparency were weak. The results indicate that the mere presence of a deformation-defined junction was not sufficient to trigger transparent layer impressions. When the lines were long, transparent impressions were strong. Therefore, the establishment of contour completion at junctions seemed critical. The results also indicate that the closure of complete objects at the junctions, which the rings had in Experiment 1, was not required to trigger a transparent layer perception.

General discussion

The present study found that the visual system uses deformation-defined junctions to judge the presence of a transparent layer on the basis of dynamic image deformations. Moreover, the deformation-based transparency-perception stimulus could interact with the luminance-contrast–based transparency-perception stimulus in determining the overall appearance of our stimuli. Contour completion at the junctions plays a significant role in the perception of a transparent layer.

Why do the contour junctions influence perceived transparency? Image deformations arise from the states of various types of materials and/or objects—for example, the physical deformation of a material body (Paulun, Schmidt, van Assen, & Fleming, 2017; Spröte & Fleming, 2016) and an animate agency (Kawabe, 2017). As described above, dynamic image deformations also arise from any optical deformation due to refraction at the surface of materials (Kawabe et al., 2014, 2015). Therefore, the visual system needs to deal with ambiguity about the source of image deformations. We assume that the role of contour junctions may be to help the visual system to interpret the ambiguous deformation as being due to a transparent layer. At the junctions in our stimulus setting, deforming regions are separated from nondeforming regions by a sharp deformation-defined discontinuity. It is unlikely that such sharp changes in image deformation along a single object's contour occur in the physical deformation of objects and agencies. Rather, this sharp discontinuity is more likely to arise from the edge of an overlaid transparent surface than from a bending or warping of the shape of the textured surface. As such, the visual system possibly uses a deformation-defined contour junction as evidence for the optical modulation of background patterns by a transparent layer that refracts light.

We did not observe a strong transparent layer impression when no deformation-based junctions were included in the stimuli. The results are consistent with previous studies (Kawabe & Kogovšek, 2017; Kawabe et al., 2014, 2015), in that the spatiotemporal deformation frequency we employed was not optimal for a deformation-based transparent layer perception as described above. On the other hand, if the spatiotemporal frequency of image deformation had been optimal, a vivid impression of a transparent layer could have been obtained (Kawabe & Kogovšek, 2017; Kawabe et al., 2014, 2015) without deformation-defined contour junctions. In this regard, it appears that the effective feature of a deformation-based transparency perception need not necessarily be a single characteristic, and the visual system may recruit the evidence from multiple available diagnostic features in order to judge whether a transparent layer exists in a scene. Without a junction, the visual system may rely more on the specific band of spatiotemporal deformation frequency of a dynamic image deformation, while with a junction, in addition to the spatiotemporal deformation frequency, the visual system also relies on the patterns of deformation-defined junctions as a diagnostic cue to the presence of a transparent layer.

The results of Experiment 1c indicate that the perception of transparency that is caused by deformation is made more likely by the simultaneous presence of appropriately configured luminance and contrast—other features that can also by themselves produce the sensation of perceiving transparency. Thus, features of an image that have been researched and considered separately as producers of this visual effect also interact with each other.

In the real world, a material which refracts light may also scatter light, and/or have a low transmittance. When a deformed area has a lower contrast than a nondeformed area, the visual system may heuristically exploit both luminance and deformation cues, and lead to the interpretation that the deformation comes from a material which refracts light and somehow lowers luminance contrast. In this respect, we would like to speculate that the visual system recruits the evidence for the existence of a transparent layer from various image attributes, and synthetically and heuristically determines the appearance of a surface stratification, or estimates the likelihood that dynamic image deformations come from a transparent layer. On the other hand, in the experiment, an additional cue for luminance-based transparency did not increase the rating scores of a transparent layer. We speculate this may come from a ceiling effect: Because the neutral condition produced a sufficiently strong impression of a transparent layer, no significant difference between the neutral and consistent conditions might be observed. Although the present study did not further pursue this issue, the parametric variation of the different cues (luminance cues and contour lengths in non-deforming regions) in future studies will give further insights into the cue combination mechanisms for the perception of a transparent layer.

Acknowledgments

This research was supported by Grants-in-Aid for Scientific Research on Innovative Areas (15H05915) from the Japanese Ministry of Education, Culture, Sports, Science, and Technology.

(a) Snapshots taken from stimulus video clips as used in Experiment 2. Line lengths in nondeforming regions increase from left to right panels. (b) Experimental results in Experiment 2. Error bars denote standard errors of the mean (N = 6).

Figure 3

(a) Snapshots taken from stimulus video clips as used in Experiment 2. Line lengths in nondeforming regions increase from left to right panels. (b) Experimental results in Experiment 2. Error bars denote standard errors of the mean (N = 6).