All animal procedures adhered to the laws governing animal experimentation issued by the German Government. For all experiments, we used 3- to 12-week-old C57Bl/6 (n = 3), Chattm2(cre)Lowl (n = 34; ChAT:Cre, JAX 006410, The Jackson Laboratory), and Tg(Pcp2-cre)1Amc (n = 5; Pcp2, JAX 006207) mice of either sex. The transgenic lines were cross-bred with the Cre-dependent red fluorescence reporter line Gt(ROSA)26Sortm9(CAG-tdTomato)Hze (Ai9tdTomato, JAX 007905) for a subset of experiments. Owing to the explanatory nature of our study, we did not use randomization and blinding. No statistical methods were used to predetermine sample size.
Animals were housed under a standard 12-h day–night rhythm. For recordings, animals were dark-adapted for ≥ 1 h, then anaesthetized with isoflurane (Baxter) and killed by cervical dislocation. The eyes were removed and hemisected in carboxygenated (95% O , 5% CO ) artificial cerebral spinal fluid (ACSF) solution containing (in mM): 125 NaCl, 2.5 KCl, 2 CaCl , 1 MgCl , 1.25 NaH PO , 26 NaHCO , 20 glucose, and 0.5 l-glutamine (pH 7.4). Then, the tissue was moved to the recording chamber of the microscope, where it was continuously perfused with carboxygenated ACSF at ~37 °C. The ACSF contained ~0.1 μM sulforhodamine-101 (SR101, Invitrogen) to reveal blood vessels and any damaged cells in the red fluorescence channel. All procedures were carried out under very dim red (>650 nm) light.
A volume of 1 μl of the viral construct (AAV9.hSyn.iGluSnFR.WPRE.SV40 or AAV9.CAG.Flex.iGluSnFR.WPRE.SV40 (AAV9.iGluSnFR) or AAV9.Syn.Flex.GCaMP6f.WPRE.SV40, Penn Vector Core) was injected into the vitreous humour of 3- to 8-week-old mice anaesthetized with 10% ketamine (Bela-Pharm GmbH & Co. KG) and 2% xylazine (Rompun, Bayer Vital GmbH) in 0.9% NaCl (Fresenius). For the injections, we used a micromanipulator (World Precision Instruments) and a Hamilton injection system (syringe: 7634-01, needles: 207434, point style 3, length 51 mm, Hamilton Messtechnik GmbH). Owing to the fixed angle of the injection needle (15°), the virus was applied to the ventronasal retina. Imaging experiments were performed 3–4 weeks after injection.
Sharp electrodes were pulled on a P-1000 micropipette puller (Sutter Instruments) with resistances >100 MΩ. Single cells in the inner nuclear layer were dye-filled with 10 mM Alexa Fluor 555 (Life Technologies) in a 200 mM potassium gluconate (Sigma-Aldrich) solution using the buzz function (50-ms pulse) of the MultiClamp 700B software (Molecular Devices). Pipettes were carefully retracted as soon as the cell began to fill. Approximately 20 min were allowed for the dye to diffuse throughout the cell before imaging started. After recording, an image stack was acquired to document the cell’s morphology, which was then traced semi-automatically using the Simple Neurite Tracer plugin implemented in Fiji (https://imagej.net/Simple_Neurite_Tracer).
All drugs were bath applied for at least 10 min before recordings. The following drug concentrations were used (in μM): 10 gabazine (Tocris Bioscience)50, 75 TPMPA (Tocris Bioscience)50, 50 l-AP4 (l-(+)-2-amino-4-phosphonobutyric acid, Tocris Bioscience) and 0.5 strychnine (Sigma-Aldrich)51. Drug solutions were carboxygenated and warmed to ~37 °C before application. Pharmacological experiments were exclusively performed in the On and Off ChAT-immunoreactive bands, which are labelled in red fluorescence in ChAT:Cre × Ai9tdTomato crossbred animals.
We used a MOM-type two-photon microscope (designed by W. Denk, MPI, Heidelberg; purchased from Sutter Instruments/Science Products). The design and procedures have been described previously52. In brief, the system was equipped with a mode-locked Ti:Sapphire laser (MaiTai-HP DeepSee, Newport Spectra-Physics), two fluorescence detection channels for iGluSnFR or GCaMP6f (HQ 510/84, AHF/Chroma) and SR101/tdTomato (HQ 630/60, AHF), and a water immersion objective (W Plan-Apochromat 20×/1.0 DIC M27, Zeiss). The laser was tuned to 927 nm for imaging iGluSnFR, GCaMP6f or SR101, and to 1,000 nm for imaging tdTomato. For image acquisition, we used custom-made software (ScanM by M. Müller and T.E.) running under IGOR Pro 6.3 for Windows (Wavemetrics), taking time-lapsed 64 × 16 pixel image scans (at 31.25 Hz) for glutamate and 32 × 32 pixel image scans (at 15.625 Hz) for calcium imaging. For visualizing morphology, 512 × 512 pixel images were acquired.
For light stimulation, we focused a DLP projector (K11, Acer) through the objective, fitted with band-pass-filtered light-emitting diodes (LEDs) (green, 578 BP 10; and blue, HC 405 BP 10, AHF/Croma) to match the spectral sensitivity of mouse M- and S-opsins. LEDs were synchronized with the microscope’s scan retrace. Stimulator intensity (as photoisomerization rate, 103 P* per s per cone) was calibrated as described previously52 to range from 0.6 and 0.7 (black image) to 18.8 and 20.3 for M- and S-opsins, respectively. Owing to technical limitations, intensity modulations were weakly rectified below 20% brightness. An additional, steady illumination component of ~104 P* per s per cone was present during the recordings because of two-photon excitation of photopigments (for detailed discussion, see refs 52 and 53). The light stimulus was centred before every experiment, such that its centre corresponded to the centre of the recording field. For all experiments, the tissue was kept at a constant mean stimulator intensity level for at least 15 s after the laser scanning started and before light stimuli were presented. Because the stimulus was projected though the objective lens, the stimulus projection plane shifted when focusing at different IPL levels. We therefore quantified the resulting blur of the stimulus at the level of photoreceptor outer segments. We found that a vertical shift of the imaging plane by 50 μm blurred the image only slightly (2% change in pixel width), indicating that different IPL levels (total IPL thickness = 41.6 ± 4.8 μm, mean ± s.d., n = 20 scans) can be imaged without substantial change in stimulus quality.
Four types of light stimuli were used (Fig. 1): (i) full-field (600 × 800 μm) and (ii) local (100 μm in diameter) chirp stimuli consisting of a bright step and two sinusoidal intensity modulations, one with increasing frequency (0.5–8 Hz) and one with increasing contrast; (iii) 1-Hz light flashes (500 μm in diameter, 50% duty cycle); and (iv) binary dense noise (20 × 15 matrix of 20 × 20 μm pixels; each pixel displayed an independent, balanced random sequence at 5 Hz for 5 min) for space–time receptive field mapping. In a subset of experiments, we used three additional stimuli: (v) a ring noise stimulus (10 annuli with increasing diameter, each annulus 25 μm wide), with each ring’s intensity determined independently by a balanced 68-s random sequence at 60 Hz repeated four times; (vi) a surround chirp stimulus (annulus; full-field chirp sparing the central 100 μm corresponding to the local chirp); and (vii) a spot noise stimulus (100 or 500 μm in diameter; intensity modulation like ring noise) flickering at 60 Hz. For all drug experiments, we showed in addition: (viii) a stimulus consisting of alternating 2-s full-field and local light flashes (500 and 100 μm in diameter, respectively). All stimuli were achromatic, with matched photo-isomerization rates for mouse M- and S-opsins.
For each scan field, we used the relative positions of the inner (ganglion cell layer) and outer (inner nuclear layer) blood vessel plexus to estimate IPL depth. To relate these blood vessel plexi to the ChAT bands, we performed separate experiments in ChAT:Cre × Ai9tdTomato mice. High-resolution stacks throughout the inner retina were recorded in the ventronasal retina. The stacks were then first corrected for warping of the IPL using custom-written scripts in IGOR Pro. In brief, a raster of markers (7 × 7) was projected in the x–y plane of the stack and for each marker the z positions of the On ChAT band were manually determined. The point raster was used to calculate a smoothed surface, which provided a z offset correction for each pixel beam in the stack. For each corrected stack, the z profiles of tdTomato and SR101 labelling were extracted by manually drawing ROIs in regions where only blood vessel plexi or the ChAT bands were visible. The two profiles were then matched such that 0 corresponded to the inner vessel peak and 1 corresponded to the outer vessel peak. We averaged the profiles of n = 9 stacks from three mice and determined the IPL depth of the On and Off ChAT bands to be 0.48 ± 0.011 and 0.77 ± 0.014 AU (mean ± s.d.), respectively. The s.d. corresponds to an error of 0.45 and 0.63 μm for the On and Off ChAT bands, respectively. In the following, recording depths relative to blood vessel plexi were transformed into IPL depths relative to ChAT bands for all scan fields (Fig. 1b), with 0 corresponding to the On ChAT band and 1 corresponding to the Off ChAT band.
Data analysis was performed using Matlab 2014b/2015a (Mathworks Inc.) and IGOR Pro. Data were organized in a custom written schema using the DataJoint for Matlab framework (github.com/datajoint/datajoint-matlab)54.
Regions-of-interest (ROIs) were defined automatically by a custom correlation-based algorithm in IGOR Pro. First, the activity stack in response to the dense noise stimulus (64 × 16 × 10,000 pixels) was de-trended by high-pass filtering the trace of each individual pixel above ~0.1 Hz. For the 100 best-responding pixels in each recording field (highest s.d. over time), the trace of each pixel was correlated with the trace of every other pixel in the field. Then, the correlation coefficient (ρ) was plotted against the distance between the two pixels and the average across ROIs was computed (Extended Data Fig. 1a). A scan field-specific correlation threshold (ρ ) was determined by fitting an exponential between the smallest distance and 5 μm (Extended Data Fig. 1b). ρ was defined as the correlation coefficient at λ, where λ is the exponential decay constant (space constant; Extended Data Fig. 1b). Next, we grouped neighbouring pixels with ρ > ρ into one ROI (Extended Data Fig. 1c–e). To match ROI sizes with the sizes of BC axon terminals, we restricted ROI diameters (estimated as effective diameter of area-equivalent circle) to range between 0.75 and 4 μm (Extended Data Fig. 1b, g). For validation, the number of ROIs covering single axon terminals was quantified manually for n = 31 terminals from n = 5 GCaMP6-expressing BCs (Extended Data Figs 1g, 2a–c).
The glutamate (or calcium) traces for each ROI were extracted (as ΔF/F) using the image analysis toolbox SARFIA for IGOR Pro55 and resampled at 500 Hz. A stimulus time marker embedded in the recorded data served to align the traces relative to the visual stimulus with 2 ms precision. For this, the timing for each ROI was corrected for sub-frame time-offsets related to the scanning. Stimulus-aligned traces for each ROI were imported into Matlab for further analysis.
For the chirp and step stimuli, we down-sampled to 64 Hz for further processing, subtracted the baseline (median of first 20–64 samples), computed the median activity r(t) across stimulus repetitions (5 repetitions for chirp, >30 repetitions for step) and normalized it such that .
For dye-injected BCs, axon terminals were labelled manually using the image analysis toolbox SARFIA for IGOR Pro. Then, ROIs were estimated as described above and assigned to the reconstructed cell, if at least two pixels overlapped with the cell´s axon terminals.
We mapped the receptive field from the dense noise stimulus and the response kernel to the ring noise stimulus by computing the glutamate/calcium transient-triggered average. To this end, we used Matlab’s findpeaks function to detect the times t at which transients occurred. We set the minimum peak height to 1 s.d., where the s.d. was robustly estimated using:
We then computed the glutamate/calcium transient-triggered average stimulus, weighting each sample by the steepness of the transient:
Here, is the stimulus, τ is the time lag and M is the number of glutamate/calcium events.
For the receptive field from the dense noise stimulus, we smoothed this raw receptive field estimate using a 3 × 3-pixel Gaussian window for each time lag separately and used singular value decomposition (SVD) to extract temporal and spatial receptive field kernels. To extract the receptive field’s position and scale, we fitted it with a 2D Gaussian function using Matlab’s lsqcurvefit. Receptive field quality (Qi ) was measured as one minus the fraction of residual variance not explained by the Gaussian fit ,
Response quality index. To measure how well a cell responded to a stimulus (local and full-field chirp, flashes), we computed the signal-to-noise ratio
where C is the T by R response matrix (time samples by stimulus repetitions), while and denote the mean and variance across the indicated dimension, respectively2.
For further analysis, we used only cells that responded well to the local chirp stimulus (Qi > 0.3) and resulted in good receptive fields (Qi > 0.2).
Polarity index. To distinguish between On and Off BCs, we calculated the polarity index (POi) from the step response to local and full-field chirp, respectively, as
where b = 2 s (62 samples). For cells responding solely during the On-phase of a step of light POi = 1, while for cells only responding during the step’s Off-phase POi = −1.
Opposite polarity index. The number of opposite polarity events (OPi) was estimated from individual trials of local and full-field chirp step responses (first 6 s) using IGOR Pro’s FindPeak function. Specifically, we counted the number of events that occurred during the first 2 s after the step onset and offset for Off and On BCs, respectively. For each trial the total number of events was divided by the number of stimulus trials. If OPi = 1, there was on average one opposite polarity event per trial.
High frequency index. The high frequency index (HFi) was used to quantify spiking (compare with ref. 28) and was calculated from responses to individual trials of the local and full-field chirps. For the first 6 s of each trial, the frequency spectrum was calculated by fast Fourier transform (FFT) and spectra were averaged across trials for individual ROIs. Then, HFi = log(F /F ), where F and F are the mean power between 0.5–1 Hz and 2–16 Hz, respectively.
Response transience index. The step response (first 6 s) of local and full-field chirps was used to calculate the response transience (RTi). Traces were up-sampled to 500 Hz and the response transience was calculated as
where α = 400 ms is the read-out time following the peak response t . For a transient cell with complete decay back to baseline RTi = 1, whereas for a sustained cell with no decay RTi = 0.
Response plateau index. Local and full-field chirp responses were up-sampled to 500 Hz and the plateau index (RPi) was determined as:
with the read-out time α = 2 s. A cell showing a sustained plateau has an RPi = 1, while for a transient cell RPi = 0.
Tonic release index. Local chirp frequency and contrast responses were up-sampled to 500 Hz and the baseline (response to 50% contrast step) was subtracted. Then, the glutamate traces were separated into responses above (r ) and below (r ) baseline and the tonic release index (TRi) was determined as:
For a cell with no tonic release TRi = 0, whereas for a cell with maximal tonic release TRi = 1.
Response delay. The response delay (t ) was defined as the time from stimulus onset/offset until response onset and was calculated from the up-sampled local chirp step response. Response onset (t ) and delay (t ) were defined as and , respectively.
We used sparse principal component analysis, as implemented in the SpaSM toolbox by K. Sjöstrang et al. (http://www2.imm.dtu.dk/projects/spasm/), to extract sparse response features from the mean responses across trials to the full-field (12 features) and local chirp (6 features), and the step stimulus (6 features) (as described in ref. 2; see Extended Data Fig. 4b). Before clustering, we standardized each feature separately across the population of cells.
BC-terminal volume profiles were obtained from electron microscopic reconstructions of the inner retina6, 10. To isolate synaptic terminals, we removed those parts of the volume profiles that probably corresponded to axons. We estimated the median axon density for each type from the upper 0.06 units of the IPL and subtracted twice that estimate from the profiles, clipping at zero. Profiles were smoothed with a Gaussian kernel (s.d. = 0.14 units IPL depth) to account for jitter in depth measurements of two-photon data. For the GluMI cell, we assumed the average profile of CBC types 1 and 2.
We used a modified mixture of Gaussian model56 to incorporate the prior knowledge from the anatomical BC profiles. For each ROI i with IPL depth , we define a prior over anatomical types c as
Where IPL(d,c) is the IPL terminal density profile as a function of depth and anatomical cell type. For example, all ROIs of a scan field taken at an IPL depth of 1.7 were likely to be sorted into clusters for CBC types 1 and 2, while a scan field taken at a depth of 0 received a bias for CBC types 5–7 (Extended Data Fig. 4a).
The parameters of the mixture of Gaussian model are estimated as usual, with the exception of estimating the posterior over clusters. Here, the mixing coefficients are replaced by the prior over anatomical types, resulting in a modified update formula for the posterior:
All other updates remain the same as for the standard mixture of Gaussians algorithm57. We constrained the covariance matrix for each component to be diagonal, resulting in 48 parameters per component (24 for the mean, 24 for the variances). We further regularized the covariance matrix by adding a constant (10−5) to the diagonal.
The clustering was based on a subset (~83%) of the data (the first 11,101 recorded cells). The remaining ROIs were then automatically allocated to the established clustering (n = 2,210 ROIs).
For each pair of clusters, we computed the direction in feature space that optimally separated the clusters , where are the cluster means in feature space and is the pooled covariance matrix. We projected all data on this axis and standardized the projected data according to cluster 1 (that is, subtracted the projected mean of cluster 1 and divided by its s.d.). We computed d′ as a measure of the separation between the clusters: , where are the means of the two clusters in the projected, normalized space.
We also performed a more constrained clustering in which we divided the IPL into five portions without overlap based on stratification profiles. We then clustered each zone independently using a standard mixture of Gaussian approach and a cluster number determined by the number of BC types expected in each portion. The correlation between the cluster means of our clustering and the more constrained clustering was 0.97 for the full-field chirp traces, indicating high agreement.
Field entropy. Field entropy (S ) was used as a measure of cluster heterogeneity within single recording fields and was defined as , where i is the number of clusters in one recording field and p corresponds to the number of ROIs assigned to the ith cluster. S = 0 if all ROIs of one recording field are assigned to one cluster and S increases if ROIs are equally distributed across multiple clusters. In general, high field-entropy indicates high cluster heterogeneity within a single field.
Analysis of response diversity. To investigate the similarity of local and full-field chirp responses across clusters (Fig. 3), we determined the linear correlation coefficient between any two cluster pairs. The analysis was performed on cluster means. For every cluster, correlation coefficients were averaged across clusters with the same and opposite response polarity, respectively. We used principal component analysis (using Matlab’s pca function) to obtain a 2D embedding of the mean cluster responses. The principal component analysis was computed on all 14 local and 14 full-field cluster means. If not stated otherwise, the non-parametric Wilcoxon signed-rank test was used for statistical testing.
Pharmacology. To analyse drug-induced effects on BC clusters (Fig. 4, Extended Data Figs 7, 8), response traces and receptive fields of ROIs in one recording field belonging to the same cluster were averaged if there were at least 5 ROIs assigned to this cluster. Spatial receptive fields were aligned relative to the pixel with the highest s.d. before averaging.
Centre-surround properties. To estimate the signal-to-noise ratio of ring maps of single ROIs, we extracted temporal centre and surround kernels and normalized the respective kernel to the s.d. of its baseline (first 50 samples). For further analysis, we included only ROIs with |Peak | > 12 s.d. and |Peak | > 7 s.d. Ring maps of individual ROIs were then aligned relative to its peak centre activation and averaged across ROIs assigned to one cluster. To isolate the BC surround, the centre rings (first two rings) were cut and the surround time and space components were extracted by singular value decomposition (SVD). The surround space component was then extrapolated across the centre by fitting a Gaussian and an extrapolated surround map was generated. To isolate the BC centre, the estimated surround map was subtracted from the average map and centre time and space components were extracted by SVD. The estimated centre and surround maps were summed to obtain a complete description of the centre–surround structure of BC receptive fields. Across clusters, the estimated centre–surround maps captured 92.5 ± 1.9% of the variance of the original map. Owing to the low signal-to-noise ratio, the temporal centre–surround properties of individual ROIs were extracted as described above using the centre and surround space kernels obtained from the respective cluster average.
The 1D Gaussian fits of centre and surround space activation were used to calculate centre and surround ratios (CSRs) for various stimulus sizes. Specifically, the CSR was defined as
where S corresponds to the stimulus radius and ranged from 10 to 500 μm, with a step size dx of 1 μm. Time kernels for different stimulus sizes were generated by linearly mixing centre and surround time kernels, weighted by the respective CSR.
BC spectra. The temporal spectra of BC clusters were calculated by Fourier transform of the time kernels estimated for a local (100 μm in diameter) and full-field (500 μm in diameter) light stimulus (see centre–surround properties). Owing to the lower SNR of time kernels estimated for the full-field stimulus, kernels were cut 100 ms before and at the time point of response, still capturing 86.7 ± 14.7% of the variance of the original kernel. The centre of mass (Centroid) was used to characterize spectra of different stimulus sizes and was determined as
where x(n) corresponds to the magnitude and f(n) represents the centre frequency of the nth bin.
Surround chirp and spot noise data. To investigate the effects of surround-only activation and stimulus size on temporal encoding properties across BC clusters, response traces and estimated kernels of ROIs in one recording field belonging to the same cluster were averaged if there were at least five ROIs assigned to this cluster. The spectra for kernels estimated from local and full-field spot noise stimuli were calculated as described above.
Time kernel correlation. To analyse the similarity of temporal kernels estimated for a specific stimulus size (Fig. 5i, j), we computed the linear correlation coefficient of each kernel pair from clusters with the same response polarity. We then calculated the average correlation coefficient for every cluster (Fig. 5i) and across all cluster averages (Fig. 5j).
Data (original data and clustering results) as well as Matlab code are available from http://www.retinal-functomics.org.