AES Rome 2013Paper Session P16

P16 - Spatial Audio—Part 2: 3-D Microphone and Loudspeaker Systems

P16-1 Recording and Playback Techniques Employed for the “Urban Sounds” Project—Angelo Farina, Università di Parma - Parma, Italy; Andrea Capra, University of Parma - Parma, Italy; Alberto Amendola, University of Parma - Parma, Italy; Simone Campanini, University of Parma - Parma, ItalyThe “Urban Sounds” project, born from a cooperation of the Industrial Engineering Department at the University of Parma with the municipal institution La Casa della Musica, aims to record characteristic soundscapes in the town of Parma with a dual purpose: delivering to posterity an archive of recorded sound fields to document Parma in 2012, employing advanced 3-D surround recording techniques and creation of a “musical” Ambisonics composition for leading the audience through a virtual tour of the town. The archive includes recordings of various “soundscapes,” such as streets, squares, schools, churches, meeting places, public parks, train station, and airport, and everything was considered unique to the town. This paper details the advanced digital sound processing techniques employed for recording, processing, and playback.
Convention Paper 8903 (Purchase now)

P16-2 Robust 3-D Sound Source Localization Using Spherical Microphone Arrays—Carl-Inge Colombo Nilsen, University of Oslo - Oslo, Norway; Squarehead Technology AS - Norway; Ines Hafizovic, SquareHead Technology AS - Oslo, Norway; Sverre Holm, University of Oslo - Oslo, NorwaySpherical arrays are gaining increased interest in spatial audio reproduction, especially in Higher Order Ambisonics. In many audio applications the sound source detection and localization is of crucial importance, urging for robust localization methods suitable for spherical arrays. The well-known direction-of-arrival estimator, the ESPRIT algorithm, is not directly applicable to spherical arrays for 3-D applications. The eigenbeam ESPRIT (EB-ESPRIT) is based on the spherical harmonics framework and is especially derived for spherical arrays. However, the ESPRIT method is in general susceptible to errors in the presence of correlated sources and spatial decorrelation is not possible for spherical arrays. In our new implementation of spherical harmonics-based ESPRIT (SA2ULA-ESPRIT) the robustness against correlation is achieved by spatial decorrelation incorporated directly in the algorithm formulation. The simulated performance of the new algorithm is compared to EB-ESPRIT.
Convention Paper 8904 (Purchase now)

P16-3 Parametric Spatial Audio Coding for Spaced Microphone Array Recordings—Archontis Politis, Aalto University - Espoo, Finland; Mikko-Ville Laitinen, Aalto University - Espoo, Finland; Jukka Ahonen, Akukon Ltd. - Helsinki, Finland; Ville Pulkki, Aalto University - Aalto, FinlandSpaced microphone arrays for multichannel recording of music performances, when reproduced in a multichannel system, exhibit reduced inter-channel coherence that translates perceptually to a pleasant ‘enveloping’ quality, at the expense of accurate localization of sound sources. We present a method to process the spaced microphone recordings using the principles of Directional Audio Coding (DirAC), based on the knowledge of the array configuration and the frequency-dependent microphone patterns. The method achieves equal or better quality to the standard high-quality version of DirAC and it improves the common one-to-one channel playback of spaced multichannel recordings by offering improved and stable localization cues.
Convention Paper 8905 (Purchase now)

P16-4 Optimal Directional Pattern Design Utilizing Arbitrary Microphone Arrays: A Continuous-Wave Approach—Symeon Delikaris-Manias, Aalto University - Helsinki, Finland; Constantinos A. Valagiannopoulos, Aalto University - Espoo, Finland; Ville Pulkki, Aalto University - Aalto, FinlandA frequency-domain method is proposed for designing directional patterns from arbitrary microphone arrays employing the complex Fourier series. A target directional pattern is defined and an optimal set of sensor weights is determined in a least-squares sense, adopting a continuous-wave approach. It is based on discrete measurements with high spatial sampling ratio, which mitigates the potential aliasing effect. Fourier analysis is a common method for audio signal decomposition; however in this approach a set of criteria is employed to define the optimal number of Fourier coefficients and microphones for the decomposition of the microphone array signals at each frequency band. Furthermore, the low-frequency robustness is increased by smoothing the target patterns at those bands. The performance of the algorithm is assessed by calculating the directivity index and the sensitivity. Applications, such as synthesizing virtual microphones, beamforming, binaural, and loudspeaker rendering are presented.
Convention Paper 8906 (Purchase now)

P16-5 Layout Remapping Tool for Multichannel Audio Productions—Tim Schmele, Fundació Barcelona Media - Barcelona, Spain; David García-Garzón, Universitat Pompeu Fabra - Barcelona, Spain; Umut Sayin, Fundació Barcelona Media - Barcelona, Spain; Davide Scaini, Fundació Barcelona Media - Barcelona, Spain; Universitat Pompeu Fabra - Barcelona, Spain; Daniel Arteaga, Fundacio Barcelona Media - Barcelona, Spain; Universitat Pompeu Fabra - Barcelona, SpainSeveral multichannel audio formats are present in the recording industry with reduced interoperability among the formats. This diversity of formats leaves the end user with limited accessibility to content and/or audience. In addition, the preservation of recordings—that are made for a particular format—comes under threat, should the format become obsolete. To tackle such issues, we present a layout-to-layout conversion tool that allows converting recordings that are designated for one particular layout to any other layout. This is done by decoding the existent recording to a layout independent equivalent and then encoding it to desired layout through different rendering methods. The tool has proven useful according to expert opinions. Simulations depict that after several consecutive conversions the results exhibit a decrease in spatial accuracy and increase in overall gain. This suggests that consecutive conversions should be avoided and only a single conversion from the originally rendered material should be done.
Convention Paper 8907 (Purchase now)

P16-6 Discrimination of Changing Loudspeaker Positions of 22.2 Multichannel Sound System Based on Spatial Impressions—Ikuko Sawaya, Science & Technology Research Laboratories, Japan Broadcasting Corp. - Setagaya, Tokyo, Japan; Kensuke Irie, Science & Technology Research Laboratories, Japan Broadcasting Corp. - Setagaya, Tokyo, Japan; Takehiro Sugimoto, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Tokyo Institute of Technology - Midori-ku, Yokohama, Japan; Akio Ando, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Kimio Hamasaki, NHK Science & Technology Research Laboratories - Setagaya, Tokyo, JapanOn 22.2 multichannel reproduction, we sometimes listened to the sounds reproduced by a loudspeaker arrangement different from that on production, and we did not recognize the difference in spatial impression between them definitely. In this paper we discuss the effects of changing some of the loudspeaker positions from the reference on the spatial impressions in a 22.2 multichannel sound system. Subjective evaluation tests were carried out by altering the azimuthal and elevation angles from the reference in each condition. Experimental results showed that the listeners did not recognize the difference in spatial impression from the reference with some loudspeaker arrangements. On the basis of these results, the ranges keeping the equivalent quality of the spatial impressions to the reference are discussed when the reproduced material has the signals of all the channels of the 22.2 multichannel sound system.
Convention Paper 8909 (Purchase now)

P16-7 Modeling Sound Localization of Amplitude-Panned Phantom Sources in Sagittal Planes—Robert Baumgartner, Austrian Academy of Sciences - Vienna, Austria; Piotr Majdak, Austrian Academy of Sciences - Vienna, AustriaLocalization of sound sources in sagittal planes (front/back and top/down) relies on listener-specific monaural spectral cues. A functional model approximating human processing of spectro-spatial information is applied to assess localization of phantom sources created by amplitude panning of coherent loudspeaker signals. Based on model predictions we investigated the localization of phantom sources created by loudspeakers positioned in the front and in the back, the effect of loudspeaker span and panning ratio on localization performance in the median plane, and the amount of spatial coverage provided by common surround sound systems. Our findings are discussed in the light of previous results from psychoacoustic experiments.
Convention Paper 8910 (Purchase now)