Audio Session Programming Guide

Working with Categories

An audio session category is a key that identifies a set of audio behaviors for your app. By setting a category, you indicate your audio intentions to the system—such as whether your audio should continue when the Ringer/Silent switch is flipped. The seven audio session categories in iOS, along with a set of override and modifier switches, let you customize your app’s audio behavior.

Each audio session category specifies a particular pattern of “yes” and “no” for each of the following behaviors, as detailed in Table B-1:

Most apps only need to set the category once, at launch. That said, you can change the category as often as you need to, and can do so whether your session is active or inactive. If your session is inactive, the category request is sent when you activate your session. If your session is already active, the category request is sent immediately.

Choosing the Best Category

The precise behaviors associated with each category are not under your app’s control, but rather are set by the operating system. Apple may refine category behavior in future versions of iOS. Your best strategy is to pick the category that most accurately describes your intentions for the audio behavior you want. The appendix, Audio Session Categories and Modes, summarizes behavior details for each category.

To pick the best category, consider:

Do you want to play audio, record audio, do both, or just perform offline audio processing?

Is audio that you play essential or peripheral to using your app? If essential, the best category is one that supports continued playback when the Ring/Silent switch is set to silent. If peripheral, pick a category that goes silent with the Ring/Silent switch set to silent.

Is other audio (such as the Music app) playing when the user launches your app? Checking during launch enables you to branch. For example, a game app could choose a category configuration that allows music to continue if it’s already playing, or choose a different category configuration to support a built-in app soundtrack otherwise.

The following list describes the categories and the audio behavior associated with them. The AVAudioSessionCategoryAmbient category allows other audio to continue playing; that is, it is a mixable app. The remaining categories indicate that you want other audio to stop when your session becomes active. However, you can customize the non-mixing “playback” and “play and record” categories to allow mixing, as described in Fine-Tuning a Category.

AVAudioSessionCategoryAmbient—Playback only. Plays sounds that add polish or interest but are not essential to the app’s use. Using this category, your audio is silenced by the Ring/Silent switch and when the screen locks.

AVAudioSessionCategorySoloAmbient—(Default) Playback only. Silences audio when the user switches the Ring/Silent switch to the “silent” position and when the screen locks. This category differs from the AVAudioSessionCategoryAmbient category only in that it interrupts other audio.

AVAudioSessionCategoryPlayback—Playback only. Plays audio even with the screen locked and with the Ring/Silent switch set to silent. Use this category for an app whose audio playback is of primary importance.

Note: If you choose an audio session category that allows audio to keep playing when the screen locks, you must set the UIBackgroundModes audio in your app’s info.plist. See UIBackgroundModes in Information Property List Key Reference for more information. You should normally not disable the system’s sleep timer via the idleTimerDisabled property. If you do disable the sleep timer, be sure to reset this property to NO when your app does not need to prevent screen locking. The sleep timer ensures that the screen goes dark after a user-specified interval, saving battery power.

AVAudioSessionCategoryRecord—Record only. Use AVAudioSessionCategoryPlayAndRecord if your app also plays audio.

AVAudioSessionCategoryPlayAndRecord—Playback and record. The input and output need not occur simultaneously, but can if needed. Use for audio chat apps.

AVAudioSessionCategoryAudioProcessing—Offline audio processing only. Performs offline audio processing and no playing or recording.

AVAudioSessionCategoryMultiRoute—Playback and record. Allow simultaneous input and output for different audio streams, for example, USB and headphone output. A DJ app would benefit from using the multiroute category. A DJ often needs to listen to one track of music while another track is playing. Using the multiroute category, a DJ app can play future tracks through the headphones while the current track is played for the dancers.

Expanding Options Using the Multiroute Category

The multiroute category works slightly differently than the other categories. All categories follow the “last in wins” rule, where the last device plugged into an input or output route is the dominant device. However, the multiroute category enables the app to use all of the connected output ports instead of only the last-in port. For example, if you are listening to audio through the HDMI output route and plug in a set of headphones, the audio plays through the headphones. Your app can continue playing audio through the HDMI output route while also playing audio through the headphones.

Your app can send different audio streams to different output routes. For example, your app could send one audio stream to the left headphone, another audio stream to the right headphone, and a third audio stream to the HDMI routes. Figure 2-1 shows an example of sending multiple files to different audio routes.

Figure 2-1 Sending different files to different audio routes

Depending on the device and any connected accessories, the following are valid output route combinations:

USB and headphones

HDMI and headphones

LineOut and headphones

The multiroute category supports the use of a single input port.

Setting Your Audio Session Category

For most iOS apps, setting your audio session category at launch—and never changing it—works well. This provides the best user experience because the device’s audio behavior remains consistent as your app runs.

Using Modes to Specialize the Category

While categories set the base behaviors for your app, modes are used to specialize an audio session category. Set the mode for a category to further define the audio behaviors of your app. There are seven modes to choose from:

AVAudioSessionModeDefault—Default mode that works with all categories and configures the device for general usage.

AVAudioSessionModeVoiceChat—For Voice over IP (VoIP) apps. This mode can be used only with the AVAudioSessionCategoryPlayAndRecord category. Signals are optimized for voice through system-supplied signal processing, and the mode sets AVAudioSessionCategoryOptionAllowBluetooth.

The set of allowable audio routes are optimized for voice chat experience. When the built-in microphones are used, the system automatically chooses the best combination of built-in microphones for the voice chat.

AVAudioSessionModeVideoChat—For video chat apps such as FaceTime. The video chat mode can only be used with the AVAudioSessionCategoryPlayAndRecord category. Signals are optimized for voice through system-supplied signal processing and sets AVAudioSessionCategoryOptionAllowBluetooth and AVAudioSessionCategoryOptionDefaultToSpeaker.

The set of allowable audio routes are optimized for video chat experience. When the built-in microphones are used, the system automatically chooses the best combination of built-in microphones for the video chat.

Note: Apple recommends that apps using voice or video chat also use the Voice-Processing I/O audio unit. The Voice-Processing I/O unit provides several features for VOIP apps, including automatic gain correction, adjustment of voice-processing, and muting. See Voice-Processing I/O Unit in Audio Unit Hosting Guide for iOS for more information.

AVAudioSessionModeGameChat—For game apps. This mode is set automatically by apps that use a GKVoiceChat object and the AVAudioSessionCategoryPlayAndRecord category. Game chat mode uses the same routing parameters as the video chat mode.

AVAudioSessionModeMeasurement—For apps that need to minimize the amount of system-supplied signal processing to input and output signals. This mode can only be used with the following categories: record, play-and-record, and playback. Input signals are routed through the primary microphone for the device.

You can programmatically influence the audio output route. When using the AVAudioSessionCategoryPlayAndRecord category, audio normally goes to the receiver (the small speaker you hold to your ear when on a phone call). You can redirect audio to the speaker at the bottom of the phone by using the overrideOutputAudioPort:error: method.

Finally, you can enhance a category to automatically lower the volume of other audio when your audio is playing. This could be used, for example, in an exercise app. Say the user is exercising along to the Music app when your app wants to overlay a verbal message—for instance, “You’ve been rowing for 10 minutes.” To ensure that the message from your app is intelligible, apply the AVAudioSessionCategoryOptionDuckOthers property to your audio session. When ducking takes place, all other audio on the device—apart from phone audio—lowers in volume. Apps that use ducking must manage their session’s activation state. Activate the audio session prior to playing the audio and deactivate the session after playing the audio.

Recording Permission

Starting in iOS 7, your app must ask and receive permission from the user before you can record audio. If the user does not give your app permission to record audio, then only silence is recorded. The system automatically prompts the user for permission when you use a category that supports recording and the app attempts to use an input route.

Instead of waiting for the system to prompt the user for recording permission, you can also use the requestRecordPermission: method to manually ask the user for recording permission. Using the requestRecordPermission: method allows your app to get recording permission from the user at a time that doesn’t interrupt the natural flow of the app. This provides a smoother experience for the user.