Session 503WWDC 2016

AVFoundation is a powerful framework for media operations, providing capture, editing, playback, and export. Learn about new APIs and methods for media playback. Achieve gapless playback transitions between assets, create seamless loops, simplify your playback logic with "autowait", and see how to deliver an even faster playback startup experience.

[ Music ]

Good morning.

[ Applause ]

Welcome to our session on Advancesin AVFoundation Playback.

My name is Sam Bushell.

Today we're going to talk about some new enhancementsthat we've added to try and smooth over some rough edgesthat some developers have found challenging.

So AVFoundation provides APIs for a very broad selectionof multimedia activities, including playback, capture,export, and many kinds of editing.

I'll be focusing mostly on playback.

AVFoundation supports playback from a very wide selectionof media formats from local storage.

And in most cases you can take the same file,and you can put it on a web serverand then AVFoundation can play that over the network.

The file format in this case is the same,but the IO is over the network.

We call this progressive download playback.

Once we start downloading that file,even if the network characteristics change,we will continue with the same file.

HTTP Live Streaming is more dynamic.

Generally, the base URL refers to a master playlistwhich introduces multiple playlists for the same contentbut varying in bit rate and format and maybe in language.

And each of these playlists references segments containingthe actual compressed media.

So let's talk about what we're going to talk about today.

We're going to discuss the playback changes to dowith the pre-playback buffering period.

We're going to introduce a new APIto simplify looping playback of a single file.

And then we'll spend the restof our time discussing a popular topic optimizationof static time in playback apps.

Let's start by waiting for the network.

Because when we play media playback over the Internet,we're at the mercy of the network.

We don't want to start too soon or playback my stall.

We don't want to start too late or the user may give up on us.

We want to start at that Goldilocks momentand start playback when we have enough data that we'll be ableto play consistently and not stall.

Here is the existing API.

AVPlayerItem provides three Boolean properties.

playbackLikelyToKeepUp, playbackBufferFull,and playbackBufferEmpty.

playbackBuffer sorry playbackLikelyToKeepUp is trueif AVFoundation's algorithm believes that if you wereto stop playing now, you could keep on playing without stallinguntil you got to the end.

playbackBufferFull is true if the buffer doesas much as it's going to.

So if you haven't started playing back yet,you might as well.

playbackBufferEmpty means that you are stallingor you're about to stall.

So for progressive download playback in iOS 9 and earlier,AVFoundation clients must monitor these propertiesthemselves and wait until playbackLikelyToKeepUp is trueor playbackBufferFull is true before setting the AVPlayer'srate property to 1.

For HTTP Live Streaming, the rules are simpler.

You can set AVPlayer's rate property to 1 as soonas the user chooses to play, and it will automatically waitto buffer sufficient media before playback begins.

We are streamlining the default API contractin the 2016 iOS releases.

iOS, Mac OS, tvOS.

For apps linked on or after iOS 10, Mac OS Sierra, tvOS 10,the same rules for HLS will also applyto progressive download playback.

When the user clicks play,you can immediately set AVPlayer's rate property to 1or call the play method, which is the same thing.

And AVFoundation will automatically waitto buffer enough to avoid stalling.

If the network drops out during playback and playback stalls,the rate property will stay set to 1.

And so it will again buffer and automatically resumewhen sufficiently buffered.

If you're using the AVKit or MediaPlayer frameworkto present your playback UI,it already supports automatic waiting for buffering,and it will continue to.

If your application uses AVFoundation directlyand you build your own playback UI, you may needto make some adjustments.

So what should we call this new API?

Well, the word Autoplay has been used in QTKit and alsoin HTML 5, but we came to the conclusionthat from the perspective of this AVPlayer API,the playback is not the automatic part.

It's the waiting.

So the formal name for this API is automaticallyWaitsToMinimizeStalling.

But you can call it Autoplay if you like.

The network playback now lookslike a state machine with three states.

Paused, waiting, and playing.

We start in the pause state until the user chooses to play.

And then the app calls play, and we move to the waiting state.

When the playback likelyToKeepUp property becomes true,the player progresses to the playing state.

Now, if the buffer should become empty,the player will switch back to the waiting stateuntil we're likely to keep up again.

Should the user pause, we'll return to the pause state.

Now there's one further transition available.

Recall that in iOS 9 and earlier before this change,you could call play before playback was likely to keep upand playback would start immediately evenif it might stall.

So we preserved this semantic by providing another method,playImmediately (atRate:) which jumps you straightinto the playing state from either the pausedor the waiting states.

Be aware that this may lead to a stallthat the patient waiting state would avoid.

So be careful.

AVPlayer's rate property might not mean what you thoughtit meant.

Let's recap so everyone's clear.

The player's rate property is the app's requestedplayback rate.

Not to be confused with the time-based rateof the player item which is the rateat which playback is actually occurring.

We've added two new properties in this releaseto give you more detail.

One is the timeControlStatus, which tells youwhich of these states you're in, paused, waiting or playing.

And if you're in the waiting state,the reasonForWaitingToPlay property tells you why.

For example, you could be in the waiting state,so the AVPlayer's rate property could be 1.

The timebased.rate would be 0 because you're waiting.

The timeControlStatus would again say I'mWaitingToPlayAtSpcifiedRate.

And the reasonForWaitingToPlay could beWaitingToMinimizeStallsReason.

So with that background,I'd like to introduce my friend Moritz Wittenhagenwho is much braver than me, as he is goingto attempt a network playback demo live on stage.

So everyone cross your fingers and give him a hand.

[ Applause ]

Well, good morning everyone.

I want to start by showing you a little bitof the setup we have on stage here.

And I have my iPad which you can see mirroredon the screen there.

And that iPad is joining a networkthat is hosted by my Mac.

And what that allows me to do is I can use the network linkconditioner to actually limit the network connectionthat this iPad has available.

Can do that using the network link conditionerpreference pane.

Sam will tell you in a minute where to find that.

And I've set up a profile called Slow Server that limits thisto a mediocre network connection that's a little slowerthan the media bitrate that we actually want to play.

It's currently turned off.

And we'll leave it off, and let's look at what the iPad doesin a decent network situation.

So what I have here is just a selection,and I can just select one video.

Let me do that.

And what you see is that the video immediately loads,and we see that we're currently not playing.

You see this wonderful engineering UI underneaththat gives us all the propertiesand functionality involved in automatic waiting.

This is really just taken from AVPlayer and AVPlayer items.

So these are the properties that you have available if you needto know what automatic waiting is doing.

So right now we are paused, so the rates are all zero.

Current time is at zero.

But the interesting thing is since we're in a fast network,we've loaded 39 seconds of the video,which is actually the whole video.

And we're currently likely to keep up.

What that means is that when I just hit play now,the video just starts playing without any problem.

Now we wanted to see what happensin a bad network situation.

So let's turn on the network link condition on the Mac.

Here we go.

And now not much changed for this video.

Because as I said, it was already buffered.

It had already buffered the whole video.

So when I go back and load this again,I want you to pay attention to loadedTimeRangesand isPlaybackLIkelyToKeepUp again.

So let's do it.

Relaod the video.

And now what we see isthat loadedTimeRange is only slowly increase.

And isPlaybackLIkelyToKeepUp is false.

Eventually it will become true.

And at that moment we're at the same state that we were beforewhere now ready to play and playback will just start.

Now let's try this one more time,and this time I will hit play right after I loaded the video.

So this time we don't have enough data,and we go into this waiting state.

And you see the spinner telling the userthat playback is waiting.

Eventually we will become ready to playand playback just starts.

There's one more thing we can do.

And that is immediate playback.

So let's also try this.

I go into the videoand immediately click play immediately.

And we see that playback starts but then we quickly runinto a stall because we didn't have enough bufferto play to the end.

In that case, we'll go into the waiting state and re-bufferuntil we have enough to play through.

And with that, it was a short demo of automatic waiting.

Go back to Sam and the slides.

[ Applause ]

Thanks, Moritz.

Let's recap what was happening in the middle there.

So when we set a slower network speed, close to the data rateof the movie, the movie started out paused.

When he hit play, it went into the waiting state.

Because playback was not yet likely to keep up.

Notice that at this time, the player's rate was 1,but the timebase rate was 0.

After a few seconds, AVFoundation determinedthat playback was likely to keep upand so it set the time controlit set the state into playing, and now you seethat the player rate and the timebase rate are both 1.

It may have occurred to youthat there's a little bit more detail availablein the timeControlStatus than in the player's rate property.

It's necessary to load media data and decode someof it before you can actually start playing it out.

This process of fillingup the playback pipelines before playback starts iscalled preroll.

So what we'd like to be able to do here isto have AVFoundation be in on the plan.

If AVFoundation knows about playback item Bbut early enough, then it can begin prerollingand decoding before item A has finished playing out.

And so it can optimize the transition from A to B.

If item B is super short, then AVFoundation may even start workon the transition to item C.

AVFoundation's tool for achieving this is AVQueuePlayer.

AVQueuePlayer is a subclass of AVPlayer, which has an arrayof AVPlayer items called the play queue.

The current item is the one in the first position of the array.

Now you can use AVQueuePlayer to optimize transitionsbetween items that are different, but for the caseof looping, you can create multiple AVPlayer itemsfrom the same AVAsset.

This is just another optimization,since AVFoundation does not have to loadand pause the media file multiple times.

And just a reminder, the play queue is not a playlist.

Please do not load the next 10,000 itemsthat you think you might like to play into the play queue.

That's not going to be efficient.

The purpose of the play queue is to provide informationabout items to be played in the near futureso that AVFoundation can optimize transitions.

The design patent when you wantto loop a single media file indefinitely isto make a small number of AVPlayer items and put themin the AVQueuePlayer's queuewith the action item end property set to advance.

When playback reaches the end of one item, it will be removedfrom the play queue as playback advances to the next one.

And when you get the notificationthat that has happened, you can take that finished item,set its current time back to the start, and put it on the endof the play queue to reuse it.

We call this patent the treadmill.

And you can implement the treadmill patent yourselfusing AVQueuePlayer.

We have sample code to help.

The slightly tricky detail is that you have to setup key value observing to watch when the item is removedand then seek it back to the start.

And then add it to the end of the play queue again.

As you can see, in this code we are deactivating our KVOobserver while we change the play queueto avoid any chance of recursion.

So this is clearly doable.

It's just a little fiddley.

And the feedback that we received wasthat it would be awful swell if we could make this easier.

So we're introducing AVPlayerLooper,which implements the treadmill patent for you.

You give it an AVQueuePlayer.

[ Applause ]

You give it an AVQueue Player and a template AVPlayerItem,and it constructs a small number of copies of that AVPlayerItem,which it then cycles through the play queueuntil you tell it to stop.

Adopting AVPlayerLooper,the code for the symbol case is really much simpler.

So I want to give you a demo of thison an iPad I have over here.

So here's a piece of sample code.

Video Looper, I'm going to launch that.

And I have added a media file of my own here and we're goingto play it with AVPlayerLooper.

[ Music ]

Don't you feel mellow?

Okay, this is clearly looping,and the code is pretty much what I pointed out.

It's fairly simple.

This would be an appropriate tool to use, for example,if you have a tvOS app and you'dlike to loop background video behind a title menu.

All right, let's return to slides.

We've talked a bit about how to loop.

I want to spend a moment on what to loop.

Ideally, if you have both audio and video tracks,they should be precisely the same length.

Why? Well, if the audio track is longer, then that meansthat near the end there's period of timewhen audio should be playing but video should not.

We have an empty segment of video,so what should the video do?

Should it go away?

Should you freeze on one frame?

Conversely, if the video track is longer, then there's a periodof time when the audio should be silent.

So when you build media assets for looping, take the timeto make sure that the track durations match up.

In QuickTime Movie files,the track duration is defined by the edit list.

Now if the media asset to loop is not entirelyunder your control, another possibility isthat you could set the AVPlayerItems forward playbackend time to the length of the shortest track.

This will have the effectof trimming back the other tracks to match.

All right, next look at an optimization that we've madein the playback pipelinethat may have an impact on your applications.

Suppose that we are currently playing, and the listsof playing tracks changes.

For example, we could change the subtitle languageor the audio language.

Audio from English to French.

Here I'll change the subtitle languagefrom English to Spanish.

Or we could remove the AVPlayerLayerthat was displaying the video.

Or we could add an AVPlayerLayer and begin displaying video.

Well, in all of these cases in iOS 9,AVFoundation will pause playback,adjust the playback pipelines to match the list of enables tracksand then resume playback.

In some cases, this even causes videoto snap back to a key frame.

Well, I will say we have received constructive feedbackfrom users and developers about this.

And so I'm happy to say that in iOS 10and its other 2016 siblings,these changes will no longer cause playback to pause.

Adding or removing the only AVPlayerLayeron a playing AVPlayer, changing the subtitle languageor the audio language on a playing AVPlayeror manually disabling or enabling tracks.

We think that this is an enhancementfor users and developers.

However, it's a significant change in API behavior,and so I would ask you please take a look in the seeds and seeif it leads to any complications in your apps.

If you find an issue with this that looks like it's a bugon our side, then please provide feedbackby filing a bug using the Apple Bug Reporter System.

And as always when filing a bug,please try to give us everything we need in orderto reproduce the problem ourselves.

In our APIs, we generally represent these choicesthrough the use of enumerated strings, since they're easierto print and display and debug.

But in media files, these are represented by numbers.

And these standard tag numbers are definedin an MPEG specification called coding independent code points.

That sounds like a paradox, doesn't it?

How can you be coding independent code points?

Well, it's less than a paradox if you read itas Codec independent code points.

The job of the spec is to make sure that the assignmentof these tag numbers is done in a mannerthat is harmonious all Codecs and file formats.

So the interpretation of numbers will be the samein QuickTime Movie, MPEG-4, H264 and so forth.

All right, with that background, let's look at a few new APIs.

We have introduced a new media characteristic that tells youthat at video track is tagged with wider color primaries,something wider than the Rec.

709 primaries.

If your app finds that there is wide gamut video,it might be appropriate for your app to take stepsto preserve it, so it isn't clamped back into the 709 space.

If not, it's actually generally best to stay within Rec.

709 for processing.

So you can specify a working color space when you setup an AVPlayerItemVideoOutput or an AVAssetReaderOutput.

And you will then receive buffers that have been convertedinto that color space.

You can also specify a target color space when settingup an AVAssetWriterInput,in which case the source image buffersthat you provide will be convertedinto that color space prior to compression.

With AVPlayerItemVideoOutput or AVAssetReaderOutputif you don't want image buffers to be convertedinto a common color space,then you should set the AVVideoAllowWideColorKey to trueand then you'll receive buffers in their original color space.

This is effectively a promise that whatever software receivesand processes those buffers, whether it's ours or yours,it will examine and honor their color space tags.

There are analogous propertiesfor configuring video compositions.

First, you can specify a working color spacefor entire video compositions.

Alternatively, if you have a custom video compositor,you may choose to make it wide color aware.

You can declare that your custom video compositor is wide coloraware and that it examines and honors color space tagson every single source frame bufferby implementing the optional supportsWideColorSourceFramesproperty and returning true.

Running it out with a reminder,if you create picture buffers manually, for example,using a pixel buffer pool in metal,then you should explicitly set the color space tagson every buffer by calling core videos APIs.

Most developers won't need to do this.

In most cases when you're using a color space aware APIfor source buffers, that'll take care of tagging them for you.

By popular request, I'm going to spend the restof our time discussing some best practicesfor optimizing playback startup time.

I'll talk about local file playback first.

And then we'll move on to HTTP Live Streaming.

Now some of these optimization techniques may becounterintuitive at first.

They require you to consider thingsfrom the perspective of AVFoundation.

And to think about when it gets the information it needsto do what your app is asking it to do.

For example, here is a straightforward piece of codefor setting up playback of a local file.

We start with the URL to the file.

We create an AVURLAsset representing the productdepositing that file.

We then create an AVPlayerItem to hold the mutable statefor playback, and an AVPlayer item to host playback.

And then we create an AVPlayerLayerto connect video playback into our display hierarchy.

Now this code is correct, but it has a small flaw,which maybe you may not initially see.

As soon as the player item is setas the player's current item, the player starts settingup the playback pipeline.

Now it doesn't know the future.

It doesn't know that you're goingto set an AVPlayerLayout later.

So it sets things up for audio only playback.

And then when the AVPlayerLayer is added, now AVFoundation knowsthat the video needs to be decoded too.

And so now it can reconfigure thingsfor audio and video playback.

Now, as I said earlier, we have made enhancementsin this year's OS releases to mean that minor changesto the list of playback to the listof enabled tracks do not necessarily causean interruption.

But it still ideal to start with the informationthat AVFoundation needs in order to get things right first time.

So I'm going to change this code a little bit.

I'm going to watch where the AVPlayerItem is connectedto the AVPlayer.

So now the player is created with no current item,which means it has no reason to build playback pipelines yet.

And that doesn't change when you add the AVPlayerLayer.

Playback pipelines don't get builtuntil the player item becomes the current item.

And by that point, the player know what it needsto get things right first time.

We can generalize this.

First, create the AVPlayerLayer, so first create the AVPlayerand AVPlayerItem objects.

And set whatever properties you needto on them including connecting the AVPlayer to an AVPlayerLayeror an AVPlayerItem to an AVPlayerItemVideoOutput.

Now this might seem crazy, but if you just want playbackto start right away, you can tell the playerto play before you give it the item to play.

Why would you do this?

Well, if you do it the other way around,the player initially thinks that you wantedto display the still frame at the start of the video.

And it might waste some time on that before it gets the messagethat actually you just want playback.

Again, starting with the actual goal may shave off afew milliseconds.

Let's move on to HLS.

The timeframes we're trying to optimize with HLS are longerbecause they're donated by network IO which is much slowerthan local file storage.

So the potential benefitsof optimizations are much more noticeable.

The network IO breaks down into four pieces.

Retrieving the master playlist that's the URL you passedto AVURLAsset.

If the content is protected with fair play streaming,retrieving content keys,retrieving the selected variant playlistsfor the appropriate bitrate and format of video and audio,and retrieving some media segmentsthat are referenced in that playlist.

Now the media segments will be the highest amountof actual data transfer but with network IO we need to thinkabout round-trip latency.

Some of these stages are serialized.

You can't download things from playlistuntil you've received the playlist.

So a thing to think about then is can we do anyof these things before the user chooses to play?

For example, maybe in your app you display a title cardwhen content is first selected, and that gets the user to say,is this actually the one I wanted to play?

Or do I want to read some information about it.

So the question is, could we do some small amountof network IO speculativelywhen the user has identified the content they probably wantto play before they make it official?

Well, AVURLAsset is a lazy API.

It doesn't begin loading or pausing any datauntil someone asks it to.

To trigger it to load data from the master playlist,we need to ask it to load a value that would derive from itlike duration or available media characteristicswith media selection options.

Duration is easy to type.

You don't have to provide a completion handler here unlessyou're actually going to do something with that value.

Speaking of playlists, they can press really easily,and we've supported compressing them with gzip for many years.

On a giant iPad Pro, there are a lot of pixels, so we could goup to a big variant for full screen.

But if we play picture-and-picture,we don't need such a high resolution anymore.

And a lower bitrate variant could reduce the sizeof our cache and help us make more memory availablefor other apps.

If the network connection is slow on any device,then that's going to be the limiting factor.

So what this means is that AVFoundation needs to takeinto account both the display dimensionsand the network bitrate when choosing the variant.

AVFoundation uses the AVPlayerLayer size on the screento evaluate the dimensions.

So set up your AVPlayerLayer at the correct size and connect itto the AVPlayer as early as you can.

It can be hidden behind other UIif you're not ready to show video yet.

On a retina iOS device, it's currently necessaryto set contentsScale manually.

As for bitrate, well AVFoundation is in a bitof a chicken and egg situation when it comesto playback first beginning.

It has to choose some variant,but it does not know what bitrate it's going to get.

Once it's begun downloading segments,it can use the statistics from those downloadsto adjust the choice of variant.

But for that first variant,it hasn't gathered any statistics yet.

So AVFoundation's base algorithm isto pick the first applicable variant in the master playlist.

If that's a low bitrate option, the user will startout seeing something blurry,but AVFoundation will soon decide what the actual networkbitrate is and switch up to the appropriate variant.

Well, the question is, what if you would like to tryto improve that initial choice?

Well, remember, there is a tradeoff you have to makebetween initial quality and startup time.

A higher bitrate first segment takes longer to download.

And that means it will take longer to start.

You might decide that it's best to startwith a lower bitrate variant in order to start faster.

Well one way to make the tradeoff is to figureout a minimum acceptable quality level you'd like to seeon a particular size of screen and start there.

Then AVFoundation will switch to a higher qualityafter playback begins as the network allows.

And maybe you know one thing that AVFoundation doesn't.

Maybe your app just played a different piece of content.

And maybe you can use that playback's access logto make a better guess about the bitratethat the next playback station is going to get.

So let's suppose that you come up with a hero stick basedon startup quality and recent bitrate statistics.

And you decide on a way to choosewhich variant you want to start with.

Well, how do we plug that choice into AVFoundation?

There are two techniques that have been used.

Here's the first technique.

On the server, you have to sort your variancefrom highest to lowest.

Like that.

And then in your app, you needto set the player items preferredPeakBitRateto you bitrate guess.

This will eliminate the higher bitrate variancefrom initial selection.

Shortly after playback starts, you should resetthat control back to zero, which will allow AVFoundation to moveup to a higher bitrate variance if the network improves.

The second technique isto dynamically rewrite the master playlist in your appand move your preferred choice to the top of the list.

To do this, use a custom URL scheme for the AVURLAssetand implement the AV asset resource loader delegateprotocol in which you can supply that rewritten playlistin response to the load request for that custom URL scheme.

I want to remind you to profile your code too.

Look for any delays before you call AVFoundation.

In particular, you do not need to wait for likelyToKeepUpto become true before you set the player rate.

You don't need to now, and in fact, you never have for HLS.

Make sure that you release AVPlayers and AVPlayerItemsfrom old playback sessionsso that they do not waste bandwidth in the background.

You can use the Allocations Instrument in Instrumentsto check the lifespans of AVPlayerand AVPlayerItem objects.

And if you have an applicationthat does other network activity,consider whether you should suspend it during networkplayback so that the user can take full advantageof available bandwidth for playback.

We've introduced a new API called AVPlayerLooperto simplify using the treadmill patentto loop playback of a single item.

Changing the set of enable tracks during playback no longeralways causes a brief pause.

And we've looked at the AVFoundation APIsthat you can use to prepare your app for wide color video.

Finally, we talked about optimizing playback startupfor local files and for HLS.

In short, avoid accidentally asking for work you don't need.

And for the work you do need, see if you can do it earlier.

We'll have more information at this URL about this session,including sample code that we've shown.

We have some related sessions that you might like to catchup to see in person or catch up online.

The bottom one is an on-demand only onethat you can watch in the app.

Thank you for attention.

It's been a pleasure.

I hope you have a great week.

Apple, Inc.AAPL1 Infinite LoopCupertinoCA95014US

ASCIIwwdc

Searchable full-text transcripts of WWDC sessions.

An NSHipster Project

Created by normalizing and indexing video transcript files provided for WWDC videos. Check out the app's source code on GitHub for additional implementation details, as well as information about the webservice APIs made available.