Understanding Lync Video Quality Reports

In any Lync Server environment, be it in a test lab or a production deployment, the Monitoring Server role and its accompanying reports should be considered a key component. With improvements in Lync 2013 this role is even easier to deploy than in past versions, providing for a wide array of data captured by endpoints during various communications modalities. Leveraging the User Activity Reports is one of the best ways to review this detailed information to then make informed decisions on how to plan for bandwidth for video conferencing.

This article does not cover the actual deployment and configuration of these component. This is one of the easier components of Lync to deploy and is covered in the Deploying Monitoring section of the TechNet documentation. What this article does cover however is where to locate and read Media Quality Reports for both peer-to-peer and multiparty conference calls, focusing specifically on the video payload information.

The concepts explained here will be important to understand for an upcoming article which will explore in greater detail the various capabilities of H.264 SVC as utilized by Lync Server 2013. This new codec provides for a much larger range of resolutions than earlier codecs like Real-Time Video so extrapolating this information from the reports can be a little more complicated than in the past.

Environment

Instead of just randomly looking at reports for various past sessions it would be better to create new calls in which the activity is deliberate and known. It can be very difficult to reverse-engineer the reports to make assumptions on what the call scenario might have been, so as a learning exercise it would be prudent to work forwards instead of backwards.

The example scenario in this article will be a Lync video call placed between two Lync 2013 clients running on different Windows 8 workstations of roughly the same capabilities. To keep this simple each system is capable of encoding and sending up to 720p HD video when using SVC and each display is set to a resolution of at least 1920 pixels wide.

Test Video Calls

To provide for a few different call logs to compare make a pair of video calls between two endpoints in the following scenarios. Leave each call up for at least 5 minutes to insure that call quality details are recorded and that average bitrate numbers will show some measurable difference from peak numbers. Also make sure not manipulate the window size outside of the instructions on either client as this may pollute the data and the results will not match what is explained in this article.

Throughout these test calls it will be important to never resize the video window prior to ending the call as the video resolution recorded in the reports will not reflect the intended result. This concept will be explained later, so for now simply end the calls unobtrusively.

Default Video

Start a video call between the two workstations and leave the video window at the default, minimum size on both clients.

The image on the left is from a Lenovo T410 connected to a 24” monitor set to 1920×1200 (16:10), while the image on the right is a screenshot of a Surface Pro set at a resolution of 1920×1080 (16:9). Notice that the Lync video window, at the smallest allowed size, is slightly larger on the Surface display due to the tablet interface in Windows. This is important to understand as the size of the window which displays the video will directly impact the resolution that the client will request the other party to send.

To measure the actual pixel resolution of these two video windows each full screenshot was cropped down to just the video and the image dimensions were captured as follows:

Workstation

Desktop Display Dimensions

Video Window Dimensions

T410

1920 x 1200

408 x 230

Surface Pro

1920 x 1080

620 x 350

After 5 minutes or more simply end the call by hitting the hang-up button (or Ctrl+Enter) on either client.

Full Screen Video

Start a second video call between the same two workstations and immediately change the video window to Full Screen View on both clients.

Make sure to use the Full Screen View button on the window, as opposed to the Maximize button which would still leave a window border up which doesn’t allow the video to use 100% of the screen. Depending on the resolution of the monitor this small amount could be enough to prevent the inbound video stream from moving up to the maximum possible resolution supported by the system.

After 5 minutes or more simply end the call by hitting the hang-up button (or Ctrl+Enter) on either client without minimizing either video window. The call must end in full screen mode for the maximum resolution to be recorded.

Media Quality Reports

The call details will typically be available on the monitoring server within a minute or so after completion, so the first call should be ready to view at this point.

Review First Test Call

Using a browser access the Lync Monitoring Server Reports Home Page and select User Activity Report under the Call Diagnostic Reports (per-user) section.

http://<SQLServerFQDN>/ReportServer_<SQLInstanceName>

Click the View Report button in the upper-right hand corner to search for database entries in the default time frame, which would be all activities and modalities in the past 24 hours.

Look for the first test call listed under the Peer-to-Peer Sessions section by identifying the users, modality (e.g. Video) and time and click the Detail button.

If the first test call is not yet shown then refresh the page (e.g. F5) to reset the results and update the To: time, and then re-run the search.

Once the Peer-to-Peer Session Detail Report is loaded the page will be broken up into 4 main sections:

Modalities – Will list which modalities were involved in this session (e.g. Audio, Video, Instant messaging).

Media Quality Report – A detailed report of all media stream information which is the focus of this article.

Diagnostic Reports – A list of diagnostic headers for specific SIP messages in the session.

By default the Media Quality Report section will be minimized and includes three major sections itself:

Call Information – Some basic information about the caller and callee workstations and clients.

Media Line (Main Audio) – Detailed statistics and information about the audio portion of the session.

Media Line (Main Video) – Detailed statistics and information about the video portion of the session.

As explained earlier only the video portion of these records will be discussed, but before jumping into the video statistics it is important to review the call information first to understand the media direction and encoder or decoder capabilities.

Expand the Media Quality Report section and review the Call Information section. The relevant items are listed below for the previous test call.

When looking at the video stream details make sure to acknowledge which endpoint was defined as the Caller versus the Callee as this is the only way to know which video stream is which. In this test call the user Lenovo laptop (jim@mslync.net) placed the video call to the client on the Surface (jeff@mslync.net).

Scroll down to the Media Line (Main Video) section and take note of the highlighted items.

The level of information in this section is sufficient to make a few assumptions about what the media path might have been. Firstly the Caller Connectivity is reported as Direct for both endpoints, meaning that the Edge Server was not required to relay media for this call. As the subnets are both the same then it would be safe to assume that both endpoints were on the same network, which they were. Additionally the Caller inside value is True for one workstation and false for the other, most likely indicating that the caller was connected to some type of third-party VPN which provided direct access to the Lync Front End pool.

Also worth noting is the Caller connection type which recorded that the caller workstation was on a wired network connection as indicated by the Ethernet string, while the other workstation is connected over Wi-Fi. This may be important when looking at bandwidth and packet loss as it is possible that the callee workstation did not have a stable connection at the time of the call.

The Transport setting is a good way to tell if the ideal option of using UDP for media was available. Often times firewalls are misconfigured and when media is established between Lync endpoints, especially when going through an Edge Server, UDP will not be available and the transport protocol will fall back to using TCP which is not preferred for real-time communications.

Scroll further down to the Video Stream sections near the bottom of the page to find details on both of the video streams involved in the call.

Video Stream

Caller > Callee

Callee > Caller

Codec

H264

H264

Resolution

424×240

424×240

Inbound Frame Rate

14.9416

14.9451

Outbound Frame Rate

14.9780

14.9448

Frame Rate Loss

0.01%

0.00%

Average Allocated Bandwidth

350 Kbps

350 Kbps

Average Bit Rate

166 Kbps

159 Kbps

Maximum Bit Rate

517 Kbps

470 Kbps

CIF Quality Ratio

100.00%

100.00%

VGA Quality Ratio

0.00%

0.00%

HD Quality Ratio

0.00%

0.00%

The details above show that both video streams utilized H.264 for video (as would be expected between Lync 2013 clients) and sent the same 424×240 resolution at 15 frames per second. The average and maximum bitrate was basically the same as well. Even though the video window on the Surface was measured at 620×350 that was not large enough to trigger a step-up to the next resolution available in the H.264 SVC codec in Lync 2013.

Review Second Test Call

Return to the User Activity Report then locate and open the record for the second test call. Expand the Media Quality Report section and then scroll down to the Video Stream sections.

Video Stream

Caller > Callee

Callee > Caller

Codec

H264

H264

Resolution

1280×720

1280×720

Inbound Frame Rate

28.2137

19.4241

Outbound Frame Rate

28.3465

19.4218

Frame Rate Loss

0.01%

0.00%

Average Allocated Bandwidth

2324 Kbps

2500 Kbps

Average Bit Rate

1370 Kbps

1016 Kbps

Maximum Bit Rate

3221 Kbps

2745 Kbps

CIF Quality Ratio

0.00%

0.00%

VGA Quality Ratio

8.00%

65.00%

HD Quality Ratio

90.00%

34.00%

The details above show increases in resolution, frame rates, and bit rates across the board. What is interesting about this specific call is that although both clients sent video at 720p they were not categorized with the same quality. While the caller’s outbound stream indicates HD quality almost the entire time in comparison the callee’s outbound stream was largely classified as VGA quality. This phenomenon is explained in more detail in the next section, but the lower frame rate and lower bitrate on that stream indicate a lower quality of video was transmitted given the resolution was no different.

Resolution & Quality

As was alluded to earlier, reading these reports are not always as straight-forward as it might seem. In this first test call the actual session and the recorded details are quite linear and there does not seem to be any surprises. But there are a couple parameters which need to be understood when looking at more complicated call scenarios.

Firstly the resolution reported will only ever contain a single entry, so what happens if different resolutions were sent, as can happen when the video window size is changed during the call? The resolution is reported by the client at the termination of the call or when video was last stopped, so only the last resolution used will be recorded. This is important to understand as when trying to find out what resolution was actually sent during a full-screen video call the call must be ended while the video is still in full screen. If the window is decreased from full screen view and then the call is hung up then a lower resolution may appear in the records.

Secondly the three quality ratios fields (CIF, VGA, and HD) are not directly reflective of only the resolution. These fields are a measurement of overall video quality which take into account resolution, frame rate, and any frame or packet loss. The resulting video quality is categorized and reported as one or more of the three. In the first example both video streams are defined as CIF quality for the entire duration of the call when encoded at 424×240 resolution at 15fps, even with 0% loss. Yet in the second example one 720p stream was labeled as HD quality for 90% of the call but the other 720p stream was primarily rated as VGA quality. The lower frame rate and bit rate are indicators of reasons for the decreased quality.

In most cases the following resolutions will be reported in the shown quality range, but there is currently a bug in the reporting which does not categorize 1080p video at all. This is not a comprehensive list of resolutions but just a sampling of the most commonly seen resolutions.

Resolution

1920×1080

1280×720

960×540

640×360

424×240

352×288

320×180

Quality

–

HD

HD

VGA

CIF

CIF

CIF

In the event of reduced frame rate or during limited bandwidth scenarios the quality can be reported lower than the resolution would typically indicate. For example the following video stream details were captured from a test call in which the callee was on a Wi-Fi network with limited signal and bandwidth.

Notice that although the resolution was recorded as 1280×720 the quality was reported at 86% of the call in VGA quality and 13% in HD. Now without knowing what the actual call experience was these results could be read in one of two ways. First it could be assumed that the video window was not in full screen, but was instead only increased to a size large enough to trigger a VGA resolution for the majority of the call (e.g. 640×360) and then was increased to full screen for the remaining 13% of the call duration, based on the last reported resolution of 720p. In most cases that would be a good assumption, but as this test call was run in full screen mode the entire time then that would be false.

Instead the explanation is that based on the limited available bandwidth the client was only able to receive a less-than HD quality stream even though the resolution was scaled up to 720p. The first clue is that the frame rate is a bit below 30, and the second is that the average bit rate is well below 1Mbps and the maximum barely topped 1Mbps. For a normal 720p video call using H.264 SVC in Lync 2013 the average bitrate should be more like 1Mbps with a maximum in the area of 1.5 to 2.5Mbps.

To further illustrate this point, look what happens when even less bandwidth is available, coupled with receiving video from a workstation encoding video at only 15fps.

The frame rate has dropped even further from 25fps to 15fps and the Average Available Bandwidth is reported as not even 300Kbps. Again, a resolution of 1280×720 was encoded, but the quality of the received video was noticeably poor compared to a normal 720p session. This is reflected by the low bit rates and CIF classification (82%) of quality for the majority of the call.

Multiparty Video Calls

The examples above capture basic peer-to-peer video calls, but what happens during Lync conference calls with many clients connected and multiple concurrent video streams being transmitted? A Conference Detail Report will be formatted roughly the same, expect that multiple participants will be listed. Every participant that was in the meeting at one point in time will be listed with their own unique report, along with the modalities that client participated in during the call.

The Conference Modalities section will list a separate Media Quality report for each participant which captures the video history for each and every active stream in the conference. So the beginning of the report will look nearly identical except that the other endpoint is now a Conference URI, as all video streams are negotiated between the client and the Lync AVMCU. A single audio session is recorded as the Main Audio as the AVMCU still continues to mix all outbound participant audio streams into a single inbound stream for each client, just as in previous versions of OCS or Lync. But each Lync 2013 client is capable of receiving up to a maximum of 6 different concurrent video streams, which will be reported in individual video sections.

Listed below are the sections of a report from an example conference in which every possible capability was involved, including more than 6 active video participants and at least two Roundtable video devices.

Header

Section

Description

Call Information

Caller and Callee identification and endpoint specifications

Media Line (Main Audio)

Audio Stream (Caller -> Callee) Audio Stream (Callee -> Caller)

Outbound audio stream Inbound audio stream

Media Line (Main Video)

Video Stream (Caller -> Callee) Video Stream (Callee -> Caller)

Outbound video stream First inbound video stream

Media Line (Panoramic Video)

Video Stream (Caller -> Callee) Video Stream (Callee -> Caller)

Outbound panoramic stream Inbound panoramic stream

Media Line (Main Video 2)

Video Stream (Callee -> Caller)

Second inbound video gallery stream

Media Line (Main Video 3)

Video Stream (Callee -> Caller)

Third inbound video gallery stream

Media Line (Main Video 4)

Video Stream (Callee -> Caller)

Fourth inbound video gallery stream

Media Line (Main Video 5)

Video Stream (Callee -> Caller)

Fifth inbound video gallery stream

Media Line (Main Video 6)

Video Stream (Callee -> Caller)

Six inbound video gallery stream*

The main difference between conference and peer reports is quite evident after advancing past the Main Video section as 5 additional sections may be shown. They all may not include any actual data, depending on how many participants were actually connected to the meeting and how many had video enabled. Also the Panoramic Video section will only be included if at least one RoundTable/CX5000 device is present in the conference call, and this item can appear in either conference or peer call reports.

The Main Video 6 section is a unique case which is different from the other 5 standard video streams. As mentioned earlier the maximum number of concurrent inbound video streams supported by the Lync 2013 client is 6. Five for a fully populated gallery view plus one additional stream for the panoramic video, if applicable. So why is there a section for a seventh inbound video stream then? Apparently this stream is an ‘extra’ stream used to negotiate video for a new active speaker in the event that all the other streams are already in use, and when that new speaker’s video replaces a past speaker’s video tile in the gallery then the older stream (e.g. Main Video 2) is stopped. When yet another new speaker needs to appear on the gallery view an unused stream is still available, and so on. This approach allows a new stream to be quickly negotiated before breaking down a previous stream.

What this all means though is that the data from a given stream is just a total of all bandwidth used by that ‘slot’ for the duration of the call, so it’s impossible to know which participant was shown in which of the slots, and for how long. And as different participants may be sending different resolutions, frame rates, and qualities over even different codecs (H.264 SVC or RTV) then it can be very complicated to attempt to use this data to either calculate total bandwidth used or to plan for how much is needed. The realistic approach is to attempt to average out how much bandwidth might be used per video tile during 2, 3, 4, 5 and 6 person video calls across different screen sizes. Either way as more participants are added to the gallery each video tile will be reduced in size to make room for the new tile, which lowers the overall resolution of each individual stream in step. This approach helps to keep bandwidth utilization under control. These concepts will be covered in more detail in a future article, along with example bandwidth numbers for various video conferencing scenarios.

Comments

Very good article! Understanding the relationships between the resolution, codec, bit rate and frame rate against the available bandwidth is a challenge, as well as the workstation. Are there other products or scripts that could be used to gain greater detail into the call? I'm interested in what happens during the session, how often the quality ratio is changing.

There is no single 'typical' video call in Lync, but in a future article I will cover various call scenarios and the actual bandwidth measured. As a ball-park average a single H.264/SVC peer-to-peer video call between two Lync 2013 clients can range from an average bit rate of ~300Kbps for 240p resolution up to ~3000Kbps for 1080p resolution. I've measured calls as low as ~110Kbps average when using the smallest video window possible in lower frame rate scenarios (240p@15fps).

Solutions with the Lync client embedded like the CX7000 and CX8000 will report media quality data no differently than standard Lync desktop clients. Natively registered standards-based endpoints like the HDX and Group will populate CDR information like client version strings but the media quality data (QoE) will only be reported by the Lync client on the other end of the call.

I'm trying to understand the report, why the avg. allocation bandwidth is 297 Kbps and at the same time the report shows an max. Bit rate of 926 Kbps? Seems to me the CAC policy is not applying ?? Like in your report I see in my report that the max. Bit rate exceed the applied bandwidth limit how is that possible? How can I ensure that if the site video session limit is is set for 512 Kbps and 64 for audio that the lync 2013 client use only the specified in the CAC policy?

I have read several sources which state that for optimal audio quality an average round trip of less than 200ms is required but I can’t find any document which states an optimal round trip for video.

When I look at the media quality reports of some of our video calls the average video round trip can be as high as 1500ms and yet the report doesn’t highlight that value with yellow or red. For the same calls the audio round trip is around 25ms.

From the screenshot of report “Media Line”, i see that Caller AV Edge Server has been filed with IP and i can find the report under “Edge server” filter even though the call is a Direct Type and both are in internal Network.
May i know for Direct Call Type, why it turns to contain Caller AV Edge server IP and report can be found under Edge server?