Firefox 23 Test Plan

Summary

The following is the test plan for Firefox 23 from Nightly through to Release. Use this document as a reference for what is being tested to validate the quality of Firefox 22. After the release this document will be used as an archive of what was done to validate this release.

If you have some free time, please pick a task below and get in touch with one of the leads.

Manual Testing

WebRTC

Using the Moztrap test as a guideline, verify that Firefox 23.0.1 is no worse than Firefox 22.0 when making AppRTC calls of >5 minutes in length. When testing, here are the guidelines to follow:

Have only one browser open on each machine at any given time

Have only one call running at any given time on a particular machine

Make sure the caller and callee are always on *different* machines.

Please don't test any 3+-way calls for this sanity check -- We just want to see the results for 1:1 (basic) calling

If you find regressions, report a bug and CC Randell and Maire; they can help track down if the regressions are real or not

Tip: be sure to provide extremely detailed steps to reproduce and witnessed results, as well as detailed information about your test environment; more information is better than not enough information.

High-risk areas need coverage on a broader range of platforms/hardware

What happened here proves a quite serious communication problem between QA and dev. The QA working on the H264... support feature was told it's only for Windows 7 and 8 and never got further updates. bug 847267 (which caused bug 901944) wasn't added as a dependency to the tracking bug either bug 799318. The feature information tracking should get improved. Otherwise we might get left with the only option of spamming devs with emails requesting for updates every week (a situation unpleasant for everyone).

I think we might be getting to a point where visibility of engineering work to QA (or lack thereof) is negatively impacting the quality of the product

status emails are sent prior to uplift but that might miss certain uplifts -- should increase frequency of these emails

use need-info request on tracking bugs to get the information we need, and block on it if necessary

We were not able to reproduce this issue despite having hardware previously known to reproduce these types of issues

We are doomed unless we can get engineers engaged in fixing this

QA is doing everything we can, Marc to talk to Bob about dev involvement

Tracy: Is it the same machine in RelEng farm building the "broken" builds?

No, it's an issue with how optimization works at build time.

Question whether doing twin RC builds is worth the QA/RelEng resources as compared to just respinning on occasion

QA needs to run full automation on both builds, regardless if we don't ship build2

RelEng needs to generate both builds, clean up build2 if we decide to ship build1

Stability rarely, if ever, has enough data to say conclusively if a build is good or not

Having multiple builds has not prevented us from having to ship a follow-up build in this instance

After 4 releases (24 weeks) dealing with this, can we do some statistical, risk, and cost-benefit analysis to justify doing this further?

For further consideration, what's our strategy if we have no more leads and have to just live with this issue?

Features lists for this release were not centralized anywhere, so at least one feature did not get to QA: Network Monitor.

The beta scope was not adapted to the new 2 betas per week process, which created QA overhead.

Invalid/unclear tests in MozTrap.

Unclear tests should be updated. If their owner can't do that, the QA team is willing to take on that. Some of these tests can lead to invalid bugs (e.g. bug 900457).

Some of these tests are present in the ESR smoke tests too (e.g. save and open test, play videos test etc - we took the liberty of updating where it was only neccesary to change the links in order to be able to run the tests).

Otilia to get SV to roll up unclear/failed tests into the Release/Beta testing emails

Mitigating Strategy

The following is a list of actions we can/should take to build on our successes and prevent similar failures.

[ashughes] come up with a better way to track high priority areas that aren't necessarily in the scope of feature testing and the existing Beta checklist

[mschifer] work closer with Asa and Developers to get greater visibility, not just in features but high-risk landings

[ashughes] add features from current release to regression suite for next release or two

All-hands Post-mortem

Discussion

should we make Beta 10 RC1, offer it to Beta users, to get more stability soak time?

relationship building between QA, Dev, and other teams

Actions

joduinn to investigate desktop repack - channel changing

bsmedberg to run his script on build#1 crash data

if we can find it, then we'll continue making build #2 automatically

if we can't find it, then we'll stop making build #2 and make more dot releases :(

result: found no crashes with build1 so we need to discuss dropping build2 from our standard process; this will mean potentially doing more .1 releases and some weekend work; a risk and trade-off we are willing to make I think based on prevalence

lsblakk to create meeting about RC instead of beta 10

QA supports this; stop doing Beta 10 and offer Beta/RC users the same RC1 build; grants us a couple more days of soak time for stability

lsblakk to speak to Laura about finding out about putting in a tab for users about a specific change - timing, finding out sooner - what causes/guidelines about when to do this

lsblakk - check with Axel if having a hook on l10n repos for key/visible string would help protect localization teams from errors like [fr] tools in the future; especially for string changes

ashughes/tracy to start discussion in a channel meeting about getting more value out of Aurora; current value is low because it's our lowest audience; Tracy suggests rebranding might help; Alex suggested scrapping Aurora; Jonath advocates for three branches (1 dev, 2 testing); Brendan does not see a need nor urgency due to B2G