Create Firebase Remote Config Experiments with A/B Testing

When you are updating your app and using Firebase Remote Config
to push it to an application with an active user base,
you want to make sure you get it right. You might be uncertain about the
following:

The best way to implement a feature to optimize the user experience. Too
often, app developers
don't learn that their users dislike a new feature or an updated user
experience until their app's rating in the app store declines. A/B testing
can help measure whether your
users like new variants of features, or whether they
prefer the app as it currently exists. Plus, keeping most of your users in a
control group ensures that most of your user base can continue to use your app
without experiencing any changes to its behavior or appearance until the
experiment has concluded.

The best way to optimize the user experience for a business goal. Sometimes
you’re implementing product changes to maximize a metric like revenue or
retention. With A/B testing, you set your business objective, and Firebase does
the statistical analysis to determine if a variant is outperforming the
control group for your selected objective.

To A/B test feature variants with a control group, do the following:

Create your experiment.

Validate your experiment on a test device.

Manage your experiment.

Important: Make sure you have the minimum SDK version for A/B Testing:
Google Play Services 11.4.2 for Android and Firebase SDK for iOS 4.5.0.

Click Create experiment, and then select
Remote Config when
prompted for the service you want to experiment with.

Enter a Name and optional Description for your experiment, and
click Next.

Fill out the Targeting fields, first choosing the app that uses your experiment. You can
also target a subset of your users to participate in your experiment
by choosing one or more of the following options:

Version: One or more versions of your app

User audience: Analytics audiences used to target users
who might be included in the experiment

User property: One or more Analytics user properties for
selecting users who might be included in the experiment

Prediction: Groups of users predicted by machine learning to
engage in a particular behavior

Country/Region: One or more countries or regions for selecting
users who might be included in the experiment

Device language: One or more languages and locales used to select
users who might be included in the experiment

Set the Percentage of target users: Enter the percentage of your app's
user base matching the criteria set under Target users that you want
to evenly divide between the control group and one or more variants in
your experiment. This can be any percentage between 0.01% and 100%.
Users are randomly assigned to each experiment, including duplicated
experiments.

Optionally, set an activation event to ensure that only users who have
first triggered some Analytics event are counted in your experiment,
then click Next.

For the experiment's Goals, select the primary metric to track, and
add any desired additional metrics from the dropdown list. These include
built-in objectives (engagement, purchases, revenue, retention, etc.),
Analytics conversion events, and other Analytics events. When
finished, click Next.

In the Variants section you'll choose a control group and at least one
variant for the experiment. Use the Choose or create new list
to add one or more parameters to
experiment with. You can create a parameter that has not
previously been used in the Firebase console, but it must exist in your
app for it to have any
effect. You can repeat this step to add multiple parameters to your
experiment.

For each variant, you have the option of adding
variant-level targeting
in which you select from the same list of targeting options available at
the experiment level—Version, User audience, User property,
Prediction, Country/Region, and Device language.

(optional) To add more than one variant to your experiment, click Add
another variant.

Change one or more parameters for specific variants. Any unchanged
parameters are the same for users not included in the
experiment.

Click Review to save your experiment.

Validate your experiment on a test device

Each Firebase app installation has an instance ID token (or registration token)
associated with it. You can use this token to test specific experiment variants
on a test device with your app installed. To validate your experiment on a
test device, do the following:

Manage your experiment

Whether you create an experiment with Remote Config or the Notifications composer,
you can then validate and start your experiment, monitor your experiment while
it is running, and increase the number of users included in your running
experiment.

When your experiment is done, you can take note of the settings used by the
winning variant, and then roll out those settings to all users. Or, you can
run another experiment.

Start an experiment

To validate that your app has users who would be included in your
experiment, check for a number greater than 0% in the
Targeting and distribution section (for example, 1% of users matching
the criteria).

To change your experiment, click Edit.

To start your experiment, click Start Experiment. You can run up to 24
experiments per project at a time.

Monitor an experiment

Once an experiment has been running for a while, you can check in on its
progress and see what your results look like for the users who have participated
in your experiment so far.

Click Running, and then click the title of your experiment. On this
page, you can view various statistics about your running experiment,
including your goal metric and other metrics. For each metric, the following
information is available:

Improvement: A measure of the improvement of a metric for a given
variant as compared to the baseline (or control group). Calculated by
comparing the value range for the variant to the value range for the
baseline.

Probability to beat baseline: The estimated probability that a given
variant beats the baseline for the selected metric.

Probability to be the best variant: The estimated probability that a
given variant beats other variants for the selected metric.

Value per user: Based on experiment results, this is the predicted range
that the metric value will fall into over time.

Total value: The observed cumulative value for the control group or
variant. The value
is used to measure how well each experiment variant performs, and is used
to calculate Improvement, Value range, Probability to beat
baseline, and Probability to be the best variant. Depending on the
metric being measured, this column may be labeled "Duration per user,"
"Retention rate," or "Conversion rate."

To increase the number of users included in your experiment, click
Increase Distribution, and then select an increased percentage to add
more eligible users to your experiment.

After your experiment has run for a while (at least 24 hours), data on this
page indicates which variant, if any, is the "leader." Some measurements
are accompanied by a bar chart that presents the data
in a visual format.

Roll out an experiment to all users

After an experiment has run long enough that you have a "leader," or winning
variant, for your goal metric, you can roll out the experiment to 100% of users.
This allows you to select a value to publish in Remote Config for all users
moving forward. Even if your experiment has not created a clear winner, you can
still choose to roll out a variant to all of your users.

The console displays a dialog with an option to increase the percentage of users who are in the currently running experiment. Input a number greater than the current percentage and click Send. The experiment will be pushed out to the percentage of users you have specified.

Duplicate or stop an experiment

Click Completed or Running, hover over your experiment, click the
context menu
(more_vert), and then click Duplicate or
Stop.

User targeting

You can target the users to include in your
experiment using the following user-targeting criteria.

Targeting criterion

Operator(s)

Value(s)

Note

Version

contains,
does not contain,
matches exactly,
contains regex

Enter a value for one or more app versions that you want to include in the
experiment.

When using any of the contains, does not contain, or
matches exactly operators, you can provide a comma-separated list of
values.

When using the contains regex operator, you can create regular
expressions in RE2
format. Your regular expression can match all or part of the target version
string. You can also use the ^ and $ anchors to match the
beginning, end, or entirety of a target string.

User audience(s)

includes all of,
includes at least one of,
does not include all of,
does not include at least one of

Select one or more Analytics audiences to target users who might be
included in your experiment.

User property

For text:contains,
does not contain,
exactly matches,
contains regex

For numbers:<, ≤, =, ≥, >

An Analytics user property is used to select users who might be included
in an experiment, with a range of options for selecting user property
values.

On the client, you can set only string values for user
properties. For conditions that use numeric operators,
the Remote Config service converts the value of the corresponding
user property into an integer/float.

When using the contains regex operator, you can create regular
expressions in RE2
format. Your regular expression can match all or part of the target version
string. You can also use the ^ and $ anchors to match the
beginning, end, or entirety of a target string.

Prediction

N/A

Target groups of users defined by Firebase Predictions—for example,
those who are likely to stop using your app, or users who are likely to
make an in-app purchase.
Select one of the values defined by the Firebase Predictions tool. If an
option is not available, you may need to opt-in to Firebase Predictions by
visiting the Predictions section of the Firebase console.

Device country

N/A

One or more countries or regions used to select users who might be included
in the experiment.

Device language

N/A

One or more languages and locales used to select users who might be included
in the experiment.

This targeting criterion is only available for Remote Config.

Variant-level targeting

A/B Testing with Remote Config provides an additional capability for
advanced use cases: variant-level targeting. With this feature, you can limit
the application of a config to a subset of users within a variant. This makes
it possible to test how different targeting schemes affect key metrics. The
configuration defined for the variant is applied only for users who meet both
the experiment-level targeting conditions and the variant-level targeting
conditions; remaining users in the variant receive the default configuration.
Note that metrics are calculated for all users in the variant, whether or not
they met the variant-level targeting conditions, in order to avoid selection
bias.

A/B Testing metrics

When you create your experiment, you choose a metric that is used to compare
experiment variants, and you can also choose other metrics to track to help
you to better understand each experiment variant and detect any significant
side-effects (such as app crashes). The following tables provide details on how
goal metrics and other metrics are calculated.

Goal metrics

Metric

Description

Daily user engagement

The number of users who have your app in the foreground each day for long
enough to trigger the user_engagement Analytics event.

Retention (1 day)

The number of users who return to your app on a daily basis.

Retention (2-3 days)

The number of users who return to your app within 2-3 days.

Retention (4-7 days)

The number of users who return to your app within 4-7 days.

Retention (8-14 days)

The number of users who return to your app within 8-14 days.

Retention (15+ days)

The number of users who return to your app 15 or more days after
they last used it.

Notification open

Tracks whether a user opens the notification sent by the Notifications composer.

Purchase revenue

Combined value for all ecommerce_purchase and
in_app_purchase events.

Estimated AdMob revenue

Estimated earnings from AdMob.

Estimated total revenue

Combined value for purchase and estimated AdMob revenues.

first_open

An Analytics event that triggers when a user first opens an app after
installing or reinstalling it. Used as part of a conversion funnel.

notification_open

An Analytics event that triggers when a user opens a notification
sent by the Notifications composer . Used as part of a conversion funnel.

Other metrics

Metric

Description

Crash-free users

The percentage of users who have not encountered errors in your app that
were detected by the Firebase Crash Reporting SDK during the experiment. To
learn more, see Crash-Free User
Metrics.

notification_dismiss

An Analytics event that triggers when a notification sent by
the Notifications composer is dismissed (Android only).

notification_receive

An Analytics event that triggers when a notification sent by
the Notifications composer is received while the app is in the background (Android only).

BigQuery data export

You can access all analytics data related to your A/B tests in
BigQuery. BigQuery
allows you to analyze the data using BigQuery SQL, export it to another cloud
provider, or use the data for your custom ML models.
See Link BigQuery to Firebase
for more information.

To get started, make sure that your Firebase project is linked to BigQuery.
Select Settings > Project Settings from the left
navigation bar, then select Integrations > BigQuery > Link. This page
displays options to perform BiqQuery analytics data export
for all apps in the project.

To query analytics data in the context of an experiment, you can open the
experiment results page, and select View in BigQuery. This opens the
BigQuery console's query composer with an example query of experiment data
preloaded for your review. Note that, because Firebase data in BigQuery is
updated only once daily, the data available in the experiment
page may be more up to date than the data available in the BigQuery console.

Note: There is no charge for exporting data from A/B Testing, and BigQuery
provides generous free usage limits. See BigQuery
Pricing, or the BigQuery
sandbox for more
information.