Description

What?

The MobileFrontend extension application comprises of a set of core modules and a set of features that rely on them. Some-to-most of these features can be enabled or disabled by setting a configuration variable, a flag, to a truthy value. Moreover, most of these features also have additional configuration variables, which change the behaviour of specific parts of the feature.

The number of features in the application is large and their configuration and location in the codebase sufficiently complex that it's non-trivial to identify and document them after the fact. A feature management system for MobileFrontend, therefore, must not only standardise how a feature is enabled and configured, but also where it's defined and documented as well. Moreover, this information should be immediately available to both MobileFrontend developers and to stakeholders, e.g. the Product Owner of the Reading Web team, and users.

A feature's behaviour can be implemented in PHP and/or JS, and CSS – LESS strictly. Consequently, the feature management system must make information about the feature available in those execution environments. A rough feature flagging system emerged in MobileFrontend's RL modules definitions, wherein modules added as dependencies of skins.minerva.scripts and skins.minerva.beta.scripts were "enabled" features, i.e. that their modules would be loaded by the client. By including the set of RL modules to load in a feature's definition, the feature management system can allow developers to concentrate on features rather than unnecessarily complex RL module hierarchies.

MobileFrontend can operate in either "stable" or "beta" mode. Generally speaking, all features are expected to be enabled in beta mode and will, at some point, be "promoted" from beta to stable mode. Given that the majority of requests to the Wikipedias are serviced by Varnish, promoting a feature requires considering how the distinct parts of the feature's implementation will interact when the feature is enabled, i.e. the response from Varnish (HTML) may or may not be up to date but the JS and CSS will be. The feature management system will handle this by updating the state of the application to reflect that the feature is disabled.

How?

in both PHP and JS, the feature manager will allow a client to query whether a feature is enabled or disabled, e.g.

interface Feature {
function getName();
function getResourceLoaderModules();
// Allow a feature to determine whether or not it //can// be enabled, e.g. the notifications
// feature requires the Echo extension to be installed.
function canBeEnabled( $context );
// Per T141087#2518683: should the feature indicate that the feature is enabled/disabled in CSS,
// i.e. should it add feature-foo or no-feature-foo CSS classes (respectively) to body element.
function shouldEnableInCss();
// Get the feature-specific hooks that should be registered.
function getHooks();
}

Hydrate the state of the JS feature manager from the forwarded state (from mw.config presumably)

Identify 5 features that can be migrated to basic feature management

Migrate the features

Due to caching, it's likely that there'll be discrepancies between the two different representations of the feature manager state, the classes added to the body element and the mw.config, when a feature is enabled or disabled on the backend. The client-side feature manager should detect this discrepancy.

the feature manager will add classes to the head – or body – element of the DOM if a feature is enabled or disabled, e.g. language-switcher and no-language-switcher respectively

these classes will be added in PHP and delivered as part of the initial response if the feature manager detects a discrepency in the JS state and these classes, then it must handle it as discussed above

This is great, it would allow for the feature flagged code to not care about cached/not-cached HTML given the feature-flags machinery would detect and inform the flag query about it.

updating the state of the application to reflect that the feature is disabled, which should probably be the default behaviour

Meaning for example:

That if we disable language-button-bottom

Cached HTML will have in mw.config and the CSS classes of language-button-bottom

In (updated) JS features.status('language-button-bottom') will return true since the cached HTML page says it has the language-button-bottom HTML

At that point it can make the feature work (either always include the module in extension.json, or load it async with mw.loader). I don't think the feature flag management should be concerned about module loading, although maybe it should.

Fresh HTML will come with the updated no-language-button-bottom and mw.config

At that point, the FE module won't initialize the feature or load the module if that was the approach decided.

The thing to consider is if the feature flag manager should care about loading the frontend modules for the feature.

My take on it is no. Keep it simple, just a backend+frontend store that get's hydrated from config.

The frontend code will have to decide if they want to send the FE modules regardless of config, or load them async depending on what the features manager says. Both have tradeoffs in different situations, so I'm of the opinion that it shouldn't be involved with asset loading.

I haven't seen a proposal though, and it may be interesting to couple those two, so looking forward to that.

Regarding interface, they look good. In the frontend I'd add some sugar for the common case, something like:

Features.whenEnabled('language-button-bottom').then(function(){// Feature is enabled and all RL modules are loaded, CSS classes are setM.require('language-button-bottom/init')// or whatever else you need to do.})

Is what I'd mostly want to use as an API.

By returning a promise we can also trigger the .catch if it is disabled, like:

Thanks for starting this epic. I'd like to use the list of features listed at 1 to learn more about how and why the new proposed system is going to work. I've tried dividing my reasoning/questions into the following distinct points:

How are features going to be determined? Are the features we're talking about going to be similarly identified as in 1?

Say, we call Special:Nearby a feature. How will all pages benefit from having an html or body class for this feature? Won't we be shipping unnecessary information for all pages since this feature is a separate page and we won't care about it unless we're on the Special:Nearby page.

Will we consider well established features such as Wikidata descriptions in search results a feature or a non-feature which is already tested and is found to be useful so that we decided to keep it around for the foreseeable future? Will everything be a feature in our codebase?

How long will a feature live behind a toggle? In the long-run the more features we have, the more tech debt we'll have. What will the plan for graduating (either promoting or removing) a feature be?

Is the new system going to change the way we deliver new features to users? Are we going to merge unfinished features to master and not worry about deployment as these features will be turned off by default until we feel they're ready to be used?

How will we write tests going forward? Do we have to write tests for cases with different combination of features? For example, do we need to write a test for Wikidata descriptions as taglines when the lead image feature is enabled and when it's disabled? I see our integration tests getting unnecessarily big because we have to take multiple combinations of features into account when testing.

As I understand it, the new system will help us toggle individual features as opposed to the beta mode where all new features are developed. This, in turn, makes it easy to manage and maintain features for us developers. I, however, think that the new solution won't be any different from the current implementation in beta in that the codebase will be filled with if checks to determine whether a feature is enabled or not. We'd surely be extracting features implementations to individual functions/files for easy reading but that can be achieved by commenting the codebase abundantly. Did I understand the proposed solution incorrectly? If yes, please explain it to me using a sample feature, for example, using the "Read in another language" feature.

Thanks for replies in advance. I'd love to learn more about the proposed solution and ask more questions afterwards.

Is the new system going to change the way we deliver new features to users? Are we going to merge unfinished features to master and not worry about deployment as these features will be turned off by default until we feel they're ready to be used?

I think having new additions to the codebase being disabled by default and, consequently, having to explicitly enable them, is A Good Thing™. AFAICT we've slowly moved to this process anyway, the language switcher page action button being the most recent example that comes to mind. I mean to say that I don't think this proposal is a new idea, I think it's an extraction of our current practices with a little sugar sprinkled on top.

Regardless of where we end up: I don't think that merging unfinished features to master is a good idea and I'll always worry when we decide so to do. Flagging features is about having control – and increased confidence as a result – over when and for whom features are enabled.

How will we write tests going forward? Do we have to write tests for cases with different combination of features? For example, do we need to write a test for Wikidata descriptions as taglines when the lead image feature is enabled and when it's disabled? I see our integration tests getting unnecessarily big because we have to take multiple combinations of features into account when testing.

This is an excellent question with an equally excellent example and I think that it applies regardless of whether or not this proposal is worked on.

As always, it's up to the developer(s), reviewer(s), and, per our norms, the Tech Lead, to use their best judgement. I think it's fair to say that if the two (or more) features interact/conflict with each other, then we should be writing acceptance tests to cover those scenarios – at the very least! If we're not doing that at the moment, then I think that our integration tests are dangerously small.

How are features going to be determined? Are the features we're talking about going to be similarly identified as in 1?

I'd say that's a good starting point as it's that page – and the emails that followed about missing features – that got me thinking about this in the first place.

When I've been talking about a feature, I've been talking about a discrete unit of functionality, something that we'd typically cover with an acceptance test or two, which ranges from small, e.g. the watchstar, to large, like the language switcher.

I'll update the epic to include a section with some examples of what I consider to be features.

Say, we call Special:Nearby a feature. How will all pages benefit from having an html or body class for this feature? Won't we be shipping unnecessary information for all pages since this feature is a separate page and we won't care about it unless we're on the Special:Nearby page.

You're right that it's redundant to add a CSS class in this case. I'll update the epic to reflect that this should be done by default but there'll be exceptions, e.g. Special:Watchlist.

Will we consider well established features such as Wikidata descriptions in search results a feature or a non-feature which is already tested and is found to be useful so that we decided to keep it around for the foreseeable future? Will everything be a feature in our codebase?

The example of Wikidata descriptions aside – have Wikidata descriptions been found to be useful yet? ;) – in short: ideally, yes.

As I understand it, the new system will help us toggle individual features as opposed to the beta mode where all new features are developed. This, in turn, makes it easy to manage and maintain features for us developers.

… and hopefully implement and unit test too – as well as for stakeholders to discover via Special:MobileFeatures, i.e. without intervention from developers.

I, however, think that the new solution won't be any different from the current implementation in beta in that the codebase will be filled with if checks to determine whether a feature is enabled or not. We'd surely be extracting features implementations to individual functions/files for easy reading but that can be achieved by commenting the codebase abundantly.

There are a whole lot of if checks to determine if features (or parts of features) are enabled but, most of the time, they are scattered across different parts of the codebase. This system is, in part, about creating a place to find them.

You're right to point out that commenting code (inline documentation) will also help. Moreover, writing good high- and low-level documentation isn't orthogonal to this goal – I think having a single system for defining, loading, and initialising features will make them easier to document!

Did I understand the proposed solution incorrectly. If yes, please explain it to me using a sample feature, for example, using the "Read in another language" feature.

I don't think you've misunderstood anything. That you have questions means I need to do a better job explaining my thoughts. I think that submitting a couple of rough changes might help here.

This is a minor gripe, but I'd like to point out that adding a lot of no-<feature> classes to the body if no extra styling is needed sounds cumbersome. Perhaps when the directory structure is changed to reflect these features (I'm assuming it will be), each feature could have a styles folder and will only include a no-<feature> class to the body if a sheet exists for it in the folder.

I talked about this with Sam today. Reflecting, through having these discussions we seem to have come up with a better way to manage features.

I'm wondering if more abstraction is needed or whether people feel like the original problems that led us to creating this task have been solved.@bmansurov@phuedx@Jhernandez any thoughts on that? What's missing?

Do we have any feature that we would like to implement in our new FMS (Feature Management System)? I don't see a value in creating a system without any features in it. First, we should what we want to refactor to be FMS-ready, and then implement the FMS scaffolding. Writing FMS just for sake of writing it doesn't seem valuable.

Do we have any feature that we would like to implement in our new FMS (Feature Management System)? I don't see a value in creating a system without any features in it. First, we should what we want to refactor to be FMS-ready, and then implement the FMS scaffolding. Writing FMS just for sake of writing it doesn't seem valuable.

This effort was often stalled on the fact that the Reading Web team (the maintainers of MF) don't have an accepted definition of "a feature." I still maintain that a feature is anything that affects the UX, e.g. the language switcher, edit links, section toggling, notifications, lazily loaded images, etc. This is somewhat reinforced by a cursory ag for getConfigVariable (see MobileContext#getConfigVariable, which is our basic feature flagging system), making a note of the name of the feature flag itself:

MinervaShowCategoriesButton

MinervaEnableFontChanger

MinervaEnableBackToTop

MFLazyLoadReferences

MFLazyLoadImages

MFShowFirstParagraphBeforeInfobox

MFExpandAllSectionsUserOption

MinervaEnableFontChanger

If the intention of this epic was only feature flagging, then we've already got something in the codebase that handles this and it might be enough.

However, IIRC, my intention was to propose a way of reorganising the codebase around the larger features (see my definition above and the Feature interface in the description), and not, as I saw it, leave it as a giant bundle of interdependent modules with shared glue. Would a giant refactor be worth it? Let's decide!