For the past couple of months, in collaboration with researchers, I've been applying machine learning to RUM metrics in order to model the microsurvey we've been running since June on some wikis. The goal being to gain some insight into which RUM metrics matter most to real users.

What happened?
On September 24, 2018 a series of malicious edit attempts were detected on translatewiki.net. In general, these included attempts to inject malicious javascript, threatening messages and porn.

We use both synthetic and RUM testing for Wikipedia. These two ways of testing performance are best friends and help us verify regressions. Today, we will look at two regressions where it helped us to get metrics both ways.

One of the Performance-Team responsibilities at Wikimedia is to keep track of Wikipedias performance. Why is performance important for us? In our case it is easy: We have so many users and if we have a performance regression, we are really affecting people's lives. Maybe you remember our hiring tweet from a couple of years ago?

This blog post will describe a bit about how we are utilizing the "Task Types" feature in Phabricator to facilitate better tracking of work and to streamline workflows with custom fields. Additionally, I will be soliciting feedback about potential use-cases which could potentially take further advantage of this feature.

Just as we were settling into nova-network (along with other early OpenStack adopters), the core developers were already moving on. A new project (originally named 'Quantum' but eventually renamed 'Neutron') would provide stand-alone APIs, independent from the Nova APIs, to construct all manners of software-defined networks. With every release Neutron became more elaborate and more reliable, and became the standard for networking in new OpenStack clouds.

On August 1st, we finished up a 6 week transition process which eliminated the last non-forward-secret cipher supported by our TLS termination, which was bare AES128-SHA. This change guarantees that all TLS connections to Wikipedia and its sister projects use cipher suites that provide forward secrecy. Wikipedia is the first major high-traffic site to take this step.

Between 2017-11-20 and 2017-12-01, the Wikimedia Foundation ran a direct response user survey of registered Toolforge users. 141 email recipients participated in the survey which represents 11% of those who were contacted.

One particularly interesting topic discussed during the Hackathon Technical Debt session (T194934) was that of the contagious aspect of technical debt. Although this makes sense in hindsight, it's not something that I had really given much thought to previously.

I would like to share my deepest gratitude for everyone who responded to the Wikimedia Communities and Contributors Survey. The survey has closed for this year. The quality of the results has improved because more people responded this year. We heard from over 200 people who work in volunteer developer spaces like Phabricator, IRC, Mediawiki, mailing lists, and many others, which was a large increase from last year.

Running all tests for MediaWiki and matching what CI/Jenkins is running has been a constant challenge for everyone, myself included. Today I am introducing Quibble, a python script that clone MediaWiki, set it up and run test commands.

The Popups MediaWiki extension previously used HTML UI templates inflated by the mustache.js template system. This provided good readability but added an 8.1 KiB dependency* for functionality that was only used in a few places. We replaced Mustache with ES6 syntax without changing existing device support or readability and now ship 7.8 KiB less of minified uncompressed assets to desktop views where Popups was the only consumer.

I've spent the last few months building new web servers to support some of the basic WMCS web services: Wikitech, Horizon, and Toolsadmin. The new Wikitech service is already up and running; on Wednesday I hope to flip the last switch and move all public Horizon and Toolsadmin traffic to the new servers as well.

Yesterday we deployed Thumbor support for Wikimedia-hosted private wikis. While 99.9% of our traffic is for public-facing wikis, the Wikimedia Foundation hosts a number of private MediaWiki instances on the same infrastructure. Those wikis facilitate work for various groups in the movement, from community-run projects like OTRS, to local chapters, staff or the board. They're essential to the Wikimedia Movement, but by being private they're an architectural special case.

This is a digest of the updates from several weeks of changelogs which are published upstream. This is an incomplete list as I've cherry-picked just the changes which I think will be of significant interest to end-users of Wikimedia's phabricator. Please see the upstream changelogs for a detailed overview of everything that's changed recently.

JADE (auditing system) work is continuing, we have a database schema designed, some code written for the backend service, and have planned an event-based architecture plus content-handled Jade and Jade_talk namespaces within MediaWiki.

Draftquality data is cached in the ORES extension and is made available to other extensions.

Long ago, the Wikimedia Operations team made the decision to phase out use of Ubuntu servers in favor of Debian. It's a long, slow process that is still ongoing, but in production Trusty is running on an ever-shrinking minority of our servers.

In the last blog post I described where Thumbor fits in our media thumbnailing stack. Introducing Thumbor replaces an existing service, and as such it's important that it doesn't preform worse than its predecessor. We came up with a strategy to reach feature parity and ensure a launch that would be invisible to end users.

This is your friendly but final warning that we are replacing Selenium tests written in Ruby with tests in Node.js. There will be no more reminders. Ruby stack will no longer be maintained. For more information see T139740 and T173488.

New language support for Bengali, Greek, and Tamil. New advance edit quality support for Albanian and Romanian. We cleaned up the old 'reverted' models where better support is available. We're working on moving to a new dedicated cluster. We improved some models by exploring new sources of signal and cleaning datasets. We started work on JADE and presented on The Keilana Effect at Wikimania.

The current physical servers for the <wiki>_p Wiki Replica databases are at the end of their useful life. Work started over a year ago on a project involving the DBA team and cloud-services-team to replace these aging servers (T140788). Besides being five years old, the current servers have other issues that the DBA team took this opportunity to fix:

24% of Wikipedia edits over a three month period in 2016 were completed by software hosted in Cloud Services projects. In the same time period, 3.8 billion Action API requests were made from Cloud Services. We are the newly formed Cloud Services team at the Foundation, which maintains a stable and efficient public cloud hosting platform for technical projects relevant to the Wikimedia movement. -- https://blog.wikimedia.org/2017/09/11/introducing-wikimedia-cloud-services/

Today, we discovered a major regression in Wikilabels. We've patched the issue and made an emergency deployment. We also deleted some labels that were saved while the system was compromised. In this post, we'll describe what happened.

Today, I'm writing to announce a breaking change in ORES that will come out about a month from now. It will only change how information about prediction models is stored and reported. This information is used by some tools to set thresholds at specified levels of confidence (e.g. "give me the threshold that gives 90% recall"). In this blog post, I'll explain how this is currently done and how it will be done once we deploy the change.

Toolsadmin.wikimedia.org is a management interface for Toolforge users. On 2017-08-24, a new major update to the application was deployed which added support for creating new tool accounts and managing metadata associated with all tool accounts.

Back in year zero of Wikimedia Labs, shockingly many services were confined to a single box. A server named 'virt0' hosted the Wikitech website, Keystone, Glance, Ldap, Rabbitmq, ran a puppetmaster, and did a bunch of other things.

At 1100 UTC on June 23rd, ORES started to struggle. Within a half hour, it had fully choked and could no longer respond to any requests. It took us 10 hours to diagnose the problem, solve it, and consider it solved. We learned some valuable lessons when studying and addressing this issue.

The Reading Web team recently discovered a bug in Firefox wherein a load event is fired when Firefox loads certain pages from its Back-Forward Cache (BFCache). To JavaScript on those pages, this event is a second load event (the first having been fired before the user navigated away from the page). This proved to be problematic for the cornerstone of our instrumentation, the EventLogging extension and delayed the deployment of Page Previews for approximately three months.

Tool owners want to create accessible and pleasing tools. The choice of fonts has previously been difficult, because directly accessing Google's large collection of open source and freely licensed fonts required sharing personally identifiable information (PII) such as IPs, referrer headers, etc with a third-party (Google). Embedding external resources (fonts, css, javascript, images, etc) from any third-party into webpages hosted on Toolforge or other Cloud VPS projects causes a potential conflict with the Wikimedia Privacy Policy. Web browsers will attempt to load the resources automatically and this will in turn expose the user's IP address, User-Agent, and other information that is by default included in an HTTP request to the third-party. This sharing of data with a third-party is a violation of the default Privacy Policy. With explict consent Toolforge and Cloud VPS projects can collect and share some information, but it is difficult to secure that consent with respect to embedded resources.

The Wikimedia Foundation’s new Scoring Platform team, led by Aaron Halfaker, will be working on democratizing access to AI, developing new types of AI predictions, and pushing the state of the art with regards to ethical practice of AI development.

The shared Elasticsearch cluster hosted in Toolforge was upgraded from 2.3.5 to 5.3.2 today (T164842). This upgrade comes with a lot of breaking API changes for clients and indexes, and should have been announced in advance. @bd808 apologizes for that oversight.

Two outages with documentation. Revscoring 2.0 coming with better model information and "thresholds". New support for Romanian, Albanian, Tamil, Greek, and Bengali. We're officially welcoming @awight to the team!

Debian Stretch was officially released on Saturday[1], and I've built a new Stretch base image for VPS use in the WMF cloud. All projects should now see an image type of 'debian-9.0-stretch' available when creating new instances.

We are currently in the final stages of deploying Thumbor to Wikimedia production, where it will generate media thumbnails for all our public wikis. Up until now, MediaWiki was responsible for generating thumbnails.

Back in the dark ages of Labs, all instance puppet configuration was handled using the puppet ldap backend. Each instance had a big record in ldap that handled DNS, puppet classes, puppet variables, etc. It was a bit clunky, but this monolithic setup allowed @yuvipanda to throw together a simple but very useful tool, 'watroles'. Watroles answered two questions:

One of the goals of the Wikimedia Performance Team is to improve the performance of MediaWiki and the broader software stack used on Wikimedia wikis. In this article we’ll describe a small performance improvement we’ve implemented for MediaWiki and recently deployed to production for Wikimedia. It highlights some of the unique problems we encounter on Wikimedia sites and how new web standards can be leveraged to improve performance.

The first very visible step in the plan to rename things away from the term 'labs' happened around 2017-06-05 15:00Z when IRC admins made the #wikimedia-labs irc channel on Freenode invite-only and setup an automatic redirect to the new #wikimedia-cloud channel.

Updates now coming to the phame blog! We made presentations and gathered new collaborators at the Wikimedia Hackathon 2017 in Vienna. ORES is back in api.php. Wikilabels has stats. ORES in CODFW fell over for a while, but it's back.

I wanted to let you know about an upcoming experimental Reddit AMA ("ask me anything") chat we have planned. It will focus on artificial intelligence on Wikipedia and how we're working to counteract vandalism while also making life better for newcomers.

In this update, I'm going to change some things up to try and make this update easier for you to consume. The biggest change you'll notice is that I've broken up the [#] references in each section. I hope that saves you some scrolling and confusion. You'll also notice that I have changed the subject line from "Revision scoring" to "Scoring Platform" because it's now clear that, come July, I'll be leading a new team with that name at the Wikimedia Foundation. There'll be an announcement about that coming once our budget is finalized. I'll try to keep this subject consistent for the foreseeable future so that your email clients will continue to group the updates into one big thread.

I hosted the AI Wishlist session at the Developer Summit(T147710). At that session, we brainstormed a set of AIs that we think would be interesting to implement. Generally I asked people to do their best to follow template that would help us remember why the AI was important, what it would help with, and what resources might help get it implemented. See artificial-intelligence

Last week @Jdlrobson pinged me by email about a performance improvement his team noticed for large wiki articles on the mobile site in our synthetic tests run on WebPageTest. The improvement looked like this, a sudden drop in SpeedIndex (where lower is better):