The Technology department supports global access to the Wikimedia projects that is reliable, fast, and secure. This team supports performance, availability, development infrastructure, technical operations, security, architecture, release management, analytics engineering and Research. One of the larger teams, they support most of the core operations to make sure our projects, services and development pathways are available to as many people as possible, on as many devices as possible, in a manner that safeguards users’ privacy and trust.

We support progress by providing tooling and infrastructure that make it possible for product developers to enhance and augment the capabilities of our software. We create pathways for creative and motivated individuals to translate their ideas into working software that is reliable, easy-to-use, secure, and scalable.

We work closely with the Product team to support ongoing initiatives and cover development dependencies. We provide counsel by assisting product teams and other units within the organization and the movement to make good choices on technology by assessing costs, analyzing usability, anticipating failures, evaluating privacy, reviewing security, projecting impact, and suggesting alternatives.

We also work with the Product audiences providing research before new products or features are built. We conduct, enable, and review research to validate and iterate on concepts, to ensure usability and that products are built around users’ needs. We conduct research and explore services so that we can surface the best ways to end barriers to contribution. We use formal collaborations with industry and academia to scale efforts of the organization.

Our new initiatives for 2017-18 include a renewed emphasis and focus on the MediaWiki platform, launching the Wikimedia Cloud Services Team and expanding the machine learning capability of ORES to continue helping our editors with creating high quality content faster and more easily. Also for this year, all of Technology's work is programmatic. Highlights of our programs include a concerted effort to reduce technical debt and strengthening the technical community inside and outside the Foundation. A key initiative in our Research team will be an effort to increase content in multiple languages using recommendation technology to help our editors prioritize their work.

Strategic priorities: This applies strongly to all strategic priorities: Reach, Community, Knowledge. This is the baseline work needed so all wikimedia sites keep running reliably for editors and readers all over the world 24/7. In the absence of this work, no other program at the Foundation (or community) can be executed.

The Wikimedia Foundation operates one of the world’s most popular web site properties, and it continues to expand with deployments of additional features and services as part of its programmatic work. These resources need to be maintained with high levels of availability, reliability, security, and performance.

We will maintain the availability of Wikimedia’s sites and services for our global audiences and ensure they’re running reliably, securely, and with high performance. We will do this while modernizing our infrastructure and improving current levels of service when it comes to testing, deployments, and maintenance of software and hardware.

Objective 2: Pay down technical debt and allow upgrading of the core OpenStack platform to modern, supported releases by replacing the current network topology layer with OpenStack Neutron, which has become the standard for most OpenStack deployments.

Objective 3: Increase availability of compute resources for the IaaS product by expanding deployment of physical resources beyond the current single broadcast domain

Outcome 5: We have effective and easy-to-use testing infrastructure and tooling for developers.

Over the last decade and a half the Wikimedia Foundation has accrued what is termed “technical debt”; historical choices and technical limitations that limit the velocity of development. The primary goal of this program is to development and implement practices that help the entire organization to identify and prioritize the resolution of technical debt properly. This program will have a positive multiplying effect on the speed and quality of all other programs the Foundation implements.

Wikimedia's software products and platforms have a diverse collection of technical communities including code contributors, documentation contributors, bug reporters, API consumers, volunteers who build innovative solutions to on-wiki workflow issues, researchers who examine the data generated by the Wikimedia projects, value-added vendors who provide services and support based on Wikimedia free and open-source software products, and true 'third parties' who install and use FLOSS software produced by the Wikimedia movement on their own computers for various reasons. These audiences contribute directly and indirectly to the broadest goal of the movement: to collect and disseminate knowledge. However, they have not always been well recognized for these contributions and supported in their work. The technical community support project will attempt to begin to address this shortcoming by providing better documentation, facilitating community building, and establishing better pathways for communication between these communities and the Foundation.

We will expand and strengthen our technical communities, focusing on understanding their needs and measuring the progress and outcome of our efforts. In particular, we will focus on three traditionally underserved communities: tool and bot developers; API and data consumers; and third-party users of our software.

Outcome 1: Becoming a technical contributor to the Wikimedia movement by creating and maintaining 'tools' (bots, webservices, etc) and other innovative solutions is easier than it has been historically because documentation is easier to find, more comprehensive, and descriptive of start to finish steps needed to solve common problems. Cloud Services product users feel comfortable sharing their knowledge with others as part of a community with a culture of sharing via documentation and mutual support.

Objective 1: Collaborate with community to find volunteers willing to form a documentation Special Interest Group to update documentation of existing Cloud Services products

Objective 2: Create tutorial content for common issues including but not limited to: creating initial account, deploying a functional web service, deploying a functional bot, and running periodic jobs with variations. Where applicable, produce variants for more than one implementation language (e.g. PHP, Python, etc).

Outcome 2: The adoption of Wikimedia technology can be reliably measured

Objective 1: Design a set of formal KPIs (key performance indicators) to measure the growth and diversity of our technology audience

Outcome 3: Value-added vendors who provide services and support based on Wikimedia software and true 'third parties' who install and use software produced by the Wikimedia movement on their own computers are more confident in recommending, deploying, and extending Wikimedia FLOSS projects.

Objective 1: Establish canonical point of contact for third-parties by promoting the existence of a dedicated technical liaison for software projects with support for third-party users

Objective 2: Clarify the Foundation’s short- and long-term commitments to third-party users. Create, publish, and promote a multi-tiered, third-party support level system for Wikimedia software projects. Document the support level of existing FLOSS projects and ensure that the documented levels of support are delivered.

Outcome 4: The collaboration with research in industry and academics is further scaled and supported, so that more findings and datasets are published and disseminated under an open license. This helps us solve strategically important questions.

Objective 1: Organize and host the annual Wiki Research Workshop to help align the interests of the academic community to issues of strategic importance for the movement. Continue to successfully run a research workshop at a major conference, as we have for the past 3 years.

Objective 2: Maintain the current capacity for formal research collaborations with industry and academia to reduce the overall cost for the organization to conduct research projects. As of March 2017, the Wikimedia Research departments works with 30 collaborators under the terms of our Open Access policy.

Outcome 5: Organize Wikimedia Developer Summit as a three day meeting of ~50 senior technical contributors focusing on one strategic theme announced before the call for participation and scholarship requests start.

Objective 1: Developer Summit web page published four months before the event includes dates and location (at least nearest airport), main theme, call for participation, call for scholarship requests, and calendar with deadlines. A good representation of non-WMF stakeholders related to the main theme are invited and participate at the event (preferred) or online.

Objective 2: A process allows prospect participants to submit statements and proposals about the main theme, and allows the Program Committee to review them and notify their decisions. Discussions start before the event with the involvement of all the relevant stakeholders, in order to identify the points that need to be addressed at the event.

Objective 3: Activities during the Summit are well documented, especially outcomes and actions, which will be compiled in a systematic way for better evaluation and followup.

Artificial Intelligence (AI) has great potential to help our projects scale by reducing the work that our editors need to do and enhancing the value of our content to readers. However, AIs also have the potential to perpetuate biases and silence voices in novel and insidious ways. ORES , is a high-capacity, machine learning prediction service that is already heavily adopted within and outside the Wikimedia Foundation. By expanding the service to support new wiki processes and implementing auditing tools, we will help identify and mitigate the effects of prediction bias.

We will build a new production platform for integrated development, testing, deployment, and hosting of applications. This will greatly reduce the complexity and speed of delivering a service and maintaining it throughout its lifecycle, with fewer dependencies between teams and greater automation and integration. The platform will offer more flexibility through support for automatic high-availability and scaling, abstraction from hardware, and a streamlined path from development through testing to deployment. Services will be isolated from each other for increased reliability and security.

Wikimedia developers, as well as third-party users, benefit from the ability to easily replicate the stack for development or their own use cases.

This work also represents an investment in the future; although this will not yet significantly materialize within FY17-18, this project will eventually result in significant cost savings on both capital expenditure (through consolidation of hardware capacity) and staff time (by streamlining development, testing, deployment and maintenance).

Strategic Priorities: Provide some of the tools and data to be able to measure progress on strategic priorities. Also address specific data needs of our community such as updating http://stats.wikimedia.org(the</> community’s main source of metrics for Wikimedia projects) and revamping infrastructure on cloud cloud services environment for better data access.

Our data is not as discoverable and accessible as it should be for both for the Foundation and our communities. This is most notable for data in the edit ecosystem. This program aims to make data of higher quality and to improve data access; the more accessible that data is, the more impact it can have. Most of the focus of this program is on infrastructure and tools for better public data access; however, we also include some improvements to private datasets.

Make Wikimedia data easily available for both the Foundation and the different Wiki communities by providing better tools, infrastructure, and access to data for editors, communities, and Foundation staff.

Outcome 1: Foundation staff and community have better tools to access data. .

Objective 1: Wikistats 2.0 redesign. Wikistats is the de-facto source of statistics for the wikimedia projects for community; this includes developing a basic and Advanced Frontend and an API powered backend.

Objective 2: Better visual access to EventLogging data

Objective 3: Experiments with real-time data and community support for new datasets available

Objective 1: Provide reliable and available access to Wikimedia database dumps by upgrading the hardware used and consolidating access by internal teams, Cloud Services users, external mirrors, and HTTPS downloaders to the new canonical location.

Objective 2: Complete migration of production database replica access for Cloud services customers to the new high-availability cluster, which uses 'row based' replication technology to provide a more consistent view of production data.

Objective 4: Provision a cluster for public Data Lake access in labs that can be used as a Quarry backend. In this iteration the Data Lake will include historical data about editing (revisions, pages, users) for all Wikimedia projects since the beginning. Data is optimized to be queried in an analytics-friendly way that allows for simple and fast queries.

Although Wikimedia currently operates two data centers each independently capable of serving our core sites and services, many of our services – including our most important core platform component (MediaWiki) – are only active in a single data center at any point in time, with the other data center being on standby. Switching between the two data centers is currently a very involved manual process with significant impact to the availability of our services for our users and substantial risk of failure. By extending existing services (and MediaWiki in particular) with support for serving requests from multiple data centers concurrently, this impact can be minimized and currently unused performance benefits can be leveraged.

We will improve availability and performance for our users, while also minimizing the impact from fail-over testing and catastrophes. We will do this by expanding our multi-data center capabilities to serve requests from multiple data centers simultaneously.

There are significant gaps of knowledge in Wikipedia today, both in terms of the articles available in different languages as well as the depth of content available in existing articles. Recommendation systems that can help editors identify prioritized missing content across Wikipedia editions and contribute towards closing the gaps are key for accelerating the article creation rate.

Outcome 1: Interested editors will be able to use recommendation services that will allow them to have relevant information about the articles they want to edit immediately at their repository. Editathon organizers benefit from automatically generated templates and recommendations that can help them in onboarding new or less experienced editors.

Objective 1: Build, improve, and expand algorithms that can provide more detailed recommendations to editors about how an article could be expanded. This step will require running natural or controlled experiments and will involve recommendations at different levels of granularity (from section recommendations, to reference and image recommendations all the way to potentially providing guidance on how to expand, for example, sections, by offering statistics about typical section features).

Objective 2: Develop and gather design requirements for how the algorithms’ results should be exposed to the editors. This objective requires the continuation of the work with the community of editors and editathon organizers started in FY16-17.

Objective 4: Build Labs API(s) that can be used by researchers and developers to use and surface the recommendations in other products and research initiatives. (Note that building the productionized API(s), when relevant, will be done in collaboration with Product teams and is not captured in this objective.)

The 'services' in the Wikimedia Cloud Services team name encompasses a collection of products that build upon the utility of the core infrastructure as a service (IaaS) product to present a well rounded and useful platform for volunteers. This helps solve the technical problems of the Wikimedia movement.

Empower volunteers to create technical solutions to the problems of on-wiki communities with a minimal investment of time and low friction for transferring maintainership from one individual to another.

Outcome 1: Members of the Wikimedia movement are able to develop and deploy technical solutions with a reasonable investment of time and resources on the Wikimedia Cloud Services Platform as a Service (PaaS) product.

Objective 2: Migrate Tool Labs account workflows from Wikitech to Striker where they are easier to integrate with the new user onboarding workflow and easier to maintain

Milestone 1: Maintain high overall customer satisfaction for the Tool Labs product as measured by the annual developer survey

Outcome 2: The 'Labs, labs, labs' branding confusion is eliminated. Branding is separated, so that all of these are no longer referred to as just ‘Labs”: infrastructure as a service product, the platform as a service product, the team that manages those products, and the community that uses them to produce technical solutions.

Outcome 3: Wikimedia community members, Foundation staff, and potential contributors are aware of the breadth of products and services offered by the Cloud Services team.

Objective 1: Promote available services and products at relevant conferences, hackathons, and within the Wikimedia communities

Outcome 4: Support requests from Cloud Services users are addressed in a best effort manner without interrupting core operational and development work by the Cloud Services team towards other program goals.

Objective 1: Provide first line technical support resources to triage and respond to Cloud Services managed product support requests

Wikimedia projects rely on verifiability as one of their core policies.There has been growing interest in building a stronger technological foundation to how sources are represented, stored and reused by contributors across Wikimedia projects. Sourcing of statements is a high priority in projects like Wikidata and a range of technical and programmatic initiatives (such as Citoid, the Wikipedia Library, OABot) have been designed to facilitate the creation of references. Despite over 10 years of community-driven efforts to design better ways to support citation-related work in Wikipedia, it’s only with the advent of Wikidata that these efforts have started to coalesce. The present program aims to develop a deeper understanding of how Wikimedia contributors use sources and lay the foundation for better technological support around sources and citations.

In the next fiscal year, we will conduct research aiming to: improve the user experience of editors and readers around sources and citations; quantify citation coverage across Wikimedia contents, identifying gaps and areas of low citation quality; help contributors identify topic areas of Wikimedia projects in greater need of sourcing work so that citation quality gaps are addressed. We will leverage our network to establish new formal collaborations and answer research questions related to the coverage, quality and accessibility of citations across Wikimedia projects. We will also continue to lead the WikiCite series, which started in 2016, in order to help align community and technical efforts related to citation data and infrastructure.

Outcome 1: Quantitative research is available to help Wikipedia and Wikidata contributors focus and prioritize their sourcing efforts.

Objective 1: Estimate what proportion of content in Wikipedia or Wikidata is unsourced and in need of citations. Estimate what proportion of existing sources cited across Wikimedia projects are accessible by the general public.

Objective 2: Collect and analyze clickthrough data for footnotes and external links to understand how readers interact with them (after discussing and reviewing privacy and security implications)

Only 10-15% of Wikipedia editors are known to be female. The issue of lack of gender diversity has long been acknowledged by the Wikimedia community. We are interested to focus on specific drivers of lack of gender diversity identified in the academic literature, design frameworks that can change such drivers, and measure the impact of such changes on contributor diversity in Wikipedia. (Please read more details about the program documented in Meta.)

Objective 2: Design frameworks to change the current socio-technical infrastructure to address at least one of the underlying causes of lack of representativeness (“Lack of confidence” is considered one such underlying cause). This step will take place in collaboration with the community of editors already experienced in this area and it has already started.

Objective 3: Run experiment(s) to assess whether the recommended design will have the desired outcome