We plan to publish up to a dozen stories on this blog and on social media:
• profiles of WikiWomen that are making exceptional contributions to our cause
• reports about worldwide edit-a-thons on International Women’s Day
• community picks of great articles about women on Wikimedia sites
• overview of the best research studies about gender diversity on Wikipedia
• reports on programs that seem to be addressing these issues successfully
• best practices for increasing diverse contributions

Through these stories, we hope to surface answers to these key questions:
• How are women around the world contributing to Wikimedia today?
• How can we support gender diversity more effectively in our communities?
• What knowledge is still missing in Wikimedia content about women and gender? How can we fill those gaps?

If you would like to submit a story on this topic, please review these guidelines and post your draft or outline, then email us. We will review new submissions for this editorial theme until March 10th, 2015.

We look forward to working together to grow and diversify our movement!

Hindi Wikipedians met to discuss a conference (‘sammelan’) in Delhi, to bring together editors dispersed across India. Photo by Muzammiluddin, free licensed under CC-BY-SA 4.0.
In July 2012, a group of five Hindi Wikipedians started a discussion on the Hindi Wikipedia Village Pump to explore the possibility of holding a Sammelan (conference) for Hindi Wikipedia, against the backdrop of Wikimania 2012 and Malayalam Wiki Sammelan. The idea was to bring together the geographically dispersed Hindi community and to drive a coordinated approach for the growth of the Hindi Wikipedia. During the last few years, this need had been felt by Hindi Wikipedians on a number of occasions. In March 2014, when I was working as Programme Officer of the Centre for Internet and Society, Bangalore, I tried to elicit the opinions of Hindi Wikipedians on the village pump about the possibility of holding this Hindi Wiki Sammelan. The idea was welcomed by all Hindi Wikipedians and most of them favored Delhi as the location for the event.

Unlike other Indian regional language Wikipedias, the Hindi Wikipedia has a very special set of characteristics. Its contributors are geographically dispersed across the country, with practically no face-to-face interaction. There have only been a handful of workshops for Hindi Wikipedia. And a disturbing trend for the Hindi Wikipedia is, except for a few dedicated contributors, the editors keep changing frequently. However, the number of editors, articles and overall edits on Hindi Wiki has exceeded all other Indian language Wikipedias. Therefore, as a precursor to the Hindi Sammelan, efforts were initiated to hold a Hindi Sammelan Meetup with a few dedicated editors as well as individuals concerned about the growth of the Hindi Wikipedia. At the Wikimedia Foundation, Asaf Bartov supported this initiative and said on the Hindi Wiki Sammelan Project Page: “We at the Wikimedia Foundation are eager to provide the resources to make this event possible.”

In line with this objective, a Hindi Wiki Sammelan Meetup was organized in Delhi on February 14-15, 2015. The event was attended by 15 people, including three administrators of the Hindi Wikipedia: Ashish Bhatnagar, Aniruddha Kumar and Sanjeev Kumar. Also present were two reviewers: Piyush Maurya and myself. The event was supported by the Centre for Internet and Society and was coordinated by Abhishek Suryawanshi.

During our discussions, we decided that before planning a pan-India Hindi Wiki Sammelan, we would work on a Wiki Sammelan in Delhi this year. Participants also reviewed the idea of holding outreach programs in a number of colleges and universities. Here are some of the suggestions which were endorsed:

Plan for “Wikipedian-in-residence” positions for the growth of Hindi Wikipedia in collaboration with various organizations.

Use “Hindi Fortnight” programme in Central Government organizations for the growth of Hindi Wikipedia.

Aim for a syndicated weekly Wikipedia editing tutorial column for Hindi newspapers in the north.

Plan Wikipedia programmes for radio and television.

Make effective use of social media.

Plan a better integration with different regional languages — since many of the languages in India such as Marathi, Konkani, Bhojpuri, etc use Devanagari script, Hindi Wikipedia outreach in these regions (Maharashtra, Goa, Bihar,etc) could be planned in harmony.

Distribute the workload: During the meeting, many participants agreed to oversee outreach activities, especially in Delhi, Lucknow and Punjab.

If this initial meetup is successful in focusing our efforts to promote the Hindi Wikipedia, we hope that the proposed Wiki Sammelan events (both at the local level in Delhi and at the national level with as well as the actual Hindi Wiki Sammelan) can support the future growth and development of Hindi Wikipedia. We also hope these events can serve as a model for building a coordinated approach between other wiki communities that are geographically dispersed.

This post was written by Jason Evans, Wikimedian in Residence at the National Library of Wales

The focus during the first weeks of the residency has been on meeting with teams from various departments in the Library. The fact the I have worked with many of the staff for nearly ten years made introductions a little easier. However this was primarily a chance to clarify the nature of the residency and to promote its goals and objectives. These meetings also spawned excellent ideas which have helped shaped plans thus far.

A major objective for the residency is to hold a number of editathons and plans are already firming up. The first editathon, on the 10th of April, will ‘focus’ on Welsh photographers including Philip Jones Griffiths whose defining images captured the horrors of the Vietnam war. Events are being planned on a variety of topics including medieval Welsh law, World War I, the Welsh colony in Patagonia, and Welsh rugby. Editathons will include an introduction to Wikipedia and basic
training for new editors.

Library staff will also be involved. Following introductory presentations all staff and library volunteers will be offered training workshops so that they can become editors themselves, and I have already spoken to a number people who are keen to get started.

Despite being in the midst of a major restructuring process staff throughout the institution have reacted positively to the arrival of a Wikipedian. They are keen to get involved and to support the project. As such a number of initiatives are already being developed. The exhibitions department has agreed to trial the use of QRpedia codes in a major upcoming exhibition, and the web team are working on installing a ‘Cite on Wikipedia’ button into our online resources, which will generate a ready-made web citation in wiki markup.

Discussions have opened with an external partner – People’s Collection Wales – about changing its licence policy so that future contributions could be uploaded to Wikimedia Commons and, perhaps most exciting are plans to share around 20,000 digital images from the library’s collection. Once we have ironed out a few technical issues we should be able to use GLAM-Wiki tools to upload en masse to Wikimedia Commons and allow the world a glimpse of our hidden treasures!

During Tech budget and resourcing meeting for the 2014-2015 Annual Plan, one of the ideas proposed was possibly sourcing an incubator group to (re)“build Wikipedia or other major project in line with the Vision from the ground up, without prior constraints from existing technology, processes”, or communities. The idea was, even if it didn’t succeeded it would cause the organization “to think differently, to create energy around being BOLD,” and catalyze the movement.

This had some currency from many of the participants1, even the C-level2 involved, that was until a director argued that this was infeasible due to the Innovator’s Dilemma. Ignoring the obvious misreading of the book, he argued that because this might destroy the existing order inside the organization, it couldn’t be done by the organization itself, and thus the proposal died despite never going up for consensus consideration.3

Deciding that it is politically stupid to point out their Readers’ Digest understanding of a deeply-flawed business text, I instead argued that an organization built around vision, rather than profits, does not have the same constraints that allow disruptive technologies to spell their undoing.

That argument didn’t carry weight because people with more experience than me were sure that this initiative would be defunded in the next annual plan and that no one would ever get behind a project that is a direct threat to them. Incubation outside the WMF is only possibility.

…

It’s sad that people don’t bother to know the most basic lived history of their own industries (or have a terribly short memory).

The Mozilla Firefox project was created by Dave Hyatt and Blake Ross as an experimental branch of the Mozilla browser.

The Phoenix name was kept until April 14, 2003, when it was changed because of a trademark dispute with the BIOS manufacturer, Phoenix Technologies (which produces a BIOS-based browser called Phoenix FirstWare Connect). The new name, Firebird, met with mixed reactions, particularly as theFirebird database server already carried the name.

The project which became Firefox started as an experimental branch of the Mozilla Suite called m/b (or mozilla/browser). After it had been sufficiently developed, binaries for public testing appeared in September 2002 under the name Phoenix

Hyatt, Ross, Hewitt and Chanial developed their browser to combat the software bloat of the Mozilla Suite (codenamed, internally referred to, and continued by the community as SeaMonkey), which integrated features such as IRC, mail and news, and WYSIWYG HTML editing into one software suite.

Dave Hyatt would leave Netscape4 for Apple in 2002 and go on to architect the number one competitor to Firefox, Safari and WebKit (the core of Safari and Google Chrome). Blake Ross would work at Netscape/Mozilla until 2004 and be nominated the next year for Wired magazine’s top Rave Award, Renegade of the Year as all of Mozilla’s resources had were redirected to Firefox, a project started internally by two employees to combat the poor direction of original Mozilla project.

…

So yeah, Fuck you.

In the months since this time whenever I mentioned this to a WMF staff member, often you’d pretty much have to hold him or her back from wanting to switch into this team if it were to exist. ↩

March 2 2015. Semantic MediaWiki 2.1.1, is a bugfix release and has now been released. This new version is a minor release and provides bugfixes for the current 2.1 branch of Semantic MediaWiki. See the page Installation for details on how to install, upgrade or update.

Sebastiaan was project leader at the Dutch Wikimedia chapter. He did tons of great work, and made a point of registering activities with his camera. Of particular mention is his huge involvement in the GLAM area.

Sebastiaan will pursue his career elsewhere. His friendly cooperative attitude will be sorely missed.Thank you, Gerard

On February 19, the European Commission held a “high-level roundtable meeting” on copyright reform in Brussels, Belgium. The hearing was presided by Commissioner Günther Oettinger and aimed at determining “how to facilitate access to knowledge and heritage through libraries, education and cultural heritage institutions, while at the same time making sure that copyright remains a driver for creativity and investment”. Wikipedian Lukas Mezger (User:Gnom) was invited to participate, representing the European Wikimedia chapters.

Copyright in Europe is largely shaped by European Union law. In 2014, the European Commission started a legislative effort to tackle 21st century problems in copyright law. This is why the EU Wikimedia Chapters have joined together to coordinate their political work, creating the Free Knowledge Advocacy Group EU. The group aims to achieve three common goals that lie at the heart of the Wikimedia movement: Freedom of panorama, public domain for public works, and free use of orphan works. Earlier, this blog discussed the European Parliament’s report regarding the planned reform.

For the hearing on February 19, we decided to present two key points to the Commissioner and his team. First, the European Wikimedia chapters support the idea of further harmonizing copyright law in Europe (as opposed to the so-called content industry, which sees the existing fragmented national rules as a competitive advantage). Second, we demand the creation of a mandatory freedom of panorama rule in the entire EU (this has so far been left to the member states to decide). (Freedom of panorama permits taking photos or videos of buildings in public places.)

Due to fragmented freedom of panorama rules in Europe, this is how the European Commission building in Brussels has to be displayed on Wikipedia. Photo by Stephane Mignon, CC BY 2.0.

Since the hearing focused on the interests of the civil society, participants represented various interest groups. Representatives from Europeana, the library associations EBLIDA and LIBER, the Association of European Film Archives, the European Consumer Organisation, and the European Writers’ Council shared ideas that are close to those of the Wikimedia movement. As a community comprised of creators such as authors, programmers, and photographers on the one hand, as well as users such as researchers and ordinary readers on the other, we have a special perspective on the copyright reform debate.

The hearing was an exciting event for the European Wikimedia chapters. We were able to present our key positions at the highest political level. Wikimedia being invited to the hearing shows that our movement is recognized as a partner for civil society dialog on the European Union’s political stage. The Free Knowledge Advocacy Group EU will make good use of this connection to push for a harmonization of freedom of panorama and public domain government works so we can better share the world’s knowledge and cultural heritage. We will continue to get involved during the creation of the European Commission’s draft bill, which is expected in the fall of this year.

The Wikipedia gender gap is well documented and is one of the biggest challenges facing the global Wikimedia movement. To help support this campaign Wikimedia UK is running a retrospective review of its projects related to gender over the last few years. This will take place during March – Women’s History Month.

As a chapter Wikimedia UK recognises the importance of all types of diversity within our community and gender is an important aspect of this. With only around 8-14% of Wikipedia editors identifying as female there is much to be done to ensure that the incredible knowledge resources of the Wikimedia projects are reflective of the sum of all knowledge.

We are also seeking personal opinion pieces from people involved in some of those projects to explore how we can do better and why it is important we as a movement take on this challenge. If you would like to share your gender gap-related projects and stories, please do get in touch, either through the comments on by emailing stevie.benton-at-wikimedia.org.uk

In related news, the Wikimedia Foundation has recently announced that its Inspire Grants Programme will focus on supporting projects related to the Wikipedia gender gap until the end of April. This is an experiment that, if successful, may see a more theme-focused grants programme in future. You can find out more about the programme here.

There are two ways of improving the content of Wikidata. It can be by adding large amounts of statements or by adding more details to existing data. As I was adding the details, I found that several award winners do not have an article. Adding them in Wikidata is easy and obvious.

Mr A.G. Mojtabai for instance received the award in 1986. Adding a red link in Wikipedia is not hard either. Thanks to the Redwd template, I linked him to both Wikidata and to Reasonator. One issue is that all these authors and the award are primarily known on the English Wikipedia. Consequently their work and relevance has at this time a limited public.

It would be nice when the presence of great information at Wikidata will lead to articles in other languages. The question is very much if it does.Thanks, GerardM

Lillian Smith who is obviously white, openly embraced controversial positions on matters of race and gender equality, she was a southern liberal unafraid to criticize segregation and work toward the dismantling of Jim Crow laws, at a time when such actions almost guaranteed social ostracism.

The Lillian Smith Book award honours those authors who, through their outstanding writing about the American South, carry on Smith's legacy of elucidating the condition of racial and social inequity and proposing a vision of justice and human understanding.

It is obvious that these writers are important as sources for the subject and consequently, registering them as award winners is important. This was done by harvesting the information from the article using Magnus's LinkedItems tool,

Obviously more can be done.

including all her work in Wikidata; it does not need a Wikipedia article

including all the works of the prize winning authors

adding dates as a qualifier for the award winners

complete the list of award winners

work on similar awards

There is always more that can be done on a subject as relevant as this.

The new version of MediaWiki has been on test wikis and MediaWiki.org since February 25. It will be on non-Wikipedia wikis from March 3. It will be on all Wikipedias from March 4 (calendar).

You can use the Content Translation tool on Wikipedia in Punjabi and Kyrgyz. You can ask for the tool in other languages.

Editing the fake blank line in VisualEditor is now simpler. This change also fixed a few bugs. [4][5][6][7]

The TemplateData editor now warns you if a related page already has TemplateData. [8]

The TemplateData table now tells you if a template doesn't take any parameters. [9]

Meetings

You can read the notes from the last meeting with the VisualEditor team.

You can join the next weekly meeting with the VisualEditor team. During the meetings you can tell developers which bugs are the most important. The meeting will be on March 4 at 16:00 (UTC). See how to join.

You can join technical meetings in France and Mexico this year. You will be able to ask for help if you can't pay yourself. [10]

The approach of a new tool developed at the hackathon in Bern is really interesting. It limits itself to five generations and it shows pictures of the people. It is nice to see princess Catharina-Amalia.

It is also great to notice that you can have the same information in for instance Chinese or Russian. When you click on one of the persons in the genealogy, it will produce the genealogy for that person..

At this time the new tool is very much in development. It is great to show why hackathons are so relevant.Thanks, GerardM

“First Women, Second Sex: Gender Bias in Wikipedia”

“it is not women’s inferiority that has determined their historical insignificance; it is their historical insignificance that has doomed them to inferiority” ~ Beauvoir

The problem of the Gender Gap in Wikipedia can mean several things; a gap in editors, or a gap in the content, and of course the relationship between the two. An arXiv preprint titled “First Women, Second Sex: Gender Bias in Wikipedia” [1] addresses the gap on the content side, with justification by many Simone de Beauvoir quotes. The authors use an ensemble of three methods—DBPedia metadata, language modelling, and network theory—to show not just inequality in encyclopedia inclusion, but degrees of sexism in how biographies are included. For instance, how different genders meet notability is quantifiably different, as is the centrality of biographies in their link structure.

The initial metadata technique is an inspection of DBPedia data mashed up with a separate dataset from previous research based on pronoun counting techniques. This method is a bit shaky as it relies on the combination of two derived datasets, especially in an era when Wikidata can deliver data closer to the source. Nevertheless the researchers find that 15.5% of their final dataset are women biographies. Digging further, biographies are separated by subclass: athletes, politicians, military-personnel, and all others are more heavily male—only artists and royalty are female-biased. Other findings from this type of infobox scraping is that female biographies are much more likely to have the spouse parameter filled.

Moving into the natural language realm, the paper inspects bigrams of the biographies’ text. The top words associated with men are “played”, “football” and “league”; for women, the top are “actress”, “women’s” and “her husband”. This already starts to hint at the notion that men are notable for what they do, rather than only their static characteristics. To investigate further, Linguistic Inquiry and Word Count (LIWC) and two measures—frequency and burstiness—are employed for semantic classification. The semantic category where male biographies score significantly higher is cognitive mechanics, which encompasses words like “became”, “known”, and “made”; meanwhile female biographies have significantly more sexual words like “love”, “passion”, and “sex”.

The last domain explored is network structure. Each biography links to and is linked from other biographies, forming a directed graph. The first interesting thing to note is that in chi-squared testing between 4 link types (female–female, female–male, male–male, male–female), only female-female occur more than expected. Next a PageRank ranking is made of the graph, which determines the importance or “centrality” of biographies. Any subsetting of biographies by removing the least PageRanked articles, it is found, reduces the female ratio of the subset below the total figure.

The authors wrap up their conclusions within the context of feminist theory. They argue the notion of gender roles is evident in Wikipedia in the way that metadata shows that men are more often known to be sportspeople, and women to be artists, royalty or spouses of someone else. Likewise the language of biographies is biased. That “her husband” and “first woman” are top terms in female articles indicates a failure in the Finkbeiner test. Furthermore the authors claim this exhibits “objectification” in light of the evidence that the “cognitive processes” of men were shown to be more significant than women, and that the “sexual” category is the only one in which women are more frequently described than men. Finally, as viewed from the network structure results, female biographies are less central to the encyclopedia. This is said to be because of historical philosophy and today’s notability guidelines, that “reason and objectivity are gendered male”—a feminist metaphysical view. The explanation of female articles tending to link to other female articles more than expected, the authors imagine, is due to women-led gender gap addressing efforts.

Overall this article provides a wide variety of methods to measure the gender gap, which proves a high-level point from many perspectives. It is situated in feminist thought, but multiple returns to Beauvoir make the final analysis seem superficial and generic. Additionally, the simplifying assumptions of English-only and derived datasets leave open the criticism that the larger points cannot be disentangled from a few extra biases introduced by language- and processing-inherited lenses. The authors admit as much in their limitations when they also acknowledge not questioning the gender binary either. What we have here though is an increment to a growing pile of methods and techniques proving the gender gap which, ideologically, does not need, but can always benefit from additional statistical legitimacy.

Wikipedia’s SOPA Strike considered as international political movement

A chronology of the events leading up to the SOPA Strike on Wikipedia is presented. The author then analyzes Wikipedia’s forums debating whether and how to restrict access to the site for a day. Debate participants are classified by such characteristics as national origin, history of editing Wikipedia, and stated arguments for and against. Simple quantitative analyses of population percentages and relative contribution are performed. Konieczny then tests various hypotheses about the nature of the protest, to see which one fits the facts.

Konieczny shows that experienced Wikipedians were generally supportive of a protest but were more likely to express misgivings about losing neutrality. Americans also participated in a greater proportion than their prevalence on the English Wikipedia. However the process also allowed non-US citizens and free culture idealists to have significant leverage over the debate on Wikipedia, and thus on American national politics. Konieczny tries to show that Wikipedia is thus an international social movement in the broader free culture movement. Konieczny ends the paper with a speculation that the many pro-blackout single-purpose accounts may reflect a new political consciousness among the young and internet-savvy.

Konieczny’s analysis gives us a very detailed, fascinating picture of what arguments were made in public on Wikipedia forums during a crucial few weeks. However, this may omit some of the most influential discussions, by insiders, taking place person-to-person and in chat rooms. The paper also omits discussion of the influence of the Wikimedia Foundation, as an American institution responding to an American legal threat.

When Konieczny asserts the existence of a rising transnational “Net Generation”, he’s presented very little evidence. A skeptical or quietist Wikipedian might still conclude that the encyclopedia wasn’t acting as an organ of democracy, but was briefly overrun by a Twitter trending topic. If Konieczny is right, we may see other internet-based communities also being pressed into service, or more permanent institutions being developed to serve this new community.

Full disclosure: I (NeilK) was intimately involved with the SOPA Strike movement on Wikipedia, as a technologist on the WMF staff, and as a concerned Wikipedian who weighed in on the very forums analyzed in this paper, in favor of a blackout.

Assignment designed to convince students of Wikipedia’s “fundamental untrustworthiness” achieves the opposite

An article in Communications in Information Literacy[3] reports on the outcome of a senior-level course at Duquesne University where students “created or modified a Wikipedia entry and tracked the modifications made by others to the entry, while they also explored the concept of the ‘wisdom of crowds’ in contrast to the ‘wisdom of experts’ through the course readings and discussions”. The class also wrote a new article collectively (Paramount Film Exchange (Pittsburgh)), and engaged in various breaching experiments. E.g. “the instructor inserted a defamatory falsehood into the page of Luke Ravenstahl, the mayor of Pittsburgh at the time, and asked students to see how long it took the falsehood to disappear. Within five minutes, it was gone.” One student created an article that “seeks to promote a specific company, Accord Curtains, and it is purposefully manipulative.” Another student vandalized an article about an NFL player and “Not even 5 seconds later, I had a message from a Wikipedia policeman informing me about the repercussions of doing such a thing to a Wikipage… It really opened my eyes as to how incredible and powerful the internet is to society.”

Students subsequently wrote papers answering the question “What are Wikipedia pages good for?”. Two and a half years after the class, participants were asked what they had learned about Wikipedia from the assignment for their post-college life. Five of them responded (a rather small sample, a limitation admitted by the authors), largely sticking to the judgment they had expressed in their original papers, reporting that “they came into the class convinced that Wikipedia was an unreliable source but that learning about the creation and community editing of Wikipedia pages made the site more reliable to them.”

In the paper’s conclusion, the authors comment:

“The instructor came into the unit assuming that he would be ushering students into an epiphany: Wikipedia, a source they loved and relied upon and rarely questioned, was actually rife with junk information because anyone—even they—could change anything at will. … How this failed! The students took away the pragmatic lesson that Wikipedia was generally reliable, almost always useful, and that its self-policing mechanisms were mostly effective, particularly when it came to popular or especially controversial pages.”

Similar findings are reported in an unrelated case study, titled “Attitude Changes When Using Wikipedia in Higher Education”[4], which involved 23 students at Williams College, evaluating their “attitudes before and after participating in collaborative wiki assignments. Results from the study showed a statistically significant positive shift in attitudes [about Wikipedia and wikis in general] before and after using the wiki.”

Reasons for contributing: Ego vs. social norms in the US and South Korea

This study,[5] roughly, asks why people are uploading (contributing) content to Wikipedia, comparing respondents from two culturally different countries, namely collectivist South Korea and the individualistic United States. It uses the usual convenience sample of college students (reached through an online survey). In a 2012 survey involving only Korean students (previous coverage: “Do social norms influence participation in Wikipedia?“), the authors had found that users might be motivated by the fact that “uploading content on Wikipedia is a socially desirable act”.

In the present study, the authors test whether a number of factors are positively correlated with intent to upload content on Wikipedia, based on the psychological theories such as theory of planned behavior, situational theory of problem solving, and roles of ego involvement (which represents the self-concept of individuals), subjective norm (a person’s perception of the social pressure to perform or not to perform the behavior in question), and descriptive norm (beliefs about what is actually done by the majority of one’s social circles).

In total, the authors present nine hypotheses. Ego involvement is found to be highly significant, but not differentiating between two cultures, which the author interpret as an an indicator that globalization and the Internet are bridging the cultural gap, an interesting conclusion that deserves further discussion. The norms are found to be mostly irrelevant (only the descriptive norm is significant for the American sample group, and—contrary to the prior studies on Korean Internet users with regard to the subjective norm—neither is for the Korean one), as is the attitude on uploading behavior. Another possible explanation offered by the authors regarding the small difference between the two cultures concerns the individualistic values embedded in, or self-oriented nature of, Web 2.0 applications and social media, and the author repeat their proposition that it is likely due to globalizing factors (suggesting that the young Korean generation, despite living in a collectivist culture, is significantly affected by individualistic global media). Overall, the authors conclude that cultural differences play a relatively small role in explaining the differences in American and Korean attitudes towards uploading content to Wikipedia.

The study also reports on the interestingly low popularity of Wikipedia in South Korea: only about 50% of Korean students used Wikipedia, whereas almost 99% of American students did. The authors did propose some interesting explanations for this finding (such as a hypothesis that uploading content on Wikipedia might be regarded as a challenge to the established authority of traditional encyclopedias), but unfortunately they are not backed up with any significant evidence. Given South Korea’s popular image as one of the most advanced countries when it comes to Internet use, the issue of Wikipedia’s poor popularity there—as the authors note themselves—is one that is worth investigating in future studies.

Undergraduates confused by references in Wikipedia articles

It is no surprise that students like to use Wikipedia. A paper[6] in New Library World adds to the debate on the perceptions, motivations, and attitudes of students who use this site by asking the following research question: “How do undergraduates actually use Wikipedia and how does this resource influence their subsequent information-gathering?” The study used the usual convenience sample of 30 American undergraduates, who were given a topic (Internet privacy), directed to the corresponding page, and asked to draft a paper on that topic, using Wikipedia as their starting point. Of particular interest to us are the author’s comments on Wikipedia’s references. First, there’s the (unfortunately, short and unjustified) comment that “it is common for Wikipedia articles to have two or more “Notes” and “References” sections, which [is] confusing”. Second, that “following Wikipedia references were least preferred as next steps in the research process”, about as likely as “going to the library catalog”, and less so than “going to Google for more information,” “accessing the library’s databases”, or simply “returning to Wikipedia”. When asked which Wikipedia references they would follow if they were to do so, there was a significant preference for the references cited first, regardless of their quality. A number of respondents expressed an opinion that first references are somehow “better”, not realizing that Wikipedia footnotes are ordered simply by the order they appear in the article. Regarding their use of Wikipedia itself, “respondents overwhelmingly indicated that they used Wikipedia because it was easy to access” (similar to Google), thus displaying a marked preference for convenience, visibility and accessibility over authority and quality of the source or their bibliographies. The authors also note that while the students understand that, in theory, scholarly sources are the best (and better than Wikipedia), they are more interested in “reasonably good” than “accurate” information, either because of difficulties in accessing / interpreting the “most credible” sources, or perhaps because of their skepticism towards authority.

The author concludes that one of the best solutions is to involve students in the process of creation and editing of Wikipedia pages, through she sees that as a method to educate students about Wikipedia’s imperfections, rather than as a way to improve Wikipedia’s quality, a task she seems to regard as better suited for faculty and librarians. She also offers some worthwhile suggestions to “Wikipedia developers” regarding the goal of pursuing collaboration with academic libraries, by noting that “it may be worth for Wikipedia to develop a visualized ranking mechanism for its references”—an idea that is certainly worth discussing further.

Briefly

ClueBot as a rebel among conquerors, followers and cowboys

There are four archetypes of Wikipedians on featured articles: Conqueror, Follower, Rebel, and Cowboy, according to the article “Measuring Creativity of Wikipedia Editors”[7]. The study investigated the quantity and rate of change of edits among editors over time, paying attention to their relative positions. The article describes the four personas of editors on the article Boston. A conqueror shows strong bursts of activity, sustains high volume over time, and is a first mover. A follower is a low volume, but still sustained, and positively correlated to a conqueror. A rebel—which hilariously they found ClueBot, the software, to be—is low volume, sustained, but negatively correlated to a conqueror. Lastly, a Cowboy is erratic with spikey contributions, and uncorrelated to other users.

This study is not very broad in terms of number or types of articles in question, only 79 articles were considered. And given the naming of their archetypes, clearly the authors aren’t aware that Wikipedians have already transcended into classifying themselves by an entire ecosystem of WikiFauna.

Using Wikipedia to correct public misconceptions about Africa

An article titled “Wikipedia for Africanists”,[8] coauthored by Hans Muller, a Wikipedian in Residence at the African Studies Centre in Leiden (Netherlands), describes the usefulness of Wikipedia for that academic discipline: “Using Wikipedia, Africanists can benefit in two ways: as readers they can quickly obtain a sourced but non-academic outline of topics of interest, and as outreach writers, they can inform the public worldwide about recent insights and attempt to solve (the many) misunderstandings on African topics with unprecedented efficiency.”

“Use and Perception of Wikipedia among Medical Students in a Nigerian University”[12] From the abstract: “[In a survey with 60 respondents,] 91.7% of the medical students have used Wikipedia;… 50.9% of the students use Wikipedia to complement lecture notes, 43.6% for research project as well as to complete class assignment, 14% of them use it to modify content of articles; … the challenges faced by the students are scantiness of information of some articles, unavailability of/inability to obtain articles on some topics from the site, and inaccuracy/unreliability of content of articles.”

“Where Non-Science Majors Get Information about Science and How They Rate that Information”[13] From the abstract: “We report on a study of 400 undergraduate non-majors students enrolled in introductory astronomy courses at the University of Arizona … Overall, students reported getting information from a variety of online sources when looking up a topic for their own knowledge, including internet searches (71%), Wikipedia (46%), and online science sites (e.g. NASA) (45%). When asked where they got information for course assignments, most reported from assigned readings (82%) but a large percentage still reported getting information from online sources such as internet searches (60%), Wikipedia (30%) and online science sites (e.g. NASA) (20%). Overall, students rated professors/teachers and textbooks at the most reliable sources of scientific information and rated social media sites, blogs and Wikipedia as the least reliable sources of scientific information.”

“Integration of multiple network views in Wikipedia”[14] From the abstract: “[We analyze] the networks of editors interacting on Wikipedia pages. We propose the prediction of article quality as a task that allows us to quantify the informativeness of alternative network views. We present three fundamentally different views on the data that attempt to capture structural and temporal aspects of the edit networks.”

“Experimental evaluation of learning performance for exploring the shortest paths in hyperlink network of Wikipedia”[15] From the abstract: “…in three separate learning sessions of 20 minutes students read series of 62 sentences built by using 22 unique hyperlinks that form the eleven shortest paths and answered pre-test and post-test multiple-choice questionnaires about recall of sentences … For experiment group (n=24) 62 sentences were chained in such an ordering that corresponds to traversing cumulatively a series of associative trails leading from concept Tourism in Malta to concept Euro coins of Malta along alternative parallel shortest paths in hyperlink network of Wikipedia category Malta. For control group (n=10) same sentences had randomized ordering. For both unique hyperlinks and consecutive pairs of hyperlinks experiment group reached higher degrees of recall than control group”. (See also Wikipedia:Wiki Game)

“Educational exploration based on conceptual networks generated by students and Wikipedia linkage”[16] (by the same author)

“Citations to Wikipedia in Canadian Law Journal and Law Review Articles”[17]

“Identifying Featured Articles in Spanish Wikipedia”[20] From the abstract: “…the first study to automatically assess information quality in Spanish Wikipedia, where Featured Articles identification is evaluated as a binary classification task. Two popular classification approaches like Naive Bayes and Support Vector Machine (SVM) are evaluated …”

“Predicting the Popularity of Trending Articles in the Arabic Wikipedia Using Data Mining Techniques”[21]

“Revision history: Translation trends in Wikipedia”[22] From the abstract: “This paper uses Mossop’s taxonomy of editing and revising procedures to explore a corpus of translated Wikipedia articles to determine how often transfer and language/style problems are present in these translations and assess how these problems are addressed.”

We plan to improve the blog’s contents and features in coming months, based on what you and others tell us.

Here are our goals for this survey:
• understand who our current blog users are
• find out what you like / don’t like, by user group
• learn what other users think of the blog
• identify key content and feature improvements
• inform our editorial strategy for communications

User groups to be surveyed include blog authors and readers, contributors, developers, donors and readers, as well as foundation staff.

We will run this survey for a couple weeks and post the results at the end of March, both here on the blog and on Meta.

February 27, 2015

For many categories it is obvious what they should include. "Indiana State Senators" for instance will all be human. So when we know that an article is a human, we can safely deduce that they indeed hold or held a position as "member of the Indiana State Senate". We can do similar things for alumni or members of a sports team.

When we harvest such categories regularly, Wikidata will become more inclusive than any Wikipedia. This is because we can harvest from similar categories from any Wikipedia.

We can, we should. harvest Wikipedia categories regularly. It will enrich Wikidata and we will become more aware of the full scope of the information held in all the Wikimedia Wikis.Thanks, GerardM

Armenian students participating in WikiCamps divide their time between editing Wikipedia and physical activities. Here they use their bodies to spell out “Wiki Camp”. Photo by Beko, freely licensed under CC BY-SA 4.0

WikiCamps is a new educational project organized by Wikimedia Armenia to encourage young people aged 14-20 to edit Wikipedia. This program provides a healthy balance of work and fun, insures the safety of participants, and seems particularly effective for engaging teenagers. Making this possible wasn’t easy — and it took time and effort.

So far, four WikiCamps have been held in Armenia. Each one took several months to study and plan — and required quite a bit of work to implement. Two camps were held in the summer of 2014, one took place in the fall and another was held in the winter. The summer WikiCamps were attended by 135 students, who created 5,425 new articles and added more than 22 million bytes of content to the article namespace on the Armenian Wikipedia. The fall and winter WikiCamps had 73 participants, who created more than 997 Wikipedia articles, adding more than 3 million bytes as well as improving 576 articles on Wiktionary.

It is generally easier to organize training events for somewhat older users (e.g.: university students), rather than working with younger participants. Adults are better acquainted with research, which is the cornerstone of Wikipedia — and do not need as much attention to contribute quality content. This is consistent with worldwide editing patterns, which suggest that the majority of Wikipedia editors tend to be adults. However, this is not the case in Armenia, where extensive training sessions for teenagers were held with very promising results.

Press conference about WikiCamps in Wikimedia Armenia. Pictured from left to right are Lilit, Mher and Susanna. Photo by Beko, freely licensed under CC BY-SA 3.0

This project started with a budget of US $2,000, which was left over from a Wikimedia Foundation grant to Wikimedia Armenia. Susanna Mkrtchyan, President of the Armenian chapter, suggested using these funds to hold camps for school children aged 14-20, so they could learn to edit Wikipedia in a collaborative and safe atmosphere. “When I mentionned our experience with WikiCamps, everyone was excited,” says Susanna, “But we must be careful when working with children. I’m a grandmother and an educator, so people trust me with their children. Safety was our first priority during the camp. We kept an eye on everyone during the day and we closed the camp at 10 PM.”

Every day the WikiCamp started with warm up exercises and sports. Editing Wikipedia came next, but for only 4 hours a day. The rest of the day was spent practicing favorite hobbies. Mkrtchyan said, “They wanted to contribute more to be the best editors, but we didn’t let them. We wanted them to dance, do sports, play music. We wanted to always keep them active.” On the other hand, a competition for the highest contributor was held every day, to encourage campers to do their best during editing hours. It was extra work for the organizers to check all participants’ contributions daily, but it helped campers focus on editing during assigned hours, so they wouldn’t forget the main purpose of the camp: to actively contribute to Wikimedia projects.

This well-divided timetable successfully encouraged students to add high-quality content to Wikimedia projects in a short time. It was also helpful to impart a love for Wikipedia and contributing to free knowledge. Mixing editing with fun activities and not letting them exceed the time limit increased the campers’ passion for editing Wikipedia. And every night, the campers also engaged in another competition: composing the best song about Wikipedia. They used known song melodies and wrote new lyrics about being in love with Wikipedia. Dduring each camp, students were also sent on two expeditions to discover new places. “They don’t just contribute to Wikipedia. Their character also changes, which is more important for them,” Susanna notes.

Project leaders helped newbies choose articles to edit, but everyone had the option to select a topic to research — or translate articles from other languages. Younger campers were invited to edit Wiktionary, since it is a simpler assignment — while the older campers edited Wikipedia (many of them had already participated in WMAM’s Education Program). This well-prepared division of work, combined with daily competitions, provided two important motivations for engaging participants. Camp fees were covered for those who made the highest contributions in previous camps (or on Wikimedia projects in general), while newbies paid their own fees. This seems like an effective reward system for participants.

There was not much money to advertise the camps. Instead, Mkrtchyan wrote press releases and invited journalists to press conferences in which Wikimedia Armenia announced each of the WikiCamps. Social networks replaced usual advertising methods, as the project depended mainly on word of mouth, press coverage, and social media, rather than customary high-cost advertising campaigns. Some of these ideas may not be possible elsewhere, but this approach worked well in Armenia.

Wikimedia Armenia’s WikiCamp project was recognized as one of the “coolest Wikimedia Chapter projects” of the year at Wikimania 2014. Our chapter agrees and we are very excited to be sharing this story and experience with our community.

These Armenian WikiCamps exceeded all our expectations. This experience shows that thinking out of the box to empower users can be very productive, with the right amount of preparation. This pilot contributed a large amount of high-quality content, while keeping participants active and engaged as Wikipedia participants.

Since early 2012, the Wikimedia Foundation has been creating partnerships with mobile carriers in selected regions of the world to waive data charges for accessing Wikipedia. To many people, the utility of such a program might be hard to understand. That’s why the Wikimedia Foundation works to create awareness of the Wikipedia Zero program, so that mobile carriers and mobile users can discover what free access to Wikipedia means for sharing knowledge across the globe.

The video above (which was narrated by Wikipedia founder Jimmy Wales and animated by Sasha Fornari) is one way to create awareness of Wikipedia Zero; it explains how the program works by using symbols, narration, animation and music — to communicate a complex concept in an inclusive way. The script was written so that anyone with access to video editing software and a microphone could re-record the dialogue track in their own language, and then mix it with the music from the video (available here). Captions have been created in English and the open captions on Wikimedia Commons allow for the timecode to be copied and the script to be translated. Below is the script for the video, which runs at just under two minutes:

Together, we are creating the most comprehensive encyclopedia that has ever existed – Wikipedia. It’s also free; free to read, free to edit, free to share. It is available in hundreds of languages, and it’s accessible to anyone with access to the internet or a mobile phone. Roughly 6 out of every 7 people today have mobile access. Mobile technology is the future of knowledge sharing, it has the potential to bring Wikipedia to billions of people. However, despite Wikipedia’s free content, most people simply can’t afford the data charges to access Wikipedia. That’s why the non-profit that supports Wikipedia runs a program called Wikipedia Zero, which works with mobile carriers to waive the data charges for accessing Wikipedia. Removing the cost of accessing Wikipedia may sound trivial, but it’s one small change that makes a huge difference. Students will do their homework and research careers. Doctors will study treatments. Small businesses will find knowledge to innovate. People will better understand their own history, society, and culture. We invite mobile operators all over the world to make knowledge truly free. Wikipedia belongs to all of us. Imagine a world in which every single human being on the planet has equal access to the sum of all knowledge.

Thanks to Jimmy Wales for providing narration, to Sasha Fornari for his motion graphics, and to the Wikimedia Foundation’s communications team members who developed the script and gave feedback on all the iterations of the video — as well as to the people who contributed their designs to the noun project that Sasha remixed, and Andy R. Jordan for the music.

Since my very early days of Open Source contribution, that goes back to my early days of college life, Red Hat had been my dream organization! The reasons behind this was probably many. During those initial days of my Open Source journey, I had been inspired by many Open source advocates and most of them were Red Hat employees. Also, many of the events that I had attended during my early college days used to be held at the Pune Red Hat office.

Well, as long as I had this dream of getting the appointment letter from Red Hat, life was exciting....but somehow I never thought how would I react once I had it!

On Monday, when I walked into this office building, I was just as scared as a child, when he (or she) is going to school for the first time. Things then happened one after the other...each one more exciting than the previous one! New people, new space, new desk, new laptop, new monitor, new desk phone...and finally introduction to some new work. Wait, did I flaunt about the welcome kit? Its like a complete package...with all the stationary one could need at office. Well, I love stationary...and as a friend of mine did rightly say, it was indeed Disney land for me.

Can't speak much about the work, since I have not done much yet....but definitely the people and the place is just what I had dreamt about!

Labs is a wonderful and successful project; more virtual machines are added all the time. More data is produced all the time and more people rely on it all the time.

Sounds good? It is!

From a management point of view it becomes increasingly problematic because for many of the most valuable Wikimedians it became a production resource and, as Labs is growing really quickly, it easily escapes the boundaries set earlier. Staffing, hardware it could all be better and it should all be better.

Having the best possible Labs will grow Labs even more. The best will outwit and outperform expectations. Classical budget think is a disservice to what we may achieve: share more information as widely as possible. One approach is to maintain a risc analysis of the services provided by Labs. It will help management to manage, to think and to use funds when the need and the justification is bigger than the budget

Today new virtual machines have been started that are starting to produce ZIM files based on the latest dumps. This will improve off-line reading of our projects a lot. The ZIM files will in future be and remain fresh..

February 25, 2015

The new Education Toolkit provides a blueprint for implementing successful Wikimedia programs. Photo by María Cruz, licensed under CC-BY-SA 3.0.

About the Education Toolkit

The Learning & Evaluation and the Education teams at the Wikimedia Foundation, together with the Education Collaborative, have created the Education Toolkit, the first in a series of program toolkits — guides for implementing effective Wikimedia programs. The program toolkits aim to share best practices among the experiences of Wikimedia program leaders from all over the world, to create a blueprint for designing successful Wikimedia programs.

From beginning to end, the Education program toolkit walks users through different phases of an education program:

Best practices for planning new and growing programs and developing partnerships with educators and the Wikimedia community

Tips for finding resources and accessing tech support for running a program

Ideas for teaching and assignments

Strategies for evaluation

Ways to connect with other community members

The content is organized based on learning area and topic, using learning patterns, problem and solution pairings, to help complete the toolkit. Those newer to the education program can begin at the start and follow through each step while more experienced program leaders can easily jump to the section that is most relevant to their work at that time.

Efforts to better understand programmatic work at the Wikimedia Foundation started in 2013. Through a series of investigations, workshops, and community consultations, the Learning and Evaluation team began to map the most replicated information about Wikimedia programs. The Wikipedia Education Program has been very popular around the world and the way the program is carried out has changed. Through an analysis of shared goals, common struggles and successes, a number of key lessons were captured to create the Education Toolkit — the first toolkit from the Learning & Evaluation team this year.

An education program manager consulted about this project wrote about its benefits: “Education programs are mutually beneficial activities with a high potential for meaningful impact. While students may benefit in a number of ways, their contributions benefit Wikimedia projects and users around the world.”

What do we know about the Education Program?

Educators and school administrators find contributions to Wikimedia projects to be a low cost way of incorporating and teaching technology in the classroom. Students also learn important objectives such as research and writing skills, information and media literacy.

In 2014, the Wikipedia Education Program Team at the Wikimedia Foundation (WMF) began mapping more than 70 educational programs in 66 countries — almost half of which are in the Global South. This mapping was shared in the team’s Quarterly Review. The mapping revealed that although 71% of assignments are on Wikipedia, many require students to translate rather than write or expand articles. The other 29% of student assignments contribute content to Wikimedia Commons and other sister projects. Further, unlike the US/Canada program – that focuses on university students who complete assignments for academic credit – education programs in other countries serve students of all ages, notably, 60% serve participants at universities, 20% secondary schools, and 13% through teacher training programs. We also learned that many students, in different parts of the world, are learning to contribute to Wikimedia projects for fun; only 30% of education programs are part of a formal course, 23% are part of structured extracurricular programs such as Wikicamps and Wikiclubs.

Research for the interviews included reading reports, blog posts, newsletters and combing through threads on the Education-L mailing list. And we were blown away by the rich anecdotes, stories of successes, discoveries, hacks and strategies that Collab members shared in interviews.

Most interestingly, many program leaders began their stories by saying, “We are the only ones who are working on this kind of program.” In fact, the interviews uncovered several similarities across programs in different countries. By curating learning pattern experiences, and organizing them into a program toolkit, we hope to pave an easier way for program leaders to collaborate in identifying common experiences and effective strategies.

We believe that this type of resource will make it easier for program leaders throughout the world to develop more effective educational programs, without having to start from scratch. In addition to sharing lessons learned, the Education Toolkit will become a central place for people to start conversations about challenges they face running programs and share experiences that others can benefit from. Since learning patterns (like Wikipedia articles) can be created, and edited, by anyone, we hope that this toolkit will expand as more and more people use it, learn from its lessons, and share new lessons!

Many people argue there’s a crisis of information in the 21st century. Facts come wrapped in claims and counterclaims, they say, with dubious sourcing and various grains of credibility. This generation of students sees digital literacy as a core skill set. How can instructors help students determine which facts are trustworthy?

Thomas Leitch’s book, “Wikipedia U: Knowledge, Authority and Education in the Digital Age,” examines questions of authority that Wikipedia skeptics often put front and center in their concerns. Leitch dedicates significant time to addressing those skeptics. Along the way, he shows how Wikipedia assignments can promote not only digital literacy skills, but an opportunity to instill a sense of personal mastery over knowledge that is unparalleled elsewhere in academia.

Leitch outlines connections between traditional knowledge production in academia and on Wikipedia. For students to evaluate online information, Leitch suggests, they should try producing it themselves. Wikipedia, he writes, is “an unexcelled laboratory for examining and comparing different models of authority” (page 6).

Leitch draws distinctions between “academic learning,” knowledge received in a classroom, and “higher learning.” Keeling and Hersh define higher learning as embracing “the active and increasingly expert use of that knowledge in critical thinking, problem solving and coherent communication” (86). Actively using Wikipedia, writes Leitch, becomes a kind of “apprenticeship” for students to practice exercising their own sense of authority over knowledge, what Char Booth calls “learning through Wikipedia.”

Traditional term papers, Leitch suggests, are a practice in passive, received knowledge. Leitch suggests that direct engagement with Wikipedia reveals how authority is created and defined.

Most undergraduate students aren’t given opportunities to practice their sense of authority and mastery. Students rarely confront challenges to their received knowledge in a classroom, meaning they have few opportunities to test or expand knowledge in the real world. Without examining the sources of their own authority, it’s difficult to determine for others.

Wikipedia becomes an experience, a field trip to the public sphere. Students engage in discourse firsthand, and see how information emerges from a consensus. Students witness the oftentimes heated debates between perspectives of knowledge. Knowledge, they learn, isn’t just something to consume. It’s something to engage with. Even, Leitch suggests, something to play with.

Leitch highlights play as often overlooked aspect of Wikipedia in academia. Wikipedia allows student editors to act out, and engage with, ever-changing claims to authority. When a student contributed to Wikipedia, they put their knowledge into play. Other editors can pick it up, and the play begins.

For an example, Leitch looks to Wiki Ed board member Bob Cummings’ Wikipedia assignment. Leitch writes that students “focus on Wikipedia as a platform for refining and extending their own authority, not as an authority to be accepted, questioned, or dismissed itself” (104). When claims to authority compete, students may defer to any outside claim. By contributing to a shared resource, they practice their authority. In turn, student editors see, firsthand, what determined their own claim to that authority. This prepares them, better than abstract exercises, to test the claims of others.

The history of this play is right there on any article’s talk page. Leitch encourages using this history for insight on how Wikipedians build knowledge. Examples range from arguments over “Grease” performer Olivia Newton-John’s national identity (British or Australian?) to deciding which assessments of the Bush administration are relevant.

This consensus-building process should be familiar to academics. Leitch suggests that Wikipedia isn’t so distinct. Rather, it reveals the evolution of knowledge unfolding in real-time, at a faster pace. Sources are still cited, evaluated, and critiqued. Wikipedians, and great scholarship, both “stand on the shoulders of giants.”

Wikipedia assignments challenge student editors to articulate and communicate their learning from many angles. This is a hallmark of higher learning. When students produce knowledge, they see how knowledge is produced. When students editors earn authority on knowledge, they see how authority is earned. Wikipedia is a rare opportunity to practice both.

February 24, 2015

What happens when photography students know they’re working for publication, rather than a classroom exercise? What if that publisher was the sixth-most visited website in the world, where millions of people could see those images?

That’s what happens when you combine Wikimedia Commons and Wikipedia with photography assignments.

We’ve already seen measurable benefits for students who contribute to Wikipedia. Students love that their work makes a difference. It’s visible, and has a tangible effect on the people who read it. This spring semester alone, 4.3 million people saw the work of student editors.

It’s a simple twist on a familiar assignment. Students find articles on Wikipedia that aren’t illustrated. That can mean anything, from household items to local monuments or buildings. Then, they upload their images to Wikimedia Commons through a Creative Commons license.

If you’re curious in seeing the range of photos that apply, our colleague Ryan McGrady’s course at North Carolina State University has some excellent examples. Not only are there some wonderful images from local institutions and events, but there are some contributions from a student’s visit to Damascus, Syria, from just before the city was transformed by civil war.

Students hold themselves to a higher standard when they shoot for a real audience. It’s a chance not only to practice, but to show what they already know. They develop their skills while showcasing what they’re capable of.

Wikipedia assignments can adapt to fit any class, even if it’s already underway. We can help you start immediately, with staff and print materials to help you have a successful assignment.

If you’re looking for a new way to engage your students in real-world, practical problem solving for photography courses while teaching digital literacy, contact Ryan McGrady, ryan@wikiedu.org.

For Black History Month, many new Wikipedia articles about black culture were created in edit-a-thons across the United States, such as this at the “BlackLivesMatter” event at the Schomburg Center in New York City. Photo by Terrence Jennings, under CC BY-SA 4.0

Black History Month

Black History Month is celebrated annually in the United States in February, to commemorate the history of the African diaspora. For this occasion, Wikipedians worked together to honor black history and to address Wikipedia’s multicultural gaps in the encyclopedia, hosting Wikipedia edit-a-thons throughout the United States, from February 1 to 28, 2015.

Maira Liriano, one of the key institutional organizers of the #BlackLives Matter Edit-a-thon in New York, summarized the goals of this project to reporters, as reported on Innovation Trail: “There is a bias and a lack of people of color involved in creating Wikipedia and many subjects are also missing from Wikipedia. So events like today are in part to make people aware of that and then to empower them and give them the information they need to correct that bias.”

Libraries proved to be ideal places for these edit-a-thons. At the Aaron Douglas Reading Room, librarians located reference texts and provided suggestions for further research. A list of Wikipedia articles to edit and create was prepared for the Schomburg Center Edit-a-thon and used by many of the satellite events.

AfroCROWD, Brooklyn

On February 7th and 8th in Brooklyn, kickoff events took place for a new initiative, the Afro Free Culture Crowdsourcing Wikimedia (AfroCROWD), which seeks to increase the number of people of African descent who actively partake in the Wikimedia and free knowledge, culture and software movements. The workshops were open to all Afrodescendants, including but not limited to individuals who self-identify as African, African-American, Afro-Latino, Biracial, Black, Black-American, Caribbean, Garifuna, Haitian or West Indian.

Events were held at the Brooklyn Public Library. Wikipedia trainings and overviews were given in some of the many languages spoken by our target population: French, Garifuna, Haitian Kreyòl, Igbo, Yoruba, Spanish and Twi. Affiliate project pages such as WikiProject Haiti were also introduced — and organizers announced the new Garifuna language Wikipedia incubator, the fruit of a collaboration between AfroCROWD and Wikimedia NYC.

Afrocrowd’s next 3 events will be HaitiCROWD on 3/14, AfricaCROWD on 4/4 and AfrolatinoCROWD on 4/12. HaitiCROWD will focus on resources in the Haitian Kreyòl, French and English Wikipedias, as well as growing the Haitian Wikipedia, which is now available free of charge to many Haitians in Haiti through the Digicel/Wikimedia Foundation Wikipedia Zero initiative. The workshop series will culminate in an edit-a-thon on June 20th at the Brooklyn Public Library.

SUNY Purchase

A #BlackLivesMatter Edit-a-thon was also held at SUNY Purchase, Westchester County, NY on Saturday February 7th.

We wish to thank all participants who made these edit-a-thons possible! It’s really exciting to see so many new editors join forces to help fill the multicultural gaps in Wikipedia — and to honor black history together.

As one of the most populous countries in the world, it is no surprise that India has many universities. The alumni of Indian universities or colleges can be found through a Wikipedia category.

Whenever tools are down I have been adding these alumni to Wikidata. It seems obvious that not all universities and colleges are represented. It is certain that many alumni cannot be found in these categories. This is because there may be no article about them or they have not been included in the category.

It is relatively easy to do this for India given that English is the main language for subjects about India. For China, Russia and Japan it is not so easy. Someone else has to get involved as I do not know the languages.

All of Labs is down again. So this time my customary hyperlinks are sadly absent..Thanks, GerardM

A few days ago, we published the Wikimedia Foundation’s report for the timespan from October to December 2014 (the second quarter of our fiscal year), which you can find in PDF form below. As of today, it is also available as a wiki page and (for easy online presentation) on Google Slides.

This is the first report in a new format. Since 2008, we have been publishing updates about the Foundation’s work on a monthly basis, also on this blog. As announced in November, we are now changing this to a quarterly rhythm; a main reason being to better align it with the quarterly planning and goalsetting process that has been extended to the entire organization since Lila Tretikov became Executive Director in 2014.

The new format reflects this in various ways. For each of the highlighted key priorities, colors (red/yellow/green) indicate clearly whether the quarterly goal was met or not. Besides a slide with overall “Key insights and trends” (see below), there are also “what we learned” sections throughout the document which summarize what the corresponding team considers the most important takeaways informing future work in that area. The report has the form of a slide deck suitable for a 90 minute presentation, keeping the amount of detail limited and linking to corresponding quarterly review meeting documentation for further detail. The Foundation began holding these quarterly team meetings in December 2012 to ensure accountability and create opportunities for course corrections and resourcing adjustments. By now, this process involves almost every WMF team or department.

Please refer to the links above for the full report. But to offer an excerpt from the “Key insights and trends” section (slide 5):

Readership: Globally, pageviews are flat. Mobile is growing, desktop is shrinking. Given a growing global potential audience, this means we need to invest in the readership experience, with focus on mobile.
We have learned that we can move at highest velocity on mobile apps due to their self-contained nature.

Performance: The implementation of HHVM across Wikimedia sites is an engineering success story and demonstrates that dedicated focus in the area of site performance can pay off relatively quickly.

Fundraising: Mobile matters — thanks to focused effort, we were able to increase the mobile revenue share from 1.7% to 16.1% (2013 vs. 2014 year-end campaign).

This being the first report in this new format, we will surely tweak format, content (including the choice of key metrics) and process for the subsequent issues. Comments continue to be welcome here or on Meta-wiki.

Asking Ever Bigger Questions with Wikidata

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge. (Magnus’ blog post, and my own.) At first it seems like quite elementary and naïve analysis, especially 14 years into Wikipedia, but only within the last year has this type of research become feasible. Like a baby taking its first steps, Wikidata and its tools ecosystem are maturing. That challenges us to creatively use the data in front of us.

To do biography analysis before Wikidata was much harder. To know the gender of an article you’d resort to natural language processing or hacks like counting gendered categories and guessing based on first name. Even more, the effort had to be duplicated for each language that had to be translated. Now the promise of language-free semantic data, and tools like Wikidata Query and Wikidata Toolkit are here. The process is easier because it is more database-like; select, group by,apply, and combine.

With this new simplicity, let’s review what we have imagined so far. Here’s a non-exhaustive introduction to the state of creative question-asking so far:

Pushing Ourselves to Think Even Bigger

Can we think even bigger if we use more of the available data? Thinking about the fact that every claim may have an attached reference, Markus Krötzsch always wants to know, for a given set of claims what references must be believed in order to believe the set of claims? With that notion we could look at all the claims associated with all the items of a given language, and thus the required belief system of that langauge. At this point we could ask what are the differences in the belief systems of any two langauges?

Another way we could test the fundamental principles of knowledge and culture is to consider the chains made by the subclass of, instance of, or cause of properties. Every language is present at different links of each chain. So we can look at the differences in ways in which languages organize a hierarchy of concepts – or if they think it’s a hierarchy at all.

Much fun for logicians and epistemologists. But we can also ask more socially important questions, questions about how language and society relate. What biases do we have that we aren’t even aware of? The method, for which I’ve proposed a PhD, could be conducted as follows. We’re aware of sexism in our societies, and as you’ve seen we’ve started to build a statistical profile of how it manifests in Wikidata. Likewise we’re cognizant of racism and homophobia. We might next look at rates people appear in Wikidata by race and desire. Let’s assume we could train a model to say that these kinds of distributions are types of social biases. Next we could search every property in Wikidata to see if it indicated social bias. If successful we may find overlooked stigmas and phobias in society.

I claim that our theoretical question-answering ability has paradigmatically shifted with the growing up of Wikidata. Soon enough you won’t even need to be a sophisticated programmer to whisper your questions into the system. So next time your reading, browsing, querying or displaying Wikidata, challenge yourself to think about how to analyse it too.

In light of the Europeana 2020 strategic plan, what shape should Europeana’s relationship to Wikipedia and the wider Wikimedia community take in the years to come? To find an answer to this question, a Europeana Network Task Force was formed with a mix of people drawn from the Europeana network of GLAMs as well as active members of the Wikimedia community. It was chaired by Jesse de Vos from the Netherlands Institute for Sound and Vision, now at Wikimedia Netherlands, and had a clear mission to reach final recommendations that benefited both Europeana and Wikimedia.

The first task was to create an overview of all past Wikimedia activities that Europeana has had involvement in. This list of past and current projects was built on-wiki (where else?) and shows very clearly the depth and breadth of the existing relationship.

The second half of the six-month review then focused on developing 10 strategic recommendations which would make this relationship even stronger over the next five years.

“Europeana has a long standing relationship with the Wiki community. With the development of the GLAMwiki Toolset, publishing to Wikimedia has become an intrinsic part of our publication policy and we would like to expand on this relationship for the benefit of our data partners. The recommendations in the report of the taskforce amplify that ambition, and we will investigate how we can act upon each of them.”
(Harry Verwayen, Europeana Foundation)

This can be achieved by considering a Wikimedia-component to both current and future projects. Europeana can also play an important role by enhancing relationships between GLAMs and the Wikimedia network, as well as sharing knowledge about practices in each of these communities.

An important aspect of the report was the recommendation that Europeana further integrate its systems and technology with Wikipedia and other Wikimedia platforms.

Wikidata is a key part of this, as a fast-growing project with enormous potential for linking collections, carrying out authority control, and synergy with Europeana’s systems.

The introduction of a dedicated Wikimedia coordinator and ‘product owner’ was another significant outcome. The creation of this position means that each of the ten recommendations can be fulfilled to their full potential. This new team member can also look at the opportunities to integrate Wikimedia in each of the major forthcoming Europeana projects, such as “1914-18”, “Sounds” and “Fashion”.

A final element to the report considers the possibilities for cooperation between Europeana and Wikimedia as they seek external funding for projects, with the possibility of Europeana becoming Wikimedia’s first movement-partner.

Each of the different strands to the report offers groundwork for exciting developments that can form future planning, and we would like to thank all members of the Task Force, as well as numerous others who gave their valuable insights during the creation of this report.

At the École normale supérieure de Lyon we have to do a programming project during the first part of your master degree curriculum. Some of us were very interested in working on natural language processing and others on knowledge bases. So, we tried to find a project that could allow us to work on both sides and, quickly, the idea of an open source question answering tool came up.
This tool has to answer to a lot of different questions so one of the requirements of this project was to use a huge generalist knowledge base in order to have a usable tool quickly. As one of us was already a Wikidata contributor and inspired by the example of the very nice but ephemeral Wiri tool of Magnus Manske, we quickly chose to use Wikidata as our primary data source.

This is why, after four months of hard work from seven people, we are happy to introduce Platypus, the new English speaking interface for Wikidata.

Platypus, the true “Jimbo Alpha”?

Platypus, using advanced natural language processing techniques and Wikidata, is able to answer a lot of questions, from the simple ones like “What is the birth date of Douglas Adams?” to the strange “What are the daughters of the wife of the husband of the wife of the president of the United States?” Currently most questions that may be answered using a single statement from Wikidata are supported. Platypus is also able to do simple spell checking in order to be able to answer to questions like “What is the cappyttal of Franse?”.

As computer scientists, we love mathematics, so the Platypus is also able to simplify a lot of mathematical formulas written in a natural-like syntax like “sqrt(180)”, in Mathematica like “Sum[1/n^42, {n,1,Infinity}]” or even in LaTeX like “\sum_{i=0}^n i^2″.

Why Wikidata is amazing

Wikidata was a very good choice because with its strong database of labels and aliases it has allowed us to easily find the Wikidata entities matching a given term using the search suggestion API of Wikidata. So it is very easy to map terms of natural languages to Wikidata identifiers and then use the statements in order to answer to most of simple questions like “When is X born?”.

Platypus is also an amazing excuse to improve Wikidata: questions for which Platypus does not give the answer are often an occasion to addrelevant data to Wikidata, and different formulations of questions are sometimes the occasion to add aliases to properties in order to improve their discoverability. It also made us discover vandalism on various Wikidata items. As example, the result of the query “Barack Obama” was broken a day because of a change of the English label of its item on Wikidata. After the revert of the vandalism and a cache purge Wikidata was clean again and this question worked.

We are also looking forward to improvements to Wikidata like the addition of support for quantities with units in order to increase the number of answerable questions.

Conclusion

The student project is finished since a few weeks but the open source project continues. We are currently working to add the support of other languages like French, improving the global performances and investigating in order to add context to question to be able to answer to things like “What is his birthdate?” after “Who is the president of the United States?” or “Where is the closest Wikimedia user group?”. People are welcome to help us on these points, or more globally to improve Platypus.

More on IWCLUL: now on the sessions. The first session of the day was by the invited speaker Kimmo Koskenniemi. He is applying his two-level formalism in a new area, old literary Finnish (example of old literary Finnish). By using two-level rules for old written Finnish together with OMorFi, he is able to automatically convert old text to standard Finnish dictionary forms, which can be used, in the main example, as an input text to an search engine. He uses weighted transducers to rank the most likely equivalent modern day words. For example the contemporary spelling of wijsautta is viisautta, which is an inflected form of the noun viisaus (wisdom). He only takes the dictionary forms, because otherwise there are too many unrelated suggestions. This avoids the usual problems of too many unrelated morphological analyses: I had the same problen in my master’s thesis when I attempted using OMorFi to improve Wikimedia’s search system, which was still using Lucene at that time.

Jeremy Bradley gave presentation about an online Mari corpus. Their goal was to make a modern English-language textbook for Mari, for people who do not have access to native speakers. I was happy to see they used a free/copyleft Creative Commons license. I asked him whether they considered Wiktionary. He told me he had discussed with a person from Wiktionary who was against an import. I will be reaching my contacts and see whether an another attempt will succeed. The automatic transliteration between Latin, Cyrillic and IPA was nice, as I have been entertaining the idea of doing transliteration from Swedish to Finnish for WikiTalk, to make it able to function in Swedish as well by only using Finnish speech components. One point sticks with me: they had to add information about verb complements themselves, as they were not recorded in their sources. I can sympathize with them based on my own language learning experiences.

Stig-Arne Grönroos’ presentation on Low-resource active learning of North Sámi morphological segmentation did not contain any surprises for me after having been exposed to this topic previously. All efforts to support languages where we have to cope with limited resources are welcome and needed. Intermediate results are better than working with nothing while waiting for a full morphological analyser, for example. It is not completely obvious to me how this tool can be used in other language technology applications, so I will be happy to see an example.

Miikka Silverberg presented about OCR, using OMorFi: can morphological analyzers improve the quality of optical character recognition? To summarize heavily, OCR performed worse when OMorFi was used, compared to just taking the top N most common words from Wikipedia. I understood this is not exactly the same problem of large number of readings generated by morphological analyser, rather something different but related.

February 22, 2015

In 2014, I posted a few photos, I continued to work on technical communications at Wikimedia before a role change, I learned more about myself, I moved to California, and I hiked a lot.

2014 in failures

Let's begin with what didn't work and get it out of the way. In January 2014, I started posting some of my photos on this site. I have accumulated dozens of thousands of photos over the past eight years, but published only a small fraction of them. By starting to publish a selection of them here, my goal was to create a momentum that would encourage me to process my backlog and publish my collections here and on Wikimedia Commons.

The photos didn't last, but they might come back.

The momentum didn't really last, though, and I ended up stopping after posting only seven photo articles. In retrospect, I think the issue wasn't really the photos themselves, but rather the accompanying texts. I've acknowledged this failure, and recently decided to retire the “Photo” section of this website. The photos are still online, but I've removed the navigation shortcut to that section.

I may resume posting photos in the future, although it's not a priority at the moment. If I do, I might change the format of the posts and only feature the photos with a very short text, if any.

2014 in work

During most of 2014, I continued to work as Technical Communication Manager at the Wikimedia Foundation, the nonprofit that operates Wikipedia.

Part of this work involved reviewing technical posts for the Wikimedia blog; I notably edited and published a series of candid essays written by students who participated in the Google Code-in program. In their “discovery reports”, they outlined their first steps as members of the Wikimedia technical community, and provided a newcomer's perspective on tools and processes regularly used by experienced contributors.

In 2014, I continued to work on Technical Communications at the Wikimedia Foundation, before transitioning to a new position.

I attended the Zürich hackathon, as well as Wikimania, the annual Wikimedia conference, whose 2014 edition was in London. At Wikimania, I presented on Tech News and put together a poster so that attendees could learn about it even if they couldn't attend the presentation.

In September, my role at the Wikimedia Foundation changed, and I started working on other projects, most notably the File metadata cleanup drive. The drive is an initiative to decrease the number of files (on Wikimedia sites) whose information can't be read by programs.

In September, my role at the Wikimedia Foundation changed, and I started to work on other projects, like the File metadata cleanup drive.

2014 in self-discovery

2013 had been a turning point for me, in that I had discovered that I was likely on the high-functioning part of the autistic spectrum. In 2014, a few experts officially confirmed that hypothesis. When asked why this had not been detected earlier in my life, the prevailing hypothesis was that I had unwittingly compensated this social blindness by a higher intelligence, as suggested by tests performed in 2013. I like to think of it as having my my own emulated emotion chip.

I feel like I deserve a membership card or something.

Throughout 2014, I continued to research and read on this topic. Doing so, I've continued to better understand my blind spots, and explored what I now refer to as my “super-powers”, a fancy way of characterizing the unique way in which my brain works.

Notably, I started reading on a variety of specialized topics I was not familiar with but intrigued me. Doing so, I discovered that I was very fast at picking up and understand new concepts and disciplines. I had had a feeling that that was the case for a long time, but experimenting with this skill was particularly fun and rewarding (I've recently been reading about Civil engineering and Human spaceflight).

2014 in transatlantic move

The biggest change in 2014 was our emigration from France to the US. As part of a role change at the Wikimedia Foundation, I relocated to the San Francisco Bay Area (again). The relocation process was easier this second time around, in part because my partner was able to relocate with me this time, and also because we decided to get organized.

Transitioning from a completely-remote environment to a tech open-space has required some adjustments, but overall we're very happy to have relocated.

2014 in physical activity

I do go outside sometimes, and as someone intrigued by the concept of Quantified Self, I try to keep metrics about my life whenever possible. Physical activity is one of the easiest things to track thanks to dedicated mobile apps.

Activity

Distance (km)

Distance (mi)

Running

178 km

110 miles

Hiking (inc. snowshoeing)

163 km

101 miles

Downhill skiing

105 km

65 miles

Cycling

49 km

30 miles

Cross-country skiing

21 km

13 miles

I love to hike and I occasionally run. In 2014, I knew we were going to relocate to sunny California, so I decided to take advantage of the snowy Alps while we were still in France.

Some days, taking the chairlift isn't nearly as fun as snowshoeing to the summit.

It had been years since I had skied downhill, but after a couple of days it all came back and I enjoyed it a lot. I also started snowshoeing, which was a really nice complementary activity. Where downhill skiing involves sprints and adrenalin, snowshoeing involves endurance and beautiful lesser-used forest trails.

The year ahead

2015 is already well underway, but it's not too late to mention what I'm planning to do this year.

Regarding my work at the Wikimedia Foundation, I'm continuing to lead the File metadata cleanup drive, and I'm hoping to continue to drive down the number of files missing machine-readable metadata. I also have a few smaller projects in the pipeline, notably the Template taxonomy.

Regarding personal work and recreation, I've started to learn Spanish again. My goal is to be able to handle basic communication by Summer, when I may visit Mexico City. Hopefully, by then, I'll be able to say more than “¡Hola!”, “Soy una tortuga” and “El elefante come la manzana”.

I've also decided to learn the piano; we'll see how far I can go in one year. Considering that I'm a total beginner, I can only make progress!

This year, I'm starting (from scratch) to learn to play the piano.

Last, I intend to continue to populate this site with historical and new content. My current priority at the moment is finishing to write about past projects before embarking on new ones, but I do think there will be room to post new content before next year's “year in review” post.

An effort started in September 2014 by the Wikimedia Foundation to fix file description pages and tweak templates to ensure that multimedia files consistently contain machine-readable metadata across Wikimedia wikis.

A short while after Wikipedia was created in 2001, contributors started to upload pictures to the site to illustrate articles. Over the years, Wikimedians have accumulated over 22 million files on Wikimedia Commons, the central media repository that all Wikimedia sites can pull from. In addition, nearly 2.5 million other files are spread out across hundreds of individual wikis.

MediaWiki, the software platform used for Wikimedia sites, wasn't originally designed for multimedia content. We've made good progress with better upload tools, for example, but the underlying system still very much focuses on text.

On MediaWiki, each file has a file description page that contains all the information ("metadata") related to the picture: what it depicts, who the author is, what rights and limitations are associated with it, etc. Many wikis have developed templates (reusable bits of wikicode) to organize such file metadata, but a lot of information is still unstructured in plain wikitext.

In October 2014, the Wikimedia Foundation launched an initiative to develop a new underlying system for file metadata using the same technology powering Wikidata. This project is still in the early stages, and even when it becomes available, it will take a long time to migrate the existing metadata to structured data.

The goal of the File metadata cleanup drive is to make the migration process for those 24+ million files less tedious, by making sure that robots can process most of the files automatically.

Evolution of the file description page

The upcoming Structured data project aims to build a system where you edit the metadata using a form, you view it in a nice format, and robots can understand the content and links between items.

With Structured data, robots will know exactly what field refers to what kind of information. This will make it easier for humans to search and edit metadata.

Many files on Wikimedia Commons aren't actually very far from that model. Many files have an "Information template", a way to organize the different parts of the metadata on the page. Information templates were originally created to display metadata in a consistent manner across files, but they also make it possible to make the information easier to read for robots.

This is achieved by adding machine-readable markers to the HTML code of the templates. Those markers say things like "this bit of text is the description", and "this bit of text is the author", etc. and robots can pick these up to understand what humans have written.

This situation is ideal for the migration, because it tells robots exactly how to handle the bits of metadata and which field they belong to.

Current information and license templates can be read by machines if they contain special markers. Robots will be able to migrate many files to structured data automatically if they use those templates.

If the machine-readable markers aren't present, the robots need to guess which field corresponds to which type of content. This makes it more difficult to read the metadata, and their parsing of the text is less accurate. The good news is that by just adding a few markers to the templates, all the files that use the template will automatically become readable for robots.

If a file contains information and license templates, but they don't have the special markers, it's difficult for robots to migrate it. Fortunately, it's easy to add the special markers.

Things become fuzzier for robots when the information isn't organized with templates. In this case, robots just see a blob of text and have no idea what the metadata is saying. This means that the migration has to be made entirely by human hands.

If the file's metadata only contains wikitext, we need to organize the content by adding an information and a license template manually. Those templates need to contain the special markers.

Fixing files and templates

Many files across wikis are in one of the latter states that aren't readable by robots, and about 700,000 files on Commons are missing an information template as well. In order to fix them so they can be easily migrated in the future requires, we need an inventory of files missing machine-readable metadata.

That's where MrMetadata comes into play. MrMetadata (a wordplay on Machine-Readable Metadata) is a dashboard tracking, for each wiki, the proportion of files that are readable by robots. It also provides an exhaustive list of the "bad" files, so we know which ones to fix.

Each wiki storing images has a dedicated dashboard showing the proportion of files with machine-readable metadata, and providing a list of the files to fix.

Once the files have been identified, a multilingual how-to explains how to fix the files and the templates. Fixing template is easy: you just add a few machine-readable markers, and you're done. For example, the English Wikivoyage went from 9% to 70% in just a few weeks. Fixing individual files requires more manual work, but there are tools that make this less tedious.

Get involved

If you'd like to help with this effort, you can look for your wiki on MrMetadata, bookmark the link, and start going through the list. By looking at the files, you'll be able to determine if if has a template (where you can add markers) or if you need to add the template as well.

The multilingual how-to provides a step-by-step guide to fixing files and templates. It's currently available in more than a dozen languages.

If you add markers to the templates, wait a couple of days for MrMetadata to update, so you can see the remaining files missing machine-readable information. The multilingual how-to provides a step-by-step guide to fixing files and templates.

Adding special markers to the templates can improve metadata readability very quickly. The English Wikivoyage went from 9% to 70% of "good" files in just a few weeks.

Impact

An assessment of impact conducted in January 2015 showed that, in three months, the cleanup drive had contributed to eliminating a third of the files missing machine-readable metadata across all wikis. Most of this progress was driven by editing file templates on the wikis with the most files. Over this period we gained 3 percentage points in the total proportion of files with machine-readable metadata.

In three months, over a third of the files missing machine-readable metadata were fixed.

The challenge at this point was that most of the low-hanging fruits (templates that were on lots of pictures) have been exhausted, and most of the remaining files don't have templates. This means that we need to add the templates ourselves to structure information that is currently in raw wikitext, which will take more time. This will be done by running focused campaigns using bots on large sets of files whenever possible.

February 21, 2015

The Stern–Gerlach Medal is one of many awards Wikidata knows about. Information is often available in a list within the articles. In some languages there are links to all those who received the award.

Having all the awards and all the people who received them in Wikidata is a massive undertaking. It can be argued that everyone who received an award has some notability..

Some people think that awards are not that important to categorise. Their way of thinking means that awards specifically relevant within a culture, a language become underrepresented. This is however an effect that diminishes in time.

It would be good when the lists were available to Wikipedias to use. When such lists become a service from Wikidata, it is easy to provide minimal information for the people that do not have an article yet. For best results it helps when all the associated labels are available.Thanks, GerardM

February 19, 2015

A wonderful new view is available thanks to Vizidata. It shows where people were born and were people died. The data is from a Wikidata dump so it is sadly static. Given that it is from Wikidata, you can safely assume that the data also exists in a Wikipedia ...

Italy is well pronounced in this view. It is because a lot of effort went into extracting data from the Italian Wikipedia. It follows that all the people the Italians care for are included as well. The fun thing of a view like this is that it is a historic view of what Wikidata covers and does not cover..

Apparently hardly anyone died in Africa in all the centuries.Thanks, GerardM

In October 2014, Twitter filed suit in federal court against the DOJ to establish its right to publish more detailed information about the national security requests the company receives from the government. The DOJ had denied them permission to report the number of national security requests within any useful ranges. The government is insisting on a reporting practice that, in our opinion, is misleading and non-transparent, especially for smaller organizations.

Current permissible reporting standards require organizations to report the number of national security requests in ranges or “bands”. For example, companies can report the number of national security letters (NSLs) received in bands of 1000, such as 0-999, meaning that an organization that received zero NSLs must report in the same band as one that received 999 NSLs. Twitter seeks confirmation that it can report that it received zero national security requests, when applicable.

The Wikimedia Foundation joined this amicus brief because we believe transparency is vital to the Wikimedia movement and that true transparency cannot be achieved without accuracy and completeness. We support any effort that permits more transparency on these kinds of demands, given the significant policy issues of these practices. As the brief underscores:

“This case is about an Internet company’s desire to be open and honest about its role—or lack thereof—in national security investigations in the post-Edward Snowden era. . . . Amici believe that to truly have a government for the people, by the people, we must have an informed citizenry.”

We, along with the co-signers of this brief, hope that the court will hear the case on its merits and provide much-needed clarity on these sensitive topics — and enable all organizations to inform their users on the practices of the government within reasonable ranges of accuracy.

Participatory grantmaking works because of committees such as this one. These community members review proposals for funding and help decide what to fund. Photo by Adam Novak, CC-BY-SA-3.0.

Because of Wikipedia’s unique structure, the grantmaking conducted by the Wikimedia Foundation (WMF) must be infused with the same spirit of collaboration, transparency, and participation that underpins the entire Wikimedia project. A new report published by The Lafayette Practice, and commissioned by WMF, provides the first in-depth insight into the Foundation’s grantmaking practices and reveals some interesting findings about our particular model.

As authors of “Who Decides?: How Participatory Grantmaking Benefits Donors, Communities and Movements,” The Lafayette Practice (TLP) found that Participatory Grantmaking (1) is a powerful movement-building strategy that leads to efficient transfer of money, knowledge and the promotion of self-determination. The idea behind this practice is to include representatives from the population that the funding will serve in the grantmaking process and in decisions about how funds are allocated.

The report found that the Wikimedia Foundation is the largest known Participatory Grantmaking Fund, through grants that support our communities and the movement more broadly. WMF’s total grants exceed all of the other funds documented in The Lafayette Practice’s original “Who Decides” report. In that study, the highest documented grantmaker budget was $2.37 million in 2012. WMF’s grantmaking budget for 2014-15 is over $7 million.

Central to the findings of this study is that our grantmaking processes and practices reflect the core ethos, mission, and model of Wikipedia and Wikimedia projects. In the same way that Wikipedia articles are born and grown on a public platform through the collaboration of a global community, so too are our grant proposals workshopped and reviewed on public wikis, as well as improved by volunteer editors.

Our four grantmaking programs have differing degrees of participation, where decisions are made in cooperation with volunteer committee members, Board members, WMF staff — and with input from the larger community. The committees behind these grantmaking programs — the Grant Advisory Committee, the Funds Dissemination Committee, and the Individual Engagement Grants Committee — are an incredible and diverse group of community members who engage in the tough work of reviewing proposals for funding and helping to make decisions about what to fund with limited resources available. It’s certainly no easy task.

The report also found that the Foundation’s grantmaking program has the largest peer-review participation of any funder of this kind, with many diverse community members from around the world involved in the decision-making committees. And in the same way that anyone can become a Wikipedia editor, anyone who edits Wikipedia can submit a proposal to the Wikimedia Foundation.

We agree with the Lafayette Practice’s assessment that our grantmaking is “innovative and groundbreaking” and we believe passionately in the participatory nature of our work.

Ultimately, the Wikimedia movement’s deeply ingrained values of collaboration, transparency and expanding access to information are reflected not only on the projects, but are also central to the way funds are allocated — and the way we support our community in sharing free knowledge with the world.

This blog has been updated with a new title, replacing “Research” with “Report.” This update reflects that this report was commissioned by the Wikimedia Foundation, as is clear from the first paragraph. We have also added a footnote about the Wikipedia article on Participatory Grantmaking.

(1) The Wikipedia article on Participatory Grantmaking was written primarily by Wikimedia Foundation staff in their capacity as Wikimedia volunteer editors. This was done on their own time, using their personal editor accounts, with the intent to share information with the larger philanthropic sector about a practice that is very much aligned with wikiculture. The article, which meets Wikipedia policies and guidelines, was developed based largely (but not exclusively) on the original report by the Lafayette Practice about participatory grantmaking, which was not funded by WMF. The study cited in the article did not include the Wikimedia Foundation. The subsequent report about the WMF’s participatory grantmaking approach was commissioned by Foundation in the months following the original report and is not referenced in any version of the Wikipedia article.

One corner of the room – detail of photo by Machi Takahashi of ATR Creative who joined the event from Tokyo and was one of the speakers. CC BY-SA 2.0

This post was written by Kimberly Kowal of the British Library and was originally published here. Reused with kind permission.

Without looking, you can’t know what’s there. That was our experience locating maps amongst the one-million British Library images released to the public domain. We had not guessed that 50,000 images of maps were lurking there. So how were they singled out?

Answer: with the help of our friends (the crowd!) using several methods.

Semi-Manually
A dedicated team of volunteers looked at individual images and applied the tag “map” on flickr. The work was organised using a synoptic index in Wikimedia Commons, providing a systematic method of looking at each volume and tracking shared progress. Over 29,000 map images were identified in this way.

Day-long event
The British Library hosted a one-day event, in concert with Wikimedia UK, to which volunteers were invited to kick-start the effort. In between working, the 30 participants enjoyed tours and talks from speakers representing online mapping efforts, including OpenStreet Map and Stroly. The day’s activities were captured in Gregory Marler’s engaging description, Lost in Piles of Maps, and a series of photographs from ATR Creative.

Ongoing crowd activity
The bulk of the work took place online over the next two months. With the wiki tools built by J.heald to guide and coordinate contributions, 51 volunteers approached the work, book by book, often focussing on geographic areas of interest. Together, they made short work of what was a huge task; 28% of the books were completed after the first 72 hours; 60% were reviewed in the first 20 days; after five weeks over 20k new maps were found in 93% of the source volumes.

Automated methods
But surely maps can be identified automatically? It’s true that well before the organised effort just described, one user produced algorithm-guided tags for this image set, which resulted in the addition of well over 15k map tags.

By the end of December 2014, every image in every book had been reviewed, and between the manual and automatic tagging, over 50k maps had been found. Since then, we have been working to clean up the data, including reviewing rogue tags, rotating images, splitting maps, and removing duplicates, to derive a final set of data. Next step: georeferencing.

What are the most important messages

In this blog post the term message means a translatable string in a software; technically, when a message is shown to users, they see different strings depending on the interface language.

MediaWiki software includes almost 5.000 messages (~40.000 words), or almost 24.000 messages (~177.000 words) if we include extensions. Since 2007, we make a list of about 500 messages which are used most frequently.

Why? If translators can translate few hundreds words per hour, and translating messages is probably slower than translating running text, it will take weeks to translate everything. Most of our volunteer translators do not have that much time.

Assuming that the messages follow a long tail pattern, a small number of messages are shown* to users very often, like the Edit button at the top of page in MediaWiki. On the other hand, most messages are only shown on rare error conditions or are part of disabled or restricted features. Thus it makes sense to translate the most visible messages first.

Concretely, translators and i18n fans can monitor the progress of MediaWiki localisation easily, by finding meaningful numbers in our statistics page; and we have an clear minimum service level for new locales added to MediaWiki. In particular, the Wikimedia Language committee requires that at very least all the most important messages are translated in a language before that language is given a Wikimedia project subdomain. This gives an incentive to kickstart the localisation in new languages, ensures that users see Wikimedia projects mostly in their own language and avoids linguistic colonialism.

The screenshot shows an example page with messages replaced by their key instead of their string content. Click for full size.

Some history and statistics

The usage of the list for monitoring was fantastically impactful in 2007 and 2009 when translatewiki.net was still ramping up, because it gave translators concrete goals and it allowed to streamline the language proposal mechanism which had been trapped into a dilemma between a growing number of requests for language subdomains and a growing number of seemingly-dead open subdomains. There is some more background on translatewiki.net.

There is much more to do, but we now have a functional tool to motivate translators! To reach the peak of 2011, the least translated language among the first 181 will have to translate 233 messages, which is a feasible task. The 300th language is 30 % translated and needs 404 more translations. If we reached such a number, we could confidently say that we really have Wikimedia projects in 280+ languages, however small.

* Not necessarily seen: I’m sure you don’t read the whole sidebar and footer every time you load a page in Wikipedia.

Process

At Wikimedia, first, for about 30 minutes we logged all requests to fetch certain messages by their key. We used this as a proxy variable to measure how often a particular message is shown to the user, which again is a proxy of how often a particular message is seen by the user. This is in no way an exact measurement, but I believe it good enough for the purpose. After the 30 minutes, we counted how many times each key was requested and we sorted by frequency. The result was a list containing about 17.000 different keys observed in over 15 million calls. This concluded the first phase.

In the second phase, we applied a rigorous human cleanup to the list with the help of a script, as follows:

We removed all keys not belonging to MediaWiki or any extension. There are lots of keys which can be customized locally, but which don’t correspond to messages to translate.

We removed all messages which were tagged as “ignored” in our system. These messages are not available for translation, usually because they have no linguistic content or are used only for local site-specific customization.

We removed messages called less than 100 times in the time span and other messages with no meaningful linguistic content, like messages where there are only dashes or other punctuation which usually don’t need any changes in translation.

We removed any messages we judged to be technical or not shown often to humans, even though they appeared high in this list. This includes some messages which are only seen inside comments in the generated HTML and some messages related to APIs or EXIF features.

Discoveries

In this process some points emerged that are worth highlighting.

310 messages (62 %) of the previous list (from 2011) are in the new list as well. Superseded user login messages have now been removed.

Unsurprisingly, there are new entries from new highly visible extensions like MobileFrontend, Translate, Collection and Echo. However, except a dozen languages, translators didn’t manage to keep up with such messages in absence of a list.

I just realized that we are probably missing some high visibility messages only used in the JavaScript side. That is something we should address in the future.

We slightly expanded the list from 500 to 600 messages, after noticing there were few or no “important” messages beyond this point. This will also allow some breathing space to remove messages which get removed.

We did not follow a manual passage as in the original list, which included «messages that are not that often used, but important for a proper look and feel for all users: create account, sign on, page history, delete page, move page, protect page, watchlist». A message like “watchlist” got removed, which may raise suspicions: but it’s “just” the HTML title of Special:Watchlist, more or less as important as the the name “Special:Watchlist” itself, which is not included in the list either (magic words, namespaces or special pages names are not included). All in all, the list seems plausible.

Conclusion

Finally, the aim was to make this process reproducible so that we could do it yearly, or even more often. I hope this blog post serves as a documentation to achieve that.

I want to thank Ori Livneh for getting the key counts and Nemo for curating the list.

The talk tried to cover three aspects: The decision driving process in a large distributed project, the technical and planning aspects of such a complex migration that nobody has tried before, and describing some of the functionality of the new tool (Phabricator) itself.

“Wikipedia is an amazing forum for anyone to participate in Feminist Digital Humanities,” she told us. She said she saw Wikipedia as a way to strengthen women’s voices by documenting the lives of women, and contributing quality content about feminism. That meant drawing a direct line between studying Wikipedia and contributing to it. “It’s one thing to study theory, but it’s a powerful thing to get an opportunity to put theory to practice.”

Alicia is a HASTAC scholar and an intern with the Digital Humanities Department at Richard Stockton College. She’s also a blogger, and will be working on social media for Stockton’s board of trustees this spring. But this was her first foray into Wikipedia.

“I honestly never gave much thought to how edits were made or how the information on Wikipedia got there,” she said. “In my first few edits I had other editors contact me within a day about correct formatting of citations … This was both intimidating and exciting, to realize I was a part of a network of editors who were paying attention to what I was doing. It pushed me to strive for better edits, knowing that there was a community both noticing the good work I did, but also anything that didn’t meet their standards.”

Once she’d learned the ropes, however, Wikipedia presented an opportunity that aligned with Alicia’s interests in raising social awareness.

“As with many web-based academic forums and archives, Wikipedia has been accused of having a sexist bias running throughout its articles. This may be due to the lack of women editors,” she said. Pointing out Wikipedia’s gender gap, she mentioned a study from the Wikimedia Foundation that found “women only contribute to editing 13% of the articles on Wikipedia,” and quoting Sue Gardner, mentioned that the typical editor is a young man. “Without diverse representation in the editing community, there is a tendency for this gap to be reflected in an under-representation of a large group of people and concerns in the articles. This is exactly what Feminist Digital Humanities addresses and looks to repair. So by just taking the time to edit Wikipedia, you can be a part of Feminist Digital Humanities efforts.”

She worked to maintain a neutral tone throughout her article, “much like the process for any research paper. You must write in a non-biased, non-personal tone, with all information backed up with secondary sources,” she said. She edited sentences at a time, to make sure other editors could easily track what she was doing and offer guidance along the way. “It’s definitely different to know that your work is in collaboration with a whole network of contributors.”

Alicia is proud of the contributions she made to Wikipedia, expanding Feminist Digital Humanities out of its stub status. “Where a paper will most likely only be read by a single professor, these edits are documented on the web for all to see. While the stakes are high, so are the rewards. There is much gratification in leaving your personal mark on something that will help others to learn.”

These are the reasons Dr. Koh said were behind her contributions to Wikipedia. “Students were scared by the assignment but also felt it was completely rewarding,” Dr. Koh told us, “much more than a final research paper, because they could see the tangible results of them actually contributing to knowledge.”

I think one of the reasons for my discomfort with many of the arguments I see online (especially in the Twitter context, like the #stopwadhwa2015 campaign I recently discussed) comes down to my disposition, as manifested in my career. I'm slow to anger and accusation, I'm influenced by Buddhist notions of compassion and inter-being, professionally I've spent my time facilitating consensus and teaching (including conflict management). There's been much discussion about online shaming recently, for which I also have a reticence. (The only shaming I enjoy without much guilt is dog shaming.) My sentiments are captured and were shaped by a suitably geeky maxim known as Hanlon's Razor: "Never attribute to malice that which is adequately explained by stupidity." I think this to myself at least once a day, especially when I feel slighted.

Even so, I absolutely support identifying problematic behavior and holding those who engage in it accountable. And I recently came to realize that my use of this Hanlon's Razor is privileged. I was speaking with a colleague, a women of color, who was rightfully complaining of microaggresions (a notion I'm sympathetic to, although I find the term imprecise). I responded that perhaps it is "better to assume stupidity than malice" but immediately realized -- and acknowledged -- that this is an easy maxim to hold from a privileged position. I thought of this yesterday too, while listening to This American Life's segment on "cops see it differently". In Miami Gardens, the police's effort to crack down on crime took the form of a competitive race to "bring in the numbers." This runaway system, for which no one has been found accountable, generated enough "stop-and-question" contacts to include half the city's population, including thousands of children and a 99-year-old "suspicious" man. The worst case was that of Earl Sampson, who became an easy target for the police whenever they needed a boost: he was arrested 111 times, 71 of which was for trespassing at his workplace! In this case and others, small systemic stupidities can be an injustice greater than personal malice.

commons ⇄ freedom, equality ⇄ good future

Same as last year, my main topic has been “protecting and promoting intellectual freedom, in particular through the mechanisms of free/open/knowledge commons movements, and in reframing information and innovation policy with freedom and equality outcomes as top.”

Rather than repeating the three doubts I expressed last year under the heading “intellectual freedom” (my evaluation of these has not much changed), I will take the subject from a different angle: the “theory of change” I have been espousing. This theory is not new to me. Essentially it is what attracted me to following the free software movement circa 1990 — its potential of extensive, pro-freedom socio-economic reform from the bottom up. That and wanting to run a unix-like on my computer — a want satisfied without respect to freedom as soon as I could use a Sun workstation at work, and for many years now would have been satisfied by OS X. I never cared very much about being able to read, modify, and share all of the software on my computer — the socio-economic implications of those capabilities make them interesting, to me. The claimed ends of the theory are in the ‘for a good future’ slogan I’ve occasionally used at least since 1998. I occasionally included the theory in blog posts (2006) and presentations (2008). Much of my ‘critical cheering’ last year (doubt) and before has largely been about my perhaps unreasonable wish that ‘free/open’ organizations and movements would take the theory I do and act as I think follows. I could easily be wrong on the theory or best actions it implies. Accordingly, I ratcheted down critical cheering in 2014; hopefully most but not all of what remained was relatively fun or novel. Instead I focused more sharply on the theory, e.g., in Sleepwalking past Freedom’s Commons, or how peer production could increase democracy, equality, freedom, and innovation, all of them!

The theory could be attacked from a number of angles — I’d love to see that done and learn of new vulnerabilities. For example, commons might not significantly affect freedom and equality, these may not be the right values, and one might consider a ‘good future’ to be one with maximum hierarchy, spectacle, even war (I repeatedly argue that future tech and culture will be marvels in most plausible futures, and that is a reason to reject ones that do not have freedom and equality as top values, but also something that makes it hard to see how a future — or present — could be different or better with more knowledge economy/policy-driven freedom and equality). But this isn’t a cheap refutation post (see below) and I don’t have very practical doubts about those values and what they imply constitutes a good future.

But I do have practical doubts about the first leg of the theory. Summary of that leg before getting to doubts: Commons-based knowledge production simultaneously destroys rents dependent on freedom infringing regimes, diminishing the constituency for those regimes, grows the constituency and policy imagination for freedom respecting regimes, and not least, directly increases freedom and equality.

Doubts:

Effects could be too small to matter, or properly attributed to generational or other competition among firms, not commons-based production. Consider Wikipedia, a success of commons-based production if there is one. Such success may not be possible in other sectors, especially ones that command top policy attention (drugs and movies) — policy imagination has not been increased. The traditional encyclopedia industry was already mostly destroyed by Microsoft Encarta when Wikipedia came along. The encyclopedia industry was not a significant constituency for freedom infringing regimes, so its destruction matters not for future policy. Encyclopedias were readily accessible at libraries, vastly more useful info of the sort found in encyclopedias is accessible online now, excluding Wikipedia, and ‘freedoms’ to modify and distribute are just not relevant nearly all humans.

I claim that the best knowledge policy reform is that which favors commons and that the reforms traditionally proposed by copyright and patent reformers are relatively futile because such proposals if implemented would not significantly change the knowledge economy to produce freedom and equality nor grow the constituencies for such changes — rather they are just about who, how, and for how much the outputs of production under freedom infringing regimes may be used — so-called balance, not the tilt I demand. But perhaps the usual set of reform proposals is the best that can be hoped for, especially given decades of discourse and organization-building around those proposals, and almost none about commons-favoring reform. Further, perhaps the usual set of reform proposals is best without qualification — commons-based production is a culturally marginal (in software; wholly irrelevant in most other sectors) arrangement that ought be totally ignored by policy.

Various (sometimes semi-) free/open movements within various sectors (e.g., software, education, research publication) are having some policy successes, without (as far as I know) usually considering themselves to be as or more central to shaping knowledge policy as usual things fitting under ‘copyright reform’ and ‘patent reform’ but this could be just what needs to happen. The important thing is that commons-based knowledge production entities act to further their interests with minimal distance from current policy discourse, not that they have any distracting and possibly discrediting theory about doing so relative to overall knowledge policy.

Only the first of these gives me serious pause, though my discounting the last two might be a matter of (dis)taste — my feeling is that most of the people involved thoroughly identify with the trivia of copyright, patent, and similar law, even if they think those laws need serious reform, and act as if commons-based production is something to be protected from reform in the bad direction, but not at all central. Sadly if my feeling is accurate, the second and third doubts probably ought give me more pause than they do.

Despite these doubts, the potential huge win-win (freedom and equality, without conflict) of reorienting the knowledge economy and policy around commons-based production makes robust discourse (at the least) on this possibility urgent, even if tilt probability is low. One of the things that makes me favor this approach is that reform can be very incremental — indeed, it is by far the most feasible reform of any proposed — we just need a lot more of it. Push-roll towards tilt!

The most damning observation is perhaps that I’m only talking, and mostly on this very blog. I should change my ways, but again, this is not a cheap refutation post.

Software Freedom/Futurism/Science Fantasy

I recently wrote that “it’s much easier to take software freedom as a serious issue of top importance if one has a ‘futurist’ bent. This will also figure in a forthcoming post from me casting doubt on everything in this post and the rest from 2014.”

How important are computers to human arrangements, and how large is the range of plausible computer-involved arrangements, and how much can those realized be changed? Should anyone besides programmers and enthusiasts care about software specifically, any more or less than they care about the conditions under which any tool is created and distributed? (Contrast with other tools would be good here, but I’ll leave for another time.)

The vast majority of people seem to treat software as any other tool — they want it to work as well as possible, and to be as cheap as possible, the only difference being that their intuitions about quality and cost of software may be worse than their intuitions for the quality and cost of, for example, bridges. Arguably nearly everyone has been and perhaps still is correct.

But one doesn’t need to be much of a futurist to see software getting much more important — organizations good at using software ‘eating’ the lunches of those less good at using software, software embedded in everything or designing everything (and anything else being obsolete), regulating and mediating every sort of arrangement — with lots of plausible variation as to how this happens.

Now the doubt: does future-motivated interest in software freedom share more with interest in science fiction (i.e., moralistic fantasy) or with interest in future studies and the many parts of various social sciences that aim to improve systems going forward in addition to understanding current and past ones? If the latter, why is software freedom ignored by all of these fields? Possibly most people who do think software is becoming very important are not convinced that software freedom is an important dimension to consider. If so (I would love to see some kind of a review on the matter) it would be most reasonable to follow the academic consensus (even if it is one of omission; that consensus being of software freedom not interesting or important enough to investigate) and if one cares about the ethical dimensions of software, focus instead on the ones the consensus says are important.

Two additionalposts last year in which I claim software freedom is of outsized and underappreciated importance (of course I don’t usually restrict myself to only software, but consider software a large and growing part of knowledge embodying cumulative innovation, and of the knowledge economy leading to more such accumulation) and some of many from years past (2006, 2006, 2007, 2007). The first from 2006 highlights the most obvious problem with the future. I had forgotten about that post when mentioning displacement of movies by some other form as the height of culture in 2013 — one has to squint to see such displacement even beginning yet. The second isn’t about the future but is closely related: alternative history.

Uncritical Cheering

I feared that many of my posts last year were uncritical cheering (see critical cheering above and last year). Looking back at posts where I’m promoting something, I have usually included or at least hinted at some amount of criticism (e.g., 12). I don’t feel too bad. But know that most of the things I promote on my blog are very likely to fail or otherwise be inconsequential — if they were sufficiently mainstream and established they’d be sufficiently covered elsewhere, and I likely wouldn’t bother blogging about them.

One followup: I cheered the publication of the first formally peer-reviewed and edited Wikipedia article in Open Medicine — a journal which has since ceased publishing.

Freeway 980

I continue to blogabout removing freeway 980, which cuts through the oldest parts of Oakland. Doubt: I don’t know whether full removal would be better (at least when considering feasibility) than capping the portion of 980 which is below grade. I intended to read about freeway capping, come to some informed opinion, and blog about it. I have not, but supposedly Oakland mayor Libby Schaaf has mentioned removing 980. Hopefully that will spur much more qualified people to publish analyses of various options for my reading pleasure. ConnectOakland is a website dedicated to one removal/fill scenario.

Politics

I’m satisfied enough with the doubt in my twoposts about Mozilla’s leadership debacle, but I’ll note apparent tension between fostering ideological diversity and shunning people who would deny some people basic freedoms. I don’t think this one was fairly clear cut, but there are doubtless far more difficult cases in the world.

Refutation

I fell further behind, producing no new dedicated collections of refutations of my 8+ year old posts. My very next post will be one, but as with previous such posts, the refutations will be cheap — flippant rather than drilling down on doubts I may have gained over the years. Again these observations (late, cheap) are what led me last year to initiate a thematic doubt post covering the immediately previous year. How was this one?

In the report the Committee makes a series of important recommendations which will influence the future digital skills of the UK. Perhaps chief among these is the recommendation to “define the internet as a utility service, available for all to access and use.” This would place access to the internet on an equal footing with access to water and energy and is an acknowledgement of just how fundamental the internet is to our modern way of life.

The report makes a number of recommendations and statements that are of particular interest to Wikimedia UK. The countries that ranked above the UK in a recent digital study had all “invested heavily in digital ‘foundations’, including up-skilling the population in technical expertise and digital capability…” These skills are extremely important and we believe that the use of Wikipedia and other open knowledge projects as both teaching and learning tools can offer great benefits to digital literacy.

The Wikimedia movement globally is making great strides towards the acceptance and appreciation of Wikipedia as a learning and teaching tool. Countries such as Israel, Serbia and Sweden are taking advantage of the capacity and scale of the free encyclopedia in creative ways within their education systems. The UK should likewise adopt the use of Wikipedia and other open knowledge projects.

One point from the report indicates a key, and timely, shift – effectively, the “3Rs” will become “3Rs and a D”. Explicitly stated in objective four of the report: “No child leaves the education system without basic numeracy, literacy and digital literacy.” This is a very welcome development. The use of Wikipedia in formal education settings from Key Stage 4 onwards could make a significant contribution to not only digital literacy skills but to core life skills such as critical thinking.

Many universities are reporting that undergraduates have not only digital skills gaps when they arrive at university, but gaps in critical thinking and key information literacy gaps too. We are currently exploring a number of ways in which using the Wikimedia projects can bridge these crucial skills gaps effectively. We welcome thoughts on this so please get in touch if you’d like to be involved.

The report calls for a single “Digital Agenda” within government following recent initiatives looking at developing greater digital democracy and the use of digital tools to replace civil courts (digital justice?). There needs to be a great deal of joined up thinking and shared ownership of work within government and the public sector for these initiatives to be introduced effectively and efficiently into one distinct package to support digital citizenship. The voluntary sector has a significant role to play in this.

If you would like to learn more about why Wikipedia belongs in education, please contact our education organiser, Dr Toni Sant – toni.sant{{@}}wikimedia.org.uk

When 25.000 books, books from the early days, English texts from 1473-1700 become available it is quite something. Many of these text are the earliest sources on many subjects in English.

All of them deserve to be registered in Wikidata, The most relevant question would be: how do we serve our public best. Yes, it starts with indicating that these books exist but it is easy enough to point people in the right direction. The direction where these books can be found to be read.

It seems obvious. When books are (finally) available under a free license, it is important for people to find them.Thanks, GerardM

February 16, 2015

Mr Tlili is a Tunisian politician who died. What is refreshing is that there is at least one decent list of members of the current parliament and, as is fitting, it is in French. Without assistance of Google translate the articles are too difficult for me.

There is also a category; and it has a problem. It links the current members in French to every member of the Tunisian parliament. from a Wikidata point of view that is fatally flawed. It is however part and parcel of a category of subjects that is underdeveloped. Our Wikiverse does not really care about Farfarawayistan. Its problems is seen as the diversity that is in genders and while important, it easily ignores what is far far away. As you can see in the picture, there are a fair bunch of women in the Tunisian parliament.

Even people who research are interested in diversity. They want to know how diversity differs in different languages. Those different languages mean different cultures, Cultures that by and large are not really well known in our Wikipedias as they are far, far away. Consequently Wikidata does not serve them the data they need.

I am happy with the Tunisian list. It means that Tunisia is not longer as far far away.Thanks, GerardM

From the University of Southern Indiana’s Intro to Mass Communication course, taught by Dr. Chad Tew, this week we’re sharing Wikipedia articles, created or expanded by students, and about journalists killed while reporting.

I have received an email from Lluís Madurell, a Catalonian Wikipedia editor, and reproduce it here (verbatim, but with links added) with his kind permission:
My name is Lluís Madurell (U:Lluis_tgn). I am a Catalan Wikimedian from 2009 and member of Amical Wikimedia.
In november Lorena Tomás contacted Amical Wikimedia. She said that you approached them to inform about the project you are running on the translation of articles about key chemistry matters from English into Catalan.
Lorena was willing to participate throught ICIQ (Institute of Chemical Research in Catalonia). ICIQ contacted Kippelboy and Kippel contacted me because this chemistry research center is in Tarragona (city 100km south of Barcelona), I am from this city and I already do diferent wiki-projects in my city. So we started mailing, we met in December and I created this Wikiproject page (https://ca.wikipedia.org/wiki/Viquiprojecte:ICIQ). ICIQ started the promotion of this project to all of his reserachers and this wednesday I did a small presentation about Wikipedia and this project to the 7 researchers that show up to the meeting (https://commons.wikimedia.org/wiki/File:Introduction_about_Wikipedia_at_ICIQ_by_Amical_Wikimedia.jpg). So the project is starting.
All of this maybe useful for you If you are stll a WiR in Royal Society of Chemistry, maybe not.
Thanks for openning this oportunity for us, your WiR got International :)
Best regards.
Working with my Wikipedia friends overseas, to develop material in multiple langauges, is one of the things I enjoy most about contrbuting to Wikipedia.

Asking Ever Bigger Questions with Wikidata

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge. (Magnus’ blog post, and my own.) At first it seems like quite elementary and naïve analysis, especially 14 years into Wikipedia, but only within the last year has this type of research become feasible. Like a baby taking its first steps, Wikidata and its tools ecosystem are maturing. That challenges us to creatively use the data in front of us.

To do biography analysis before Wikidata was much harder. To know the gender of an article you’d resort to natural language processing or hacks like counting gendered categories and guessing based on first name. Even more, the effort had to be duplicated for each language that had to be translated. Now the promise of language-free semantic data, and tools like Wikidata Query and Wikidata Toolkit are here. The process is easier because it is more database-like; select, group by,apply, and combine.

With this new simplicity, let’s review what we have imagined so far. Here’s a non-exhaustive introduction to the state of creative question-asking so far:

Pushing Ourselves to Think Even Bigger

Can we think even bigger if we use more of the available data? Thinking about the fact that every claim may have an attached reference, Markus Krötzsch always wants to know, for a given set of claims what references must be believed in order to believe the set of claims? With that notion we could look at all the claims associated with all the items of a given language, and thus the required belief system of that langauge. At this point we could ask what are the differences in the belief systems of any two langauges?

Another way we could test the fundamental principles of knowledge and culture is to consider the chains made by the subclass of, instance of, or cause of properties. Every language is present at different links of each chain. So we can look at the differences in ways in which languages organize a hierarchy of concepts – or if they think it’s a hierarchy at all.

Much fun for logicians and epistemologists. But we can also ask more socially important questions, questions about how language and society relate. What biases do we have that we aren’t even aware of? The method, for which I’ve proposed a PhD, could be conducted as follows. We’re aware of sexism in our societies, and as you’ve seen we’ve started to build a statistical profile of how it manifests in Wikidata. Likewise we’re cognizant of racism and homophobia. We might next look at rates people appear in Wikidata by race and desire. Let’s assume we could train a model to say that these kinds of distributions are types of social biases. Next we could search every property in Wikidata to see if it indicated social bias. If successful we may find overlooked stigmas and phobias in society.

I claim that our theoretical question-answering ability has paradigmatically shifted with the growing up of Wikidata. Soon enough you won’t even need to be a sophisticated programmer to whisper your questions into the system. So next time your reading, browsing, querying or displaying Wikidata, challenge yourself to think about how to analyse it too.

February 14, 2015

In the last 8-9 months of my life, I have gone through some of the most beautiful experiences of my life, which has helped me grow, personally as well as professionally. Now its time to move on, implement those learning in making something bigger.Scrollback had given me a new identity. I probably got famous on social media like never before! I could never refer to it as just another organization I worked for and it will always hold the same place in my heart. But just like all good things should come to an end, this journey also needs to.I am leaving Scrollback in a week's time. I am joining Red Hat and moving back to Pune. This was never an easy decision. I have always loved Scrollback, as a product, as an Open Source community and of-course, most importantly, I have loved the work I had been doing and the people I had around me, to always guide, support and help me. My career graph, if we plot it now, has a lot of ups and down. I have worked as a PHP developer in the initial 6 months of my work life, before moving to Scrollback as a Technical Evangelist. Somehow, I couldn't do much evangelism here. With the need of the organization, I moved more into community management. But, well, no kidding, I am not getting any younger and need to start thinking of building one profile, in any one domain, which I can sustain for the rest of my life! When the Red Hat offer came across, I saw it as an opportunity to experiment with another new profile and see if this can be the one which can ultimately settle me down.Its tough (well, I would like to believe impossible) to take the Scrollbacker out of me now....its too deep in my blood. I will always keep contributing to Scrollback, in all possible ways. I will be joining Red Hat as a Technical Writer. There are a lot of reasons behind my decision to join Red Hat, out of which I guess I have already told the most important one, experiment a different profile. Another big reason is the fact that this offer came from Red Hat! Red Hat has been my dream organization since my early days of college! This Open Source organization has a constant reputation of hiring really cool people, and giving the ultimate work environment, along with the required freedom both at work as well as to maintain the work-life balance. I had wanted to join the cool gang for years and finally got the chance now! This paragraph is unnecessary. Its only for all those who have been hearing (or spreading) rumours around this decision of mine. So, if you are not one of them, feel free to skip it. My decision was not influenced by the fact that the offered job location was Pune. I love Pune as a city, I never disagree to that, but, had the job location been the remotest village in India (or any part of the world), things would have not changed. Also, I am not taking this job because my "believed" boyfriend is at the same office! Trust me, even if that was true, I don't take career decisions based on emotional influences!Having said most of the things, just to let everyone know, I am going to continue all my Mozilla activities the way I have been. Rather, I will be taking up a few community responsibilities again, since I am moving back to Pune and I have worked way more closely with this community than the Bangalore community. I will also continue contributing to Scrollback as a volunteer, so in-case you need to reach out to me for any Scrollback related queries, I am still available.

I find it astounding to learn that Creative Commons is in financial dire straits. As Wikimedians we are part of a world that is shaped by copyright law and the fight for free and fair license. When a crucial player like Creative Commons cannot take its role, it shows our weakness, It indicates that we are fighting a losing battle because our priorities are wrong.

Creative Commons deserves our support. We rely on Creative Commons.

It is one of those organisations that the WMF could do something special for. For instance a fund raiser on their behalf. <grin> WMF is good at that </grin> and in this way commit ourself more to free and fair licenses.Thanks, GerardM

In watching the twitter storm about #stopwadhwa2015 I'm struck by two things, but first, some context—which as we will see, is problematic in Twitter discourse. Vivek Wadhwa is an American tech entrepreneur, columnist, pundit, and researcher. He's been a vocal advocate for more gender and racial diversity in technology and is a co-author of Innovating Women: The Changing Face of Technology. Upon being quoted in Newsweek's"What Silicon Valley Thinks of Women" a number of women (roughly aligned with Geek Feminism) protested and asked Wadhwa that he stop presuming to speak for them. (Even if the article's author, Nina Burleigh, was sympathetic to women, the graphic associated with it set the frame for controversy.) In response to this challenge, Wadhwa attempted to defend himself on Twitter. This led to further critique, best documented in Amelia Greenhall's post "Quiet, Ladies. @wadhwa is Speaking Now". Greenhall had been interviewed for a story on On the Media's TLDR podcast, but it was removed, leading to further antagonism.

I'm struck by how inappropriately people use Twitter, such as breaking up a longer missive into separate tweets or trying to have a sensible disagreement. Little that is nuanced or complex can be expressed in 140 characters. As I note in the forthcoming book "because comment is reactive, it's inherently contextual; yet, it's also hypotextual, shedding context with ease which leads to confusion and retorts of 'WTF?!?' in response." If you find yourself trying to have a challenging conversation, you're using the wrong medium. Twitter is best used for links, status updates, and the whimsical; people who try to use it for more will often find themselves frustrated—I admit I may be old school. (I recommend Jon Ronson's recent "How One Stupid Tweet Blew Up Justine Sacco’s Life" on this point.)

This case also exemplifies what happens when a well-intentioned male is confronted by Geek Feminism. The well intentioned may have been thanked and congratulated for his disposition and efforts in the past, but Geek Feminism says no cookie for you. As I discuss in "The Obligation to Know: From FAQ to Feminism 101," geekdom is culturally laden and one is expected to educate oneself on the rudiments. To avoid making stupid 101-type mistakes, one has to read up on the common problematic behaviors of misogynists and allies alike; this can be alienating (even for women) just as it is for the tech "newbie" who is told to go away and RTFM. As Kelly Ellis noted, the fact that Wadhwa presumes "nerds" can only be men is an egregious mistake for someone who supposedly studies the technology gender gap. It insults women because it further polices an identity they have claimed (sometimes at great expense) and, worse yet, it is clueless to this fact.

Update 2015-02-14: And all of this is assuming that Wadhwa is a well-intended but clueless newbie. As a friend reminded me, the critique is that he's opportunistically exploiting the issue (to the detriment of actual women) for his own advancement.

Privilege is often understood by way of metaphor. Like any tool, a given metaphor is apt for some tasks more so than others. Even so, “when you only have a hammer everything looks like a nail.” Hence, it’s worth considering the merits of metaphors related to privilege.

privilege as an invisible knapsack

Peggy McIntosh’s (1990) original metaphor for privilege was that it was “like an invisible weightless knapsack of special provisions.” This is useful in that it speaks to relative advantage and that privilege is often hidden or unseen.

privilege as the “lowest difficulty setting”

In the contemporary technology context, John Scalzi (2012) likened privilege to playing a game on the lowest difficulty setting: monsters are easier to kill and there’s more health and bonus packs. This not only translates McIntosh’s notion into the digital realm, but improves upon the knapsack insofar as it recognizes that you can still lose the game: there are still challenges to overcome, yet “The lowest difficulty setting is still the easiest setting to win on. The player who plays on the ‘Gay Minority Female’ setting? Hardcore.”

privilege and smoking

Tim Wise and Kim Case (2013) also speak to the non-deterministic character of privilege on outcomes; they liken it to smoking: many people smoke without getting cancer, but it is highly correlated with it. Similarly, whiteness is highly correlated with advantage relative to people of color.

privilege as a intergenerational relay race

Mcnamonee and Miller’s (2004: 49) metaphor nicely accounts for the importance and perpetuation of social, economic, and cultural capital by likening privilege to an intergenerational relay race: "children born to wealthy parents start at or near the finish line, while children born into poverty start behind everyone else. Those who are born close to the finish line do not need any merit to get ahead. They already are ahead. The poorest of the poor, however, need to traverse the entire distance to get to the finish line on the basis of merit alone" (p. 49).

Finally, meritocracy is also spoken of as by way of metaphor, often as a
bubble. The meritorious are said to “rise to the top,” much like champaign bubbles. This naive notion of meritocracy is that there is an inherent quality of superiority which causes a person to rise through a transparent and fluid medium. Alan Fox (1956) and Alice Marwick (2013) have critiqued this metaphor.

Wise T and Case KA (2013) Pedagogy for the privilege: Addressing
inequality and injustice without shame or blame. Case KA (ed.),
Deconstructing Privilege: Teaching and Learning as Allies in the
Classroom, New York: Routledge.

Avner and Darya fell in love while touring Israel with other Wikipedians. Here they are at Mount Eitan. Photo by Deror Avi, freely licensed under CC-BY-SA 4.0

On Wikipedia’s 14th birthday, Avner proposed to me. It seemed very natural, as we both feel that Wikipedia’s community is family.

Avner and Darya first met at a Wikimedia Israel Meetup.

I remember meeting Avner for the first time at a Wikimedia Meetup with volunteers, hosted at the Wikimedia Israel office by Jan-Bart de Vreede, chair of the Wikimedia Foundation’s Board of Trustees. Wiki-Academy had just successfully completed its 6th conference and everyone was in good spirits. Avner came in late. He knew everyone in the room, except for a girl in a red who caught his eye.

Avner asked around and learned that I was the Wikipedian in Residence at the National Library of Israel. So naturally, the next time he saw me at the library, he started a conversation. The first thing we talked about was a recent complaint the Library’s Reference Desk in Wikipedia had received. He was unaware that I was in charge of the Reference Desk and that I wasn’t very happy being criticized. So we kept our distance during the Hebrew Wikipedia birthday that summer.

Avner and Darya at the Hebrew Wikipedia’s 11th Birthday.

Not ready to give up hope, Avner wrote to me and we started corresponding. Soon after, we went out and in no time we realized this was the real thing! As we are both Wikipedians, we got involved together in many of the chapters’ activities, including Elef Millim, a ‘thousand words’ tour of Israeli landmarks (see photo above).

Finally, Avner decided it was time. He thought it over very carefully, I didn’t realize a thing. On Wikipedia’s 14th birthday, organized by Wikimedia Israel, Avner was to give a lecture titled “How to find love in Wikipedia and what do Wikipedians do when they are in love.” It was supposed to be a theoretical lecture on couples in Wikipedia. He even invited my brother to the lecture: I still didn’t figure out that something was up.

The lecture started, Avner spoke about how he begun writing in Wikipedia and how we met. Not exactly a theoretical lecture. :) Then, his last piece of advice for finding love was to dedicate an article for your beloved. He dedicated an entry of a well-known poem by Natan Alterman called ‘Eternal Meeting’. He then read a verse from that poem, looking straight at me, his voice trembling a bit:

You stormed in to me
I’ll forever play your tune
(Avner’s addition) Will you give me your hand in marriage …‘Eternal Meeting’ by - Natan Alterman

Avner and Darya kiss after his proposal.

Then Avner took out a ring. We embraced. I put the ring on my finger and we kissed, while the audience applauded. We were greeted with so much love from the Wikipedia community! We are incredibly happy to be able to share this moment with everyone. It was perfect!

Wikipedia is our passion. Naturally, we will be celebrating our honeymoon at Wikimania 2015!

3D Love: This mathematically-defined heart shape is one of the many ways that love is represented on Wikimania sites. By Chiph588, CC0.

What do we know about love? What can we learn from Wikipedia and its sister sites?

For Valentine’s Day, we asked Wikimedians to share their favorite articles or images about love, from Wikipedia and sister projects.

Together, we collected a wide range of insightful articles, images, videos, sounds, quotes and websites on the many different ways this topic is represented in our wikis: from platonic to fraternal, divine or romantic love.

Here are some of our favorites, based on over 77 community recommendations, shared via email, social media and on Wikimedia sites this week.

Articles

Love
Good introduction to the many types of love, and how they vary between cultures and viewpoints. This article is well-written, factual, and nuanced, with helpful context. Did you know that a core concept of Confucianism is Ren (“benevolent love”, 仁), which focuses on duty, action and attitude in a relationship, rather than love itself?Suggested by Karam Wajeeh Abutabaq (Facebook). Romeo and Juliet painting by Frank Dicksee, public domain.

Valentine’s Day
A well-researched overview of Valentine’s Day, how this holiday came about, and how it is celebrated in many world regions. Did you know that Saint Valentine marks the beginning of spring in some cultures? that some Islamic countries ban the sale of Valentine’s Day items?Suggested by Fabrice Florin (WMF). Antique Valentine’s Day card is public domain.

Golden Rule
This article describes the ethic of reciprocity behind by this maxim: “Treat others as one would like them to treat you.” Community member Lotje chose it because “you can pick in any language or religion,” and supporter Anika adds it can be practiced both “on-wiki and off-wiki.”Suggested by Lotje. Golden rule image by Bernard d’Agesci, public domain.

Parvati
Pārvatī is the Hindu goddess of love, fertility and devotion. She represents the gentle and nurturing aspect of Hindu goddess Shakti. Community member Wyatt Brown adds: “Lord Shiva and his Parvati companion have been doing some pretty epic stuff together, for a very long time. :)”Suggested by Wyatt Brown (Google+). Picture of Shiva and Parvati is public domain.

Images

Smallbones writes: “‘Love’s Messenger‘ is a wonderful painting I found and uploaded three years ago in February while working with a previously unknown colleague on a series of articles on Pre-Raphaelite paintings. She was absolutely wonderful in helping me work through the series. (…) Finding a colleague like this is the greatest pleasure that I have working on Wikipedia. (…) Even though we still haven’t met, and there is no romantic love between us, this is my “love letter” to P., and to all my great colleagues on Wikipedia.”Suggested by Smallbones. Painting by Marie Spartali Stillman, public domain.

Multimedia

Created as a ‘Valentine from Wikipedia’, this video documents the creation of an article about the ‘Love Dart’ — along with candid reactions from people on the street. Did you know that some snails and slugs make a little arrow (or ‘love dart’) inside their bodies before mating? Wikipedia editor Susan Hewitt thinks this may have been the origin of Cupid’s arrow.Suggested by Michael Guss. Video by Victor Grigas (WMF), licensed under CC-BY-SA 3.0. View it on YouTube.

American silent film directed by Cecil B. DeMille (1920). As described on Wikipedia, this 90-minute romantic comedy tells the story of Robert and Beth Gordon, who are married but share little. He runs into a young woman at a cabaret — and the Gordons are soon divorced. Over time, they are drawn back to each other — and they fall in love again. The end title claims that “a man would rather have his wife as a sweetheart than any other woman”: it invites women to always look their best and “learn when to forget that you’re his wife.”Suggested by Geni. Film by Cecil B. DeMille is public domain. (Note this file may not play well on some browsers.)

Audio recording of Cole Porter’s “Let’s Do It”, performed by Linda November and Artie Schroeck.Suggested by 98.114.44.226. Audio by Linda November and Artie Schroeck, ARTLIN Enterprises, CC-BY-SA-3.0. (Note this file may not play well on some browsers.

Projects

Wikilove – The Encyclopedia of Love
Wikilove is a project dedicated to emotional bonds and love rituals. Founder Alexis de Maud’huy says the main idea of Wikilove is to “educate about emotional intelligence.” Did you know that he first created this project as a wedding gift to his wife? It has now grown to nearly a million pages and has become a valuable resource about emotional issues.Suggested by Keegan (WMF) and Jessica Robell.

Feel the Love
A wonderful gallery of love images on the Signpost, created in response to our call for wiki content on this topic. Its creator, Pine, writes: “If anyone felt that there was too little love in Wikimedia, I hope that this gallery will change their minds!” Thanks, Pine, we feel the love, and it is much appreciated. :)Suggested by Fabrice Florin (WMF).

Love Wikiquotes
A rich collection of quotable statements about love. You can pick quotes in any language, on a wide variety of emotional states related to love.Suggested by Nemo.

To-morrow is Saint Valentine’s day,
All in the morning betime,
And I a maid at your window,
To be your Valentine.
Then up he rose, and donn’d his clothes,
And dupp’d the chamber-door;
Let in the maid, that out a maid
Never departed more.

Thanks for sharing the love!

Thanks to everyone who contributed to this great community-created love collection!

We are particularly grateful to Pine, Nemo, Fae and Geni, some of our most active participants, for their helpful contributions. Many thanks as well to all the folks who shared their suggestions on Facebook, Twitter and Google +.

Your collective suggestions broadened our perspectives about love, in all of its forms. Together, we found some really well-written, factual and nuanced articles, as well as many humorous, dramatic or beautiful images, which gave us a better understanding about love and why it matters.

What do you think about this curation experiment? Did you learn anything new? Should we do it again? If so, what themes should we focus on next? Please chime in the comments below with your ideas and suggestions. We hope that collaborations like these can help us discover new ways to share useful information, combining the wikis, our blog and social media.

Thanks again for sharing the love — and happy Valentine’s Day to all Wikimedians!

Wiki Ed is looking for current translation and language courses to translate Wikipedia articles.

These assignments present students with an opportunity to put their skills into public service. Student editors get real-world experience in translating for a broader public while deepening an appreciation for cross-cultural understanding.

Finally, it taps into an intrinsic motivation to apply skills beyond the classroom. They know the translation exercise they do will matter long after it’s graded.

How does a translation assignment work? It’s actually simple. As a source text, students find a high-quality article on their target-language Wikipedia. They check to see if the corresponding article is on their native-language Wikipedia. If it’s missing or short, they translate the article into their native language.

Here’s an example. An English-speaking student studying Spanish would find a Good Article on the Spanish Wikipedia. Then, they would add translated content to the corresponding article on the English Wikipedia.

By translating geographic and cultural articles from other parts of the world that are otherwise missing or incomplete, a student at an American or Canadian university can expand English Wikipedia’s diversity of content for millions of people.

You can get started right away. We can help you find articles to assign to students, or they can find them on their own. We’ll offer a flexible assignment outline, training materials for student editors, and support during the assignment.

This assignment is a rare opportunity to apply translation to a widely read source of information for people around the world. If you are interested in bringing this opportunity to your students, contact Ryan McGrady at ryan@wikiedu.org.

‘Impact’ is a perennial concern for organisations, including Wikimedia chapters. Showing that what you’re up to makes a difference: contributing to free knowledge.

It’s a familiar topic if you’re a researcher and can affect whether you get funding. It’s one thing to be able to say that your article has appeared in a journal with a circulation of 10,000 copies but that doesn’t necessarily show that it has influenced people. Ideally you want to see people talking about your research, sharing it with other people, and using it to inform their own work. This is often done by counting how many times an article is cited in other publications, but misses out the likes of social media and newspapers. Altmetric.com measures the digital impact of articles, and recently announced that they are now including Wikipedia in their statistics.

This is a significant step. Wikipedia is the 6th most visited website in the world and receives about 500 million unique visitors every month. Not only is it one of our first sources of information in the digital age, it is read on an incredible scale. If your work is being used there, it is reaching far more people than would otherwise be possible.

So why is the inclusion of Wikipedia something to celebrate?

In short it’s another step towards recognising the reach and importance of Wikipedia and might encourage academics to interact with it. Already groups are considering Wikipedia as part of their outreach work when applying for funding. The Atlas of Hillforts Project of Oxford University’s school of archaeology specifically mentioned Wikipedia in terms of data dissemination and received £950,000 from the Arts and Humanities Research Council. One more incentive might help people get involved and it creates a positive feedback loop. The better quality information Wikipedia has, the more likely academics will be to improve it.

Importantly, this move might help encourage open access. Researchers and academics generally understand conflict of interest issues, so the key way of making it more likely that Wikipedia will cite your work is to make it available to as wide an audience as possible through open access.

Overall any initiative which might increase the quality of Wikipedia in the long run and improve its reputation is surely a good thing.

February 12, 2015

Most users of Wikipedia aren’t aware that Wikipedia, like other online services, has a support staff. Granted, it is an all-volunteer staff and it can sometimes take months to get an answer to your question because of the backlog that often exists. But it’s a free service, and no one will try to upsell you to Wikipedia’s premium version. Many who’ve used the service have been quite satisfied with the results.

That said, there are complicating matters and some ethical issues involved that will be discussed in a subsequent blog post. In this post I’d like to focus on how you can access the Wikipedia support system. The following screencast is designed to do just that.

Once you’ve submitted your email request, it is entered in a support ticket system (often referred to by Wikipedians as “OTRS,” the name of the software used to run the system.) Your ticket will eventually be processed by a Wikipedia volunteer. All subsequent communication will be via email between you and the volunteer who, with any luck, will help you with your request.