Hey, have you ever had a look into the Wiktionary? Did you ever came across a word like Mòcheno? And you possibly knew how to pronounce that correctly? Or you were just wondering? Let me tell you, it is a laborious if not even tedious process, recording a pronunciation and uploading it: First, one needs recording software, then one has to record, save in correct file format and upload it to Wikimedia Commons, including preparation of the file description page, and finally inclusion into the Wiktionary entry is required. What would you say if you could do all this in your browser without having to care for uploading, file description preparation, recording software and how this plays together?

Unfortunately this project was, like a lot of GSoC-Projects, not completed. Since IEG does not support extension-work, I would like to fork the code and remove the Upload Wizard-dependency because only FormData is required for uploading in modern browsers that support the audio API and none of the user-interface-elements of Upload Wizard are required. In the end, a wizard-guided-process for adding good pronunciations, driven by JavaScript will be the solution, that I am going to offer.

Learn: If user preferences are unknown to the tool, a learn page will be displayed first, telling the contributor a few words about the flow of this tool, best practices for recording and that it is going to request access to the microphone when proceeding (to check everything is installed correctly, we'll request access to your microphone ...)

Access and Preferences: If the contributor decided to proceed, the wizard will be greyed out, request access to the user's microphone, and if successful, continue to the next step, asking for preferences including a section with a license-picker, as well as a section asking how the user would like to be attributed. When the user decides to proceed, these preferences will be saved into the Wiktionary user account. These two steps will be skipped in future when invoked again by the contributor. User accounts must not be shared so it's okay to save the license selection.

Recording: Now, a recording toolbar is shown with the word and IPA to speak. It allows recording of multiple samples (listing them for playback) so, in the end, the contributor can decide which one to submit. If it should turn out that simple audio visualization (graph) and manipulation tools (cutting, volume-level) are required they could be also implemented.

Once the contributor has chosen the best sample, it is uploaded to Wikimedia Commons without having to leave the page. Upload progress will be shown.

As soon as the upload to completed, the Wiktionary entry is edited automatically, adding the audio sample. When successful, a thank you or success-message is shown and the page is either re-loaded or the sound file is dynamically inserted at the correct position in the current page-rendering, or the fresh page text is retrieved via API (not sure whether the media player respects the mw.hook for new content, yet).

To enable Wiktionary users as fast as possible to use the tool, I intend forking the extension, adding the missing features and providing it as a RL-module at Wikimedia Commons together with instructions how to load it from other wikis. Plan B is, if Commons Community believes that there are license-incompatibilities, to host it on Toollabs and if this turns out being too unreliable, plan C is an own labs instance.

During software development, there are multiple feedback cycles scheduled allowing Wikitionary community members to actively take the product into the right direction ensuring the software we produce fits their needs. Most funded time will be spent by coding JavaScript, bringing the browser's audio API, FormData for file upload and canvas or SVG elements for audio visualization into play. The ultimate goal is the creation of a gadget. Rillke owns a tool labs account and is capable using SSH tunnels, Databases and SFTP, therefore the gadget could be alternatively hosted there. Usage metrics will be sent to, saved and evaluated at tool labs respecting the privacy policy, of course. There are no intentions paying for external usability testers as long as we'll get sufficient useful feedback by the Wiktionary community.

Safari can be expected supportinggetUserMedia (the required API) in one of the next versions as Apple hasn't announced the contrary. Internet Explorer will most likely not support the API used because Microsoft declared that they would like to implement something differently. If a huge part of users willing to add pronunciation recording cannot due to missing browser support, we consider authoring/adding a Flashplayer-Shim but we'll first try without that as it means a lot of extra-efforts and promoting technologies that are not favourable. Rillke going to ask at Microsoft STC if he's able to, how they suggest to implement audio recording in Internet Explorer.

Old browser testing, Wiktionary and Wikidata community feedback, bot assistance

EUR 200 (about $ 274.50)

EUR 200

Volunteers, Contest winners

Wikimedia merchandise for volunteers (WMF shop): Providing volunteers helping organizing community engagement (making the tool known, organizing a pronunciation rally, including authoring banners, awards, advertising, etc.) with small gifts and providing them with merchandize items for contest winners (will be requested on demand); Please consider the number of wiktionaries when evaluating the amount requested here

By easing the creation, uploading and inclusion process, there are substantial benefits for Wiktionary users creating pronunciation samples. Visitors of Wiktionary, especially non-native speakers will hopefully get more audio samples and Wikimedia Commons community members, getting perfect file description pages profit from the development of this gadget. The open source world and the MediaWiki extension development, depending on the IEG-FDC's decision, may get positive impulses from the gadget-development.

Pronunciation Recording Gadget will be able to encode to Ogg-Opus (Container: Ogg; Codec: Opus) client side with reasonable performance (speed and quality promising). This is achieved by a port of opus-tools/opusenc to JavaScript using Emscripten. Whether this capability is used for sample lengths of a few seconds depends on several aspects like Metadata support (MediaWiki, Ogg/Vorbis container), how well browsers work with the 1 MiB JavaScript encoder chunk and Wikimedia OPS support for transcoding support for Opus encoded files for Timed Media Handler (this is providing differently encoded files for different browsers).

Without a FLAC or Opus encoder, it's not suitable for recording full Wikipedia articles, if the resulting audio would be longer than 5 minutes. If you desire creating me a JavaScript FLAC/Opus encoder, please comment below. It will be a huge pile of work (approx. 5 months with 10-20 hrs/week of coding). If you found a JavaScript FLAC encoder, just point me to the direction and things will be a lot easier. I know there is speex.js but it appears to cut the input at the beginning and the quality is not that great.

There are interested parties in getting Opus encoder support and the W3C made a draft allowing clients to request audio samples in a format of their desire and Firefox on Windows even records to this format by default when using its MediaRecorder API. Though, this limits the ability to edit the audio sample as it would have to be done prior to encoding. This could be probably achieved creating a MediaStream from the recorded sample data or an AudioBuffer in an AudioContext from the WAVE in-memory-file. Both have the disadvantage that they are rendered in real time (thus the user has to wait for the complete recorded sample to be played).

First, I am going to send a message to the Grease pit and the equivalent discussion forums in other language Wiktionaries. I am also going to contact people who regularly upload pronunciations to Commons directly (either via IRC or onWiki). Looking for one or two Wiktionary Community members interested in working as consultants and testers supporting the creation of an awesome tool that really fits Wiktionary's needs. Depending on good-will by volunteers is too risky for me but of course is also welcome. Just add your name at the volunteers: section, if you're willing to give me a hand. A project page and a feedback page will be created at Commons; Pronunciation Recording will offer a link to both of them. Shortly before completing Phase I and Phase II (c.f. #Measures of success), I am going ask for first feedback, either through the consultants or directly.

Offering this easy-to-use tool will enrich the Wiktionary's editing tools enhancing efficiency and therefore possibly increase participation and it will also improve Wiktionary's quality for the visitors if more spoken samples are available.

Code will be open source. Audio editing tools might be useful for other projects, beyond the bounds of the WMF as well.

I will optionally follow the MediaWiki JavaScript coding conventions here so some of the code that turns out to be useful can be backported to the extension. The code will be packed into modules, so, no matter what some upload, recording and UI code, as well as icons will be created that might be useful for other projects as well.

Let's be pragmatic here rather than writing a philosophical essay doing great analysis: If the small number of community members I am going to contact is happy with my implementation and will make use of the tool, I am also, leaving aside the question whether technical tools encourage participation or whether it is the climate in the community.

… but we're wondering how many users you expect to use the tool to upload sound files at the end of 6 months

First of all, I think purely focusing on user numbers without distinguishing which kind of users the PRG attracted (power users who e.g. recommend PRG to their mates, integrate it into Wiktionary's workflow; or users who only rarely contribute something; users who pay a lot of attention to do it correctly; users who are more sloppy about their actions) is not a suitable marker for success.

But it is a good start to develop a strategy for getting a clue about the impact. First we have to know:

How many active users are there in Wiktionaries? And how does the edit-count-distribution look like. [graphs and numbers to be added]

How many Wiktionary users have a Microphone or other suitable recording device at their service? [needs survey]

How many of them are inclined recording pronunciation? [needs survey]

How many of #3 will use PRG?

In numbers

We agreed in IRC that we are going to plot the number of pronunciation uploads and the number of users producing them versus time. Global usage is another interesting indicator we might want to measure.

Measuring use

As outlined by Infovarius, this can be done by adding a template the file description page and then querying templatelinks.

Or, and this allows more accurate tracking, a service running on tool-labs to which additional information like account age, user groups and edit count is submitted.

Rillke, as a community administrator at Wikimedia Commons, mainly caring about technical aspects, has gathered lot of experience in JavaScript coding over the last 2 years that lead to the development of several tools for both, less experienced ([1]) and more experienced ([2], [3]) users including upload implementations. Rillke furthermore contributed several small fixes to, for example Upload Wizard.

Ungoliant MMDCCLXIV. English Wiktionary member since January 2011 and administrator since August 2012. Made over 30000 contributions in that project. Has some experience with programming but little with JavaScript.

Infovarius. Russian Wiktionary member and administrator. Made over 40000 personal and 400000 automated edits in that particular project. Runs self-programmed bot for arbitrary tasks. Also planning to provide compatibility with Wikidata.

Due to an unforeseeable lack of time, I was unable to talk to the community in time. I therefore withdraw this proposal. Let's see how the work on the WMF-side is progressing in the next half year. And if it isn't I am probably going to re-launch this proposal. In the meanwhie I have more time familarizing with Wiktionary. Thanks for all the comments at the talk page. -- Rillke (talk) 17:58, 22 October 2013 (UTC)

Ok, Rillke - sorry to see you withdrawing this time, hope you'll resubmit in a future round! Siko (WMF) (talk) 16:17, 28 October 2013 (UTC)

There was zero progress after I've withdrawn the request, only some automated substitions and i18n updates: Commits to the extension: Last real codebase update was on 2013-09-23. Thus, there is still need for an actually working solution, Wiktionary users could benefit from. -- Rillke (talk) 08:13, 6 March 2014 (UTC)

First of all: I think this tool has much potential and could reduce the amount of work necessary to create pronunciation recordings to a minimum. It's a great idea! I created a few recordings for de.wiktionary.org and at first I had a few problems figuring out which steps are necessary to get a proper recording into Wiktionary.
I am not familiar with programming and things like that but I have two notes regarding this tool:

1. To ensure a standard in audio quality it would be great to connect this tool to Auphonic.com. This austrian company runs a few algorithms on audio files and is able to improve the quality of a file. It's maybe even more important that they standardize the volume and meta data of a recording. They offer an API but I'm not sure if this tool and Auphonic could be working together properly.

2. In my case the pronunciation recordings are a byproduct of a podcast I produce. When I upload my recordings to commons, they are already done. It would be nice to use those finished recordings in the same way I could create a new one with this tool. My recommendation is to add a context menu for uploading existing files in the recording section. That would still improve my workflow a lot, because I would not have to upload my files to commons and add the recording at de.wiktionary.org separately. Also, I could use my favorite DAW to alter the audio like I want to. I know this tool is meant for more unexperienced users, but it could make the lifes of more sophisticated users easier, too.

1. If auphonic.com is inclined to provide us with free API access for at least the next 5 years, I will be inclined to consider building on top of their services. I am going to ask them. As this would involve communication to external servers, this would be an extra-button or setting a user has to explicitly opt-in and agree to their terms of service. They would also have to agree to keep the same standards as the WMF has in regard to data protection etc.

In case this fails, I can imagine a custom implementation of the Adaptive Leveler without having to rely on third party services. De-noise seems to be more complecated.

We'll mention some tricks to ensure a minimum of quality in the first step. I imagine that these tricks will be fetched from a page maintained by the community, enabling Wikitionary contributors to adjust their demand for a specific level of quality. If that page does not exist in a wiktionary, a default will be displayed.

2. Yes, this sounds like a great idea. This will be implemented, if the proposal is selected. Maybe not as context menu but there will be a way to insert recordings from the local file system. E.g. with Drag&drop into the recording-window. Dependent on the input format and browser, playback might not be available for these kind of samples before they're uploaded. -- Rillke (talk) 14:52, 15 March 2014 (UTC)

Currentlyy, I am using Shtooka Recorder, semi-automated patching, and uploading via PyWikiBot - working but complicated to set up. Good to have a better approach.

Quickly redoing and then being able to choose is imho the best approach to a 1st step of quality assurance of recordings.

Should be working from outside Wiktionaries as well. Commons, Wikidata, Wikipedia all have demands, too. User should be able to choose the target wiki for their uploads as per an installatino option of the wiki, Commons being he default.

As a user selectable option, having both an oscillogram and a spectogram shown per recording would be nice. Could be added later as well, since computing them is not trivial. Praat has free open source doing that.

Doing entire sentences or phrases should be possible, they are needed.

Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project in the list below. Other feedback, questions or concerns from community members are also highly valued, but please post them on the talk page of this proposal.

Community member: add your name and rationale here.

I think this would be a very valuable project. In my experience, Wiktionary contributors tend not to be very vocal when it comes to the technical side of things, so I think the silence here should not be seen as a concern. This, that and the other (talk) 05:18, 8 March 2014 (UTC)

The main principle of the wiki is that everybody makes his contribution to it. It worked well for the definitions, the proof being that we now have +2.5M pages. However, a lot of pronunciations are still missing to improve the quality of our projects. This is a great idea we have here, and Rilke seems quite motivated to develop it. Let's give him the means to achieve it. -- Quentinv57(talk) 14:13, 8 March 2014 (UTC)

Nothing special to say but just a message to support strongly your work. I wish not become a consultant (mainly due to a lack of time) but I will be glad to test your extension. Pamputt (talk) 12:38, 9 March 2014 (UTC)

Based on the demo video, I think this would be a very useful gadget. Wiktionary would benefit from having more pronunciation info. Many people are unfamiliar with pronunciation transcription schemes, but most people are able to speak their native languages aloud without difficulty. A gadget like this would make it easy for them to record themselves speaking and uploading the files. Wiktionary also covers placenames and surnames, and if users uploaded pronunciations of these, Wikipedia would also benefit (en.WP's article on The Hague already includes the same audio file of the city's Dutch name that Wiktionary has, similar audio, though users might initially add it to WP, could then be copied to WP). PS, I agree with This, that and the other's assessment of Wiktionary as rather taciturn. :b -sche (talk) 00:03, 14 March 2014 (UTC)

I support this as well. Wiktionary has a number of active editors from various backgrounds. I probably won't record a lot in Russian (my native tongue) but I can do it on request and will probably request others to record words in other languages I'd like to have recording for. Many languages lack audio recordings altogether. This could also be used in various linguistic discussions (which lead to decisions about the provided information), currently people can only use IPA or someone else's recordings. --Anatoli (talk) 01:06, 14 March 2014 (UTC)

This tool will encourage me to record Armenian pronunciations for all Armenian words I create on Wiktionary—something I don't do now because of the complexity of uploading. --Vahagn Petrosyan (talk) 06:58, 14 March 2014 (UTC)

The project sounds good to me. I hope that many users will participate. The realization seems to be very user-friendly. Keep up the good work! Best regards --Yoursmile (talk) 12:59, 15 March 2014 (UTC)

This tool would simplify the process of creating pronunciation recordings to a level that's more reasonable for unexperienced users and could - if it's done properly - secure a minimum standard of quality. The time and effort which are necessary to create a recording are too high at the moment. There are too many steps and work that has to be done manually, which could be done by software. This project could solve those problems. --LarsvonSpeck (talk) 13:47, 15 March 2014 (UTC)

I would like to add audio files of English words to Wiktionary, but I gave up on learning the recording and uploading process long ago. I would add IPA transcriptions to more words, but I often don't know how to accurately transcribe words in English, despite being a native speaker. I think this new method would be extremely beneficial to language learners. I would love to be able to add recordings of terms that can't be found in any other dictionary. Ultimateria (talk) 20:41, 22 March 2014 (UTC)

I'd certainly welcome this. Hopefully it would be able to remember metadata about the speaker (e.g. their regional accent) to save re-entering it. I am curious if this would also work on mobile browsers or if recording on devices would require a separate app? (iOS doesn't even have ogg playback support yet, so I'm guessing not on iOS) —Pengo (talk) 04:41, 25 March 2014 (UTC)

Yeah, mobile would be awesome but currently only BlackBerry Browser 10 implements the Web-APIs necessary (so I would even lack test devices) and gadgets cannot be deployed to mobile targets easily; the only way getting community JavaScript executed seems to be MediaWiki:Mobile.js. This development will focus on desktop devices first and I'll, if time permits, provide tools for collecting metrics about users on mobile willing to record pronunciation. The issue is that coding an app, even when using cross-platform-helpers like PhoneGap, for every single device type is becoming quite time-expensive. Nonetheless, I'll give it a try. -- Rillke (talk) 08:29, 25 March 2014 (UTC)

This is a great idea. It's only the hassle of uploading audio files that's put me off hitherto; I regularly add IPA. This gadget will probably lead to adding audio pronunciations becoming part of my routine when creating entries. I'm so meta even this acronym (talk) 18:13, 25 March 2014 (UTC)

I am confident that Rillke has the skills to succesfully complete this proposal. I also think it would be quite useful. Bawolff (talk) 17:10, 27 March 2014 (UTC)

Simply brilliant! I think I've recorded a few hundred sound samples, especially of single words and the like. Going through the normal uploading process is tedious business, so this gadget would such a boon to the project. I would just like to stress that the audio templates that link the files have to be formatted to have proper links to the Commons file page, not just a link to the file itself. This has been a problem I've noticed on English Wikipedia. Also, it would be great if it suggested standard formats for language prefixes, like the "en-" for English or "de-" for German. Keep up the good work! Peter Isotalo (talk) 17:42, 27 March 2014 (UTC)

This tool would be a great help. It would simplify the process of creating Hungarian pronunciation recordings for the English Wiktionary. --Panda10 (talk) 21:49, 27 March 2014 (UTC)

Rillke has clearly looked at the existing progress, and prepared for this round. His JavaScript knowledge makes him a good candidate to pick this up. Some of the software he's worked on (e.g. commons:MediaWiki:EnhancedStash.js) is in related areas of the API. Superm401 | Talk 09:27, 29 March 2014 (UTC)

I think this project could be valuable. I used to record pronunciation and it brings several problems. However this proposal doesnt solve problems, which people, who do mass pronunciation recordings, it comes with a new solution, how to enrich Wiktionary or Wikiversity with audio recordings. I think to have an easy way to make such recording directly on the "word" page would automatically attract more contributers.--Juandev (talk) 16:42, 30 March 2014 (UTC)

I tried the existing extension a few months ago, and really want to see more. With the endorsements from the devs above, I have full confidence in this candidate. Quiddity (talk) 23:00, 31 March 2014 (UTC)

I think this is an excellent idea, and I know that if this existed I would spend a least a few days just going through recording pronunciations for obscure English words I know. (Along with making a recording basically everytime I look something up on Wiktionary that doesn't have one yet) Zellfaze (talk) 17:51, 2 April 2014 (UTC)

I support endorsement of this project. Pronunciation aids for Wikipedia and Wiktionary would be phenomenally helpful to me on a near-daily basis, and above editors have expressed considerable confidence in the coding skills of the applicants such that they should be able to carry out the project capably. Chubbles (talk) 07:25, 7 April 2014 (UTC)

Reposted from nl.wiktionary: Seems like a really, really good idea. The words in Dutch here have recently largely been added by someone with admirable stamina and diligence, after a long drought, because it used to be really tedious to upload to media commons. I do have a question about that. At commons files need to be uploaded as nl-something.ogg or so but then alphabetized under 'something' otherwise everthing ends up under N. Is this automatically done? Another thing is: can we upload sentences. We try and give an example sentence for every word here and I have heard from people who want to learn Dutch that that is really useful. It could be made more so if the audio is there too. Again this would require some adaptation at commons, perhaps a different category than just the word category. It would really be nice if you could automate that. Jcwf (talk) 01:26, 8 April 2014 (UTC)

Thank you for re-posting here. I guess you are referring to the sort key? This one looks quite easy to implement. Yes, it will be automatically done.

Sentences are useful, sure. However, it would not only require a different category-system at Commons but also a different inserting and detection mechanism for the Wiktionary entry. I have to check how feasible this is first and possibly it does not fit in the time schedule set out above. Uploading and recording longer samples, let's say of about 7 seconds should work the same as recording single words of 2 s length. Just the whole procedure around it will differ. In case I do not manage producing sentence recording in round 1/2014, and given that the tool is going to be popular and used, I'll consider it for a renew request. -- Rillke (talk) 08:30, 8 April 2014 (UTC)

Endorse What a great project! Jane023 (talk) 20:41, 14 April 2014 (UTC)

Endorse. It's a pity the extension was never finished, but this seems like a great way to take it forward. I'm confident that Riilke has the required skills, and has obviously thought carefully about how to implement this. the wub"?!" 21:21, 19 April 2014 (UTC)

It could be a easyest way to obtain pronunciation files. --Dvdgmz (talk) 11:33, 21 April 2014 (UTC)

Support ! This is an awesome project. Yug (talk) 11:46, 1 October 2014 (UTC)

Support: Pronunciations are a real plus on Wiktionary projects and, given the difficulty of edit pages, such a tool is a godsend. — Automatik (talk) 00:22, 12 December 2014 (UTC)

This tool would be very useful at Wikivoyage, for survival dictionaries. --Felipe (talk) 00:20, 29 April 2016 (UTC)