Tag: mercurial

One of the projects I’ve been really excited about recently is getting ESLint working for a lot of our JavaScript code. If you haven’t come across ESLint or linters in general before they are automated tools that scan your code and warn you about syntax errors. They can usually also be set up with a set of rules to enforce code styles and warn about potential bad practices. The devtools and Hello folks have been using eslint for a while already and Gijs asked why we weren’t doing this more generally. This struck a chord with me and a few others and so we’ve been spending some time over the past few weeks getting our in-tree support for ESLint to work more generally and fixing issues with browser and toolkit JavaScript code in particular to make them lintable.

One of the hardest challenges with this is that we have a lot of non-standard JavaScript in our tree. This includes things like preprocessing as well as JS features that either have since been standardized with a different syntax (generator functions for example) or have been dropped from the standardization path (array comprehensions). This is bad for developers as editors can’t make sense of our code and provide good syntax highlighting and refactoring support when it doesn’t match standard JavaScript. There are also plans to remove the non-standard JS features so we have to remove that stuff anyway.

So a lot of the work done so far has been removing all this non-standard stuff so that ESLint can pass with only a very small set of style rules defined. Soon we’ll start increasing the rules we check in browser and toolkit.

How do I lint?

From the command line this is simple. Make sure and run ./mach eslint --setup to install eslint and some related packages then just ./mach eslint <directory or file> to lint a specific area. You can also lint the entire tree. For now you may need to periodically run setup again as we add new dependencies, at some point we may make mach automatically detect that you need to re-run it.

Why lint?

Aside from just ensuring we have standard JavaScript in the tree linting offers a whole bunch of benefits.

Linting spots JavaScript syntax errors before you have to run your code. You can configure editors to run ESLint to tell you when something is broken so you can fix it even before you save.

Linting can be used to enforce the sorts of style rules that keep our code consistent. Imagine no more nit comments in code review forcing you to update your patch. You can fix all those before submitting and reviewers don’t have to waste time pointing them out.

Linting can catch real bugs. When we turned on one of the basic rules we found a problem in shipping code.

With only standard JS code to deal with we open the possibility of using advanced like AST transforms for refactoring (e.g. recast). This could be very useful for switching from Cu.import to ES6 modules.

ESLint in particular allows us to write custom rules for doing clever things like handling head.js imports for tests.

Mozreview is close to being able to lint your code and give review comments where things fail

Work is almost complete on a linting test that will turn orange on the tree when code fails to lint

What’s next?

I’m grateful to all those that have helped get things moving here but there is still more work to do. If you’re interested there’s really two ways you can help. We need to lint more files and we need to turn on more lint rules.

The .eslintignore file shows what files are currently ignored in the lint checks. Removing files and directories from that involves fixing JavaScript standards issues generally but every extra file we lint is a win for all the above reasons so it is valuable work. Also mostly straightforward once you get the hang of it, there are just a lot of files.

We also need to turn on more rules. We’ve got a rough list of the rules we want to turn on in browser and toolkit but as you might guess they aren’t on because they fail right now. Fixing up our JS to work with them is simple work but much appreciated. In some cases ESLint can also do the work for you!

ESLint becomes the most useful when you get warnings before even trying to land or get your code reviewed. You can add support to your code editor but not all editors support this so I’ve written a mercurial extension which gives you warnings any time you commit code that fails lint checks. It uses the same rules we run elsewhere. It doesn’t abort the commit, that would be annoying if you’re working on a feature branch but gives you a heads up about what needs to be fixed and where.

To install the extension add this to a hgrc file, I put it in the .hg/hgrc file of my mozilla-central clone rather than the global config.

After that anything that creates a commit, so that includes mq patches, will run any changed JS files through ESLint and show the results. If the file was already failing checks in a few places then you’ll still see those too, maybe you should fix them up too before sending your patch for review? 😉

The offending changeset that broke hgchanges yesterday turns out to be a merge from an ancient branch to current tip. That makes the diff insanely huge which is why things like hgweb were tripping over it. Kwierso point out that just ignoring those changesets would solve the problem. It’s not ideal but since in this case they aren’t useful changesets I’ve gone ahead and done that and so hgchanges is now updating again.

My handy tool for tracking changes to directories in the mozilla mercurial repositories is going to be broken for a little while. Unfortunately a particular changeset seems to be breaking things in ways I don’t have the time to fix right now. Specifically trying to download the raw patch for the changeset is causing hgweb to timeout. Short of finding time to debug and fix the problem my only solution is to wait until that patch is old enough that it no longer attempts to index it. That could take a week or so.

Nearly a year ago I showed off the first version of my webapp for displaying recent changes in mercurial repositories. I’ve heard directly from a number of people that use it but it’s had a few problems that I’ve spent some of my spare time trying to solve. I’m now pleased to say that a new version is up and running. I’m not sure it’s accurate to call this a rewrite since it was entirely incremental but when I look back over of the code changes there really isn’t much that wasn’t touched in some way or other. So what’s new? For you, a person using the site absolutely nothing! So what on earth have I been doing rewriting the code?

Hopefully I’ve been making the thing a whole lot more stable. Eagle eyed users may have noticed that periodically the site would stop updating, sometimes for days at a time. The site is split into two pieces. One is a django website to display the changes. The other is the important bit that reads the change data out of mercurial and puts it into a database that the website can use. I do this caching because it would be too expensive to work it out on the fly. The problem is that the process of reading the data out of mercurial meant keeping a local version of every repository on the server (some 5GB for the four currently tracked) and reading the changes using mercurial’s python API was slow and used a lot of memory. The shared hosting environment that I run this out of kept killing the process when it used too many resources. Normally it could recover but occasionally it would need a manual kick. Worse there is some mercurial bug where sometimes pulling a repository will end up with an inconsistent database. It’s rare so many people don’t see it but when you’re pulling repositories every ten minutes it starts happing every couple of weeks, again requiring manual work to fix the problem.

I did dabble with getting this into shape so I could run it on a mozilla server. PAAS seemed like the obvious option but after a lot of work it turned out that the database size restrictions there made this impossible. Since then I’ve seem so many horror stories about PAAS falling over that I think I’m glad I never got there. I also tried getting a dedicated server from Mozilla to run this on but they were quite rightly wary when I mentioned that the process used a bunch of memory and wanted harder figures before committing, unfortunately I never found out a way to give those figures.

The final solution has been ripping out the mercurial API dependency. Now rather than needing local mercurial repositories the import process instead talks to them over the web. The default mercurial web APIs aren’t great but luckily all of the repositories I care about have pushlog installed which has a good JSON API for retrieving recently pushed changesets. So now the process is to pull the recent pushlog, get the patches for every changeset and parse them to work out which files have changed. By not having to load the mercurial structures there is far less memory used and since the process now has to delay as it downloads each patch file it keeps the CPU usage low.

I’ve also made the database itself a lot more efficient. Previously I recorded every changeset for every repository, but many repositories have the same changeset in them. So by noting that the database gets a little more complicated but uses much less space and since you only have to parse the changeset once the import process becomes faster. It took me a while to get the website performing as fast as before with these changes but I think it’s there now (barring some strange slowness in Django that I can’t identify). I’ve also turned on some quite aggressive caching both on the client and server side, if someone has visited the same page as you’re trying to access recently then it should load super fast.

So, the new version is up and running at the same location as the old. Chances are you’re using it already so if you’re noticing any unexpected slowness or something not looking right please let me know.

Back in the old days when we used CVS of all things for our version control we had a wonderful tool called bonsai to help query the repository for changes. You could list changes on a per directory basis if you needed which was great for keeping an eye on certain chunks of code. I recall there being a way of getting an RSS feed from it and I used it when I was the module owner of the extension manager to see what changes were landed that I hadn’t noticed in bugs.

Fast forward to today and we use mercurial instead. When we switched there was much talk of how we’d get tool parity with CVS but bonsai is something that has never been replaced fully. Oh hgweb is decent at looking at individual files and browsing the tree, but you can’t get that list of changes per directory from it. I believe you can use the command line to do it but who wants to do that? Lately I’ve been finding need of those directory RSS feeds more and more. We’re now periodically uplifting the Add-on SDK repository to mozilla-central, it’s really important to spot changes that have been made to that directory in mozilla-central so we can also land them in our git repository and not clobber them the next time we uplift. I’m also the module owner of toolkit, which is a pretty big sprawling set of files. It seems like everytime I look I find something that landed without me noticing. I don’t make for a good module owner if I’m not keeping an eye on things so I’d really like to see when new files are added there.

So I introduce the Hg Change Feed, the result of mostly just a few days of work. Every 10 minutes it pulls new changes from mozilla-central and mozilla-inbound. A mercurial hook looks over the changes and adds information about them to a MySQL database. Then a simple django app displays that information. As you browse through the directories in the tree it shows only changesets that affected files beneath that directory. For any directory you can also get an RSS feed of the same. Plug that into IFTTT and you have an automated system to notify you in pretty much any way you’d like about new changes you’d be interested in.

One other neat trick that this does is mostly ignore merge changesets. Only if a merge actually makes a change not already present in either of the merge parents (mostly happens when resolving merge conflicts) will it show up in the list of changes, because really you don’t need to hear about changes twice.

So play with it, let me know if you find it useful or if you think things are missing. I can also add other mercurial repositories if people want. Some caveats:

It only retains the last 2000 changesets from any repository in an effort to keep the DB size small and fast, it also only shows the last 200 changesets for each page, or just the last 20 in the feeds. These can be tweaked easily enough and I’ve done basically no benchmarking to say those are the right values.

The site isn’t as fast as I’d like, particularly listing changes for the top level directory takes nearly 5 seconds. I’ve thrown some basic caching in place to help alleviate that for now. I bet someone who has more MySQL and django experience than me could tell me what I’m doing wrong.

I’m off on vacation tomorrow so I guess I’m announcing this then running away, sorry if that means it takes me a while to respond to comments.

Want to help out and make it better? Go nuts with the source. There’s a readme that hopefully explains how to set up your own instance.

The extension I’ve been working on in my spare time for the past couple of weeks is now available as a first (hopefully not too buggy) release. It lets you open WebApps in Thunderbird, properly handling loading new links into Firefox and making all features like spellchecking work in Thunderbird (most other extensions I found didn’t do this). You can read more about the actual extension at its homepage.

Mostly I’ve been really encouraged during the development of this at just how far our platform has come for developing restartless add-ons. When we first made it possible in Firefox 4 there was a whole list of things that were quite difficult to do but we’ve come a long way since then. While there are still things that are difficult there are lots of things that are now pretty straightforward. My add-on loads simple XUL overlays, style overlays, installs JS XPCOM components with category manager registration, all similar to older add-ons. In fact I’m struggling to think of things that it is still hard to do though I’m sure other more prolific developers will have plenty of comments on that!

The other thing I’ve been doing with this extension is experimenting with git and GitHub. I think it’s been an interesting experience, there are continual arguments over which is better between git and mercurial with many pros and cons listed. I think most of these were done some time ago before mercurial and git really matured because from what I’ve seen there is really little difference between the two. They have slightly different default branching styles, but both can do the same kind of branching that the other can if you want and there are a few other minor differences but nothing that would really make me all that bothered over deciding which to use. I think the only place where git has a bonus is with GitHub, and really as far as I can see there isn’t a reason why someone couldn’t develop a similar site backed by mercurial repositories, it’s just that no-one really has.

GitHub is pretty nice with built-in basic issue tracking and documentation though it still has some frustrating issues. It seems odd for example that you can’t fork your own project, only someone else’s, but that’s only a minor niggle really. As project hosting goes I can’t say I’ve come across anything better that I can remember.