In order to make things more realistic and “testable” inside xpcshell, I created a crawler using Jetpack and standard Firefox XPCOM components. I have a feeling that QA might be interested in this as, so I posted to the Jetpack Gallery

It is not configurable from the outside, but I have plans for that. I am planning on making JetCrawl surf every night for a set period to increase my collection of data work with.

I need this data in crafting a new Places Query API that is fast and well tested against a rather large collection of bookmarks and history. The urls that Jetcrawl use are taken from the Alexa Top 100, so it is a common set to boot.

Automated tests will work better with this data since it is hitting predictable urls and it is the actual places apis creating the data in the first place.

If you want to try it out, please use a new profile. You can stop it by closing the tab. There is no UI as of yet.

>The timeline work I mentioned earlier has been going well. The patches are pretty well on their way. Until this lands, I am keeping the instrumentation patch unbitrotted each week.

On the “startup” team, we use the bugzilla whiteboard to mark all startup-related bugs with “[ts]”. I wanted to propose that in the meantime before the startup patch lands, feel free to mark [ft] in the whiteboard and list any function names you want to be timed with cold/warm start, platform, etc.

I will run the [ft] query often and add the timers to the instrumentation patch(es), and update the bug with the details.

>At Mozilla, we need to understand how Firefox is used in the wild. Knowing what “typical” profiles are like and having automated tests that attempt to model real world situations is a big plus for writing well performing code.

Just in case anyone else needs to collect data about Firefox use or model “typical” user data for performance testing, here is how Drew and I quickly put together our “Places” toolkit.

2. A server side script to generate a places.sqlite database based on the metrics we are collecting.

I focused on the database generation.

For now, we are doing this so we can create a test (mock) sqlite database with as many records as we wish, or based on the min, max or average of the users that post to the places-stats collection url.

So the basic flow is:

1. have users visit https://places-stats.mozilla.com and run the collection script.2. get a large number of users (and varied types of users) posting their stats to the collection url 3. be able to produce a “power user”, “average user”, and “light user” places.sqlite database on the fly from data hosted at places-stats.mozilla.com

I wrote a Python script for the aggregate data collection and database generation.

To make this an easy, fast exercise in software re-use, I used Django’s db module to reverse engineer the Places schema into a set of Python models.

Once you have Django set up you can run the famous ‘manage.py inspectdb’, which queries your SQLite db schema and outputs the corresponding django.db Python classes.

The generation script populates “Places”, History, Bookmarks, Favicons, Input History and Keywords. I still have a few more entity types to generate, but this is sufficient for the testing we need to do now.

The basic lesson learned is that you can build an effective, one-off data collection/metrics tool quickly and easily. I am sure others at Mozilla need tools like this, so do not hesitate to ping me with questions.

>It’s quite amazing how much traction we get out of irc at work. It’s pretty much a rule to keep the conversation inside irc. This makes a whole lot of sense, as we have all kinds of documentation in the chat logs. I am very accustomed to chatting for work conversation, debugging and whatnot. I love the decentralized model. This is above and beyond what I have ever experienced. So cool.

Between irc, blogs and bugzilla, we are all having a lot of online conversations. Email is important, but secondary. I likes. 99% of the communication is captured in a pretty meaningful way, unlike typical corporate massive reliance on email, which is so full of spam and nonsense.

>The cool thing about really getting your hands dirty deep inside of chrome is learning how the Mozilla developers workflow works. So cool. As an extension developer, I was so used to a workflow where I code, restart Firefox and test. When you work on Mozilla you code/write tests, run make, make check on the module or modules you are touching and start your build, then test if need be…

So cool. I am having a ton of fun. At my last job I was leading a project and now I am back in the position as a student, my favorite place to be:)