Friday, December 7, 2012

I am from Europe. It's a place behind a big body of water from the
United States, generally in the direction of the east. There are few key
differencies, among other things, approach to democracy. We would typically
have a large body of people making all important decisions (like a parliment)
and a small body of people making more mundane decisions (like a government).
In a typical scenario, the government has seriously less power than
the parliment, however it's also much more agile, hence more suited for
making quick decisions. A good example is budget - the government would create
a budget that would then be voted by the parliment. As far as I understand,
the idea is to not vote all the details, but instead create a budget that
will be approved by the parliment.

The PSF is almost like this. There is a large body of people (PSF members)
and a small body of people (PSF board). There is one crucial difference
- the PSF members have power only on paper. The only voting that ever happens
are for either broadening the powers of the board, voting for the board
or for voting in new members. The board make all the actual decisions.

This is not to say that the board makes bad decisions - I seriously cannot
pinpoint a single time where it did happen. I'm very happy with
the board and with it's policies. However I don't feel I have any voting
power. I perfectly understand why it is that way - the PSF members is a big
group and even finding a way for everyone to vote in a reasonable manner
would be a mission. As a European, I would think it's a mission worth trying
though, but as of now I would stay as a non-voting member (also known as
emeritus) and wait for the board to make a decision on everything.

Thursday, July 12, 2012

DISCLAIMER: This post is incredibly self-serving. It only makes sense
if you believe that open source is a cost-effective way of building software
and if you believe my contributions to the PyPy project are beneficial to the
ecosystem as a whole. If you would prefer me to go and "get a real job",
you may as well stop reading here.

There is a lot of evidence that startup creation costs are plummeting.
The most commonly mentioned factors are the cloud, which brings down hosting
costs, Moore's law, which does the same, ubiquitous internet, platforms and open
source.

Putting all the other things aside, I would like to concentrate on open source
today. Not because it's the most important factor -- I don't have enough data
to support that -- but because I'm an open source platform provider working
on PyPy.

Open source is cost-efficient. As Alex points out, PyPy is operating on
a fraction of the funds and manpower that Google is putting into V8 or Mozilla
into {Trace,Jaeger,Spider}Monkey, yet you can list all three projects in
the same sentence without hesitation. You would call them "optimizing dynamic
language VMs". The same can be said about projects like GCC.

Open source is also people - there is typically one or a handful of individuals
who "maintain" the project. Those people are employed in a variety of
professions. In my
experience they either work on their own or for corporations (and corporate
interests often take precedence over open source software), have shitty jobs
(which don't actually require you to do a lot of work) or scramble along
like me or Armin Rigo.

Let me step back a bit and explain what do I do for a living. I work on NumPy,
which has managed to generate slightly above $40,000 in donations so far. I do
consulting about optimization under PyPy. I look for other jobs and do random
stuff. I think I've been relatively lucky. Considering that I live in a
relatively cheap place, I can dedicate roughly
half of my time to other pieces of PyPy without too much trouble.
That includes stuff that
noone else cares about, like performance tools, buildbot maintenance,
release management, making json faster etc., etc..

Now, the main problem for me with regards to this lifestyle is that you can
only gather donations for "large" and "sellable" projects. How many people
would actually donate to "reviews, documentation and random small performance
improvements"? The other part is that predicting what will happen in the near
future is always very hard for me. Will I be able to continue contributing to
PyPy or will I need to find a "real job" at some point?

I believe we can come up with a solution that both creates a reasonable
economy that makes working on open source a viable job and comes with
relatively low overheads. Gittip and Kickstarter are recent additions
to the table and I think both fit very well into some niches, although not
particularly the one I'm talking about.

I might not have the solution, but I do have a few postulates about such
an economical model:

It cannot be project-based (like kickstarter), in my opinion, it's much
more efficient just to tell individuals "do what you want". In other words
-- no strings attached. It would be quite a lot of admin to deliver
each simple feature as a kickstarter project. This can be more in the shades
of gray "do stuff on PyPy" is for example a possible project that's vague
enough to make sense.

It must be democratic -- I don't think a government agency or any sort
of intermediate panel should decide.

It should be possible for both corporations and individuals to donate.
This is probably the major shortcoming of Gittip.

There should be a cap, so we don't end up with a Hollywood-ish system where
the privileged few make lots of money while everyone is struggling.
Personally, I would like to have a cap
even before we achieve this sort of amount, at (say) 2/3 of what you could
earn at a large company.

It might sound silly, but there can't be a requirement that a receipent must
reside in the U.S. It might sound selfish, but this completely rules out
Kickstarter for me.

The problem is that I don't really have any good solution -- can we make
startups donate 5% of their future exit to fund individuals who
work on open source with no strings attached? I heavily doubt it. Can we
make VCs fund such work? The potential benefits are far beyond their event horizon,
I fear. Can we make individuals donate enough money? I doubt it, but I would
love to be proven wrong.

Thursday, April 19, 2012

I'm a technology nomad. We changed camels for high powered, fossil fuel burningjets. I'm working from any place that has internet connection which istypically within hundreds of meters from any physical location I happen tobe at. I create open source software that brings value to various people,using mostly loose change and scraps from large corporations for a living.It's not that much value, after all, who uses PyPy, but the important partis the sign - it's a small, albeit positive change in the open source ecosystemthat in turn makes it cheaper to create software stacks which ends up inyoung companies trying to make a dent in the universe. I'm a plumber fixingyour pipes, one of the many.

And I need a visa for that. I want to have a stamp in my passport that willstate all of the above and provide few clues as to what it actually means:

I will not stay in your country for very long.

The exact place does not matter at all to me - it's all one big internet.

I'll not seek employment at McDonalds and I have a pretty good track record,go read my bitbucket contributions.

Open Source is software you run into everyday - and this is also becauseof people like me.

And yet I'm failing. People running immigration are so detached from my realitywe don't even send postcards to each other. Every single border officer issuspicious and completely confused as to why and how I do all of that.

How can we change it? How can we end the madness of pointless paperwork?

Tuesday, February 14, 2012

Obviously I'm biased, but I think PyPy is progressing fairly well. However,I would like to mention some areas where I think pypy is lagging ---not living up to its promises or the design decisions simply didn'tturn out as good as we hoped for them. In a fairly arbitrary order:

Whole program type inference. This decision has been hauntingseparate compilation effort for a while. It's also one of the reasonswhy RPython errors are confusing and why the compilation time is so long.This is less of a concern for users, but more of a concern for developersand potential developers.

Memory impact. We never scientifically measuredmemory impact of PyPy on examples. There are reports of outrageous pypymemory usage, but they're usually very cryptic "my app uses 300M" and notreally reported in a way that's reproducible for us. We simply have to startmeasuring memory impact on benchmarks. You can definitely help by providingus with reproducible examples (they don't have to be small, but they haveto be open source).

The next group all are connected. The fundamental question is: What to doin the situation where the JIT does not help? There are many causes, but,in general, PyPy often is inferior to CPython for all of the examples.A representative, difficult exammple is running tests. Ideally, forperfect unit tests, each piece of code should be executed only once. Thereare other examples, like short running scripts. It all canbe addressed by one or more of the following:

Slow runtime. Our runtime is slow. This is caused by a combinationof using a higherlevel language than C and a relative immaturity compared to CPython. Theformer is at least partly a GCC problem. We emit code that does not looklike hand-written C and GCC is doing worse job at optimizing it. A goodexample is operations on longs, which are about 2x slower than CPython's,partly because GCC is unable to effectively optimize code generatedby PyPy's translator.

Too large JIT warmup time. This is again a combination of issues.Partly this is one of the design decisions of tracing on the metalevel,which takes more time, but partly this is an issue with our currentimplementation that can be addressed. It's also true that in some edgecases, like running large and complex programs with lots and lotsof megamorphic call sites, we don't do a very good job tracing. Becausea good example of this case is running PyPy's own test suite, I expectwe will invest some work into this eventually.

Slow interpreter. This one is very similar to the slow runtime - it'sa combination of using RPython and the fact that we did not spend muchtime optimizing it. Unlike the runtime, we might solve it by having anunoptimizing JIT or some other medium-level solution that would work goodenough. There were some efforts invested, but, as usual, we lack enoughmanpower to proceed as rapidly as we would like.

Thanks for bearing with me this far. This blog post was partly influencedby accusations that we're doing dishonest PR that PyPy is always fast. I don'tthink this is the case and I hope I clarified some of the weak spots, both hereand on the performance page.

EDIT:For what is worth I don't mention interfacing with C here and that's not because I think it's not relevant, it's simply because it did not quite fit with other stuff in this blog post. Consider the list non-exhaustive