Monday, November 16, 2015

tl;dr; We are willing to commercially support CPython C API in PyPy,
so if you want pypy to support library X, get in touch with me
at fijal@baroquesoftware.com. Read further for more details.

Python owes a whole lot of its success to the ease of integration
with existing POSIX APIs and legacy applications. For the most parts,
this means calling C (or Fortran) using various ways with the Python
C API being the most commonly used one, either directly or
with the help of tools like Cython.

Historically, calling C was one of the sore points of PyPy. We originally
had no way to do it, then we implemented ctypes which is loved by few,
hated by many. Next we went ahead and implemented cffi which is both
a better way to call C and a much better supported one. cffi has been
a stunning success, becoming one of the most commonly used pypi
package with over 1.5mln downloads happening every month and growing.

This addressed the basic problem of PyPy -- how do I call C? It has
however not addressed a very important problem -- "how do I integrate
with legacy applications?". We were willing to take a step further,
implementing a subset of CPython C API, labelling it "forever beta",
"incomplete" and "slow", just to support the legacy software.

Now, to address the growing need, we're willing to take a step further.
We discussed ways to make the CPython C API faster and more stable with
the promise of supporting it completely in the future. However, since it
is about supporting mostly legacy codebase, me and my company, baroquesoftware
want to structure it as a commercial contract.

This is an open bid to find funding primarily through commercial partners
to implement speed and completeness improvements to CPython C API and pool it together
into pypy. The end product will be available, for free for
everyone under the MIT license, but the funding will be structured
as a commercial contract with all the benefits of one.

Get in touch with me, preferably using mail at fijal@baroquesoftware.com
for more details regarding what can be done for what sort of budget.

Friday, March 20, 2015

tl;dr; We decided to go to Y combinator with HippyVM, our high
performance PHP implementation, and we did not get
through after two rounds of interviews.

But I suppose there is more to it, so keep reading....

The whole story started with a small disaster, but let's start at the
beginning. We applied to Y combinator
a bit haphazardly in 2012 for the 2013 summer batch, without expecting
the interview to get through. The main reason for me to apply was precisely
the 7 ideas talk done by Paul Graham at Pycon US as a keynote mentioning
the "sufficiently smart compiler". For those readers who don't know, PyPy
is a fast Python compiler, but we also developed a language and a framework
called RPython that's suitable for implementing fast dynamic languages,
so we decided to check if it works for PHP, which is how HippyVM was born.

Well, I thought, we have a framework that's
as close as it gets this days to "sufficiently smart compiler"; so I decided
to submit -- why not. When we got the Y combinator invitation, I was in Europe
at the time, out from my usual place of residence, South Africa. We got
tickets, went to the airport and.... it turned out my visa for the US had been
left at home. Note: US tries not to admit the fact that they keep visa
information in any sort of system, so if you get a new passport you are either
allowed to use your old passport or you need to apply for a new visa. No
way to transfer to a new passport. Oh well -- fortunately for the most part we
live in the 21st century and a few calls, DHLs and tickets later, I landed in
San Francisco for a weekend with the interview scheduled for Saturday.
That ended up in 3h of being detained at SFO, since nobody flies to SF
for a weekend carrying two sets of clothes a laptop and a sleeping bag.

The idea

The idea was simple - we have enough expertise in compilers to do hard things
and PHP is the most widely deployed dynamic language. Also, people are selling
various "PHP optimizers" for money that don't really do much. We can do better.
At the time HHVM was really not working very well and there was no other
competition.

The actual interview

We actually ended up having two sets of interviews, which I think is
pretty unusual. The first team was probably very confused, so they sent us
down to the second one. The positive
part of the interview was that people (at least those that use Python)
generally recognize our work. The negative part was that 3 months is by far
not enough to bring any tangible results in the compiler world. We required 1-2
years of work to provide anything tangible, and that does not fit into their
model. Paul Buchheit asked us half-jokingly why all the cool compiler guys
are from Europe (which is as far as I know not true, but Europe
is overrepresented). I didn't have an answer at the time but later that day
it become blatantly obvious that it's all about long-term vision. Compilers
take more time than Americans typically have in their sights. PyPy is 10 years
old and it's "the new kid on the block". We were told we should be home
cranking code until we can get to something showable.
I walked out from the interview pretty sure we would not get in.

The aftermath

Unsurprisingly, we didn't get in. We ended up having a very good one day
PyPy sprint in San Francisco. We do not fit in the model. Now this brings
me to an interesting question, which is what Lars Bak told me -- there is no
money in infrastructure like programming languages. Very few people are
willing to invest
in such companies and the contenders these days are all Open Source without
a decent funding model or backed by a large corpo (Oracle, Google, Microsoft)
or both. I have no idea how to go about sponsoring research like PyPy or
building a business model around it. Despite bringing a lot of value to the
system (and I don't mean just PyPy; also CPython, Ruby etc.) there does not
seem to be a good way to build a business model.

There are good reasons why you want your infrastructure to be either
Open Source or backed by a large stable entity, and I'm very much for that,
the world is a better place than it was during the coldfusion days and we're all
better off. However, we're missing a business model where infrastructure
people can get attention from VCs and a revenue model that somehow corresponds
to the value they're bringing.

Post-mortem

HippyVM got a little funding at the beginning to get us to some sort of
prototype. Within a bit over than a year to a point where we were
able to run mediawiki with a significant speedup over Zend PHP. However,
the HHVM team these days counts between 30-60 people (that's what I can guess
from the photo) and is available for free. Sure, it's tied to Facebook, but
it seems to be enough to deter any business in this area. We would not be
able to outcompete HHVM by enough (usually enough is 2x faster) on real
life workloads with a fraction of team and a fraction of their funding,
so we went onto improving PyPy.
We did achieve most of what HHVM does at a small
percentage of the cost, but the difficulties in funding generally caused
the HippyVM project to come to a stall.

What now?

I do consulting. Most of it is PyPy-related, so I'm pretty happy, however
I'm still trying to find a model where basic research and infrastructure work
can provide revenue which is related to the amount of value it's bringing to
companies. Ideas welcome :-)