Monday, December 7, 2009

I find it difficult for me to write a history of Sage without writinga history of my personal involvement with mathematical software. Iloved using calculators since my earliest memories. My father somehowgot a mechanical electric adding machine for me when I was quiteyoung, back in the 1970s, and I spent a great deal of time fillingribbons of paper with it. Then he got me an old TI scientificcalculator from a yard sale, with a LED readout. At age 11, when Imoved from Oregon to Texas, I bought a sliderule, which was prettyexciting for a while. Then in junior high I finally got a realscientific calculator, a little Casio, whose instruction manual Idevoured. This was my first inroduction to trigonometry, statistics,and many other bits of computational mathematics. I was preparedthough, since I had done elementary school on an unusualwork-at-your-pace alternative curriculum, and had already workedthrough 10th grade level mathematics.

The first mathematical computer software I ever used was Mathematica,back in 1992 on a Windows 3.1 PC, when I was a 17-year oldundergraduate at Northern Arizona University in Flagstaff, Arizona.

Naturally, my copy of Mathematica was pirated, since like manystudents I was extremely poor at the time. The only thing I found ituseful for at the time was drawing 3d plots (just for fun), and eventhen it was frustrating, since there was no way to interactivelychange the viewpoint. I also obtained a copy of MATHCAD somehow,which I found much more useful than Mathematica. This is perhaps notsurprising, because at the time I was a computer engineeringundergraduate, so taking courses in Physics, Electrical Engineering,Programming, etc.. I was definitely not a mathematics major: Iremember once finishing a multivariable calculus class and thinking``this is the last mathematics class I'll ever take"! I also usedMaple for about an hour or two in a computer lab for a mathematicscourse I took, but found it very cumbersome; this was a token 1-hourintroduction to math software that professors gave their students,perhaps to justify the grant that paid for the computer lab. At thetime, I viewed Maple, Mathematica, and MATHCAD as software that didn'treally go beyond scientific calculators in any exciting way.

My next encounter with mathematics software was in early 1994 when Ibecame a mathematics major, after accidentally encountering anabstract algebra book misfiled under computer science in a usedbookstore, and being instantly mesmerized by ideas such as groups,rings, and fields. I was in another small computer lab and stumbledon printouts of the documentation (Northern Arizona Universityprofessor) Mike Falk had for Cayley, which was the predecessor toMagma. I was amazed and excited, since the capabilities and ideas inCayley went so far beyond anything I had thought was even possible inmathematical software before. I'm pretty sure I never got to actuallyuse Cayley though, since like now, the software was very expensiveand hard to get. Instead, I just spent a lot of time reading thereference manual full of its beautiful examples of computing withgroups, rings, and fields, which were the very objects that hadenticed me into mathematics in the first place.

In fact, after that brief encounter with Cayley, I didn't touchmathematics software again, or even do any nontrivial computerprogramming for 3 years. This was because in 1994 I became intenselyinterested in theoretical mathematics, and spent most of my timeduring the next 3 years systematically doing exercises in mathematicsbooks, ranging from basic books on linear algebra and combinatorics(which was big in Arizona) to Hartshorne's Algebraic Geometry.

In 1997, I was a graduate student at Berkeley, and didn't want toteach so much, so I landed a job (funded by the NSF VIGRE program) inthe department for one year doing programming of curriculum materialsfor an undergraduate linear algebra course, along with fellow graduatestudent Tom Insel. Though I had been using Linux for years, I hadnever thought about free software, or that actual people couldcontribute to it. I thought of Linux as ``Unix that I can install onmy own computer", and back then one still paid for Linux by buying aCD, since downloading over a modem was way too slow. Tom and I spenta lot of time working on our project together, and he told me how hehad written some software included in Slackware (a Linuxdistribution), so got free copies of the CD when new versions cameout. Hey, anybody can contribute to Linux!

I also remember Tom complaining frequently about how we were forced toprogram in MATLAB for our project, and he mentioned many otheralternatives that would have been better for what we were doing,including Java. We were doing GUI programming, with a tiny, tiny bitof actual mathematics thrown in, and MATLAB's handle-based system forwriting graphical users interfaces was really painful. I had a lot ofexperience with C/C++/Windows 3.1 GUI programming from my computerscience undergraduate days, and agreed that MATLAB was pretty awkwardfor what we were doing at the time. In retrospect, what we did wasprobably pointless, and perhaps never got used. However, theexperience was extremely valuable for both of us, and I'm glad NSFfunded it.

In the meantime, my Ph.D. thesis was going nowhere, despite nearly 3years of graduate school. One day, I heard about a problem Ken Ribetwas asking all the graduate students: ``Is there a prime number p suchthat the Hecke algebra at level p is ramified at p?" It's the firstresearch problem I can ever remember hearing that was almost certainlyimpossible to solve without using a computer. Fellow grad studentsJanos Csirik (now at D.E. Shaw), Matt Baker (now at Georgia Tech), andI searched and found one paper by Hijikata (?), I think from the mid1970s, which gave an algorithm that might allow one to answer theabove question for specific p's. But to implement the algorithm, itwas necessary to compute class numbers of a huge number of quadraticfields, and none of the mathematics software I had ever heard of untilthen could do this. Janos and Matt mentioned PARI, and I installed iton my computer. And indeed, it could quickly and easily compute classgroups! PARI was also the first free mathematical software Iencountered.

So I started coding up the algorithm (the Eichler-Selberg traceformula) in PARI. I had a lot of experience with C++, which is a realprogramming language with user defined data types, exception handling,templates, etc. I rememember in 1992 carefully reading several C++book cover-to-cover, and I wrote a large amount of code (video games!)in C++ long, long ago. In contrast, PARI was an immediate shock.This was a language with no local variables, no real scoping, only acouple of builtin types, and for a while I thought (incorrectly) thatentire function definitions always had to be on one line, since thatwas the case in example code I found. But the algebraicnumber theory algorithms implemented in the internal library wereamazing, deep, and very fast. So I implemented the trace formula, andran it to try to answer Ribet's question. The program did notwork---basic consistency checks failed. It turned out that there wasa major bug in the algorithm for computing class groups. Inparticular, the function qfbclassno, silently returned wronganswers.

You would think that qfbclassno would be fixed by now. But no. It'sonly frickin' 12 years later!! I just checked right now, and thedocumentation for PARI still says ``Important warning. For $D < 0$,this function may give incorrect results when the class group has alow exponent (has many cyclic factors), because implementing Shanks'smethod in full generality slows it down immensely.'' This is buriedin the documentation. The only change is that I think in 1997 thedocumentation said that the authors were ``too lazy'' to implement thefull algorithm. Note: In the 1990s the function was classno instead at that time. I found in tutorial.tex frompari-1.39a.tar.gz:

Type {\tt classno($-$10007)}. GP tells us that the result is 77. However, you may have noticed in the explanation above that the result is only usually correct. This is because the implementers of the algorithm have been lazy and have not put the complete Shanks algorithm in PARI, causing it to fail in certain very rare cases. In practice, it is almost always correct, and the much more powerful {\tt buchimag} program, which {\it is} complete, can give confirmation.

So I worked around that problem, and was able to run the algorithm forall primes up to about 300, but didn't find any primes as in Ribet'squestion. A few weeks later, at the Arizona Winter School in March1998 in Tucson, Arizona(AWS), I mentionedthis to Joe Wetherell (who was another Berkeley grad student), whilewe were walking to lunch, and he told me he had written a program---alsoin PARI---for computing with modular symbols (following John Cremona'sbook), with which I might be able to push the computation a littlefurther. He gave me a copy later, and I started playing around withit, and computed the discriminants of all of the Hecke algebras ofprime level up to about 500. NOTE: That program lives on here, in caseyou're interested: tables. And it still works (here with GP 2.4.3)!!!

Again, I didn't find any examples, and I wrote to Ken Ribet to tellhim. Then I hopped on a plane and flew to Cambridge, England, tovisit Kevin Buzzard.

Once I got settled in Cambridge, I rechecked my calculations for somereason... and found an example for the prime $p=389$! Somehow, I hadjust missed the example in my previous check. I was extremely excitedas I wrote to Ken Ribet, with my first ever genuine contribution to researchmathematics, which appears in a big paper Ken published on theManin-Mumford conjecture---I had shown that his new theorem definitelydid not prove that conjecture for all modular curves. I was also hookedon computing modular forms, and started making tables. It was alsothe height of the mad cow disease scare in England, so I became avegetarian.

I tried to push the PARI program that Joe Wetherell had given me tomake bigger tables, but it very quickly ran out of steam. The mainalgorithms that the modular symbols algorithm relies on are all mannerof linear algebra over the rational numbers, including computation ofcharacteristic polynomials, and kernels of sparse and dense matrices,and also factorization of polynomials over the integers. Despite itsfirst rate algebraic number theory capabilities, PARI was (and stillis) terrible at linear algebra with really big matrices over the rationalnumbers. Also, I found the PARI programming language unbearablynaive.

So I spent the summer of 1998 writing a much more general C++ programfor computing with modular forms called HECKE. Here it is:hecke. If you grab the filehecke-july99.gz from that web page, and drop it on just about any Linuxbox, it should just work: NOTEHere is Hecke which I just tried on a Core 2:

% ./hecke-july99HECKE Version 0.4, Copyright (C) 1999 William A. SteinHECKE comes with ABSOLUTELY NO WARRANTY.This is free software, and you are welcome to redistribute itunder certain conditions; read the included COPYING file for details.

HECKE: Modular Forms Calculator Version 0.4 (July 9, 1999)

William SteinSend bug reports and suggestions to was@math.berkeley.edu.Type ? for help.

Anyway, I spent all my time for many monthswriting HECKE. This program built on several other C++ mathlibraries, including LiDIA and NTL. It doesn't use PARI, becauseusing PARI from C is... really weird, and LiDIA/NTL had equivalentfunctionality at the time. Much of the time I spent on HECKE involved(1) designing algorithms, generalizing work in Cremona's book, etc.,and (2) implementing and optimizing algorithms for linear algebra overthe rational numbers. For (2), Kevin Buzzard hooked me up with thesame code Cremona used, which Cremona had got from some South Americanstudent at Cambridge named Luis (?). I then spent a large amount oftime optimizing that code for my computations. The actual linearalgebra algorithms in that code were very naive compared to thealgorithms in Sage now, but they were better than anything availablein any other software I was aware of, or in research papers at thetime.

I was able to compute many fairly sophisticated tables using HECKE,and it soon became the canonical software for computing with modularforms, since it was the only software generally available for suchcomputations. This is perhaps similar to how NAUTY was for a verylong time the canonical software for computing graph automorphismgroups. Naturally, I made HECKE freely available on my webpage. Iremember once getting an email from Ken Ono, who has probably writtenover 100 papers on modular forms, many inspired by concrete examples.He had run across HECKE on my web page, installed it, and was totallyblown away by the capabilities of HECKE, and how useful it would befor his research. A whole new world had opened up to him, and he toldme he was promptly ordering a new fast computer specifically to runHECKE on. I was happy to help. Also, my thesis work was starting togo well, because the computations I was doing was suggestinginteresting new (and do-able) mathematics, left and right.

Then, in 1999, David Kohel---who had been a Berkeley grad student withme until December 1996---was visiting Berkeley from Sydney, Australia,and told me about implementing algorithms related to his thesis inMagma. I think this was the first I had ever heard of Magma, despiteMagma having been around for several years. Magma was expensive andnearly impossible to get unless you knew the right person, since itwas sold via informal channels. I think there was an install on thecomputers in Berkeley, but those computers were ancient vintage1990-ish Sun workstations, so nobody would seriously try to use them.Anyway, David had implemented code for computing with rationalquaternion algebras, and this was the only implementation of thatalgorithm in the world. Coincidentally, I had extended an old idea ofRibet to come up with a new algorithm for computing Tamagawa numbersof modular abelian varieties (see ants andcompgrp). I really, really wanted toimplement my algorithm, because it would allow me to compute all ofthe invariants in the Birch and Swinnerton-Dyer conjecture (exceptSha) for most rank 0 modular abelian varieties, which would be a hugestep forward. But my algorithm fundamentally relied on exactly thecomputations in rational quaternion algebras that David Kohel hadimplemented in Magma. And that was no small undertaking---it's acomplicated algorithm, it takes somebody familiar with the relevanttheory months to implement and optimize, and it builds on many otherbasic capabilities. I had a thesis to finish. David---who was thenofficially a Magma developer, was able to give me a copy of Magma formy own computer, which had his code in it. Combining all this withHECKE, and copying and pasting, I was the first person ever to systematicallycompute Tamagawa numbers of general modular abelian varieties (at primesof multiplicative reduction).

So in 1999 David Kohel put me in a situation where I was fundamentallydependent on a closed source non-free program in order to continue myown research. Ironically, during the same afternoon in my apartmentin Berkeley, he mentioned the GNU Public License (GPL), and suggestedI released HECKE under the GPL. Before that moment, I had never evenheard of the GPL. I did as he suggested, but had no idea what itmeant really. That was perhaps fortunate, since the dependencies ofHECKE are NTL and LIDIA, and the NTL license is GPL, but the LIDIAlicense is GPL incompatible, so technically I guess HECKE can't bedistributed. Incidentally, LIDIA is still licensed under aGPL-incompatible license, despite many emails I've received suggestingthe license would change to GPL---licenses don't change easily, dueto having to get agreement from all copyright owners.

I obviously had to actually use Magma at some level in order to dothese computations. Despite having a lot of experience withprogramming, I initially found the Magma language and system extremelyhard to learn or do anything with. It was certainly much harder touse initially than PARI. On the one hand, Magma is a massive system, withthousands of commands and thousands of pages of reference manual documenation, buton the other hand there is very little in the way of ``introspection",i.e., given an object A, it is hard to get context sensitive helpabout A. However, one thing was clear: Magma was dramatically betterat dense linear algebra over the rational numbers than HECKE. It wasa whole different world. We're talking jaw dropping speed. In fact,Magma in the late 1990s on an old computer, was far faster at largelinear algebra over the rationals than Maple or Mathematica is todayon the latest hardware. I would soon find out the reason.

Allan Steel is an enthusiastic Australian, who was an undergraduate atUniversity of Sydney in the early 1990s and fell in love with theMagma project. I guess David Kohel told John Cannon about me in 1999,and Allan happen to be visiting Berkeley for a conference, so Allanand I met. We ended up talking a huge amount over 3 days. Allananswered my every question about the language and the system---usuallywith an answer about how he had implemented it that way for acertain reason. So within a few days I knew Magma well enough to bequite productive in it. Also, Allan gave me some hints about whylinear algebra was so fast: together with polynomial and integerarithmetic, asymptotically fast linear algebra was one of his maininterests. He explained an exciting array of algorithms, many ofwhich he had developed or---more importantly---made practical.There were dozens of tricks that I had never heard of, such asrational reconstruction, multimodular algorithms, p-adic algorithms,etc., which were far beyond what I had done with HECKE, or seen in anyother software. And they meant that it would be possible to push mymodular forms computations much farther. All I had to do was rewriteHECKE in Magma.

It was my last year of graduate school at Berkeley, and I probablyshould have been writing my thesis, but instead John Cannon flew medown to Sydney, Australia, to work with the Magma group for a monthand do a complete new implementation of all the algorithms in HECKE ontop of Magma. I shared an office with Claus Fieker, a German who hasimplemented a large amount of the algebraic number theory in Magma,among other things. As I started doing this, I had some seriousconcerns, including: Magma did not allow users to define their owntypes (or classes), there is no exception handling, there is no ``evalstatement", no way for users to write compiled code, Magma is closedsource, and Magma is not free. I raised all of these concerns withCannon and others, and was assured at the time that they would all beaddressed really soon, except the free part. Regarding free, theysaid that I could give copies of Magma to whoever I wanted, so long asI checked with John Cannon first.

I don't remember why exactly, but I remember once during that month in1999 going on a walk through the park near the U Sydney campus near apond of ducks, sitting on a bench, and realizing that I was making ahuge sacrifice of my freedom as a researcher by going down this path.Magma was not open source---John Cannon had absolute control over thesystem. Magma was not free. And as a language, Magma wassignificantly behind C++. It didn't even have a sensible notion ofscope, and one added new data types by entering entries in big tables,then recompiling the kernel. I asked Cannon why it was so far behind,and he explained that the grants he was able to secure simply wouldn'tpay for language design. The people who supported Magma with funding(mainly granting agencies) would only support implementing andoptimizing mathematical algorithms, and the license fees only paidenough to support ``maintenance", which mainly meant the person whocollected the license fees and distributed Magma via ftp. Ten yearslater the Magma language has hardly improved at all. They finallyhave exception handling, but still no user defined types, etc., etc.The library of functionality implemented in Magma is huge though; theissues with the language didn't stop people from implementing manyexciting algorithms, which have supported huge amounts of research innumber theory and other areas.

I sat down on that park bench, and realized what a dangerous path Iwas taking in giving up so much freedom so early in my career. Iresolved at that moment not to do it. At that moment I starteddesigning what would eventually become Sage. I started thinking aboutthe language I would implement (I had taken a course in writinginterpreters when I was a student), about implementing all the linearalgebra algorithms Allan had hinted at (but given no details), etc.I then realized that if I did this, I would have to do it by myself,since almost everybody I knew used Magma, and would consider my plantoo difficult and pointless. I wouldn't get to do number theory foryears. My spirit broke. And Cannon told me that many of my issueswith Magma would be addressed within a year.

I spent the next 5 years writing and using Magma. I gave dozens anddozens of talks all over, and convinced anybody who wanted to docomputations with modular forms that Magma was the way to go. I gaveaway free copies of Magma (with John Cannon's official blessing),taught undergraduate courses using Magma, and was generally veryproductive. Also in 2003, I had a real job and money, so I startedforming this philosophy of software, where I would judge what softwareI used purely based on capability and functionality, and not on priceor openness. I started using Microsoft Windows fulltime, since itbest supported my PDA and had the widest range of software. I boughtsome $500 (with educational discount) suite of Adobe software forvideo editing, photos, vector graphics, etc. And of course I usedMagma. I was a well paid academic at a well-endowed university(Harvard), and wanted the best that money could buy.

In 2002, William Randolph Hearst III also donated money to Harvard tobuy me a cluster computers, and by 2003, I wanted to easily scriptrunning lots of computations in parallel. Since Magma didn't have anyparallel capabilities, I stumbled on some language called ``Python"(Version 2.3), which looked a lot like Magma, but was designed forgeneral purpose scripting. I started using it to run manycomputations in parallel on that cluster. It was the best tool Icould find for the job.

I also started working much harder on making the number theory data Iwas computing with Magma available online, and naturally I turned toPython. Dimitar Jetchev (a Harvard undergrad) and I wrote Python codeto make the data easily queryable via a web interface, and also wrotecode that made it so one could do computations in Magma (and PARI insome cases) over the web. One incarnation of this is hosted on theMagma website: calc

As I learned about Python, a funny thing happened. I had by thispoint developed a large list of issues with Magma. For example, thedocumentation and examples in the Magma reference manual aren'tautomatically tested to ensure they give the claimed output, and theyoften get out of sync with the actual code. Python is in many wayssimilar to Magma -- the language itself feels somewhat similar, and ithas the same ``batteries included" philosophy. The surprising thingwas that Python had solved the dozens of major problems I had withMagma! I was excited about this in 2003, and so during my next (andlast) trip to work with the Magma group in Sydney, and started tryingto incorporate these solutions into my new Magma code, and to explainwhat I had learned to John and others. Right after I returned fromSydney, I recall excitedly explaining all of this to David Goldschmidtat a reception at an AMS meeting.

I was quite surprised when a month or two later, in early 2004, JohnCannon really soundly rejected my ideas and even had one of hisemployees rewrite my new modular abelian varieties code to get rid ofthem. In retrospect, now that I run a large software developmentproject, I can understand why he didn't get what I was doing. But Iwas really suprised then. Second, I went to a big Magma conference atIHP in Paris, where Manjul Bharghava (a young professor at Princeton),me, and many other people gave some series of talks. I recalllistening to Manjul's talk as he described a research problem he wasworking on, and during his talk he explained that the whole thrust ofhis research was seriously stymmied because Magma is closed source.He needed to adapt some relatively minor part of the some algorithm inMagma related to quadratic forms, and simply couldn't due to it notbeing in the interpreter level of Magma. This just didn't seem right.

I also had lunch with John Cannon during that workshop, where heexplained some big plans he had to edit a huge sequence of volumesabout the mathematics behind algorithms implemented in Magma. Isuggested that it would be nice if these were freely available, and hedid not think that would be possible. He made good on his idea, and Iactually published a paper in the first such volume:bsdmagma. I wrote that paper in Windowsusing Microsoft Word (and converted to LaTeX using WinEdt, a non-freeLaTeX frontend, only at the very end)!

Finally, at that IHP workshop I learned two other things. First, Ilearned that a workshop is an incredibly efficient way to developmathematical software, far more efficient than the Magma model ofhiring people for months at a time and flying them to Sydney, andcertainly vastly more efficient than what Maple, Mathematica, andMatlab do, which costs literally costs hundreds of millions of dollarsper year. Second, I recall overhearing conversations about the Magmalanguage during tea breaks by some locals who were interested in theworkshop, but had not drank the Magma koolaid, and in theseconversations it became clear that I wasn't the only person that foundthe Magma language to be deficient.

I started reading slashdot in early 2004, mainly for the interestingtech news, and the comments kept mentioning ``open source", which washonestly something I had paid almost no attention to until then.Intrigued I decided to look around and see how open source mathematicssoftware had done since I had abandoned it in 1999. NTL was no longerbeing actively developed, the LiDIA project was nearly dead, PARIhadn't changed much (except to break all my old code), but with somemore algorithms for relative number fields. So in five years, thesituation with the open source number theory software environment hadgot worse. I realized that I was probably partly to blame, havingtried to convince every number theorist I knew to use Magma, and oftengiven them free access to ensure they could. I had helped hook ageneration.

About this time I was also writing an elementary number theory book(which eventually became this book: ent). I hadplanned to have a chapter about number theory using each ofMathematica, PARI, Magma, and Maple. I had the first three programs,but didn't have access to Maple. Somebody suggested that being afaculty member and writing a book should be a good argument for Mapleto send me a free copy, so I contacted them. A person from Maplesoftcalled me back, and explained that though I was writing a book with achapter on Maple, they could not give me a free copy. However, theycould give me the special academic discount of 500 dollars. I askedif he could do better, and he called me back the next day and said:``If you can get 4 of your colleagues to also buy Maple at 250dollars/each then I can sell you Maple for 250 dollars." I wasoffended, so I ``obtained'' a ``trial'' copy, and started writing mychapter anyways. I started trying all the same things as I had easilydone in PARI and Magma, e.g., checking primality of numbers, etc. Iwas totally surprised to find that Maple was terrible, being massivelyslower than Pari, Magma or Mathematica for most elementary numbertheory computations relevant to my book. (Maplesoft was bought bysome Japanese company a few months ago, by the way.)

I also installed Linux in a virtual machine (under Windows), to seewhat all the fuss was about, and found I started using Linux all thetime instead of Windows, because the software was better (even Magmaruns much better under Linux than Windows). I deleted Windows andinstalled Linux. I was also starting to definitively realize that myhuge list of problems with Magma would never, ever get resolved, andwas getting increasingly frustrated by these problems because Pythondidn't have them.

I started talking a lot with Thomas Barnet-Lamb about a crazy idea tocreate a new open source math software system with readableimplementations of algorithms, and nothing hidden in some stupidproprietary layer. Thomas was then a first year Harvard grad studentwho had won some international computer programming competition, soI figured he would enjoy talking about software. I also talked a lotwith Dylan Thurston about this crazy idea; Dylan had started gradschool at the same time as me at Berkeley, graduated the same time,and had the same first two jobs as me, was also an AssistantProfessor. Both Thomas and Dylan gave me many ideas for programminglanguages to consider, including OCaml (which Thomas liked), Haskell(which Dylan was a huge fan of), etc. After having used Magma foryears, with its highly optimized algorithms, I desperately needed afast language. But I also wanted a language that was easy to read,and that mathematicians could pick up without too much trouble, sinceI wanted people like Manjul to someday use this system and not havetheir research cut off. And I knew from experience that unreadablesource code is no better than closed source.

I'm not going to go into negatives of any languages. Though I usedPython a lot, for a long time I didn't consider it seriously at allfor this crazy project, since I tried implementing some basicarithmetic algorithms in Python and found that they were vastly tooslow to compete with Magma (or C). I had also tried quite hard to useSWIG to make C++ available in Python, but SWIG is extremely frustring,and has horrible performance (due to multiple layers of wrapping), atleast compared to what Magma could do.

In October 2004, I was flying back from Europe (the Paris Magmaconference) and started reading the Python/C API reference manualstraight through. I realized that Python is far, far more than justan interpreter. It is a C library that implements everything youneed, and has a well defined and well documented API. I did somesample benchmarks on the plane, and found not surprisingly that Icould write code as extensions to Python that was just as fast asanything one could write for Magma by modifying the Magma kernel,since under the hood, both were written in C. Also, on the flight, Irealized that because the Python/C interface uses reference counting,it would be vastly easier to write the C extensions I would need usingsome sort of language I would design. I got home and somehow stumbledonto Pyrex, which was exactly what I was planning to write. I triedit out, did benchmarks, and realized that I had a winner.

With Pyrex and Python, I could implement algorithms and make them asfast as anything in Magma, assuming I could figure out the rightalgorithm. Moreover, the dozens of issues I had with Magma, many ofwhich were simply a function of them not having the resources to dolanguage development, were already solved in Python. And Python wouldcontinue to move forward with no work from me. It was mid-2004 andbecause of Python, the overall software ecosystem was much better thanin 1999, despite open source number theory software having not movedforward much.

I started going to (and sometimes hosting) the Boston Python usergroup meetings, which was quite large, and gave me much usefulfeedback. And I decided it was time to move past my test andprototype stage and get to work. My plan, as I had explained it toThomas, was to create a complete new system from the ground up usingPython + Pyrex. All the code would have an easy to read Pythonimplementation that was well documented, in some cases there would bea much faster Pyrex implementation of the same code, etc. With mynaive plan in hand, I sat down with the main elliptic curves file ofthe PARI source code, and started to translate.

I think I made it through one function. Where some might havedoggedly persisted for years with such an approach, I quickly ran outof patience. In fact, when it comes to software and programming I canbe extremely impatient. I realized that my entire plan was insane,and would take too long. I had discussions with Thomas, Dylan, andothers, and everybody I knew who was seriously into number theorycomputation was using Magma, so I realized that I was going to have todo this entire project myself. So I realized translating was doomed.Somehow, even with all my experience, I had massively underestimatedthe complexity of the algorithmic edifice that is any seriousmathematical software system.

I read the PARI C API reference, and used Pyrex to write a wrapper sothat I could call some basic PARI functions from Python. I implementedbasic rational and integer types using Pyrex and GMP, and theperformance was reasonable. One day, I was using Matplotlib (a Pythonlibrary) to draw some plots for Barry Mazur that involved explicitcomputation with the incomplete Gamma function, and was frustratedbecause neither PARI nor Magma had an implementation of this specialfunction at the time. Harvard had a Mathematica site license, so Ihad a copy of Mathematica, and I wrote code using the pexpect Pythonlibrary to hold open a single Mathematica session and use it tocompute the incomplete Gamma function. Problem solved. This was whenthe interfaces between Sage and other mathematics software systems wasborn.

In January 2005, I was at the AMS meeting in Atlanta, Georgia, hackingon my code, and David Joyner walked up to me and asked what I wasdoing. Until then, I had not shown my Python/Pyrex math softwareproject to people. There were a few reasons, including feeling surethat it was massively too difficult to pull off, that working onsomething like this would seriously piss off John Cannon, etc. Butfeeling brave, I showed David what I was doing and I was surprisedthat he found it interesting. I promised to post a copy online, whichhe could download.

David Joyner is the first to admit that it's a good idea to makesoftware easy for him! So I had to make it easy for him to downloadand install my program, which I called Manin at the time (after one ofmy favorite mathematicians). My target audience wasn't ``Debian"; itwasn't ``Python programmers"; it wasn't elite hackers---it was DavidJoyner. I had to make this program trivial for him to install, workwith, develop, etc. I thought about how it had literally taken mehuge amounts of time just to build Python, GMP, PARI, etc. all fromsource in a directory for development, and realized that there was noway in hell David would get anywhere on Manin if I told him that hisfirst step was to figure out how to build all those programs, then getback to me. So I setup something that would do it all automaticallyin a self contained way. He tried it, it ``just worked", and he gotreally excited and started writing code for Manin. David is a codingtheorist, and wanted group theory and coding theory functionality inManin, but didn't want to write it all himself from scratch, so heasked in email about making Manin and GAP talk to each other somehow.I showed him my pexpect code for controlling Mathematica, and headapted it to create a GAP interface.

David also works at the US Naval Academy where he evidently teaches alot of Calculus and Differential Equations courses. He was having somuch fun with Manin, he asked about adding something to do symboliccalculus to Manin. This was 2005, and I personally had never used anysymbolic calculus software aside from Mathematica and Maple 12 yearsearlier, since I viewed computational symbolic calculus as pretty muchirrelevant for most computational number theory, and the Calculus Ihad taught never required a computer since computers weren't allowedon exams. (I now think no computational technique should be a prioriviewed as irrelevant to research in number theory!) So I asked Davidabout the available open source options, and he said they were Axiomand Maxima, neither of which I had ever heard of. I can't rememberhow we chose Maxima instead of Axiom, but it was some combination ofMaxima being easier to build, easier to understand, and having about1000 times as many users. In any case, like with PARI, GMP, andPython, I added Maxima and GAP to the Manin distribution. I alsochanged the name from Manin to SAGE = Software for Algebra andGeometry Experimentation.

David also convinced me Sage needed commutative algebra. At first, hetalked to people and tried to implement everything from scratch, buteven the resulting arithmetic was dreadfully slow. Groebner basiswould be a nightmare waiting on the horizon. We were both impatient,so we decided to try to find an open source program already out here,and just use it. There were two choices: Macaulay 2 and Singular,which had a lot of overlap in functionality. For what we wanted---basic commutative algebra---they both did all we needed. Singularbuilt from source easily in a few minutes on every computer I caredabout. Macaulay 2 was ridiculously hard to build and took a longtime. I think based mostly on that, we chose Singular. Also, it wasencouraging that Singular had a relatively large development team, anddid better in some benchmarks I tried.

At the same time as all this, I was traveling a lot and interviewingfor tons of tenure/tenure track jobs all over the place. I got somejob offers with tenure, and suddenly had the crazy idea that if Iworked fulltime on SAGE for a year two, my career could not bedestroyed. This really encouraged me. My number theory researchslowed a lot, and I spent all my extra time on SAGE for a while.

Remember David Kohel, who six years ago in 1999 first introduced me toMagma? It turns out that like me he spent years and years writing alarge library of software on top of Magma for number theory andcryptography research. However, at some point he had a fairly seriousfalling out with the Magma group, whose details I will omit. Sufficeto say, like me he was motivated to at least look for other options.He started building and using Sage, and started doing huge amounts ofwork on Sage as well, e.g., introducing morphisms and categoriessystematically into Sage, and implementing tons of code related toelliptic curves, algebraic varieties, etc. David Kohel was aBiologist as an undergraduate and has an amazing eye for generalstructure. He also had many technical issues with Magma, which weremostly different than mine. For example, he felt that the Magmadevelopers had made a mistake with the design of morphisms, and hedidn't want Sage to make the same mistakes. And he was right toworry, since for things I didn't care too much about, I would usuallyjust copy Magma... or as David would say, ``copy Magma's mistakes".

I moved to San Diego and Joe Wetherell who first introduced me tomodular symbols in 1997 also got involved in Sage development, thoughmostly from the conversation point of view. Joe had long ago quitgrad school to start a software company in the early 1990s, thenretired from that and went back to grad school, so he had a fairlymature perspective on software development, and he knew a huge amountabout number theory and optimized algorithms. So 2005 was a long,long year in which David Kohel in Australia, David Joyner in Maryland,and me in San Diego, wrote code.

At the end of 2005, the three of us had written a ton of code,integrated numerous components togethers, and finally had something.On December 6, Jaap Spies mentioned Sage on the sci.math.symbolicnewsgroup, in response to which some guy named Richard Fatemandeclared Sage a curiosity and made some unencouraging assertions aboutthe way the world works (regarding users, funding, etc.):

I was certain deep down inside that Sage would fail anyways, that whatwe wanted to do with Sage was totally impossible, so Fateman'scomments couldn't discourage me further. I just didn't care that Sagewas doomed. I couldn't help pushing further.

John Cannon found out about Sage, maybe as a result of the postings onsci.math.symbolic, and right before Christmas in 2005 he sent me thisemail:

This is to formally advise you that your permission to run ageneral-purpose calculator based on Magma ends on Dec 31, 2005. This was originally set up at your request so students in your courses at Harvard could have easy access to Magma.

Please confirm receipt of this letter.Wishing you a happy Christmas,John---------------------------------------------------------------

This single email seriously scared me. Though I was working on Sagevery hard for nearly a year at this point, I honestly didn't thenexpect Sage to really be able to replace Magma for me. Magma was thecommercially funded result of fulltime work over decades (reallystarting in 1973 with the first version of Cayley). The amount ofwork to get from what I had with Sage in December 2005 to what I hadwith Magma in December 2000, was still absolutely momentous. I didn'teven know if it could be done by a single human being. Moreover, asfar as I could tell many of the critical linear algebras algorithms Ineeded (to make the difference between a calculation taking a minuteor a year) existed only in the secret kernel of Magma and AllanSteel's head, and they were going to stay locked there forever as faras I could tell. For example, in June 2004 (before Sage existed),Allan and I were together at the ANTS VI conference. I started askingAllan to explain some of the algorithms, and he would explain thingsto a point, but not nearly enough to do an actual implementation. Andhe gave me this look, like he knew I was trying to get something outof him.

Isn't it weird that mathematics can be done that way? In 2004, almosteverybody in the world doing serious computations with ellipticcurves, modular forms, etc., were using Magma. Magma was the industrystandard, Magma had won for the forseable future. David Kohel and Iwere a big reason why. And yet what kind of mathematics is it, whenmuch of my work fundamentally depends on a bunch of secret algorithms?That's just insane. Moreover, it turns out that these algorithms I'malluding to are really beautiful (and they are now standard and insome cases better than what's in Magma, in my opinion, thanks to greatwork of people like Giesbrecht, Perent, Kaltofen, Storjohann,Saunders, Albrecht, etc.).

Anyway, John Cannon's email above seriously scared me. I wasn't inany way confident that Sage would ever replace Magma for my work andteaching, and I had big plans involving interactive mathematical webpages. These plans were temporarily on hold as I was drawn intoSage. But there were still there. What John did with that email istell me, in no uncertain terms, that if I was going to create thoseinteractive mathematical web pages, they couldn't depend onMagma. ``This is to formally advise you that your permission to run ageneral-purpose calculator based on Magma ends." I was scared. Itwas also the first time I saw just how much power John Cannon had overmy life and over my dreams. That email was sent on a whim. I hadn'tgot any official permission to run that Magma calculator for aspecific amount of time (just open ended permission). What John madecrystal clear to me was that he could destroy my entire longterm planson a whim. I looked around for other options, and there just weren'tany. Sage had to succeed. But still I was certain that it justwasn't humanly possible, given that I had to do almost all the work,with limited funding and time.

At this time I had an NSF grant, and also startup money at UCSD, henceI could rebudget some of my NSF grant. David Joyner suggested that werun a ``Sage Days", which I guess was named after the East CoastComputer Algebra Days (ECCAD). David and I organized the first one,and David did an amazing job inviting a great cast of speakers,including Steve Linton (of GAP), Sebastian Pauli (of KANT), etc., andJoe Buhler who also lives in San Diego made sure we scheduled theworkshop so that a lot of people would show up. We had Sage Days inearly February, and I released Sage version 1.0 during my talk, whichstarted the workshop. The talks went well, people were extremelyenthusiastic about Sage, the coding sprints were intense: the firstversion of Sagetex was written, and the current sophisticated GAPinterface was written then by Steven Linton, Kiran Kedlaya and DavidRoe wrote that Macaulay 2 interface, and I had the first spark ofinsight about how to create the Sage Notebook, after watching a talkby Robert Kern about some failed attempt to give IPython a notebookinterface. That was the first time I realized a notebook styleinterface had some value. And Gonzalo Tornaria got us to finallystart using revision control for our source (!), which meant way morepeople could easily contribute. (I had used revision control beforewith Magma, but with Sage I had been just taking snapshots regularly.)

As a direct result of Sage Days 1, development picked up. Then Imoved to University of Washington (Seattle) two months later.Somehow, during the summer of 2006 I was invited by MSRI to run a2-week summer workshop on modular forms for about 40 graduatestudents. I invited David Kohel and a few other speakers. At thispoint, honestly some aspects of Sage sucked. I had written a massiveamount of code, for really wide range of things. Power series,fraction fields, number fields, modular symbols, linear algebra, etc.I probably should have just used Magma for the workshop, and indeed atleast one speaker entirely did. But I didn't... and this was aturning point. Some of the students, such as David Harvey (gradstudent at Harvard), Robert Bradshaw (Seattle), Craig Citro, and manyothers, became highly interested in fixing the numerous flaws they raninto with Sage. After the talks, we had huge all night coding sprintsin the dorm lobby. Students constantly asked me questions about howto do things in Sage, and my answer was usually: ``It's easy. Implementit and send me a patch!" They made a t-shirt for the conference withthis quote on it.

Next we started planning Sage Days 2 in Seattle in late 2006. Thissecond Sage Days was also well attended and resulted in majorfundamental development directions. For example, David Harvey led acharge to redesign the coercion model and David Roe got obsessed withimplementing every model imaginable of the p-adics (this still isn'treally done over 3 years later). Sebastian Pauli gave a talk in whichhe explained what anybody who takes an algebraic number theory courseknows (or pays attention to Weil), which is that there is a numberfield and function field analogy and that all the algorithms carryover. Guess what---Magma has a sophisticated implementation of allthe relevant algorithms in the function field case, due to work ofFlorian Hess that built on work of the KANT group and others, and PARIhas absolutely nothing for algebraic number theory over functionsfields. Sebastian explained that in fact Magma is the only program inthe world that provides both sides of this analogy, I think hopingthat we would do something about this problem. (Now it is 2009, andstill nothing at all has happened---Magma is the only program in theworld for the function field half of algebraic number theory. Gees.)

We had about 13 talks on the first day of Sage Days 2. At the end ofthe day, David Savitt (at student of Richard Taylor, and now aprofessor in Arizona) looked at me and declared me insane.

After Sage Days 2, I spent over 2 very, very painful monthsimplementing David Harvey's proposed coercion model. I learned (orrather, remembered) how difficult certain types of changes to a largeinterrelated library can be. Also, a student from San Diego---AlexClemesha---followed me to Seattle, and I paid him fulltime to work onSage using my startup money. He implemented 2d graphics formathematicians (instead of scientists, which is what matplotlibprovides), and he also helped a lot with the first version of the SageNotebook. In fact, he was a big Mathematica user before he startedusing Sage, and he really missed the Mathematica Notebook, so hewanted something similar in Sage. When he used Mathematica, he had ajob programming webpages using webMathematica, so really wantedsomething that combined the notebook idea with a webpage. We came upwith various ideas. Then I hired an undergraduate, Tom Boothby, whohad just quit a 6-year career in web programming to go back tocollege. Together, the three of us figured out how to write AJAXapplications, and the first version of the notebook was born.

It was a controversial decision at the time to write a webappinstead of a traditional local GUI application. There were manyreasons we made this choice, but for me it was mainly because (1) Ihad written some serious local GUI applications before and knew thatthey are not easy to write, not portable, and hard to build fromsource, and (2) I had tried out wxMaxima (the local GUI maximainterface) and was just totally shocked to see how bad it was, due tohaving to reinvent the wheel---they would have to implement fontdialogs, tabs, everything from scratch; in contrast, with a webapplication much of that comes for free. So my motivation wasentirely to create a desktop application quickly. That it happen tolater make it possible for people to collaborate easily, use a Sagenotebook over the web, etc., is a nice bonus. And, it's clear by nowthat web applications (like Facebook, Gmail, etc.) are extremelypopular now, and will only get way more popular in the future.

In 2007, the Sage project started really picking up steam. BobbyMoretti, another UW undergraduate, got obsessed with making itpossible to actually do symbolic calculus in Sage itself. UntilBobby's code was added to Sage in mid 2007, absolutely all symbolicCalculus in Sage had to be done via explicit unnatural calls toMaxima, and involved embarassing and confusing conversions. Bobby, meand others spent a lot of work designing Sage's first symboliccalculus interface, and Bobby wrote a pure Python ``proof of concept"reference implementation that used Maxima via a pseudotty behind thescenes for everything. This took him months, and probably had anegative impact on all other aspects of his life. But he heroicallypulled it off. It went into Sage and changed thingsdramatically---suddenly, Sage could actually be used for someundergraduate courses. This increased interest in Sage dramatically.

A few months later, in November 2007, Sage was nominated for theTrophees du Libre, and Martin Albrecht presented Sage at the meetingfor finalists. We won first place in the Scientific Softwarecategory. This resulted in a blitz of publicity (e.g., severalslashdot articles, and articles in papers around the world in manylanguages), and greatly increased the number of Sage downloads.

Around this time we also have Sage Days 5 at the Clay MathematicsInstitute, and Craig Citro convinces us to switch to a 100% peerreview and 100% doctest policy on all new Sage code. Also, I hireMichael Abshoff to do release management for one year, whichtemporarily frees me up to work more on coding, grant proposals, andmy own research.

There is of course much, much more to this story. But it's toorecent, and sometimes a story shouldn't be told until enough time haselapsed.

About Me

I am a professor of mathematics at University of
Washington. In my mathematics research, I use the Birch and
Swinnerton-Dyer conjecture as motivation to explore the
constellation of conjectures and questions about arithmetic invariants of elliptic curves. I do many explicit computations, and started the Sage Mathematical Software project. Currently, I'm working very hard on https://cloud.sagemath.com.