This is a soft question but it's not meant as a big-list question. I have recently been asked whether I want to provide feedback at the pre-beta stage on a forthcoming website that will provide a platform for data sharing, and rather than giving just my personal opinion I'd rather consult other mathematicians first. I was going to write a blog post but then I thought that Mathoverflow was a more suitable place since I have a question and I'm looking for answers of a certain type rather than general comments. The website seems to be aimed mostly at scientists who want to share raw data, so at first I thought it probably wouldn't be much use to mathematicians since our data is (or are if you prefer) mostly highly interlinked -- the connections are often more interesting than what they connect.

But on further reflection, it seems to me that a good data sharing site could be a valuable resource, even if it doesn't do absolutely everything any mathematician would ever want. For instance, Sloane's database is fantastically useful. A rather different sort of database that is also useful is Scott Aaronson's Complexity Zoo. So useful databases exist already. Is this an aspect of mathematical life that could be greatly expanded given the right platform? And if so, what should the platform be like?

I don't know anything about the design of the site, but if I'm going to comment intelligently on what features it would need to have to be useful to mathematicians, I'd like to be armed with some examples of the kind of data sharing we might actually go in for. Here are a few ideas off the top of my head.

Diophantine equations: one could have a list of what is known about various different ones.

Mathematical problems: listed in some nice categorized way, each problem accompanied by a description, complete with reading list, of what you really ought to know before thinking about the problem. (As an example, if you are thinking about the P versus NP problem, then you really ought to know about the Razborov/Rudich natural proofs paper.)

Key examples in various different areas and subareas of mathematics.

Sometimes you have a whole lot of related mathematical properties with a complicated pattern of implications between them. Under such circumstances, it could be nice to have this information presented in a nice graphical way (something I think this site may be able to do well -- they seem to be keen on visualization) with links to proofs of the implications or counterexamples that demonstrate when the implications do not hold. (The example I'm thinking of while writing this is different forms of the approximation property for Banach spaces, but there are presumably several others.)

List of special functions and the facts about each one that are the main facts one uses to prove things about them.

List of integrals that can be evaluated, with descriptions of how they can be evaluated.

List of important irrational numbers with their decimal expansions to vast numbers of places. (I'm not sure why this would be useful but it might be amusing.)

These are supposed to be examples where people could usefully pool the background knowledge that they pick up while doing research. I'm not particularly pleased with them: they should be thought of as a challenge to come up with better ones, which almost certainly exist. If you've ever thought, "Wouldn't it be nice if there's somewhere where I could look up X," then X would make a great answer. I think the most interesting answers would be research-level answers (unlike some of the suggestions above).

If there were a site with a lot of databases, it would make a great place to browse: it would be much easier to find useful data there than if it was scattered all round the internet.

One constraint on answers: there should be something about a suggested database that makes it unsuitable for Wikipedia, since otherwise putting it on Wikipedia would appear to be more sensible.

As a practical matter, you might suggest (assuming these people are not close to StackOverflow) that instead of contributing a website based service for people to set up databases, they contribute a software package for users to install on their own website, and make the free version open source and useful and the paid version with technical support and more features like incremental indexing, internet impact, special templates for those with mobile devices, etc. Gerhard "Some Things In Life Cost" Paseman, 2011.06.21
–
Gerhard PasemanJun 21 '11 at 23:00

We advocated the need for an open, worldwide graphbase to collect and distribute graphs and programs for their generation, analysis, manipulation, and drawing.

And here is their Abstract:

In order to evaluate, compare, and tune graph algorithms, experiments on well designed benchmark sets have to be performed. Together with the goal of reproducibility of experimental results, this creates a demand for a public archive to gather and store graph instances. Such an archive would ideally allow annotation of instances or sets of graphs with additional information like graph properties and references to the respective experiments and results. Here we examine the requirements, and introduce a new community project with the aim of producing an easily accessible library of graphs. Through successful community involvement, it is expected that the archive will contain a representative selection of both real-world and generated graph instances, covering significant application areas as well as interesting classes of graphs.

I agree with André Henriques regarding algebraic topology. In addition to cohomology of $K(\mathbb{Z},n)$ data, I'd also like to see pages of spectral sequences computing various other things of interest, e.g. homotopy groups of spheres, complex cobordism, modules over real $K$-theory, etc. This subject has a massive dearth of examples, especially of examples using spectral sequences. One place where some computations exist is Bob Bruner's webpage, and I like the way he displays some of those spectral sequences as JPEGs. I envision something like that going into this database, but with more computations because more mathematicians can contribute.

In a related vein, I would LOVE to have a repository of applets and other computer code mathematicians have written to help them do computation. For instance, Aaron Mazel-Gee has an Adem relation calculator which would have helped me a ton back when I was starting to learn this material. I've found various other applets (e.g. from Christian Nassau's webpage linked from Bob Bruner's page) but it's not easy. If we had a database it seems it would be a natural place to put up code we've written along with documentation to help others with similar computations. Obviously algebraists have SAGE and GAP but there seems to be nothing like that for algebraic topology. If there is such a place please let me know, especially if it helps with the bookkeeping in spectral sequence computations!

I'd quibble with saying 'this subject has a massive dearth of examples' --- spectral sequences are almost exclusively developed by and for examples, of which there are a spectacular plethora. The (well, a) real problem is that until recently typesetting spectral sequences was sufficiently difficult that people did it quite infrequently, even in research papers whose main content was spectral sequence computations. Tilman Bauer's sseq package was a big improvement, but many kinds of SS displays are still painful to tex.
–
cdouglasJun 25 '11 at 1:14

What I meant was mostly with regards to the "modules over $K$-theory" and similar things. I'm thinking of Ring Spectra and modules over them. In that setting there seem to be very few examples. You can get Eilenberg-Maclane spectra, spectra of the form $\Sigma^\infty X$ for a space $X$, and classical (generalized) cohomology theories. Beyond that I have found very few examples. For spectral sequence computations I agree that there are many examples.
–
David WhiteJun 25 '11 at 14:12

So people are working on Sage packages for doing things with the steenrod algebra, does this count as algebraic topology?
–
Sean TilsonJun 29 '11 at 23:26

Several times this week I wished I could find a list of adjunctions and the explicit monad and comonad thereby generated. May as well throw in the algebras thus generated, as well as the Kleisli and Eilenberg-Moore categories. The usefulness would come from having all the definitions properly unwound, so that one would recognize the 'familiar' objects immediately.

In the flavor of "Wouldn't it be nice if there's somewhere where I could look up X," I think the database could probably use a list of online lectures which are freely available, e.g. some entries given by this MO question.

In the same vein, I could also imagine listing lecture notes people have prepared for courses, but this would be harder since there are so many places online where professors have posted their lecture notes and it can be difficult to say when one set of lecture notes is "better" than another. Still, having online lectures as well as resources for teaching (e.g. calculus applets) would fit the bill of things I like to look up but often have to search all over the web for.

I would like such a database to include a list of conferences by field of study and location of the conference. As it stands several people have webpages where they post about conferences and there's also the AMS page but some conferences still slip through and I don't hear about them till they're over. One would assume that a giant database which all professional mathematicians could contribute to would not have this problem. Conference organizers could post to this database and in this way get the word out to a much larger audience. It's also not at all hard (using say SQL) to make your database self-cleaning, i.e. remove conferences that have already happened.

I like that this is community wiki, so I can throw out a crazy idea and not feel bad if people vote it down. I've come across a number of questions on MO of the flavor "please give me a textbook recommendation for subject X." What about having a page in the database for mathematicians to rank textbooks in various fields, e.g. by voting for them.

The obvious problem I foresee with this is that different textbooks are good for different levels of learning, but I suppose this page could be geared towards undergrads or early grad students who don't yet know which books will serve them best. By the time you are an expert in a field you don't need textbook recommendations, so this aspect of the database would need to be purely pedagogical and geared towards people starting in a subject.

As it stands, I need to search several websites (e.g. Google books, Amazon, Goodreads, etc) for recommendations and these can come from literally anyone. With this database I assume we'd be restricting who can post to the database so that only professional mathematicians could. So I would know that I could trust the recommendations as coming from a mature and knowledgeable source rather than an undergrad who just got a bad grade and blames the textbook.

While the OEIS is open & easily downloaded, the algorithms behind superseeker are not. The OEIS could really benefit from an open and usable database of sequence transformations.
–
Kevin O'BryantOct 30 '11 at 22:28

1

@KevinO'Bryant: The superseeker source has been public since at least 2010. You can download it at oeis.org/ol.html which is linked to from the bottom of each page on the OEIS.
–
CharlesJul 14 '14 at 14:32

I often find myself in situations where I have an object with certain properties,
so a way of searching on theorems about objects with these properties would be very convenient. This could be expanded into a database where each theorem has a computer-verifiable proof.

Without semantic parsing of theorems (i.e. being able to decompose a theorem into semantic subcomponents/objects, not just a syntactic collection of tags) this would be of very limited use. And I don't think we're getting any kind of semantic search in mathemathics soon...
–
Marcin KotowskiJun 22 '11 at 9:34

@Marcin Kotowski: Well, I still want to see this in a DB.
–
Per AlexanderssonJun 22 '11 at 16:37

My feeling about this is closest to the answer by Timothy Chow. First somewhere I can look up a definition or a named theorem would be better off somewhere else. There are existing databases that have proved themselves useful. The OEIS is far more useful than I would ever have expected. John Cremona's database of elliptic curves is the result of a lifetime's hard work although I have not used it myself. Again it is not my field but my understanding is that GAP includes many databases that have been built up over years. There is also Thistlewaite's KnotScape and Bar Natan's KnotAtlas which are databases of knot tables and knot invariants. There is also Nauty which produces great lists of graphs and related stuctures.

It seems to me that all of these become far more useful if the database is part of a computer algebra package. Furthermore that it is not just the data but the calculations that can be done with it that make it useful to mathematicians.

For numerical analysts and scientific computing folks, a database of 'standard problems' with given geometry, parameters, tolerances and input/output specifications, and a mechanism for storing and comparing (curated) computational attacks on these. For example, the problem of lid-driven cavity flow is considered a major test for computational fluid dynamics, and it would be great to have an agreed set of 3 or 4 sub-problems (laminar flows, angled walls, incompressible flows, nearly incompressible flow) on which the performance of algorithms could be compared.

Comparisons of the performance of algorithms on a given problem according to criteria such as accuracy, storage needed, and efficiency would be useful. Code in a given language would be useful, but one probably cannot insist on this.

There's a dearth of such 'standard problems', and thus algorithms purportedly approximating solutions for the same problem are rarely compared.
NIST has an example of such standard problems for models of micromagnetics.
http://www.ctcms.nist.gov/~rdm/mumag.org.html

Here is a model to emulate: The Complexity Zoo, and the associated Complexity Garden,
already mentioned by Timothy.
If you ever encounter an unknown (to-you) complexity class acronym (PPP, LOCCFL, QIP, ...), then this is the source to consult. Does this fail the Wikipedia test? You can look up each acronym individually, but
to have the entire Zoo (I count[grep] 569 entries) in front of you as you try to identify features of the animals is extremely
useful. And in fact most Wikipedia entries for the individual classes link to The Complexity Zoo
for further information.

It seems to me that it would be a good start to list existing resources that are already known to be useful. These should be a good indication of what other resources would be useful, if not examples of resources that would benefit from being located in a single centralized place instead of scattered across the web. Here are three categories of resources that come to mind.

Tables of data for small examples of various important families of mathematical objects. In addition to those that have already been listed, I'll mention Cremona's elliptic curve data and a list of data about small matroids on David Haws's site.

Code repositories for implementations of important algorithms that aren't readily available in a standard mathematical computation package. Currently these seem to be scattered across people's personal homepages, and run on a variety of computing environments, although Sage is making a creditable attempt to unify everything.

Dynamic surveys, such as those maintained by the Electronic Journal of Combinatorics. These are typically maintained by a single person, although some of them might benefit from wikification. The NP Optimization Compendium might be an example of this; the subject is so vast that it is hard for just one or two people to keep it up to date.

It seems like this fails the Wikipedia test.
–
Qiaochu YuanJun 22 '11 at 0:03

1

@Qiaochu: I don't understand your comment. Are you talking about the particular example "Noether's theorem" (which indeed is easy to find on Wikipedia) or are you making some other point? I could provide other examples of named theorem for which it took me some time and effort before I figured out what the theorem exactly said.
–
André HenriquesJun 22 '11 at 17:40

I believe Qiaochu is referring to the last paragraph in the question, and is suggesting that such a list would be appropriate on Wikipedia.
–
S. Carnahan♦Jun 23 '11 at 11:26

Lie group theory and their representations certainly deserve a database. (See e.g. Atlas project and Olver's books that are the only reference for classification of smooth group actions on $\mathbb{R}^n$ for small $n$ that I am aware of.)

Also there is a project to create an internet database of solutions to Einstein equations and I guess that other important PDE's could use a database too.

The one thing that can really be a problem, particularly when you are not that familiar with some area, is when you become interested in objects of some kind with a particular property and you do not know that these are usually being referred to as, say, Garfield-Dilbert-FooBars. More generally speaking, sometimes you do not know the name of a mathematical definition and that name alone would help immensely. I for one would like to have a hierachical database where you can find such names. For instance, you search for "surface", "complete" and "regular" and (among maybe other things), it tells you what a K3 Surface is. There should be plenty more examples like this.

One of notation for various mathematical concepts, covering cultural differences. For example the different ways to denote the open-closed interval, or that different symbols are used for strict subset. LaTeX symbols or macros when they exist, would also be useful. This would have greatly helped me when reading things for the first time and later with writing. Such material is definitely out of scope for Wikipedia.

Another useful database would be of logics and logical theories. These could range from propositional logics to higher order and infinitary logics. The theories should include various fragments of arithmetic, algebraic theories, theories of strings, etc. For each, I would be interested in what is known about decidability (and computational complexity), completeness, interpolation, etc. Currently, I use the Stanford Encyclopaedia of Philosophy and Wikipedia but what I want may often not be there or not succinctly presented for reference purposes.

@André: I hadn't seen that before, but I like it more than $[2,\infty)$, which is often introduced at a time when students do not yet have a proper respect for $\infty$.
–
Noah SteinJun 22 '11 at 13:00

1

@André: the Czech public school system also employs angle brackets, but with the opposite meaning: these two intervals would be written as $\langle2,4)$ and $\langle2,\infty)$.
–
Emil JeřábekJun 22 '11 at 15:16

1

I forgot another perversity, namely that the end-points would often be delimited by a semicolon $\langle2;4)$ instead of a comma, on account of the latter being used as the decimal mark.
–
Emil JeřábekJun 22 '11 at 15:26

+1 for the logical theories - in the same spirit as fragments of arithmetic, also a collection of first-order languages and the objects they describe (for example, the theory of the p-adics seems to have been studied in a few different setups). There are scads of interrelated examples with different properties floating around in model theory.
–
Scott McKuenJun 24 '11 at 14:14

Here is a database on consequences of and equivalent formulations of the axiom of choice, which is searchable by keyword and axiom form, and which contains hundreds of formulations of choice-like axioms, while providing the implication relations between them and the known models that separate them. I have had occasion to use this database in my research, and the data is useful. I can easily imagine, however, a greatly improved interface or more visual manner of presenting the data, and perhaps this is the kind of situation you seek.

More generally, I can imagine some kind of database keeping track of the various independence results over ZFC (or other theories), especially when combinations of statements are considered, and the models that achieve them.

I can imagine a database of the logical relations between various large ardinal notions and their strengths. The mere fact that the large cardinal chart in Kanamori's book The Higher Infinite is so often consulted proves that a much larger and more comprehensive version of that information would be useful.

Joel, the idea has been suggested that a "wiki" like platform be set up to discuss determinacy. I may get serious about it in a few months, once the tenure hurdle is over.
–
Andres CaicedoJun 25 '11 at 5:21

Victoria Gitman and I have started the large cardinal site. More info later.
–
Joel David HamkinsOct 31 '11 at 12:53

I would like a mathematical search-thesaurus; this would be a list of descriptions people thought of to use as search terms (and that I might think of using) and for each description, a list of phrases that appeared in documents which contained stuff relevant to the descriptions. A recent example might be: "gently falling curve" yields "roller coaster physics". A prototype of a thesaurus could be built out of the MathOverflow database.

I could imagine such a thing rapidly becoming 1) bloated and 2) not particularly useful. The search terms any given person thinks of for a particular concept are likely to be highly idiosyncratic. Doing this intelligently seems like a difficult "semantic web"-type problem rather than a database problem.
–
Qiaochu YuanJun 21 '11 at 22:54

If the data were automatically assembled, with no processing, I agree. I think if the contributions came from a certain class of the population, you might find more uniformity in the kinds of terms. Once a good initial list is prepared, it could be expanded by contributions from people outside the class. This could be done through volunteers who were motivated in building the list, or through data collection from that part of the population who are willing to rank initial attempts by the software to match their descriptions. Gerhard "Email Me About System Design" Paseman, 2011.06.21
–
Gerhard PasemanJun 21 '11 at 23:10

It would be really useful to have a database containing various computations of (co)homology/homotopy groups of various spaces that arise in algebraic topology...Note:
There is so much known out there that one would have to first think really hard about how to organize it all.

Here's an example:
I could imagine that, for certain users, listing the first 30 integral cohomology groups of the spaces $K(\mathbb Z,1)$, $K(\mathbb Z,2)$, $K(\mathbb Z,3)$, and $K(\mathbb Z,4)$
could be more useful¹ than listing all the cohomology groups of all the $K(\mathbb Z,n)$'s. The reason is that, in order to do the latter, the information has to be packaged in a certain way that might be hard to understand: the user would need to unpack that information before she can access it.

¹ Of course, it's even better to have both pieces of information available.

For example, a while ago Greg Kuperberg found for me the small ones here, and I joyfully have in my possesion 9 DVDs (around 40 gigabytes, in a very efficient special purpose format) containing the 11,084,874,829 triple systems on 19 points as found by [P. Kaski and P. R. J. Östergård, The Steiner triple system of order 19, Math. Comp. 73 (2004), 2075–2092.]