This is very inspirational but not sure if this is done in the best way. The core value of Mendeleev's approach is to find gaps and predict things that should exist but we haven't found yet. However for this to happen, one must isolate the "atoms" that can't be further broken down and identify the pattern in these atoms. The periodic table created in Figure 4 of this paper doesn't exactly sounds like isolation of "atoms" of data structures.

You're right based on the modern day thinking of what the periodic table is but you're wrong to suggest that people knew what they were doing from the very start.

The formulation of the periodic table is quite like modern day data science. Chemists even before Mendeleev observed patterns and tried to build a model to predict missing ones. The beauty is that this was done without understanding the underlying mechanism of how it all worked, ie what makes an element (protons), what makes their periodicity (electrons). That came 50+ years later when we discovered protons and electron orbitals.

So going back to the article, their approach is fine because they're just formulating some model from observed patterns. Their reasoning for the underlying mechanism could be completely wrong but that's fine too until a better model comes along.

I am fascinated with unifying theories such as the periodic table (Mendeleev's) and the Copernican view of planets orbiting the sun. The previous, Earth-centric model had more deviations than predictive power. The periodic table is also an example of a structure with tremendous predictive power.

I think it's also interesting to distinguish between systems that reveal intrinsic order (e.g. periodic table), and systems that superimpose external order (e.g. a performance review framework).

Behavioural economics is an example of a field that desperately needs a modern-day equivalent of Mendeleev to come around and structure.

I can't find it now but I recently read of one or two guys who systematically tabulated juggling tricks. Gaps in the table led to new tricks which astonished experienced jugglers -- the way gaps in the periodic table led Mendeleev to predict properties of undiscovered elements. One guy gives talks on juggling at math conferences and on math at juggling conferences.

Newton’s theory of gravitation unified the motion of apples falling from a tree, cannonballs thrown into walls, the motion of planet earth around the sun, and even motion of all the known planetary bodies.

Here's an eclectic mix of examples from various perspectives across multiple domains. Understanding the representation of knowledge and the fundamental structure that underpins the connective essence of things is one of the prime ideas that drives my thinking, and as a consequence much of what I post about on here is related to that in some way. Below is a brief sampling of examples related to this idea, and for a more extensive list, browse through my postings (and if you can see how they all connect, let me know!).

In general, see the Classification of Finite Simple Groups [0]. In terms of specific examples, the minimal classifications of knots [1], trees [2], and braids [3] are 3 instances that popped to mind. From a number-theoretic perspective, the paper Enumerating the Rationals and its derivative works are interesting reads. In terms of spatial structure and spatial decomposition, look at the classification of crystal structures [4], lattices, polytopes (zonotopes are particularly interesting), periodic and aperiodic tilings, the classification of space-filling curves, and the classification of closed surfaces [5]. From a relational perspective, look at the classification of topological spatial relations [6] and Allen's interval algebra [7]. And the one my intuition finds most intriguing is the classification of Group Theory Single Axioms and its relation to my conjecture that division is primary.

But the whole idea of figuring out first principles and "fundamental" design choices is quite appealing. They have a section saying that a big goal here would be automatic design of optimal data structures. They talk of machine learning techniques (they mention Bayesian optimization and reinforcement learning).

That seems a very interesting research direction. In the same vein there's this talk by Jeff Dean about how they use ML at Google to replace all sorts of heuristics in data structure design (bloom filters etc.) with machine learning to optimize performance, though from what I recall it doesn't automatically change the algorithm itself:

Commenting on paper like this requires time to digest and reflect on the content. For a conceptual paper such as OP I would need at least a day (to figuratively sleep on it) but returning a couple of days later to make (hopefully) informed comment is no longer interactive. There is archival and future reference value, of course, but then rebuttals to possible misconceptions will not accompany the comment.

I only post when I think I can somehow add value to the discussion, at least value for some people who might read the comment. Obviously some people with in-depth knowledge of the paper and field in general will probably already be aware of the link I posted (for example), but others might find it an interesting avenue to explore. I know I often come to HN comments for this reason: a topic seems interesting, I want to learn of related stuff and/or get opinions from people with deeper expertise.

So even if it's only for "future reference", it might actually bring value _for some_. At least that's what I tell myself :)

Yeah, that and tracking long-running topics are things old-fashioned forums (phpBB style) are better at than the newer, more ranking-oriented ones (HN, reddit, ...), since old discussions can be pushed up again.

"""Mendeleev was a friend and colleague of the Sanskritist Böhtlingk, who was preparing the second edition of his book on Pāṇini[47] at about this time, and Mendeleev wished to honor Pāṇini with his nomenclature.[48] Noting that there are striking similarities between the periodic table and the introductory Śiva Sūtras in Pāṇini's grammar"""

this is probably off topic, but I've waited before asking until I found the closest topic from my perspective:

A while back (order months or years but certainly within last 2 years) someone mentioned a paper, I believe it was here on HN either as the topic or in a comment, but perhaps it may have been on IRC.

I am no longer able to find back this paper, and have only a vague recollection of it (I had read the abstract and the introduction and then postponed reading until I forgot about it).

It dealt with the problem of say 2 peers each having their own list or set, and part of their list is in common, but both may also have entries the other doesn't have. The problem was finding the most efficient way such that both end up with the union of both lists or sets. A brute force way would be to each send a copy of their list to the other, a slightly less brute way would be to have only one send a full copy, and have the other return his difference. But the paper detailed a more efficient method, which obviously I can't remember...

Does this description ring a bell? Does anyone know the paper I am trying to locate?

I actually have a similar question to yours--I lost track of a set of notes that was linked to in a HN comment, and would like to find them again!

The notes were a (draft?) PDF for a textbook on algorithms--much like Mathematics for Computer Science by Lehman & Leighton [1]--but instead, the topic was narrowly restricted to algorithms with a cryptographic or otherwise number-theoretic basis. In particle, hashes, content-addressable storage, and Merkle-trees were covered.

Right, the solution I remembered was a bit more complex, it required counting bloom filters and a subtraction operation to get a filter for the set difference. But the paper mentioned in a sibling comment claims to have even less overhead.

A few months ago I saw a program like this that would derive a data structure for you based on what operations you wanted to perform (inserts and queries, for example) but I haven't been able to find it again.

Whether it’s correct or not, it’s not a great sentence, and would benefit from a rewrite.

Language is weird sometimes, so I’m not certain, but your argument seems circular and incorrect. Use of ‘is’ doesn’t cause ‘data structures’ to be singular a collective noun. The noun is plural, so the correct verb is ‘are’. The sentence would need to name the singular collective in order to use ‘is’. For example: “A set of data structures is how we store and access data.” That would be correct. Arguing that “Hard disks is how we store and access data” is correct and that “hard disks” is a singular collective noun, because the sentence used ‘is’, is not normally accepted grammar.

Normally I try to get past grammatical errors in scientific literature if they do not affect the meaning of the content. Authors may not be writing in their native language, and the literary aspects of the piece aren't its purpose. When it's the third sentence in the abstract and the first of two questions that the paper is fundamentally seeking to answer, it does make me twitch a little.

Isn't CLRS mostly a secondary source that puts together previous literature? Citations usually prefer a primary source that lets you place a "discovery" accurately in date and place (aka conference/journal).