Sure, the world is complicated, but not as complicated as you might think. It turns out that most organic molecules—the kind of chemicals that make food tasty, perfumes fragrant, and life alive—derive from a few relatively simple architectures.

Together with a bunch of data-minded colleagues, Alan Lipkus of the Chemical Abstracts Service took a deep dive into his organization’s century-old library of 24 million organic compounds—most of them synthetic. They found that more than half are built from just 143 basic shapes, or “frameworks.” And the rest? Well, building those requires the other 836,565 cataloged frameworks.

Why do a handful of fundamental shapes get all the work? In part because chemists typically create new molecules—in the search, say, for potential new drugs—from the ones they’re familiar with. It’s cheaper. But Lipkus hopes that showcasing this lopsided approach will encourage researchers to work farther out on the long tail of molecular geometry. “A lot of structures have not been fully explored,” he says. “There could be interesting things to discover.” Here’s a snapshot of the newly discovered shape-alphabet.

Top 30 Molecular Shapes

Molecules are clusters of atoms joined like Tinkertoys. The range of possible structures is vast, but they can all be categorized by “molecular framework”—the underlying rings and connectors. Most common by far is the hexagon—a ring of six atoms, with one at each corner, that’s the basis for nearly 10 percent of known organic compounds. Here are the top 30 most common frameworks, with frequency of occurrence in parentheses.