Schema

A Datomic schema defines the set of possible attributes that we can use.

In Molecule we make this definition in a Schema definition file:

Schema definition file

Molecule provides an intuitive and type-safe dsl to model your schema in a Schema definition file.
After each change you make in this file you need to compile your project with sbt compile so that the
sbt-plugin can create a Molecule DSL from your definitions.

The outer object SeattleDefinition encapsulates our schema definition. The name of this object has to end with “Definition”
in order for the sbt-molecule plugin to recognize it.

Custom Scala Doc generation

The sbt-molecule plugin even generates ScalaDoc documentation for the custom DSL generated form the schema definition file!
Attribute types are explained and an optional doc(<text...>) can be added to give a hint about the attribute when
working with the code in the IDE. Given the doc text above for the Community.name attribute we can see this in our
IDE:

Molecule arity

The @InOut(2, 8) arity annotation instructs the sbt-molecule plugin to generate boilerplate code with the ability to create
molecules with up to 8 attributes including up to 2 input attributes.

When developing your schema you might just set the first arity annotation variable for input attributes to 0 and
then later when your schema is stabilizing add the ability to make input molecules by setting it to 1, 2 or 3 (the maximum).
Using parameterized input attributes can be a performance
optimization since using input values in Datalog queries allow Datomic to cache the query.

The second arity annotation parameter basically tells how long molecules you can build (this doesn’t affect
how many attributes you can define in each namespace). The maximum arity is 22, the same as for tuples.

Namespaces

Attribute names in Datomic are namespaced keywords with the lexical form <Namespace>.<attribute>. Molecule lets you
define the <Namespace> part with the name of the trait, like Community in the Seattle
example above. In this way Molecule can construct the full name of the Community.category attribute etc.

Namespace != Table

If coming from an sql background one might at first think of a namespace as
a table having columns (attributes). But this is not the case. An
entity in Datomic can associate values of attributes from any namespace:

So, when we build a molecule

val toughCommunities = Community.name.Neighborhood.name("Tough").get

we shouldn’t think of it like a

“Community table with name field with a join to Neighborhood table with a name field set to ‘Tough’” (wrong!)

but rather think it as

“Entities with a communityName attribute having a reference to an entity with a neighborhoodName value ‘Tough’”

Partitions

“All entities created in a database reside within a partition. Partitions group data together, providing locality of reference
when executing queries across a collection of entities. In general, you want to group entities based on how you’ll use them.
Entities you’ll often query across - like the community-related entities in our sample data - should be in the same partition
to increase query performance. Different logical groups of entities should be in different partitions. Partitions are discussed
in more detail in the Indexes topic.”

In the schema definition file we can organize namespaces in partitions with objects:

Here we have a gen (general) partition and a lit (litterature) partition. Each partition can contain as many
namespaces as you want. This can be a way also to structure large domains conceptually. The partition name has to be
lowercase and is prepended to the namespaces it contains.

When we build molecules the partition name is prepended to the namespace like this:

lit_Book.title.cat.Author.name.gender.get === ...

Since Author is already defined as a related namespace we don’t need to prepend the partition name there.

When we insert a Person the created entity will automatically be saved in the gen partition (or whatever we call it).

Attribute types

In the Seattle example we see the attributes being defined with the following types that should be
pretty self-explanatory:

oneString, manyString etc defines cardinality and type of an attribute

In the example above we saw a reference from Community to Neighborhood defined as one[Neighborhood]. We would for instance
likely define an Order/OrderLine relationship in an Order namespace as many[OrderLine].

Attribute options

Each attribute can also have some extra options:

Option

Indexes

Description

doc

Attribute description.

uniqueValue

✔︎

Attribute value is unique to each entity.Attempts to insert a duplicate value for a different entity id will fail.

uniqueIdentity

✔︎

Attribute value is unique to each entity and "upsert" is enabled.Attempts to insert a duplicate value for a temporary entity id will cause all attributes associated with that temporary
id to be merged with the entity already in the database.

indexed

✔︎

Generated index for this attribute. By default all attributes are set with the indexed option automatically by Molecule, so you don't need to set this.

fulltextSearch

✔︎

Generate eventually consistent fulltext search index for this attribute.

isComponent

✔︎

Specifies that an attribute whose type is :db.type/ref is a component. Referenced entities become subcomponents of the entity to which the attribute is applied.When you retract an entity with :db.fn/retractEntity, all subcomponents are also retracted. When you touch an entity, all its
subcomponent entities are touched recursively.

noHistory

Whether past values of an attribute should not be retained.

Datomic indexes the values of all attributes having an option except for the doc and noHistory options.

As you saw, we added fulltextSearch to some of the attributes in the Seattle definition above. Molecule’s schema
definition DSL let’s you only choose allowed options for any attribute type.