An object id (OID) is something that uniquely identifies an object. Within a JVM this is typically the object's pointer. In a relational database table, a row is uniquely identified in its table by its primary key. When persisting objects to a database you need a unique identifier for the objects, this allows you to query the object, define relationships to the object, and update and delete the object. In JPA the object id is defined through the @Id annotation or <id> element and should correspond to the primary key of the object's table.

You must never change the id of an object. Doing so will cause errors, or strange behavior depending on your JPA provider. Also do not create two objects with the same id, or try persisting an object with the same id as an existing object. If you have an object that may be existing use the EntityManagermerge() API, do not use persist() for an existing object, and avoid relating an un-managed existing object to other managed objects.

An object id can either be a natural id or a generated id. A natural id is one that occurs in the object and has some meaning in the application. Examples of natural ids include email addresses, phone numbers, and social insurance numbers. A generated id (also known as a surrogate id) is one that is generated by the system. A sequence number in JPA is a sequential id generated by the JPA implementation and automatically assigned to new objects. The benefits of using sequence numbers are that they are guaranteed to be unique, allow all other data of the object to change, are efficient values for querying and indexes, and can be efficiently assigned. The main issue with natural ids is that everything always changes at some point; even a person's social insurance number can change. Natural ids can also make querying, foreign keys and indexing less efficient in the database.

In JPA an @Id can be easily assigned a generated sequence number through the @GeneratedValue annotation, or <generated-value> element.

Table sequencing uses a table in the database to generate unique ids. The table has two columns, one stores the name of the sequence, the other stores the last id value that was assigned. There is a row in the sequence table for each sequence object. Each time a new id is required the row for that sequence is incremented and the new id value is passed back to the application to be assigned to an object. This is just one example of a sequence table schema, for other table sequencing schemas see Customizing.

Table sequencing is the most portable solution because it just uses a regular database table, so unlike sequence and identity can be used on any database. Table sequencing also provides good performance because it allows for sequence pre-allocation, which is extremely important to insert performance, but can have potential concurrency issues.

In JPA the @TableGenerator annotation or <table-generator> element is used to define a sequence table. The TableGenerator defines a pkColumnName for the column used to store the name of the sequence, valueColumnName for the column used to store the last id allocated, and pkColumnValue for the value to store in the name column (normally the sequence name).

Errors such as "table not found", "invalid column" can occur if you do not have a SEQUENCE table defined in your database, or its schema does not match what you have configured, or what your JPA provider is expecting by default. Ensure you create the sequence table correctly, or configure your @TableGenerator to match the table that you created, or let your JPA provider create you tables for you (most JPA provider support schema creation). You may also get an error such as "sequence not found", this means you did not create a row in the table for your sequence. You must insert an initial row in the sequence table for your sequence with the initial id (i.e. INSERT INTO SEQUENCE_TABLE (SEQ_NAME, SEQ_COUNT) VALUES ("EMP_SEQ", 0)), or let your JPA provider create your schema for you.

Sequence objects use special database objects to generate ids. Sequence objects are only supported in some databases, such as Oracle, DB2, and Postgres. Usually, a SEQUENCE object has a name, an INCREMENT, and other database object settings. Each time the <sequence>.NEXTVAL is selected the sequence is incremented by the INCREMENT.

Sequence objects provide the optimal sequencing option, as they are the most efficient and have the best concurrency, however they are the least portable as most databases do not support them. Sequence objects support sequence preallocation through setting the INCREMENT on the database sequence object to the sequence preallocation size.

In JPA the @SequenceGenerator annotation or <sequence-generator> element is used to define a sequence object. The SequenceGenerator defines a sequenceName for the name of the database sequence object, and an allocationSize for the sequence preallocation size or sequence object INCREMENT.

Errors such as "sequence not found", can occur if you do not have a SEQUENCE object defined in your database. Ensure you create the sequence object, or let your JPA provider create your schema for you (most JPA providers support schema creation). When creating your sequence object, ensure the sequence's INCREMENT matches your SequenceGenerator's allocationSize. The DDL to create a sequence object depends on the database, for Oracle it is, CREATE SEQUENCE EMP_SEQ INCREMENT BY 100 START WITH 100.

This can occur if your sequence object's INCREMENT does not match your allocationSize. This results in the JPA provider thinking it got back more sequences than it really did, and ends up duplicating values, or with negative numbers. This can also occur on some JPA providers if your sequence object's STARTS WITH is 0 instead of a value equal or greater to the allocationSize.

Identity sequencing uses special IDENTITY columns in the database to allow the database to automatically assign an id to the object when its row is inserted. Identity columns are supported in many databases, such as MySQL, DB2, SQL Server, Sybase, and PostgreSQL. Oracle supports IDENTITY columns as of Oracle 12c. If using an older version, it is possible to simulate them using sequence objects and triggers.

Although identity sequencing seems like the easiest method to assign an id, they have several issues. One is that since the id is not assigned by the database until the row is inserted the id cannot be obtained in the object until after commit or after a flush call. Identity sequencing also does not allow for sequence preallocation, so can require a select for each object that is inserted, potentially causing a major performance problem, so in general are not recommended.

In JPA there is no annotation or element for identity sequencing as there is no additional information to specify. Only the GeneratedValue's strategy needs to be set to IDENTITY.

Sequence objects are database constructs and can be instantiated as distinct objects as needed. They provide incremental values when their "next" method is called. Note that the increments can be values greater than one, for example incremented by two's, three's etc. An 'on insert' trigger can be constructed for each table that calls a sequence object created for that specific table primary key asking for the next value and inserting it into the table with the new record. Identity datatypes in products like MySQL or SQL Server are encapsulated types that do the same thing without requiring the set up of triggers and sequences (although PostgreSQL at least has automated much of this with its Serial and Big Serial pseudo datatypes (these actually create an int/bigint column and run a "macro" creating the sequence and 'on insert' trigger when the CREATE TABLE statement is executed.)).

This typically occurs because the @Id was not configured to use an @GeneratedValue(strategy=GenerationType.IDENTITY). Ensure it is configured correctly. It could also be that your JPA provider does not support identity sequencing on the database platform that you are using, or you have not configured your database platform. Most providers require that you set the database platform through a persistence.xml property, most provider also allow you to customize your own platform if it is not directly supported. It may also be that you did not set your primary key column in your table to be an identity type.

Identity sequencing requires the insert to occur before the id can be assigned, so it is not assigned on persist like other types of sequencing. You must either call commit() on the current transaction, or call flush() on the EntityManager. It may also be that you did not set your primary key column in your table to be an identity type.

A common issue is that the generated Id is part of a child object's Id through a OneToOne or ManyToOne mapping. In this case, because JPA requires that the child define a duplicate Basic mapping for the Id, its Id will be inserted as null. One solution to this is to mark the Column on the Id mapping in the child as insertable=false, updateable=false, and define the OneToOne or ManyToOne using a normal JoinColumn this will ensure the foreign key field is populated by the OneToOne or ManyToOne not the Basic. Another option is to first persist the parent, then call flush() before persisting the child.

Identity sequencing does not support sequence preallocation, so requires a select after each insert, in some cases doubling the insert cost. Consider using a sequence table, or sequence object to allow sequence preallocation.

A composite primary key is one that is made up of several columns in the table. A composite primary key can be used if no single column in the table is unique. It is generally more efficient and simpler to have a one-column primary key, such as a generated sequence number, but sometimes a composite primary key is desirable and unavoidable.

Composite primary keys are common in legacy database schemas, where cascaded keys can sometimes be used. This refers to a model where dependent objects' key definitions include their parents' primary key; for example, COMPANY's primary key is COMPANY_ID, DEPARTMENT's primary key is composed of a COMPANY_ID and a DEP_ID, EMPLOYEE's primary key is composed of COMPANY_ID, DEP_ID, and EMP_ID, and so on. Although this generally does not match object-oriented design principles, some DBA's prefer this model. Difficulties with the model include the restriction that employees cannot switch departments, that foreign-key relationships become more complex, and that all primary-key operations (including queries, updates, and deletes) are less efficient. However, each department has control over its own employee IDs, and if needed the database EMPLOYEE table can be partitioned based on the COMPANY_ID or DEP_ID, as these are included in every query.

Other common usages of composite primary keys include many-to-many relationships where the join table has additional columns, so the table itself is mapped to an object whose primary key consists of the pair of foreign-key columns and dependent or aggregate one-to-many relationships where the child object's primary key consists of its parent's primary key and a locally unique field.

There are two methods of declaring a composite primary key in JPA, IdClass and EmbeddedId.

An IdClass defines a separate Java class to represent the primary key. It is defined through the @IdClass annotation or <id-class> XML element. The IdClass must define an attribute (field/property) that mirrors each Id attribute in the entity. It must have the same attribute name and type. When using an IdClass you still require to mark each Id attribute in the entity with @Id.

The main purpose of the IdClass is to be used as the structure passed to the EntityManagerfind() and getReference() API. Some JPA products also use the IdClass as a cache key to track an object's identity. Because of this, it is required (depending on JPA product) to implement an equals() and hashCode() method on the IdClass. Ensure that the equals() method checks each part of the primary key, and correctly uses equals for objects and == for primitives. Ensure that the hashCode() method will return the same value for two equal objects.

TopLink / EclipseLink : Do not require the implementation of equals() or hashCode() in the id class.

An EmbeddedId defines a separate Embeddable Java class to contain the entities primary key. It is defined through the @EmbeddedId annotation or <embedded-id> XML element. The EmbeddedId's Embeddable class must define each id attribute for the entity using Basic mappings. All attributes in the EmbeddedId's Embeddable are assumed to be part of the primary key.

The EmbeddedId is also used as the structure passed to the EntityManagerfind() and getReference() API. Some JPA products also use the EmbeddedId as a cache key to track an object's identity. Because of this, it is required (depending on JPA product) to implement an equals() and hashCode() method on the EmbeddedId. Ensure that the equals() method checks each part of the primary key, and correctly uses equals for objects and == for primitives. Ensure that the hashCode() method will return the same value for two equal objects.

TopLink / EclipseLink : Do not require the implementation of equals() or hashCode() in the id class.

A common model is to have a dependent object share the primary key of its parent. In the case of a OneToOne the child's primary key is the same as the parent, and in the case of a ManyToOne the child's primary key is composed of the parent's primary key and another locally unique field.

JPA 1.0 does not allow @Id on a OneToOne or ManyToOne, but JPA 2.0 does.

One of the biggest pitfalls when working with composite primary keys comes with the implementation of associations to entity classes whose tables have multi-column primary keys (using the @JoinColumns annotation). Many JPA implementations may throw seemingly inconsistent exceptions when not specifying referencedColumnName for every @JoinColumns annotation, which JPA requires for composite foreign keys (even if all referenced column names are equal to the ones in the referencing table). See http://download.oracle.com/javaee/5/api/javax/persistence/JoinColumns.html

Unfortunately JPA 1.0 does not handle this model well, and things become complicated, so to make your life a little easier you may consider defining a generated unique id for the child. JPA 1.0 requires that all @Id mappings be Basic mappings, so if your Id comes from a foreign key column through a OneToOne or ManyToOne mapping, you must also define a Basic@Id mapping for the foreign key column. The reason for this is in part that the Id must be a simple object for identity and caching purposes, and for use in the IdClass or the EntityManagerfind() API.

Because you now have two mappings for the same foreign key column you must define which one will be written to the database (it must be the Basic one), so the OneToOne or ManyToOne foreign key must be defined to be read-only. This is done through setting the JoinColumn attributes insertable and updatable to false, or by using the @PrimaryKeyJoinColumn instead of the @JoinColumn.

A side effect of having two mappings for the same column is that you now have to keep the two in synch. This is typically done through having the set method for the OneToOne attribute also set the Basic attribute value to the target object's id. This can become very complicated if the target object's primary key is a GeneratedValue, in this case you must ensure that the target object's id has been assigned before relating the two objects.

Some times I think that JPA primary keys would be much simpler if they were just defined on the entity using a collection of Columns instead of mixing them up with the attribute mapping. This would leave you free to map the primary key field in any manner you desired. A generic List could be used to pass the primary key to find() methods, and it would be the JPA provider's responsibility for hashing and comparing the primary key correctly instead of the user's IdClass. But perhaps for simple singleton primary key models the JPA model is more straight forward.

TopLink / EclipseLink : Allow the primary key to be specified as a list of columns instead of using Id mappings. This allows OneToOne and ManyToOne mapping foreign keys to be used as the primary key without requiring a duplicate mapping. It also allows the primary key to be defined through any other mapping type. This is set through using a DescriptorCustomizer and the ClassDescriptoraddPrimaryKeyFieldName API.

Defining an Id for a OneToOne or ManyToOne in JPA 2.0 is much simpler. The @Id annotation or id XML attribute can be added to a OneToOne or ManyToOne mapping. The Id used for the object will be derived from the target object's Id. If the Id is a single value, then the source object's Id is the same as the target object's Id. If it is a composite Id, then the IdClass will contain the BasicId attributes, and the target object's Id as the relationship value. If the target object also has a composite Id, then the source object's IdClass will contain the target object's IdClass.

...publicclassPhonePKimplementsSerializable{privateStringtype;privatelongowner;publicPhonePK(){}publicPhonePK(Stringtype,longowner){this.type=type;this.owner=owner;}publicbooleanequals(Objectobject){if(objectinstanceofPhonePK){PhonePKpk=(PhonePK)object;returntype.equals(pk.type)&&owner==pk.owner;}else{returnfalse;}}publicinthashCode(){return(int)(type.hashCode()+owner);}//getter and setters with names matching the properties with many-to-one relationships}

One issue with table sequencing is that the sequence table can become a concurrency bottleneck, even causing deadlocks. If the sequence ids are allocated in the same transaction as the insert, this can cause poor concurrency, as the sequence row will be locked for the duration of the transaction, preventing any other transaction that needs to allocate a sequence id. In some cases the entire sequence table or the table page could be locked causing even transactions allocating other sequences to wait or even deadlock. If a large sequence pre-allocation size is used this becomes less of an issue, because the sequence table is rarely accessed. Some JPA providers use a separate (non-JTA) connection to allocate the sequence ids in, avoiding or limiting this issue. In this case, if you use a JTA data-source connection, it is important to also include a non-JTA data-source connection in your persistence.xml.

Table sequencing also allows for truly sequential ids to be allocated. Sequence and identity sequencing are non-transactional and typically cache values on the database, leading to large gaps in the ids that are allocated. Typically this is not an issue and desired to have good performance, however if performance and concurrency are less of a concern, and true sequential ids are desired then a table sequence can be used. By setting the allocationSize of the sequence to 1 and ensuring the sequence ids are allocated in the same transaction of the insert, you can guarantee sequence ids without gaps (but generally it is much better to live with the gaps and have good performance).

One paranoid delusional fear that programmers frequently have is running out of sequence numbers. Since most sequence strategies just keep incrementing a number it is unavoidable that you will eventually run out. However as long a large enough numeric precision is used to store the sequence id this is not an issue. For example if you stored your id in a NUMBER(5) column, this would allow 99,999 different ids, which on most systems would eventually run out. However if you store your id in a NUMBER(10) column, which is more typical, this would store 9,999,999,999 ids, or one id each second for about 300 years (longer than most databases exist). But perhaps your system will process a lot of data, and (hopefully) be around a very long time. If you store your id in a NUMBER(20) this would be 99,999,999,999,999,999,999 ids, or one id each millisecond for about 3,000,000,000 years, which is pretty safe.

But you also need to store this id in Java. If you store the id in a Java int, this would be a 32 bit number , which is 4,294,967,296 different ids (well actually 2,147,483,648 positive ids), or one id each second for about 100 years. If you instead use a long, this would be a 64 bit number, which is 9,223,372,036,854,775,808 different ids, or one id each millisecond for about 300,000,000 years, which is pretty safe. I would recommend using a long vs an int however, as I have seen int ids run out on large databases before (what happens is they become negative until they wrap around to 0 and start getting constraint errors).

JPA supports three different strategies for generating ids, however there are many other methods. Normally the JPA strategies are sufficient, so you would only use a different method in a legacy situation.

Sometimes the application has an application specific strategy for generating ids, such as prefixing ids with the country code, or branch number. There are several ways to integrate a customize ids generation strategy, the simplest is just define the id as a normal id and have the application assign the id value when the object is created.

TopLink, EclipseLink : Several additional sequencing options are provided. A UnaryTableSequence allows a single column table to be used. A QuerySequence allows for custom SQL or stored procedures to be used. An API also exists to allow a user to supply their own code for allocating ids.

Hibernate : A GUID id generation options is provided through the @GenericGenerator annotation.

A database table can be defined to have a trigger that automatically assign its' primary key. Generally this is normally not a good idea (although some DBAs may think it is), and it is better to use a JPA provider generated sequence id, or assign the id in the application. The main issue with the id being assigned in a trigger is that the application and object require this value back. For non-primary key values assigned through triggers it is possible to refresh the object after committing or flushing the object to obtain the values back. However this is not possible for the id, as the id is required to refresh an object.

If you have an alternative way to select the id generated by the trigger, such as selecting the object's row using another unique field, you could issue this SQL select after the insert to obtain the id and set it back in the object. You could perform this select in a JPA @PostPersist event. Some JPA providers may not allow/like a query execution during an event, they also may not pick up a change to an object during an event callback, so there may be issues with doing this. Also some JPA providers may not allow the primary key to be un-assigned/null when not using a GeneratedValue, so you may have issues. Some JPA providers have built-in support for returning values assigned in a trigger (or stored procedure) back into the object.

TopLink / EclipseLink : Provide a ReturningPolicy that allows for any field values including the primary key to be returned from the database after an insert or update. This is defined through the @ReturnInsert, @ReturnUpdate annotations, or the <return-insert>, <return-update> XML elements in the eclipselink-orm.xml.

If the application generates its' own id instead of using a JPA GeneratedValue, it is sometimes desirable to perform this id generation in a JPA event, instead of the application code having to generate and set the id. In JPA this can be done through the @PrePersist event.

Sometimes your object or table has no primary key. The best solution in this case is normally to add a generated id to the object and table. If you do not have this option, sometimes there is a column or set of columns in the table that make up a unique value. You can use this unique set of columns as your id in JPA. The JPA Id does not always have to match the database table primary key constraint, nor is a primary key or a unique constraint required.

If your table truly has no unique columns, then use all of the columns as the id. Typically when this occurs the data is read-only, so even if the table allows duplicate rows with the same values, the objects will be the same anyway, so it does not matter that JPA thinks they are the same object. The issue with allowing updates and deletes is that there is no way to uniquely identify the object's row, so all of the matching rows will be updated or deleted.

If your object does not have an id, but its table does, this is fine. Make the object an Embeddable object, embeddable objects do not have ids. You will need a Entity that contains this Embeddable to persist and query it.