Id generation strategy in Spring Boot

When might we need @Id generation strategy in Spring Boot?

If we are using Spring Boot Data JPA and have @Entity models in our application, we need to have an id field of some kind in the model and @Id annotation added to it, otherwise the IDE will complain as there will be no primary key for the entity (even if you have a field named _id_).

IDE complaining as we have not defined a primary key

As you may notice, errors are gone as soon as we add @Id annotation from jakarta.persistence package to the field, and we are not required to specify the ID generation strategy.

IDE does not display any more errors after marking the field as a primary key

That’s because, if no specific strategy is provided, Spring Data JPA automatically applies GenerationType.AUTO. But, we might want to take a look at the available options.

Why do we care about ID generation?

Choosing the right ID generation strategy matters, because it can have impact on performance, scalability, and database compatibility. And each of them is (usually) important for our application.

What are the available strategies?

There are 4 strategies:

GenerationType.AUTO

@Entity
public class Blog {
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Long id;
}

JPA lets the persistence provider (Hibernate by default, if you don’t change it) decide the best strategy based on the database we are using.
Works well for most cases, but as the actual implementation differs across databases, you might want to check other options as well.

GenerationType.IDENTITY

@Entity
public class Blog {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
}

Uses the auto-increment feature available in the database (for MySQL it would be the _IDENTITY_ column, for PostgreSQL — _SERIAL_).
This one is pretty interesting — each insert requires a separate query to fetch the generated key. So even though this strategy is quite simple and reliable, we might want to think twice, as it can cause performance issues in batch inserts (since each insert will be immediately executed and it can add up).
Before jumping on it, double check that your database supports auto-increment feature (e.g., SQLite, MySQL, MariaDB, Oracle and PostgreSQL support it).

GenerationType.SEQUENCE

@Entity
public class Blog {
    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "blogSeqGen")
    @SequenceGenerator(name = "blogSeqGen", sequenceName = "blog_sequence", allocationSize = 100)
    private Long id;
}

IDs are generated using a database sequence (in this case we would need to define the sequence name in an additional _@SequenceGenerator_).
Turns out it’s more efficient than IDENTITY for batch operations, since it supports pre-fetching (and can reduce database round-trips during batch inserts). The number of pre-fetched IDs is defined by allocationSize property, which equals to 50 by default unless modified (if required, _allocationSize = 1_ effectively disables preallocation and enables one-to-one generation behavior).
Not all databases support this type (but Oracle and PostgreSQL do).

GenerationType.TABLE

@Entity
public class Blog {
    @Id
    @GeneratedValue(strategy = GenerationType.TABLE)
    private Long id;
}

Uses a separate table to store and manage ID values. Because of this, it works across all databases but is the slowest strategy due to table lookup overhead and potential locking issues.
This is the only type I have never needed to use in my experience, but it seems to best fit the scenario when sequences aren’t supported and we need portability.

What happens if I don’t choose a strategy myself?

JPA defaults to AUTO, meaning that the persistence provider decides what to use, based on the database dialect (which is tied to the vendor).

This also means that dialect must match your database, as Hibernate relies upon it to decide how to map strategies.

Exactly why is GenerationType.SEQUENCE more efficient than GenerationType.IDENTITY for batch operations?

We find that it has to do with how the IDs are allocated and when inserts happen.

IDENTITY — The ID is generated by the DB at insert time. As JPA/Hibernate must insert the row immediately to get the ID, the inserts can’t be grouped together. Basically it’s a cycle of insert-flush, insert-flush…
SEQUENCE — IDs are fetched from a sequence before inserting rows, allowing Hibernate to pre-allocate a block of IDs using allocationSize (aka pooled sequence) and effectively batch all inserts into a single statement.

How does pooled sequence work for GenerationType.SEQUENCE?

It turns out that Hibernate can apply further optimizations to SEQUENCE that reduce the number of roundabout trips to DB to get new ID values, called optimizers. They determine how Hibernate handles the ID range fetched before via allocationSize.

No pooling

When allocationSize = 1, no pooling happens, meaning that each time a new ID is needed, nextval() is being called on the sequence in the DB, every insert = round trip.

Pooled-lo optimizer

When allocationSize is greater than 1, poolod-lo optimizer is used by default. In the example below, Hibernate fetches a single value from the database sequence — for example 10. Then it internally calculates a block of IDs using that value as a base Final ID = hi * allocationSize + lo, in memory, without touching the DB (so, 10 * 50 = 500 up to 549). After the block is used up, next value will be fetched (11 in our case) and process will be repeated (550 up to 559).

@SequenceGenerator(
    name = "blogSeqGen", 
    sequenceName = "blog_sequence", 
    allocationSize = 50
)

One note from personal experience here, if your business logic somehow relies on ID values being continuous, you might need to set _allocationSize_ to 1, because in other cases, if the app crashes midway for some reason, unused IDs in the current block will get lost and you will have a gap between ID values.

Pooled optimizer

@GenericGenerator(
    name = "blogSeqGen",
    strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
    parameters = {
        @Parameter(name = "sequence_name", value = "blog_sequence"),
        @Parameter(name = "optimizer", value = "pooled"),
        @Parameter(name = "increment_size", value = "50")
    }
)

The explicit pooled optimizer works similarly to pooled-lo, but instead of keeping the hi value in Hibernate’s memory, it is stored directly in the database sequence, and increased by allocationSize on every fetch.

For example, when querying the database for a value, it returns 1. While the ID range 1 + allocationSize will be stored in memory and used as needed, database will increment its sequence by allocationSize, so next time we query, it will return 1 + allocationSize (51, following the above example). No math calculations, Hibernate just reads the sequence value and treats it as the start of the next block.

This works especially well if we have several instances and need to use the same sequence without overlapping values. However, be careful, because for this optimizer to work, it should be manually configured in the database to increment by the same allocationSize you define in your entity. If they don’t match, you will end up with gaps or collisions.

How is post-insertion fetching of ID in GenerationType.IDENTITY approach beneficial at all?

In fact, this design choice was deliberate, giving the following advantages:

**Simple and easy to apply **For simple CRUD apps or other cases when you don’t want to think about ID generation logic.
**IDs are strictly ordered **If you are interested in time-based tracking, event logs, auditing or satisfying some legacy system business requirements, then this may be the one, as IDs are assigned exactly in the order rows are inserted and there is no risk of ID gaps due to preallocation (unlike the one we may have in case of _SEQUENCE_, however we know there are non-trivial ways to mitigate it).
**State managed by database consistently **As ID creation is handled atomically and transactionally, no need to worry about multiple app nodes coordinating ID ranges.

What happens if I select a strategy, but my database does not support it?

Interestingly, several things can happen depending on the strategy we have chosen.

AUTO is generally safe as you would have guessed, but if Hibernate picks something that doesn’t match the actual DB capabilities, we will hit runtime errors. This can happen due to misconfigured dialect, for example.
In case of IDENTITY, we would get SQL syntax error during table creation or inserts, as the db would not support AUTO_INCREMENT.
In case of SEQUENCE, db will throw runtime exception, unless we configure Hibernate to configure sequences using a table.
As TABLE is DB-independent, it would work, almost always — but if the table isn’t created properly, you will get table-not-found or concurrent-access errors.

If GenerationType.TABLE is so slow, why does it still exist?

Seems that reasons are mostly historical and technical as well. TABLE was designed to ensure portability across databases that did not support sequences, and didn’t have true IDENTITY columns either, like early MySQL. And since it just uses a regular table to emulate a sequence, it’s a good fallback that works everywhere (in theory).

There are several cases where it can be especially useful:

For embedded/in-memory DBs (like H2), or legacy systems, where we might not want to modify the DB schema.
For multi-dialect apps or tools if target DB isn’t known in advance.

We can think of it like a spare tire.

How do I pick the right strategy?

Maybe start by answering the following questions:

Small projects with a simple database? IDENTITY might be fine.
Need high-performance bulk inserts? SEQUENCE is a go-to.
Unsure? AUTO is a safe default, but check what it maps to in your database.
Working with an unsupported database? Consider TABLE (but be aware of its downsides).

… and then go deeper with carefully considering the pros and cons above. There is not a one answer, there are different circumstances.

Catch you in the next blog ✨