Beware Hibernate’s caching when using database filters

by Landon | Apr 30, 2019 | Leave a comment

The stack I work in every day uses Hibernate and Spring Data JPA for its object/relational mapping framework. My company is hardly alone in using these tools to map data from a database into Java objects. They’re quite commonly used, and also quite powerful.

One of the nifty features of Hibernate is Filtering. You can put a filter definition on one of your entity classes, and then use that filter to dynamically alter the SQL that actually is issued against the database. Doing so, you can ignore data you know you won’t want or won’t need.

We’ve used this database filter mechanism in an interesting, object-oriented way. First, we define the filter on one of our entities. Note the @FilterDef annotation.

@Entity
@Table(name = "...obfuscated...")
@FilterDef(name = "effective", defaultCondition = "((current_date between start_date and end_date) or " +
                                                      "(start_date <= current_date and end_date is null))")
public class Entity implements AuditableText, Serializable, Comparable<Entity> { ... }

Then, create an abstract class that will leverage the Hibernate entity manager to enable or disable a filter.

import org.hibernate.Session;

import javax.inject.Inject;
import javax.persistence.EntityManager;

public abstract class MyAwesomeFilter {
    @Inject protected EntityManager em;

    public void enable() {
        ((Session)em.getDelegate()).enableFilter(getFilterName());
    }

    public void disable() {
        ((Session)em.getDelegate()).disableFilter(getFilterName());
    }

    abstract String getFilterName();
}

You can see that this class is mostly a wrapper for some methods on the hibernate entity manager.

Because it’s an abstract class, we need an implementation that is attached to the filter we defined and attached to one of our entities.

import org.springframework.context.annotation.Scope;

import javax.inject.Named;

@Named
@Scope("prototype")
public class EffectiveFilter extends MyAwesomeFilter {
    @Override
    String getFilterName() {
        return "effective";
    }
}

Now, when you open a transactional window into the database and start retrieving data, all you have to do in the call path is specify whether or not you want this filter enabled. 

Notice as well that because this is a named object, it can be injected via Spring’s DI framework into whatever class it’s needed.

@Service
public class AwesomeServiceClass {

    private final EffectiveFilter effective;
    private final EntityRepo repo;

    public ActiveDutyService(EffectiveFilter effective, EntityRepo repo) {
        this.effective = effective;
        this.repo = repo;
    }

...

    @Transactional
    private Entity getMyEntity (final int entityId) {
        log.trace("savePartyStatus()");
        effective.enable();
        return repo.findOne(entityId);
    }

This has caused us so much frustration

While it’s cool that you can flip a filter on or off at will, it’s also caused a ton of consternation for one simple reason: it’s very easy to lose track of whether or not the filter is turned on when a call is issued. Hibernate and other ORMs are supposed to allow you think less about what sql is executed against a database, but once you start down the filter path (especially if you can turn it on or off), your call path could be filled with places where the filter is turned on and turned off and then turned on again. 

Which leads us to … the Hibernate cache!

To minimize the amount of round trips hibernate has to make to the database, Hibernate (and several other ORMs … I know .NET’s EntityFramework does this too) will cache results of queries within the JVM. The benefits of this are obvious… fewer round trips to a database means better performance.

But in a filtering context, this can give you nightmares.

What if, at the beginning of an execution path, you need data with the filter turned on, but later in the path, you need that same data but with the filter turned off? As I recently discovered (the hard way), you’re sort of hosed.

Hibernate, for all its advantages, will recognize that you’ve already retrieved the entity on the first go round (with the filter on). Even though you’ve disabled the filter on the second go round, hibernate will rely on its cache before it goes back to the database to get the actual data that you want.

We ended up having to get the data we were after via a more direct route: an explicit sql query attached to a hibernate entity repository interface. (In other words, we forced a trip to the database.)