04/07/2019

I discovered an undocumented way to improve JPA performance

Auteur

Fabrizio C4JFabrizio U.

Inzicht vanuit ons deelbedrijf
C4J logo

When mapping a table column we use the classic @Entity on our POJO’s (Plain Old Java Object) and every column that is present in the database we have to map it with a variable. This is good practice, however from a performance point of view it can give issues.

Imagine we have 30 columns on a table and most queries only need 5 of them. Every time we use our entity to fetch those columns, all the other columns are also fetched. For small data sets the performance impact is neglectable. However, for big datasets it will put a strain on all kind of resources for nothing!

Or imagine we have a CLOB or BLOB as a column. This can contain huge amounts of data. If with every query they get fetched from the database, it is bad practice, since all this data is always pulled over the network while we do not need it.

JPA (Java Persistence API) offers 3 less popular ways to optimize these kind of situations:

  1. Enable lazy loading on column level
  2. Use the multiselect functionality
  3. Define smaller entities that are mapped to views instead of tables

The last option has always been available since JPA 2.0. When I was coding at a customer, another colleague notified me about this quite handy feature and I was surprised I couldn’t find much information about it on the internet.

That’s why I will discuss this feature and compare it with the other options above. I will share how to implement them, the advantages and disadvantages.

Setup

Oracle logo Java EE logo Red Hat logo
  • Oracle 12c
  • Java EE 7
  • Red Hat EAP 7.1 with Hibernate 5.1.10

Example Entity

Let’s take a random attachment file that is represented in the database and compare 3 ways to optimize JPA performance.

import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Version;
import javax.persistence.Lob;

@Entity
public class Attachment {

	@Id
	long id;

	@Version
	int version;
	
	String name;

	@Lob
	byte[] content;

	// Getters, Setters omitted for brevity
}

1. Enable lazy loading on column level

When in a web page you are creating a table with an overview of all files and corresponding names, you will also fetch automatically the content column. Even when at that moment you do not need it. In our example below, you will only need it when the user clicks on the download link.

Id Name Download
1 Order1.pdf Click
2 Order2.pdf Click

To only get the content column when the user clicks it, we need to do two things. First you need to put the @Basic annotation on the content column and using the fetch property like this:

import javax.persistence.Basic;
import javax.persistence.FetchType;

…

@Lob
@Basic(fetch = FetchType.LAZY)
byte[] content;

Afterwards, you need to configure Hibernate via Maven to enhance its bytecode for the usage of lazy initialization of columns:

<plugin>
	<groupId>org.hibernate.orm.tooling</groupId>
	<artifactId>hibernate-enhance-maven-plugin</artifactId>
	<version>5.1.10.Final</version>
	<executions>
		<execution>
			<configuration>
				<enableLazyInitialization>true</enableLazyInitialization>
			</configuration>
			<goals>
				<goal>enhance</goal>
			</goals>
		</execution>
	</executions>
</plugin>

Important to know is that JPA defines the fetch type lazy as a hint. It means that depending on the implementation (Hibernate, EclipseLink, TopLink, …) it might effectively be fetched lazily. In some cases, the implementation might decide to eager fetch it anyway. As a developer you cannot control this.

Also consider the fact that for every query the hint will be considered. If you have only one overview where you want to lazily fetch file properties, you are safe. However, if you have a second overview where you want to preview the content of the file too, in case it is an image for instance, you cannot eager fetch it, meaning the image content will always be fetched in the context of the N + 1 select problem. This will result in a considerable performance loss.

When executing a standard select the following SQL statements are generated:

select
        attachment0_.id as id1_0_,
        attachment0_.name as name3_0_,
        attachment0_.version as version4_0_ 
from
        Attachment attachment0_

Advantages
  • Easy to use: just one annotation
  • Entity managed like any other entity (reading, saving, …)
  • Works with schema generation
Disadvantages
  • No guarantee it will be fetched lazily
  • Risk to encounter the N + 1 problem
  • Original entity is changed by adding default behavior for lazy loading columns
  • Maven plugin needed

2. Use the multiselect functionality

Now we will build a multiselect query in combination with the criteria metamodel to create a complete type safe JPA query. There are two ways to do this, one with an extra POJO and another one with a JPA generic Tuple object, which resembles the older DAO (Data Access Object) way.

The POJO way

We create a new POJO that only contains the necessary columns without the @Entity annotation.

public class AttachmentSimple  {
	
	Long id;

	String name;

	byte[] content;

	// Getters, Setters omitted for brevity
}

Then we will build our multiselect query like this:

CriteriaQuery<AttachmentSimple> query = enityManager.getCriteriaBuilder().createQuery(AttachmentSimple.class);
		
Root<Attachment> attachment = query.from(Attachment.class);
		
query.multiselect(attachment.get(Attachment_.id), attachment.get(Attachment_.name));
		
enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.getId() + " " + " Name = " + a.getName()));

What we are saying here is that our select root is the full Attachment entity, but the final mapping is done to the AttachmentSimple POJO.

The Tuple way

The Tuple object represents a row in an abstract way. In fact it is like the old way before JPA, but now it is type-safe:

CriteriaQuery<Tuple> query = enityManager.getCriteriaBuilder().createTupleQuery();
		
Root<Attachment> attachment = query.from(Attachment.class);
		
query.multiselect(
attachment.get(Attachment_.id).alias(Attachment_.id.getName()), attachment.get(Attachment_.name).alias(Attachment_.name.getName())
);
		
enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.get(Attachment_.name.getName()) + " " + " Name = " + a.get(Attachment_.name.getName())));

To be able to use this, you need to use the alias function, in order to retrieve the needed columns later from the Tuple object.

Advantages
  • Fine grade selection of columns
  • No need to change the original entity
  • Works with schema generation
Disadvantages
  • Cumbersome to set up queries
  • POJO way
    • Extra POJO needed
  • Tuple way
    • Aliases needed
    • Only generic “old-school” get methods available on Tuple
  • POJO or Tuple objects are not entities and cannot be saved back

3. Define smaller entities that are mapped to views instead of tables

Now that I’ve described the two fairly documented features, it’s time to compare these with the new feature I’ve discovered recently.

Create an SQL script named attachmentView.sql in the same META-INF folder as your persistence.xml with following content:

CREATE OR REPLACE VIEW AttachmentView (id, name) AS SELECT id, name FROM Attachment;

Add in your persistence.xml following property:

<property name="javax.persistence.sql-load-script-source" value="META-INF/attachmentView.sql"/>

Create a new entity:

import javax.persistence.Entity;
import javax.persistence.Id;


@Entity
public class AttachmentView {

	@Id
	long id;
	
	String name;

// Getters, Setters omitted for brevity
}

When we then fetch our entities following output is generated:

    select
        attachment0_.id as id1_1_,
        attachment0_.name as name2_1_ 
    from
        AttachmentView attachment0_

Be aware that the creation scripts could contain errors since JPA will still want to drop and create the entity AttachmentView as a table because it is not aware that the script contains the creation or replacement of a view. You can neglect these errors, configure your log levels accordingly or build your own drop and create scripts together with the corresponding JPA schema generation properties.

Advantages
  • Most easy to use once the view is created
  • A view can contain complex database logic (functions, multiple joins, …) which is hided from the developer (separation of concern)
  • No need to change the original entity
  • Reading is always normal JPA style
  • Depending on how simple the view is built, the entity can be saved back
Disdvantages
  • Works with schema generation through extra script
  • Knowledge of SQL and database specific statements for views is needed
  • Extra entity needed

Test logic

import javax.annotation.PostConstruct;
import javax.ejb.Singleton;
import javax.ejb.Startup;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.persistence.Tuple;
import javax.persistence.criteria.CriteriaQuery;
import javax.persistence.criteria.Root;

import data.model.dynamic.Attachment;
import data.model.dynamic.AttachmentSimple;
import data.model.dynamic.AttachmentView;
import data.model.dynamic.Attachment_;

@Startup
@Singleton
public class PerformanceComparisonLogic {
	
	@PersistenceContext
	protected EntityManager enityManager;
	
	@PostConstruct
	public void init() {
		
		normalSelect();
		multiSelectPOJO();
		multiSelectTuple();
		viewSelect();
	}
	
	public void normalSelect() {

		CriteriaQuery<Attachment> query = enityManager.getCriteriaBuilder().createQuery(Attachment.class);

		Root<Attachment> attachment = query.from(Attachment.class);

		query.select(attachment);

		enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.getId() + " " + " Name = " + a.getName()));
	}
	
	public void multiSelectPOJO() {
		
		CriteriaQuery<AttachmentSimple> query = enityManager.getCriteriaBuilder().createQuery(AttachmentSimple.class);
		
		Root<Attachment> attachment = query.from(Attachment.class);
		
		query.multiselect(attachment.get(Attachment_.id), attachment.get(Attachment_.name));
		
		enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.getId() + " " + " Name = " + a.getName()));
	}
	
	public void multiSelectTuple() {
		
		CriteriaQuery<Tuple> query = enityManager.getCriteriaBuilder().createTupleQuery();
		
		Root<Attachment> attachment = query.from(Attachment.class);
		
		query.multiselect(attachment.get(Attachment_.id).alias(Attachment_.id.getName()), attachment.get(Attachment_.name).alias(Attachment_.name.getName()));
		
		enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.get(Attachment_.name.getName()) + " " + " Name = " + a.get(Attachment_.name.getName())));
	}
	
	public void viewSelect() {
		
		CriteriaQuery<AttachmentView> query = enityManager.getCriteriaBuilder().createQuery(AttachmentView.class);

		Root<AttachmentView> attachment = query.from(AttachmentView.class);

		query.select(attachment);

		enityManager.createQuery(query).getResultList().forEach(a -> System.out.println("Id = " + a.getId() + " " + " Name = " + a.getName()));
	}
}

So what’s the conclusion?

There are several ways of improving performance. One of them is finetuning the selection of columns. JPA offers some ways to do it, and the most unknown is to map an entity to a view. This is a quite straightforward way once you know how to do it. The extra advantage is that, when needed, complex database functionality can be mapped to a view without changing your code.

Ontmoet C4J

Backend

C4J is uw partner voor al uw backend oplossingen. Reken op onze uitgebreide kennis en ervaring voor al uw applicaties en systemen.

Bezoek c4j.be