Hibernate and sets – use at your peril

For the past 3 weeks I’ve spent hours tuning the performance in one of the applications we use. Unfortunately rather than spend some time and come up with a proper reporting solution I was tasked to fix the existing solution which is to report off our live transactional system. We use EJB3 on JBoss so hibernate it is, which I personally think is an excellent framework for getting your application up and running quickly. Hibernate have always stated that performance isn’t it’s goal so I can’t blame hibernate for anything I’ve found. I tackled the problem in the usual way, add some sql logging and see what’s really happening and look for the usual suspects, lazy loading, code structure etc. Where I saw collections being loaded up separately I added Join fetches and rerun the reports. This worked really well then I noticed that no matter how I wrote the JPQL it always seemed to be running seperate queries for some of the realtionships. That’s when the bulb flashed they were defined as Sets!!

An example of such a thing:

private Set taxComponents = new HashSet(1);

I understand why sets are used but as I’m in control of what gets put in I can safely use:

 private Collection taxComponents = new ArrayList(1);

So using set’s is great but it comes at a cost and beware it’s not just fetches but deletes and inserts have a similar issue in that they get deleted one by one and then reinserted to ensure uniqueness, so if you do update the set it also has a performance hit.

The performance of the major report moved from 30 mins to 6 mins, still not lightning fast but a good improvement. I could of course move to native SQL but I didn’t want to have to rewrite a load of code to do that.

2 thoughts on “Hibernate and sets – use at your peril

  1. “I understand why sets are used”

    I’m not sure you do. There are two reasons why you’d have a @OneToMany(mappedBy):

    You have a legacy Java codebase or bad architecture and you HAVE to iterate through a collection of TaxComponents, you can’t execute a query (call a DAO method) to access that data.

    The second reason is that you can do some fancy query tricks (batch fetching) with such a collection. As you are having problems with queries, well, let’s just say whoever created the @OneToMany probably didn’t think about that either.

    So remove the @OneToMany and the collection. It’s not doing you any good and it’s not necessary (mappedBy).

    See: http://in.relation.to/1395.lace

    • Hi Christian,

      Thanks for the comment, I’m humbled that someone such as yourself would take the time to respond. Like you say in your blog mapping a collection is a feature and what I’m pointing out in this post is that using this feature has a side effect, which I’ve not seen documented and it can have some massive performance overhead. I don’t see why when using a ORM I should loose the ability to treat objects as objects without having to call a DAO to get related objects.



Comments are closed.