There’s Lazy Loading, And Then There’s Lazy Coding
Posted by Davy Brion on November 19th, 2009
I frequently hear people complaining about lazy loading. Not really about specific implementations of the lazy loading pattern, but about the pattern itself. A lot of people think it shouldn’t be used, or be possible at all, because it makes it too easy for developers to write code that will perform horribly slow when it has to deal with the volumes of data that can be expected in production environments. I could not disagree more. It is true that many developers make too much use of lazy loading in their code, which most definitely will cause performance problems sooner or later. I don’t however think this is an inherent problem with lazy loading, it’s purely an example of lazy coding.
If you need data, then it is your responsibility to retrieve that data as efficiently as you can. If you know that your code is going to need a certain set of data to perform the functionality that it needs to provide, then you should retrieve that set of data in the best way possible. Often, this means loading that set of data with as few queries as you can. If you merely retrieve the first entity that you need, and then make use of the lazy loading features to automatically get the rest of the data without you having to write a specific query for it, then you probably shouldn’t be writing software for a living. If you’re taking such a shortcut out of laziness, then you’re probably the kind of developer who will take much more shortcuts in your code to ‘finish’ your task as quickly as possible. Frankly, I’d rather not work with people like that.
Most problems that are attributed to lazy loading are really manifestations of the laziness (or ignorance) of the developers who write code like that. They are not inherently related to what lazy loading is about. In many cases, lazy loading can indeed improve performance of your code, but as with so many concepts and practices, it all depends on the situation. If a certain subset of the data that you would retrieve will not be used in the majority of cases, then obviously lazily loading that subset of data instead of eagerly fetching it could be a big performance improvement.
Here are some rules of thumb that i usually employ when it comes to lazy loading:
- If a set of data is always, or at least in the (large) majority of cases, needed by your code then don’t use lazy loading and fetch the required data as efficiently as you can
- If a set of data is not needed in the majority of cases, then use lazy loading instead of eager fetching
- If the probability of a set of data being needed is about 50%, decide on a case by case basis and keep the size of the set of data in mind, as well as the cost of fetching it separately over the cost of fetching everything together
And that is really all there is to it. Keep those rules in mind, and you’ll hardly ever run into problems with lazy loading. Remember kids: don’t say no to lazy loading, say no to lazy coding instead.
November 19th, 2009 at 3:26 pm
I agree that lazy loading is not the problem, and that a lot of the perceived problem is caused by developers relying too much on it, either out of laziness (though one could argue a YAGNI or “optimize later” stance – leading to “death by a thousand cuts”) or ignorance.
But I think the root problem is that contextual span fetching is inherently a DRY violation, meaning that load spans and domain logic are defined separately, thus making them hard to maintain in sync as an application evolves.
I am thinking here of a load span being defined in the context of e.g. a particular service method. The domain logic would obviously go on the domain model classes, which we want to keep independent of any persistence concerns. When we make a change to a method on a domain class, requiring some new data, we need to understand which spans (there could be more than one) are affected and amend them as appropriate.
In the end, the only cure is measuring performance and profiling, which is something one should do anyway.
Cheers!
/Anders
November 19th, 2009 at 6:45 pm
If you need data, then it is your responsibility to retrieve that data
as efficiently as you canas efficiently as performance requirements dictate.Otherwise, yes, lazy loading is a useful pattern, but like any pattern, it can be abused easily.
November 23rd, 2009 at 8:30 am
Totally true.
We were working with some hierarchical data and we started with eager loading for all the levels. As it turned out most of the cases we did not needed all the data we were loading.
Then we changed everything to the lazy load, but as it turns out sometimes we were firing two many queries for the some of the lower level of hierarchies.
Then finally we have to go for a combination of lazy with eager as you have mentioned to hit the sweet spot.