Batching NHibernate’s DML Statements
Posted by Davy Brion on October 27th, 2008
An oft-forgotten feature of NHibernate is that of batching DML statements. If you need to create, update or delete a bunch of objects you can get NHibernate to send those statements in batches instead of one by one. Let’s give this a closer look.
I have an ‘entity’ with the following mapping:
<class name="CrudTest" table="CrudTest">
<id name="Id" column="Id" type="guid" >
<generator class="assigned" />
</id>
<property name="Description" column="Description" type="string" length="200" not-null="true" />
</class>
Nothing special here, just a Guid Id field and a string Description field.
First, let’s see how much time it takes to create 10000 records of this without using the batching feature. I use the following method to create a bunch of dummy objects:
private IEnumerable<CrudTest> CreateTestObjects(int count)
{
List<CrudTest> objects = new List<CrudTest>(count);
for (int i = 0; i < count; i++)
{
objects.Add(new CrudTest { Id = Guid.NewGuid(), Description = Guid.NewGuid().ToString() });
}
return objects;
}
Then, the code to persist these objects:
var testObjects = CreateTestObjects(10000);
var stopwatch = new Stopwatch();
stopwatch.Start();
using (ITransaction transaction = Session.BeginTransaction())
{
foreach (var testObject in testObjects)
{
Session.Save(testObject);
}
transaction.Commit();
}
stopwatch.Stop();
Without enabling the batching, this code took 23 seconds to run on my cheap MacBook. Now let’s enable the batching in the hibernate.cfg.xml file:
<property name="adonet.batch_size">5</property>
A batch size of 5 is still very small, but for this test it means that it only has to do 2000 trips to the database instead of the original 10000. The code above now runs in 5.5 seconds. Setting the batch size to 100 made it run in 1.8 seconds. Going from 23 to 1.8 seconds with a small configuration change is a pretty nice improvement with very little effort. Obviously, these aren’t real benchmarks so your results may vary but i think it does show that you can easily get some performance benefits from it.
You can get performance benefits like this whenever you need to create/update/delete a bunch of records simply by enabling this setting. Keep in mind that this batching of statements doesn’t apply to select queries… for that you need to use NHibernate’s MultiCriteria or MultiQuery features
Another thing to keep in mind is that for this test i used the ‘assigned’ Id generator… which means that the developer is responsible for providing the Id value for new objects. One of the consequences of this is that NHibernate does not have to go to the database to retrieve the Id values like it would have to do if you were using (for instance) Identity Id values. If you were using the Identity Id generator, this configuration setting would have no effect whatsoever for inserts, although the benefits would still apply to update and delete statements.
Note that this approach is good for regular applications, but it’s still not good enough if you need to process very large data sets (like import processes and things of that nature). Obviously, an ORM isn’t well suited for those purposes, but we will examine another NHibernate feature in a future post which makes it possible to use NHibernate in such bulk operations with a pretty low performance overhead.
October 27th, 2008 at 8:26 pm
[...] Note: this was originally posted on my own blog [...]
October 30th, 2008 at 9:28 pm
[...] Batching NHibernate’s DML Statements [...]
November 2nd, 2008 at 9:23 pm
[...] my previous post, i showed how you can configure NHibernate to batch create/update/delete statements and what kind [...]
January 27th, 2009 at 9:33 am
Hi, Davy. Is this technique working with oracle DB (10g)?
January 27th, 2009 at 9:38 am
I’m not sure, but i thought none of the batching features work on Oracle
May 21st, 2009 at 1:43 pm
[...] case it only needs to hit the database when the transaction is committed. If you’ve enabled batching of DML statements, you could severly reduce the number of times you need to hit the database in this [...]
September 9th, 2009 at 11:10 pm
Does batching work with SQLite? I ask because it doesn’t seem to work for me…
September 10th, 2009 at 7:18 am
@GordonS
SQLite runs entirely in memory, so there would hardly be any benefit to batching
September 11th, 2009 at 5:45 pm
>SQLite runs entirely in memory
It’s not true, it can keep data file in FS.
I double the question.
September 12th, 2009 at 1:08 pm
i checked the nhibernate code for this… the current SQLite ‘drivers’ (within the nhibernate code, not the actual ado.net providers) don’t support this
it only seems to be supported for Sql Server and Oracle (if you use the ODP provider instead of the default one in .NET)