The Inquisitive Coder – Davy Brion's Blog

Trying to walk that thin line between intelligence and ignorance

Build Your Own Data Access Layer: Enabling Bulk Inserts

Posted by Davy Brion on October 12th, 2009

Note: This post is part of a series. Be sure to read the introduction here.

I know i wrapped up the series already, but i just had to add the ability to do bulk inserts to this data layer so i figured i’d might as well post about it. Ayende already talked about how to enable the ability to batch inserts (or updates and deletes) here and here so i’m going to skip that part. I used the exact same trick and created a PublicSqlCommandSet class which wraps the hidden SqlCommandSet class. Again, if you have no idea what i’m talking about in that last sentence then you need to read Ayende’s 2 posts that i just linked to ;)

After that, adding the bulk insert feature to the DAL was as simple as creating this class:

    public class BulkInsertAction : DatabaseAction

    {

        public BulkInsertAction(SqlConnection connection, SqlTransaction transaction, MetaDataStore metaDataStore,

            EntityHydrater hydrater, SessionLevelCache sessionLevelCache)

            : base(connection, transaction, metaDataStore, hydrater, sessionLevelCache) {}

 

        public void Insert<TEntity>(IEnumerable<TEntity> entities, int batchSize, int commandTimeOut)

        {

            var tableInfo = MetaDataStore.GetTableInfoFor<TEntity>();

            var insertStatement = tableInfo.GetInsertStatementWithoutReturningTheIdentityValue();

 

            var sqlCommandSet = new PublicSqlCommandSet { CommandTimeout = commandTimeOut, Connection = GetConnection(), Transaction = GetTransaction() };

 

            foreach (var entity in entities)

            {

                var currentCommand = CreateCommand();

                currentCommand.CommandText = insertStatement;

 

                foreach (var parameterInfo in tableInfo.GetParametersForInsert(entity))

                {

                    currentCommand.CreateAndAddInputParameter(parameterInfo.DbType, "@" + parameterInfo.Name, parameterInfo.Value);

                }

 

                sqlCommandSet.Append(currentCommand);

 

                if (sqlCommandSet.CommandCount == batchSize)

                {

                    ExecuteCurrentBatch(sqlCommandSet);

                    sqlCommandSet = new PublicSqlCommandSet { CommandTimeout = commandTimeOut, Connection = GetConnection(), Transaction = GetTransaction() };

                }

            }

 

            if (sqlCommandSet.CommandCount > 0)

            {

                ExecuteCurrentBatch(sqlCommandSet);

            }

        }

 

        private void ExecuteCurrentBatch(PublicSqlCommandSet sqlCommandSet)

        {

            try

            {

                sqlCommandSet.ExecuteNonQuery();

            }

            finally

            {

                sqlCommandSet.Dispose();

            }

        }

    }

And then adding this to the Session class:

        public void BulkInsert<TEntity>(IEnumerable<TEntity> entities)

        {

            CreateAction<BulkInsertAction>().Insert(entities, 50, 200);

        }

Obviously, the method signature was also added to the ISession interface. The batch-size and commandtimeout parameters are currently hardcoded but they should come from some kind of configuration file.

All in all, pretty easy stuff :)

9 Responses to “Build Your Own Data Access Layer: Enabling Bulk Inserts”

  1. Reflective Perspective - Chris Alcock » The Morning Brew #453 Says:

    [...] Build Your Own Data Access Layer: Enabling Bulk Inserts – Davy Brion resumes his series on creating his own ORM / Data Access Layer from scratch with a look at enabling his framework to support bulk inserts of data [...]

  2. Tom Says:

    Some good posts.

    What would you suggest in regards to deep loading and using stored procedures, say you have a list of 100 of something each having a foreign key to another table … (using batch queries)

    We have to use stored procedures at our place and I am looking at building our own basic ORM.

    1. Get all of the list and the FK records
    2. Get all of the list and attempt to get unique FK records (so you bring back less from DB).

  3. Davy Brion Says:

    @Tom

    depends on the dataset in the referenced table… if it’s more likely that the number of needed records is somewhat similar to the number of records in the referring table then i’d just join the two in one query

    if the number of needed records in the referenced table will be much smaller than the number of records in the referring table, i’d go for two statements and try to bring back only the ones i really need in the second statement. your stored procedure could then still send back both resultsets… with a datareader you can process each resultset returned by a statement so that wouldn’t be an issue

    either way you do it, as long as you combine both operations in one stored procedure then you don’t really need to worry about the ‘overhead’ of a second query since you’re not really paying for the network latency of multiple remote calls

  4. Tom Says:

    So really you need some sort of fetching strategy (not sure how you would implement)? because as always.. ‘it depends’..

    My example query:

    Fixture (say 100 of them)
    Fixture.HomeTeam (type Team), Fixture.AwayTeam (type Team)

    Stored procedures would be as its for an ORM … FixtureListAll, FixtureGetByID, TeamGetByID

    Options I can see are either:

    1.

    FixtureList

    and then work out what unique TeamIDs there is and then do a batch call on
    TeamGetByID

    So 2 trips to DB

    OR

    2.

    FixtureList and deep load 1 lot of children underneath, so bring back the Teams also, but this would be for all entities so performance would be hit and miss

    So 1 trip to DB but bring back more than you need
    OR

    3.

    A fetching strategy and you create a single stored procedure like option 2 but only when you need it, so default to lazy load or something

    So 1 trip but because you programmed for this you are aware of what your bringing back

    Any advice/suggestions welcome

  5. Tom Says:

    Sorry when I say “for all entities” i mean for anything using the framework so like a default way of getting things from the DB

  6. Davy Brion Says:

    advice… umm, can you get a new job? :p

    the fetching strategy is going to take the most effort out of the 3 options, but it gives you the flexibility to optimize for each specific scenario that you encounter

  7. Tom Says:

    I assume you would just call something similar to .net tiers…?

    Fetch(entity or collection, levels, types to include, types to exclude)

    Fetch(fixtures, 1, typeof(Team), typeof(Venue))… you just call this in your service layer and default to lazy loading otherwise.. or define something on the domain entity to say what to types to get by default? I would be using ActiveRecord under the service layer.

    While this isn’t one trip it would still be less than several 100 potential

    public class Fixture
    {
    // This would be ignored as the deeploading options excldues this
    [DeepLoad]
    public Venue Venue
    {
    return this.venue;
    }

    [DeepLoad]
    public Team HomeTeam
    {
    get
    {
    return this.homeTeam;
    }
    }

    [DeepLoad]
    public Team AwayTeam
    {
    get
    {
    return this.awayTeam;
    }
    }
    }

  8. Tom Says:

    Any thoughts in regards to fetching strategys?

  9. Davy Brion Says:

    Sorry, haven’t really had the time to look into this

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>