Archive for April, 2009

Gotta Love Refactoring

1 Comment »Written on April 19th, 2009 by
Categories: Software Development

Behold the following two classes:

    public class FutureCriteriaBatch

    {

        private readonly List<ICriteria> criterias = new List<ICriteria>();

        private readonly IList<System.Type> resultCollectionGenericType = new List<System.Type>();

 

        private int index;

        private IList results;

        private readonly ISession session;

 

        public FutureCriteriaBatch(ISession session)

        {

            this.session = session;

        }

 

        public IList Results

        {

            get

            {

                if (results == null)

                {

                    var multiCriteria = session.CreateMultiCriteria();

                    for (int i = 0; i < criterias.Count; i++)

                    {

                        multiCriteria.Add(resultCollectionGenericType[i], criterias[i]);

                    }

                    results = multiCriteria.List();

                    ((SessionImpl)session).FutureCriteriaBatch = null;

                }

                return results;

            }

        }

 

        public void Add<T>(ICriteria criteria)

        {

            criterias.Add(criteria);

            resultCollectionGenericType.Add(typeof(T));

            index = criterias.Count - 1;

        }

 

        public void Add(ICriteria criteria)

        {

            Add<object>(criteria);

        }

 

        public IFutureValue<T> GetFutureValue<T>()

        {

            int currentIndex = index;

            return new FutureValue<T>(() => (IList<T>)Results[currentIndex]);

        }

 

        public IEnumerable<T> GetEnumerator<T>()

        {

            int currentIndex = index;

            return new DelayedEnumerator<T>(() => (IList<T>)Results[currentIndex]);

        }

    }

and

    public class FutureQueryBatch

    {

        private readonly List<IQuery> queries = new List<IQuery>();

        private readonly IList<System.Type> resultCollectionGenericType = new List<System.Type>();

 

        private int index;

        private IList results;

        private readonly ISession session;

 

        public FutureQueryBatch(ISession session)

        {

            this.session = session;

        }

 

        public IList Results

        {

            get

            {

                if (results == null)

                {

                    var multiQuery = session.CreateMultiQuery();

                    for (int i = 0; i < queries.Count; i++)

                    {

                        multiQuery.Add(resultCollectionGenericType[i], queries[i]);

                    }

                    results = multiQuery.List();

                    ((SessionImpl)session).FutureQueryBatch = null;

                }

                return results;

            }

        }

 

        public void Add<T>(IQuery query)

        {

            queries.Add(query);

            resultCollectionGenericType.Add(typeof(T));

            index = queries.Count - 1;

        }

 

        public void Add(IQuery query)

        {

            Add<object>(query);

        }

 

        public IFutureValue<T> GetFutureValue<T>()

        {

            int currentIndex = index;

            return new FutureValue<T>(() => (IList<T>)Results[currentIndex]);

        }

 

        public IEnumerable<T> GetEnumerator<T>()

        {

            int currentIndex = index;

            return new DelayedEnumerator<T>(() => (IList<T>)Results[currentIndex]);

        }

    }

They are almost exactly the same, which is obviously a problem. The root cause of this situation is that there is no common interface between IQuery and ICriteria, and there also isn't a common interface between IMultiQuery and IMultiCriteria. Introducing those interfaces would make a lot of things easier, but can't be done right now. Still, that shouldn't prevent us from striving to avoid duplicate code.

This is actually pretty easy to do... here's the new FutureBatch class:

    public abstract class FutureBatch<TQueryApproach, TMultiApproach>

    {

        private readonly List<TQueryApproach> queries = new List<TQueryApproach>();

        private readonly IList<System.Type> resultTypes = new List<System.Type>();

        private int index;

        private IList results;

 

        protected readonly SessionImpl session;

 

        protected FutureBatch(SessionImpl session)

        {

            this.session = session;

        }

 

        public IList Results

        {

            get

            {

                if (results == null)

                {

                    GetResults();

                }

                return results;

            }

        }

 

        public void Add<TResult>(TQueryApproach query)

        {

            queries.Add(query);

            resultTypes.Add(typeof(TResult));

            index = queries.Count - 1;

        }

 

        public void Add(TQueryApproach query)

        {

            Add<object>(query);

        }

 

        public IFutureValue<TResult> GetFutureValue<TResult>()

        {

            int currentIndex = index;

            return new FutureValue<TResult>(() => GetCurrentResult<TResult>(currentIndex));

        }

 

        public IEnumerable<TResult> GetEnumerator<TResult>()

        {

            int currentIndex = index;

            return new DelayedEnumerator<TResult>(() => GetCurrentResult<TResult>(currentIndex));

        }

 

        private void GetResults()

        {

            var multiApproach = CreateMultiApproach();

            for (int i = 0; i < queries.Count; i++)

            {

                AddTo(multiApproach, queries[i], resultTypes[i]);

            }

            results = GetResultsFrom(multiApproach);

            ClearCurrentFutureBatch();

        }

 

        private IList<TResult> GetCurrentResult<TResult>(int currentIndex)

        {

            return (IList<TResult>)Results[currentIndex];

        }

 

        protected abstract TMultiApproach CreateMultiApproach();

        protected abstract void AddTo(TMultiApproach multiApproach, TQueryApproach query, System.Type resultType);

        protected abstract IList GetResultsFrom(TMultiApproach multiApproach);

        protected abstract void ClearCurrentFutureBatch();

    }

And now the original 2 classes look like this:

    public class FutureCriteriaBatch : FutureBatch<ICriteria, IMultiCriteria>

    {

        public FutureCriteriaBatch(SessionImpl session) : base(session) {}

 

        protected override IMultiCriteria CreateMultiApproach()

        {

            return session.CreateMultiCriteria();

        }

 

        protected override void AddTo(IMultiCriteria multiApproach, ICriteria query, System.Type resultType)

        {

            multiApproach.Add(resultType, query);

        }

 

        protected override IList GetResultsFrom(IMultiCriteria multiApproach)

        {

            return multiApproach.List();

        }

 

        protected override void ClearCurrentFutureBatch()

        {

            session.FutureCriteriaBatch = null;

        }

    }

and

    public class FutureQueryBatch : FutureBatch<IQuery, IMultiQuery>

    {

        public FutureQueryBatch(SessionImpl session) : base(session) {}

 

        protected override IMultiQuery CreateMultiApproach()

        {

            return session.CreateMultiQuery();

        }

 

        protected override void AddTo(IMultiQuery multiApproach, IQuery query, System.Type resultType)

        {

            multiApproach.Add(resultType, query);

        }

 

        protected override IList GetResultsFrom(IMultiQuery multiApproach)

        {

            return multiApproach.List();

        }

 

        protected override void ClearCurrentFutureBatch()

        {

            session.FutureQueryBatch = null;

        }

    }

They are still very similar, but at least it's a big improvement over the original situation.

Put Performance Concerns Into Perspective

No Comments »Written on April 18th, 2009 by
Categories: Performance

There is a reported performance issue with NHibernate that i wanted to look into. The reported issue was related to retrieving objects through a generically typed List or through an IList reference.

The following code simulates the issue:

    class MyClass {}

 

    class Program

    {

        static void Main(string[] args)

        {

            List<MyClass> list = new List<MyClass>();

 

            Stopwatch stopwatch = Stopwatch.StartNew();

 

            for (int i = 0; i < 100000; i++)

            {

                list.Add(new MyClass());

            }

 

            stopwatch.Stop();

 

            Console.WriteLine("Elapsed ms: " + stopwatch.ElapsedMilliseconds);

 

            IList iList = new List<MyClass>();

 

            stopwatch = Stopwatch.StartNew();

 

            for (int i = 0; i < 100000; i++)

            {

                iList.Add(new MyClass());

            }

 

            stopwatch.Stop();

 

            Console.WriteLine("Elapsed ms: " + stopwatch.ElapsedMilliseconds);

 

            Console.ReadLine();

        }

    }

The only difference between both Add operations is that in the first case, the typed Add method of the generically typed List reference is called. In the second case, the untyped Add method of the generically typed List is called through the IList reference.

On my slow Macbook, the first Add operation typically took between 10 and 20 ms. The second Add operation typically took almost twice as long as the first Add operation. As you can see, that is a very minor performance issue, and it actually is only consistently noticeable once you're dealing with 100000 elements. At 50000 elements, both operations typically take the same amount of time with only minor variations in performance on certain runs.

So yes, once you're dealing with a large enough set of elements, there is indeed a performance difference. But it's extremely minor and the extra cost of the Add operation is most definitely the least of your concerns if you're retrieving that many entity instances through an ORM. The extra amount of memory that needs to be used for those entities and the extra cost of pulling all of that data over the wire is what's really going to bite you, not the extra cost of the Add operation ;)

Educate Developers Instead Of Protecting Them

21 commentsWritten on April 15th, 2009 by
Categories: Opinions

A lot of companies have a couple of technical people who are responsible for laying down some technical guidelines, rules, or even reusable libraries or small (or large) frameworks. The idea is that all the developers within the company (or department) need to follow the same rules, use the same libraries, and pretty much write similar code compared to everyone else. I'm not going to get into the benefits/drawbacks of this, but there is one important aspect of this approach that i've seen in a few places which i'd like to comment on.

What you often see in this kind of situation is that the people who make up the rules (let's call them the 'technical leads' for the purpose of this post) often try to protect less-skilled developers from making mistakes or even assume that less-skilled developers won't be able to comprehend certain technical principles/aspects/patterns/approaches. Btw, i don't mean the term less-skilled developers in a derogatory way... but that is often how technical leads in places like these think of 'application developers', especially those who still have a lot to learn. This often leads to guidelines, or even API's which are heavily influenced by the perceived lack of skills/insight/knowledge of the less-skilled developers.

I'm sure you all have countless examples of situations where technical leads impose restrictions or guidelines based purely on their lack of faith in less-skilled developers. Hell, just browse through Microsoft's Framework Design Guidelines and you'll encounter quite a few more examples of how some of the .NET Framework's technical leads made choices or still recommend certain things based largely on having to deal with less-skilled developers who might not have the required skills to do everything the proper way. (Side note: i do like the FDG book and find it quite helpful in many situations, though the parts where they talk about less-skilled developers really do annoy me).

In the large majority of cases, trying to protect developers from themselves is just not worth it. For starters, hiding things from developers often requires more effort in the long run than simply trying to educate your developers as to how they should properly do their job. Sure, you can wrap anything that you deem as potentially easy-to-misuse, but you will need to think about alternative approaches whenever you have to deal with a situation in an application that falls outside of the boundaries you had in mind when you wrote your wrapper. And this is where it usually gets very messy, very quickly.

On the other hand, you could have invested some amount of effort into guiding and teaching those developers who you might consider as less-skilled. If you're going to hide things from less-skilled developers, how can you possibly hope that these developers will one day require less hand-holding and protection from themselves? The only way these developers are going to reach the level you'd like them to be on, is if somebody is going to help them get to that level. It might not always be easy, but it's usually far more effective in the long run than simply having to deal with their (perceived) lack of skill forever and ever.

Another huge downside to protecting developers from themselves is that it easily puts off the strong developers. Just imagine that you join one of these companies/departments where they do this. You know the ins and outs of certain libraries or frameworks, yet you are being restricted because the powers that be don't want your less-skilled coworkers to screw up. How likely are you to stick around for a long time in a place like this? How motivated could you be, knowing that you'll often be restricted on a technical level?

Basically, there are 2 huge downsides to this. The first is obviously that your less-skilled developers are never going to improve. You will be stuck with their perceived inabilities forever. The second is that it will either drive your competent developers to look for a better job, or if they stick around they might tune out, which could lead to a whole other host of problems. All of this can easily be avoided by simply showing your willingness to help educate your developers. They will benefit from it, and as a result, so will you.

NHibernate’s Future Queries And Their Fallback Behavior

1 Comment »Written on April 13th, 2009 by
Categories: NHibernate

I've blogged about NHibernate's Future queries a couple of times already. But as you know, NHibernate aims to offer you a way to write your code completely independent of the actual database you're using. So what happens if you run your code, which is using the Future and FutureValue features, on a database that doesn't support batched queries? Previously, this would fail with a NotSupportedException being thrown.

As of today, (revision 4177 if you want to be specific) this is no longer the case. If you use the Future or FutureValue methods of either ICriteria or IQuery, and the database doesn't support batching queries, NHibernate will fall back to simply executing the queries immediately, as the following tests show:

        [Test]
        public void FutureOfCriteriaFallsBackToListImplementationWhenQueryBatchingIsNotSupported()
        {
            using (var session = sessions.OpenSession())
            {
                var results = session.CreateCriteria<Person>().Future<Person>();
                results.GetEnumerator().MoveNext();
            }
        }

        [Test]
        public void FutureValueOfCriteriaCanGetSingleEntityWhenQueryBatchingIsNotSupported()
        {
            int personId = CreatePerson();
 
            using (var session = sessions.OpenSession())
            {
                var futurePerson = session.CreateCriteria<Person>()
                    .Add(Restrictions.Eq("Id", personId))
                    .FutureValue<Person>();
                Assert.IsNotNull(futurePerson.Value);
            }
        }

There are more tests obviously, but you get the point. The interesting part about these tests is how i disabled query batching support. I only have Sql Server and MySQL running on this machine, and they both support query batching. I didn't really feel like installing a database that doesn't support it, so i just took advantage of NHibernate's extensibility. Since most of us run the NHibernate tests on Sql Server, i inherited from the Sql Server Driver and made sure that it would report to NHibernate that it didn't support query batching:

    public class TestDriverThatDoesntSupportQueryBatching : SqlClientDriver
    {
        public override bool SupportsMultipleQueries
        {
            get { return false; }
        }
    }

Easy huh? Then i just inherited from the TestCase class we have in the NHibernate.Tests project which offers a virtual method where you can modify the NHibernate configuration for the current fixture:

        protected override void Configure(Configuration configuration)
        {
            configuration.Properties[Environment.ConnectionDriver] =
                "NHibernate.Test.NHSpecificTest.Futures.TestDriverThatDoesntSupportQueryBatching, NHibernate.Test";
            base.Configure(configuration);
        }

Now NHibernate thinks that query batching isn't supported, yet the above tests still work. Mission accomplished :)

300 Posts

No Comments »Written on April 13th, 2009 by
Categories: About The Blog

And there we go... the 300th post on this blog. I'm continuing the tradition, so here's an overview of the 10 most popular posts in between the 200th and 299th post:

  1. Tired Of Working With Big Visual Studio Solutions?

    This one was a bit of a fluke. It's a solution (no pun intended) to a problem that i tried at work, and worked pretty well for us. When i wrote the post i considered it as 'filler' material but it turned out that quite a few people where happy with this.

  2. Genesis: Bridging The Gap Between Requirements And Code

    This one was a shameless plug to one of the products that my company is working on. I actually thought that this would get more reactions from readers but unfortunately, it didn't. It did get a lot of views so i guess that's something.

  3. What It Takes To Be A Great Technical Lead

    Just a list of the qualities/skills that a great technical lead needs to have IMO. And no, i don't think i live up to that list in case you're wondering :)

  4. Why On Earth Would A Developer Do This?

    This was about a piece of code i saw in one of the books every developer supposedly has to read. That piece of code is really a shame, and considering where the code came from (not to mention who wrote it), it only makes matters worse.

  5. Challenge: Do You Truly Understand This Code?

    I thought this one was a lot of fun. From the reactions i got, i guess most of you liked it too. I should do more of these in the future :)

  6. Do Not Litter Your Code With Null Checks

    This was just me ranting against a practice that i've seen far too often.

  7. Continuous Integration 101

    The basic rules to follow when you're a member of a team that's using Continuous Integration. I was tremendously frustrated with my coworkers when i wrote this post, and it shows. But it's still a good list :)

  8. Why Don't We Learn?

    I'm generally pretty unhappy with the state of software development as an art, science, craft, field or whatever way you want to look at it. I guess this post explains why.

  9. We All Write Bad Code

    We all try to write great, clean code but i think we should all be honest about the fact that we all have to write bad code once in a while. I still stand by my statement that anyone who claims he/she never writes bad code is either lying, ignorant or living in a fantasy world.

  10. Performance Rules Of Thumb

    This is one of my personal favorites. Performance is an often recurring theme on this blog (though i usually avoid micro-optimizations since they're usually pointless) and i just wanted to post a list of things to keep in my mind at all times when you're writing code or designing something. Some people would consider those things premature optimizations but i have to disagree on that one. Most of this is just common sense and not following these rules is very likely to cause problems sooner or later in any real world project.

There you have it. The thing i found pretty interesting about this list is that it's completely different from the previous lists at 100 and 200 posts. In those lists, most of the posts were about specific technical subjects (either certain libraries or approaches or solutions or whatever) whereas in the posts of this list, there's not a lot of actual code to be found. I have no idea what i'm going to be writing about in the next 100 posts, though i'd guess it will be the same kind of mix between opinions and actual technical stuff. Suggestions (in general) are always welcome of course ;)