C#

Virtual Method Performance Penalty, Revisited

18 commentsWritten on December 17th, 2010 by
Categories: C#, Performance

I wrote a post about a year ago which discussed a test of the performance difference between calling virtual methods and non-virtual methods. This morning, someone added the following comment to that post:

if you have 100 subclasses of class A, and they all override a method a, it will take a lot longer for it to figure out which version of a to call. Think of it as a switch statement with one case label verses a switch statement with 100 case labels. Since you’re just testing it with one method it’s not surprising that the cost is negligible.

My Bullshit-detector started beeping while reading that, so i just had to see if the number of subclasses indeed had an impact. I didn't go all the way up to 100 subclasses, but i went with 15. If there is indeed a performance penalty that grows with the number of subclasses in play, then surely i'd have to see some difference when using 15 subclasses over just 1, right?

In the original test, i had the following 2 classes:

    public class MyClass
    {
        public long someLong;

        public void IncreaseLong()
        {
            someLong++;
        }

        public virtual void VirtualIncreaseLong()
        {
            someLong++;
        }
    }

    public class MyDerivedClass : MyClass
    {
        public override void VirtualIncreaseLong()
        {
            someLong += 2;
        }
    }

Now, i wasn't quite sure whether the commenter meant having a bunch of classes that inherited directly from MyClass, or having a set of inheriting classes in a deep inheritance tree. Just to be sure, i tested both cases.

In the first case, i have classes like MyDerivedClass1, MyDerivedClass2, ... , MyDerivedClass15 that all inherit directly from MyClass. In the second case, MyDerivedClass1 inherits from MyClass, MyDerivedClass2 inherits from MyDerivedClass1, ... , and MyDerivedClass15 inherits from MyDerivedClass14.

The code of the test is still largely the same as it was in the previous post, with just some minor modifications to make sure that more of the code to be executed has been JIT'ed prior to the actual test-run:

    class Program
    {
        const int iterations = 1000000000;

        static void Main(string[] args)
        {
            var myObject = new MyClass();
            var myDerivedObject = new MyDerivedClass15();

            // we do this so there's no first-time performance cost while timing
            EnsureThatEverythingHasBeenJitted(myObject);
            EnsureThatEverythingHasBeenJitted(myDerivedObject);

            TestNormalIncreaseMethod(myObject, iterations);
            TestVirtualIncreaseMethod(myObject, iterations);

            TestNormalIncreaseMethod(myDerivedObject, iterations);
            TestVirtualIncreaseMethod(myDerivedObject, iterations);

            Console.ReadLine();
        }

        static void EnsureThatEverythingHasBeenJitted(MyClass theObject)
        {
            theObject.IncreaseLong();
            theObject.VirtualIncreaseLong();
            TestNormalIncreaseMethod(theObject, 1, false);
            TestVirtualIncreaseMethod(theObject, 1, false);
        }

        static void TestNormalIncreaseMethod(MyClass theObject, int numberOfTimes, bool printToConsole = true)
        {
            if (printToConsole) Console.WriteLine(string.Format("calling the IncreaseLong method of type {0} {1} times", theObject.GetType().Name, numberOfTimes));
            
            var stopwatch = Stopwatch.StartNew();
            for (var i = 0; i < numberOfTimes; i++)
            {
                theObject.IncreaseLong();
            }
            stopwatch.Stop();

            if (printToConsole) Console.WriteLine("Elapsed milliseconds: " + stopwatch.ElapsedMilliseconds);
        }

        static void TestVirtualIncreaseMethod(MyClass theObject, int numberOfTimes, bool printToConsole = true)
        {
            if (printToConsole) Console.WriteLine(string.Format("calling the VirtualIncreaseLong method of type {0} {1} times", theObject.GetType().Name, numberOfTimes));

            var stopwatch = Stopwatch.StartNew();
            for (var i = 0; i < numberOfTimes; i++)
            {
                theObject.VirtualIncreaseLong();
            }
            stopwatch.Stop();

            if (printToConsole) Console.WriteLine("Elapsed milliseconds: " + stopwatch.ElapsedMilliseconds);
        }
    }

In the first test (multiple direct subclasses of MyClass) i got the following result:

fifteen subclasses

(note: for this test, i used MyDerivedClass1 instead of MyDerivedClass15 as in the listed code)

In the second test (inheritance tree) i got the following result:

fifteen nested subclasses

As you can once again see, the difference is completely negligible. So here's what i propose: until someone actually shows a case where a clear-cut performance penalty is shown that is even slightly relevant to real-world usage, we should just drop the whole "virtual methods are expensive!"-thing.

What Do You Think Of This Hack?

14 commentsWritten on November 16th, 2010 by
Categories: C#, Code Quality

I have a class which exposes a fluent interface to build something. Instances of this class contain some state based on the methods of the fluent interface you called and the arguments you passed to those methods. Now, that state is currently private but it's not private in the "oh my god we need to encapsulate this so nobody can read it!11!!!1" sense. In fact, i actually want to access that state from another class. The only 'problem' is that i don't want to add methods or properties to the class to expose this state, because it sort-of pollutes the fluent interface. I know, that's not really a huge issue but still, it'd be nice to keep the fluent interface clean and focused.

One way to expose this state without polluting the fluent interface is to create a separate interface which defines the methods/properties and then have the class implement that interface explicitly. That way you could only access those methods/properties when you cast the instance to the interface type. While there's nothing really wrong with doing that, i kinda have a bad feeling about that because it introduces an interface which is only there to support this little exercise in Intellectual Masturbation.

Instead, i tried this:

    public class MyClassWhichUsesAFluentInterface
    {
        private List<string> someState = new List<string>();

        public MyClassWhichUsesAFluentInterface SomethingFluent(string blah)
        {
            someState.Add(blah);
            return this;
        }

        // ...

        public static List<string> GetState(MyClassWhichUsesAFluentInterface constructedThingie)
        {
            return constructedThingie.someState;
        }
    }

Is it fancy? nope. Is it cool? nope. Does it work? yup. Is it simple? yup.

Good enough for me.

Using The Dynamic Keyword To Avoid Difficulties With Generics

15 commentsWritten on July 16th, 2010 by
Categories: C#, Code Quality

A coworker was working on some kind of base EntityBuilder class to use in his tests.  One of the requirements of the EntityBuilder class was that it would need to automatically set the ID of an entity to a ‘real’ value (as in: not the default value of the type).  The EntityBuilder would use some kind of IdGenerator based on the type of the ID of the entity.   First of all, the example i’m gonna show is highly simplified and might not look like it makes much sense, but it’s only to illustrate some C# stuff with regards to generics and the dynamic keyword.  So bear with me, and just focus on the language details :)

Suppose you’ve got something like this:

    public abstract class Entity<TId>

    {

        public virtual TId Id { get; set; }

    }

 

    public interface IIdGenerator<TId>

    {

        TId GenerateId();

    }

 

    public class IntIdGenerator : IIdGenerator<int>

    {

        private static int lastIssuedId;

 

        public int GenerateId()

        {

            return ++lastIssuedId;

        }

    }

 

    public class GuidIdGenerator : IIdGenerator<Guid>

    {

        public Guid GenerateId()

        {

            return Guid.NewGuid();

        }

    }

 

The idea was to write the EntityBuilder somewhat along these lines:

    public abstract class TestEntityBuilder<TEntity, TId> where TEntity : Entity<TId>

    {

        public TEntity Build()

        {

            var entity = CreateEntityWithDefaultProperties();

            entity.Id = GetIdGeneratorFor(typeof(TId)).GenerateId();

            return entity;

        }

 

        protected abstract TEntity CreateEntityWithDefaultProperties();

 

        private IIdGenerator<TId> GetIdGeneratorFor(Type type)

        {

            if (type == typeof(int))

            {

                return new IntIdGenerator();

            }

 

            return new GuidIdGenerator();

        }

    }

 

Of course, that doesn’t even compile… you’ll get the following compiler errors:

error CS0266: Cannot implicitly convert type 'MyProject.IntIdGenerator' to 'MyProject.IIdGenerator<TId>'. An explicit conversion exists (are you missing a cast?)
error CS0266: Cannot implicitly convert type 'MyProject.GuidIdGenerator' to 'MyProject.IIdGenerator<TId>'. An explicit conversion exists (are you missing a cast?)

So, how exactly do you get this working with generics? That’s when he asked for my help, and i didn’t know the answer either… i’ve struggled with this exact problem in a few previous situations and i never really got a clean solution either.  But then i thought “wait, can’t we just avoid the problems with generics through the dynamic keyword?”

We changed the code to look like this:

    public abstract class TestEntityBuilder<TEntity, TId> where TEntity : Entity<TId>

    {

        public TEntity Build()

        {

            var entity = CreateEntityWithDefaultProperties();

            entity.Id = GetIdGeneratorFor(typeof(TId)).GenerateId();

            return entity;

        }

 

        protected abstract TEntity CreateEntityWithDefaultProperties();

 

        private dynamic GetIdGeneratorFor(Type type)

        {

            if (type == typeof(int))

            {

                return new IntIdGenerator();

            }

 

            return new GuidIdGenerator();

        }

    }

 

We just changed the return type of the GetIdGeneratorFor method to ‘dynamic’, and the call to the GenerateId method is now a dynamic call instead of a normal method call.  And it works.  No messing around with generics voodoo, no (direct) usage of reflection either.  Just clean code.

I’ll probably use this trick a lot more times in the future when i run into the limitations of generics :)

Check Out QuickGenerate

1 Comment »Written on June 30th, 2010 by
Categories: C#, Code Quality, Software Development, testing

One of several interesting things in Mark Meyers’ QuickNet project is the whole input generation thing that you need for property-based testing.  It turns out that those input generators are very usable for far more purposes than just property-based testing, so it’s evolved into its own library.  It can generate object instances of almost any kind, while you can still have fine-grained control over the generation if you want to.  You can use it for simple types, complex objects or even entire object graphs. I wish i had time to write a more in-depth post about this, but for now i’m just gonna point you guys in the right direction, and i hope that you’ll see the value in this :)

The announcement of the first release can be found here, and an example can be found here.  Here’s a little glimpse at the code of one the examples:

quickgenerate

I think that piece of code is a nice illustration of how powerful and flexible this is :)

Is There A Good Reason To Hide Inherited Members?

22 commentsWritten on January 17th, 2010 by
Categories: C#

A mistake that i used to see a lot, and sometimes still do, is that developers hide inherited members in derived types.  For those of you who don’t know what that means, check out the following class:

    public class MyClass

    {

        public void DoSomething()

        {

            Console.WriteLine("i'm doing something important");

        }

    }

 

As you can see, the DoSomething method does something important, so you as a developer should consider its behavior important as well.   In some cases, you might want to add something extra to this behavior in a derived class.  Some developers would do that like this:

    public class MyDerivedClass : MyClass

    {

        public void DoSomething()

        {

            base.DoSomething();

            Console.WriteLine("(i'm just better at it)");

        }

    }

 

When they compile this, they’ll get the following warning:

warning CS0108: 'HidingMethods.MyDerivedClass.DoSomething()' hides inherited member 'HidingMethods.MyClass.DoSomething()'. Use the new keyword if hiding was intended.

When this occurs, either one of 4 things can happen:

  1. The developer doesn’t bother to read compiler warnings (an offense worthy of a bitchslap) and is not aware of a possible problem
  2. The developer sees the warning and just adds the ‘new’ keyword to the method.
  3. The developer reads the documentation to figure what this really means, and hopefully realizes his mistake.  If he does, he combines this option with option 4.  If he doesn’t, he goes the option 2 route.
  4. The developer realizes his mistake and either makes the base method virtual and adds the override keyword to the method in the derived class, or when that’s not possible, either renames the method or thinks of a different approach.

If you’re unlucky, you either end up with no modification or the method will now look like this:

        new public void DoSomething()

        {

            base.DoSomething();

            Console.WriteLine("(i'm just better at it)");

        }

 

Great, no more compiler warning! All is well in the world now, right?

Err… not really.

The DoSomething method of MyDerivedClass actually hides the original DoSomething method. 

The following code will always produce the expected behavior:

            new MyClass().DoSomething();

            new MyDerivedClass().DoSomething();

 

That is, when the DoSomething method of an instance of MyClass is called, it will obviously execute the original DoSomething method.  And if the DoSomething method of an instance of MyDerivedClass class is called through a reference of MyDerivedClass then it will call the ‘new’ DoSomething method.

The following code however, would not produce the expected behavior:

        static void DoIt(MyClass subject)

        {

            subject.DoSomething();

        }

 

        static void Main(string[] args)

        {

            DoIt(new MyClass());   

            DoIt(new MyDerivedClass());

 

            Console.ReadLine();

        }

 

In this example, the DoSomething method is always called through a reference of MyClass.  When the DoIt method receives an instance of MyClass, the original DoSomething method will obviously be executed.  What some (many?) people unfortunately aren’t aware of is that when an instance of MyDerivedClass is passed into the DoIt method, the ‘new’ DoSomething method will not be executed but only the original one will be executed.  The reason is because methods that hide inherited members can only be called through references of the type, or derived types of that, that hides the inherited method.  Doesn’t really sound like a fun situation to debug, right?

So anyways, back to my original question: is there any valid reason why you would want to do this? Or better yet, can you share a situation where you had to resort to hiding an inherited member and if so, are you happy with the solution or did you consider it a hack?  And are you aware that it is essentially a bug waiting to happen?

So far, i have never actually seen a valid reason for doing this.  The only reason i’ve seen so far for occurrences of hidden members was because the developers that used it simply didn’t know any better.  In some cases, either me or someone else wasted debugging time on this when a piece of code that used a reference of a base class suddenly didn’t do what was expected when it was passed an instance of a derived class which hid the original method.  

I have read of only one valid reason to do this, and that is if a (either virtual or non-virtual) method with the same signature is introduced outside of your control in a base class that you can’t modify (like say, a class in the .NET framework or some other 3rd party assembly that you depend on) and you want to get rid of the compiler error that you got when recompiling against the newer version of the assembly.  In that case, it does make sense though i’d still say it’s going to be a future source of confusion sooner or later, and quite likely lead to a future bug as well.  In that situation, i’d much rather rename the member in my class.  Even if it means that consumers of my code will be forced to deal with the breaking change.  After all, a simple rename will always be less work than having to debug an issue because of a hidden member.