Virtual Method Performance Penalty, Revisited

18 commentsWritten on December 17th, 2010 by
Categories: C#, Performance

I wrote a post about a year ago which discussed a test of the performance difference between calling virtual methods and non-virtual methods. This morning, someone added the following comment to that post:

if you have 100 subclasses of class A, and they all override a method a, it will take a lot longer for it to figure out which version of a to call. Think of it as a switch statement with one case label verses a switch statement with 100 case labels. Since you’re just testing it with one method it’s not surprising that the cost is negligible.

My Bullshit-detector started beeping while reading that, so i just had to see if the number of subclasses indeed had an impact. I didn't go all the way up to 100 subclasses, but i went with 15. If there is indeed a performance penalty that grows with the number of subclasses in play, then surely i'd have to see some difference when using 15 subclasses over just 1, right?

In the original test, i had the following 2 classes:

    public class MyClass
    {
        public long someLong;

        public void IncreaseLong()
        {
            someLong++;
        }

        public virtual void VirtualIncreaseLong()
        {
            someLong++;
        }
    }

    public class MyDerivedClass : MyClass
    {
        public override void VirtualIncreaseLong()
        {
            someLong += 2;
        }
    }

Now, i wasn't quite sure whether the commenter meant having a bunch of classes that inherited directly from MyClass, or having a set of inheriting classes in a deep inheritance tree. Just to be sure, i tested both cases.

In the first case, i have classes like MyDerivedClass1, MyDerivedClass2, ... , MyDerivedClass15 that all inherit directly from MyClass. In the second case, MyDerivedClass1 inherits from MyClass, MyDerivedClass2 inherits from MyDerivedClass1, ... , and MyDerivedClass15 inherits from MyDerivedClass14.

The code of the test is still largely the same as it was in the previous post, with just some minor modifications to make sure that more of the code to be executed has been JIT'ed prior to the actual test-run:

    class Program
    {
        const int iterations = 1000000000;

        static void Main(string[] args)
        {
            var myObject = new MyClass();
            var myDerivedObject = new MyDerivedClass15();

            // we do this so there's no first-time performance cost while timing
            EnsureThatEverythingHasBeenJitted(myObject);
            EnsureThatEverythingHasBeenJitted(myDerivedObject);

            TestNormalIncreaseMethod(myObject, iterations);
            TestVirtualIncreaseMethod(myObject, iterations);

            TestNormalIncreaseMethod(myDerivedObject, iterations);
            TestVirtualIncreaseMethod(myDerivedObject, iterations);

            Console.ReadLine();
        }

        static void EnsureThatEverythingHasBeenJitted(MyClass theObject)
        {
            theObject.IncreaseLong();
            theObject.VirtualIncreaseLong();
            TestNormalIncreaseMethod(theObject, 1, false);
            TestVirtualIncreaseMethod(theObject, 1, false);
        }

        static void TestNormalIncreaseMethod(MyClass theObject, int numberOfTimes, bool printToConsole = true)
        {
            if (printToConsole) Console.WriteLine(string.Format("calling the IncreaseLong method of type {0} {1} times", theObject.GetType().Name, numberOfTimes));
            
            var stopwatch = Stopwatch.StartNew();
            for (var i = 0; i < numberOfTimes; i++)
            {
                theObject.IncreaseLong();
            }
            stopwatch.Stop();

            if (printToConsole) Console.WriteLine("Elapsed milliseconds: " + stopwatch.ElapsedMilliseconds);
        }

        static void TestVirtualIncreaseMethod(MyClass theObject, int numberOfTimes, bool printToConsole = true)
        {
            if (printToConsole) Console.WriteLine(string.Format("calling the VirtualIncreaseLong method of type {0} {1} times", theObject.GetType().Name, numberOfTimes));

            var stopwatch = Stopwatch.StartNew();
            for (var i = 0; i < numberOfTimes; i++)
            {
                theObject.VirtualIncreaseLong();
            }
            stopwatch.Stop();

            if (printToConsole) Console.WriteLine("Elapsed milliseconds: " + stopwatch.ElapsedMilliseconds);
        }
    }

In the first test (multiple direct subclasses of MyClass) i got the following result:

fifteen subclasses

(note: for this test, i used MyDerivedClass1 instead of MyDerivedClass15 as in the listed code)

In the second test (inheritance tree) i got the following result:

fifteen nested subclasses

As you can once again see, the difference is completely negligible. So here's what i propose: until someone actually shows a case where a clear-cut performance penalty is shown that is even slightly relevant to real-world usage, we should just drop the whole "virtual methods are expensive!"-thing.

  • http://benpittoors.wordpress.com den Ben

    dynamic methods are expensive :p

  • Artur Dorochowicz

    FYI, I think the claim about slow virtual methods is actually true on Compact Framework. AFAIK it does not have vtables, so virtual method calls are resolved at call time by checking inheritance hierarchies each time.

  • Alex Simkin

    It is not that “virtual” calls are expensive, it is that the presence of “virtual” calls prevents compiler optimizer and jitter from doing certain optimization tricks.

  • http://davybrion.com Davy Brion

    @Alex

    i’d love to see an example which shows the performance penalty of not being able to use those optimisations

  • Alex Simkin

    public class Program
    {
    public bool ImAlwaysTrue()
    {
    return true;
    }

    public virtual bool IcanBeWhatever()
    {
    return true;
    }

    public bool ItTakesMeForeverToGetResult()
    {
    Thread.Sleep(1000);
    return true;
    }

    public bool M1()
    {
    return ImAlwaysTrue() || ItTakesMeForeverToGetResult();
    }

    public bool M2()
    {
    return IcanBeWhatever() || ItTakesMeForeverToGetResult();
    }

    public void Jit()
    {
    M1();
    M2();
    }

    static void Main(string[] args)
    {
    var p = new Program();
    p.Jit();
    var stopwatch1 = Stopwatch.StartNew(); for (var i = 0; i < 1000000000; i++) { p.M1(); } stopwatch1.Stop();
    var stopwatch2 = Stopwatch.StartNew(); for (var i = 0; i < 1000000000; i++) { p.M2(); } stopwatch2.Stop();
    Console.WriteLine(stopwatch1.ElapsedMilliseconds);
    Console.WriteLine(stopwatch2.ElapsedMilliseconds);
    }
    }
    }

    I can see difference when compiling and executing in Release mode. I see 459 vs 4126.

  • http://davybrion.com Davy Brion

    @Alex

    ah, now that’s a good example :)
    on my system it’s 771 vs 3393… i suppose this is purely due to the inlining of ImAlwaysTrue?

    i’m gonna play with this some more tomorrow… wanna see what the difference is when inlining isn’t possible

  • Alex Simkin

    @Davy

    Yes, virtual methods cannot be inlined.

    For code below I put Jitter output below each method:

    public int One()
    {
    return 1;
    }

    public virtual int Two()
    {
    return 2;
    }

    public int M1()
    {
    return One() + One() + One() + One() + One();
    }
    // mov eax, 5
    // ret

    public int M2()
    {
    return Two() + Two() + Two() + Two() + Two();
    }
    // push ebp
    // mov ebp, esp
    // push esi
    // push eax
    // mov esi, ecx
    // mov ecx, esi
    // mov eax, dword ptr [ecx]
    // mov eax, dword ptr [eax+28h]
    // call dword ptr [eax+10h]
    // mov ecx, esi
    // mov dword ptr [ebp-8], eax
    // — repeat 4 more times
    // — giving total of 37 instructions including 5 calls

  • tcmaster

    I’d say to 99% of applications this kind of performance penalty is nothing. On they other hand, illed design or implementation can easily take 100 times of that penalty. I think on this side, Java took the correct direction by taking all method as virtual by default.

  • kilfour

    I’d say if one needs this kind of performance C# is a bad choice of technology.
    The mere fact that Alex is able to write down all that registry malarkey, just screams ‘Ansi/C programmer’.
    Which is not bad thing, it is just not relevant for most C# apps being written today.

  • Alex Simkin

    @Tcmaster

    With getting popular “extract till you drop” [1] refactoring, one can kill performace significantly by making compiler optimization impossible. Imagine that One() and Two() are called in a loop.

    @Kilfour

    C# used for all kinds of programs including computation intensive, not just forms over data.

    [1] http://blog.objectmentor.com/articles/2009/09/11/one-thing-extract-till-you-drop

  • summerlight

    Indirect branch speculation mechanism in modern CPU should also be considered. In this scenario, speculation hit rate is almost 100% since only one virtual table is actually used for indirect branch. When it goes to more realistic scenario, speculation hit rate will be far more less and performance will drastically decreased.

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #754

  • Pingback: Tweets that mention Virtual Method Performance Penalty, Revisited | The Inquisitive Coder – Davy Brion's Blog -- Topsy.com

  • Zaibot

    Don’t forget sealing your methods and classes will get alot of performance back! ;-)

  • Alex Simkin

    @Zaibot

    I do not see any difference. Care to share a test?

  • Mike Strobel

    I could very well be wrong, but I seem to remember reading that the C# compiler emits many calls as virtual even if the target method is not actually virtual. Have you checked the IL to see if your non-virtual test cases actually make non-virtual calls?

  • http://davybrion.com Davy Brion

    @Mike

    haven’t checked the IL but i’ve read that a couple of times as well

  • Alex Simkin

    @Davy

    C# always emits callvirt for any instance calls (virtual or non-virtual). Unlike Java bytecode, IL is never interpreted and always jitted. Jitter should generate correct instructions for different types of calls.