NDepend And Lack Of Cohesion Of Methods… Not to be trusted

9 commentsWritten on December 11th, 2009 by
Categories: NDepend

I already mentioned that i don’t trust the Lack Of Cohesion Of Methods (LOCM) metric from NDepend on the type level.  Let’s go over a small example, shall we?

Suppose you have a class like this:

    public class CohesiveClass

    {

        private List<int> integers;

 

        public CohesiveClass()

        {

            integers = new List<int>();

        }

 

        public void AddInteger()

        {

            integers.Add(integers.Count);   

        }

 

        public IEnumerable<int> GetEvenIntegers()

        {

            return integers.Where(i => i % 2 == 0);

        }

 

        public IEnumerable<int> GetOddIntegers()

        {

            return integers.Where(i => i % 2 > 0);

        }

    }

 

If you look at the NDepend LOCM documentation, you can find the following statement:

a class is utterly cohesive if all its methods use all its instance fields

Now, unless i’m missing a fundamental concept here, i would dare to claim that the class shown above is utterly cohesive.  After all, each method uses the 1 instance field of this class.

This is how NDepend is supposed to calculate the LOCM metric (taken from the documentation):

There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming. Here are algorithms used by NDepend to compute LCOM metrics:

  • LCOM = 1 – (sum(MF)/M*F)
  • LCOM HS = (M – sum(MF)/F)(M-1)
Where:
  • M is the number of methods in class (both static and instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods).
  • F is the number of instance fields in the class.
  • MF is the number of methods of the class accessing a particular instance field.
  • Sum(MF) is the sum of MF over all instance fields of the class.

The underlying idea behind these formulas can be stated as follow: a class is utterly cohesive if all its methods use all its instance fields, which means that sum(MF)=M*F and then LCOM = 0 and LCOMHS = 0

I may suck at math, but i think that both LCOM and LCOM HS should be 0 for the code shown above.  NDepend however disagrees with me.  LCOM for this class, as calculated by NDepend is 0.33 and LCOM HS is 0.4.  In both cases NDepend considers these values to be bad for this class (as indicated by the fact that the IsBadLackOfCohesionOfMethods and IsBadLackOfCohesionOfMethods_HS values in TypesMetrics.xml are set to True).

Of course, it also reports that there are 3 fields in this type, instead of one (NFields is set to 3) and 2 static methods along with the 4 instance methods.  I’m sure you can agree with me that there are no static methods, and only one field.

Now, if you look at the compiled code in reflector, it suddenly makes sense:

reflector_output

Alright, so we do have 3 fields: my integers list, and 2 Func instances that the C# compiler generated for me.  And the 2 static methods that were listed by NDepend also show up, and they contain the code of my lambda expressions.

Now here is my question: if a tool calculates metrics of code, shouldn’t it calculate them based on the code that i wrote, instead of the code that the compiler generated?  Especially when it comes to metrics that are supposed to give you an idea on the quality of the code, i would really prefer it that my code was used for the calculation, without the compiler generated code.

Let’s try to make NDepend agree with me that this code is indeed utterly cohesive, shall we?

Here’s a modified version of the code:

    public class CohesiveClass

    {

        private List<int> integers;

 

        public CohesiveClass()

        {

            integers = new List<int>();

        }

 

        public void AddInteger()

        {

            integers.Add(integers.Count);   

        }

 

        public IEnumerable<int> GetEvenIntegers()

        {

            var evenInts = new List<int>();

 

            foreach (var i in integers)

            {

                if (i % 2 == 0)

                {

                    evenInts.Add(i);

                }

            }

 

            return evenInts;

        }

 

        public IEnumerable<int> GetOddIntegers()

        {

            var evenInts = new List<int>();

 

            foreach (var i in integers)

            {

                if (i % 2 > 0)

                {

                    evenInts.Add(i);

                }

            }

 

            return evenInts;

        }

    }

 

NDepend will now gladly agree with me that LCOM and LCOM HS are 0 and that the class is utterly cohesive.

I don’t know about you, but i’ll take version 1 over version 2 any day of the week.

Until this issue is fixed (and who knows how many other metrics show incorrect results based on compiler generated code), i’ll assess the quality of my code myself instead of depending on a tool, thank you very much.

  • http://benpittoors.wordpress.com den Ben

    if reflector knows the difference between written code and compiler generated code then NDepend should be able to easily do so also imo

    maybe it should differentiate between LCOM and LCOMCompiled because it still might be interesting to know what the compiler does to your code ;-)

    too bad NDepend is closed source… I can’t imagine this being hard to implement

  • http://davybrion.com Davy Brion

    reflector doesn’t know the difference… it just reads the generated intermediate language and displays it in C# as well as it can

  • http://benpittoors.wordpress.com den Ben

    so there is no way to distinguish compiler generated code from developer written code reading from the intermediate language?

  • http://benpittoors.wordpress.com den Ben

    not fail safe… but chances are that parsing the name for a double underscore would get you a long way already: ie b__0(Int32)

  • alwin

    A method/field with a in it would usually be a compiler generated method, since you can’t use those characters in C# (or VB I think). I mean the method name, not the signature which can have for generics.

  • Harry Steinhilber

    Another pretty good indicator would be the presence of the [CompilerGenerated] or [DebuggerHidden] attributes. Although I have seen [DebuggerHidden] used in hand written code, though rarely.

  • http://www.NDepend.com Patrick Smacchia

    Davy, interesting finds, we agree with your analysis,
    LCOM computation is indeed flawed when the compiler emits some members behind your back.
    this will be fixed in future releases.
    Thanks
    Patrick Smacchia
    NDepend Lead Dev

  • http://davybrion.com Davy Brion

    @Patrick

    excellent news, thanks

  • Pingback: NDepend 3 Preview | The Inquisitive Coder – Davy Brion's Blog