Do You Know What’s Really Going On?
Posted by Davy Brion on January 20th, 2010
I recently showed the following picture and asked you what it showed:
Matt Hidinger pretty much nailed it in the comments:
Well it appears to have something to do with Efferent Coupling at the Type Level (http://www.ndepend.com/Metrics.aspx#TypeCe) — simply meaning the number of types that a class directly depends on.
So it appears that you’re aiming for less than 20 direct type dependencies and highlighting classes with more than 35 deps.
I imagine this is a view of a particular namespace, but I’m not sure on that.
I mentioned in the comments that i’d write more about it soon, so here it is. That screenshot comes from a product (called Glance) that we initially developed for internal use, because a lot of the people at our company are well, addicted to data. I guess we got tired of looking at numbers at some point so we decided to spruce things up a little. Let’s walk through an example, shall we?
In my case, i’m interested in code metrics. Well, some of them at least. The following picture shows a high level view on the status (or health) of some code metrics for a particular project:
Their names aren’t being shown correctly yet, but the metrics that are shown are the following: Association Between Classes, Cyclomatic Complexity, Depth Of Inheritance, Effererent Coupling and Number Of Lines Of Code. Notice their colors. They can range from green (which is good) to red (which is bad). Again, i’m looking at these metrics on a project level so the colors that are shown here represent the average status of each specific metric for the entire project. In this particular image, you can see that the Cyclomatic Complexity color is starting to turn a bit yellowish. When you hover over the metric, you get to see the following:
As the description of this particular metric says, the goal is to get a value of 0.2 which would result in a status of +1. 0.4 would be neutral so that’s 0 and anything greater than 0.6 would be –1. The color ranges from red to green based on the status (-1 = red, +1 = green). For this particular project, the average value of all types is 0.5649. This is sort of confusing, because you’d think that it should be close to red but the average status is 0.3587 and that is what the color is based on. Now, if the average value is close to red yet the average status is green, it means that there are probably some classes that have a very high score (for this metric, that’s obviously very bad) and they have a heavy influence on the average. The average status however, is not impacted in this case. At this point, we’re still not quite sure whether the color should be based on the average status, or the average value but that’s a minor detail which isn’t important for this post
Anyway, when i click on the metric i get the following screen:
This shows a circle for each top-level namespace in the project. Each circle is colored based on the average status of the types within that namespace. As you can see, some of them are green, and some are yellow. What you don’t see in this screenshot is that those circles are all floating around the center circle (which represents the parent level of this view) in a 3D view. I rotated the view to make it as clear as possible where the circles are located. When you hover over one of the circles, you’ll again see the value and the status of the namespace that is represented by that circle.
Obviously, you can do this with each circle in the view:
This namespace isn’t even close to green, so we really ought to check out what’s going on. When i click on it, i drill-down into the data again, and i get to see the namespaces and classes that are in the EMS.Silverlight.Common namespace.
The EMS.Silverlight.Common.Controls.Tooltip namespace seems to be in the worst shape, so i click on it again.
Oops. We have two classes that are pretty red. I guess we’ve identified a good candidate for refactoring
To wrap up the screenshot bonanza i’ve got going on here, here’s a picture of the Lines Of Code metric in a large web application. It clearly shows that there are 4 parts that we really need to look at for large classes, while the rest of it doesn’t really warrant any attention at this time.
Now, as cool as those screenshots are (at least, i certainly think so) they aren’t really the coolest part of all this. First of all, you might be wondering “who on earth goes to all this trouble to develop a tool to look at code metrics like this?”. Well, as a company we are kinda “out there” but we’re definitely not that far “out there”. It would be ridiculous to develop something like this purely to look at code metrics.
Instead, we can use this for pretty much any kind of data. The tool doesn’t even know it’s showing code metrics. It is just showing data from a data warehouse where we’ve defined dimensions and hierarchies, like Assembly -> Namespace -> Type in the above example, to define the context for some Key Performance Indicators (KPI’s). The actual code metrics come from nightly NDepend builds that we run for each of our projects. The data is imported in our warehouse and then we can look at it with this tool due to the KPI’s that we’ve defined.
So basically, pretty much everything is possible as long as you can get the data in your warehouse and define some KPI’s for that data. I only showed you code metrics in this post, but we use it for a lot more than that. Some examples include Holiday Acceptance (which manager is the fastest or slowest in approving holidays), Aging Invoices (which customers are always late with payments), Project Revenue (which projects do we make the most money on), Issues (who fixes bugs the fastest? who creates the most or the fewest bugs?), etc…
And you really can get all of that data from everywhere… in our case, we get a lot of data from our EMS system and obviously from Genesis as well. We also get data from NDepend (as you’ve seen in this post), from TeamCity (anything build-related), Subversion, whatever you want basically. Obviously, you’re not limited to looking at the data in the ‘circle view’. You can get traditional graphs as well. For instance:
This is global view of all requested changes for all of our projects (both commercial projects for customers as our internal applications).
The following picture shows the average lead time for issues that we had in 2009 based on the type of issue:
Finally, an overview of the time we spend on global task types:
Note that we do very little up-front design, and our continuous/evolutionary design is obviously a part of our regular Development tasks. And i’m sure you can understand that this kind of data is very helpful when we need to make estimations for new projects. We have a very good view on the amount of overhead that we typically have for project management, for instance.
One of the biggest benefits (IMO anyway) is that with the way the data is visualized (especially with the circle view), everyone can easily look at the data and see where more attention is required. I don’t know a lot about finances but i know that red circles are bad. My boss doesn’t know anything about good code, but even he now has a pretty good view on the quality of our code and he actually likes coming up to some of our guys to say “wow, class X in your project really could use some attention huh?”.
And today i heard that one of our customers who’s already using this is apparently learning quite a lot about their business because of this tool. They also put a lot of data in their warehouse and they are learning in advance where they need to improve their business processes before those things really become a problem because of what Glance is showing them. The process is actually pretty simple. If it’s green, you don’t really need to spend time on it. If it’s turning yellow or red, you probably want to investigate things and try to improve the situation before it really starts causing problems. In the future, we’re also going to add automated notifications to it. So once things start going bad, you’ll be notified of them automagically.
In short, you can find out what’s really going on in your organization and focus your efforts on where it truly makes sense to do so. And that, my friends, is what it’s all about.
January 20th, 2010 at 12:32 pm
Wow!
This application really seems helpful, and it looks absolutely amazing!
Congratulations for the team!!
January 20th, 2010 at 3:26 pm
Looks very nice.
One remark: red-green colors are about the worst colors to use as ends of a color scale for people who are color-blind. To me that second image looks like 5 identical green spheres.
January 20th, 2010 at 4:24 pm
@Filip
Good point.
Still, when you hover over a sphere you get the status indicator gauge which is pretty obvious about the health of the kpi, even to color-blinds.
And in the drill-down mode the distance between a sphere and the center gives you the same information as the color does. The farther away from the center, the worse…
January 20th, 2010 at 7:11 pm
It’s really nice to see it in action like that. It’s hard to imagine a cooler application to work on during an internship…
January 21st, 2010 at 6:24 am
It’s awesome. BTW, what does placement of dots signify?
January 21st, 2010 at 9:35 am
@Dhananjay
In his post Davy is following a drill-down path from Assembly, over Namespace to Type (=Class).
The first horizontally aligned series of dots (or spheres, because they actually are 3D objects) represent the different KPIs. When you click on one of them you start drilling down a dimension hierarchy. The selected sphere becomes the central one and it will be surrounded by a bunch of spheres that represent Assemblies. When you click on such an Assembly, that one becomes the central one and it will be surrounded by the different Namespaces it contains. When you click on a Namespace that one becomes the central one… (can you see where this is going?)
When a sphere is located far away from the central one, that means it is in a bad shape. When it is placed near the central one then it’s in a good shape
On a side note: We are also testing out an algorithm for determining correlations between different KPI values (based on Principal Component Analysis). There, the distance will mean the level of influence a surrounding sphere has on the central one (closer means a higher influence in that case).
January 21st, 2010 at 10:11 am
Awesome!
NDepend v3 fully integrated in VS and with real-time analysis and CQL rule checking is coming publicly for next week.
Davy, we’ll see how Glance will fit into all these.
January 21st, 2010 at 2:37 pm
[...] Brion just unveiled on his blog a promising data visualization application: Glance. Glance aims at making more sense from numerical [...]
January 21st, 2010 at 2:39 pm
@Filip
That second image looks liked 5 similar green spheres to me too, I’m not color blind.
I guess they really are 5 green spheres
January 21st, 2010 at 2:44 pm
@Zecc
they are shades of green, some a little bit more yellow than the others
January 21st, 2010 at 3:20 pm
This looks very promising and beautiful. But as Filip said, it is very frustrating for those of us who have some kind of colour blindness. I basically see two colours here, one red and the other which is what you call green and yellow. You really should make that configurable.
February 2nd, 2010 at 10:10 am
@Den
Thanks for clarifying this for me. I follow your point.