Code Quality

Three Strikes And You Refactor

8 commentsWritten on May 6th, 2012 by
Categories: Code Quality

The first time you do something, you just do it. The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway. The third time you do something similar, you refactor.

That's a quote from Don Roberts, found in Martin Fowler's Refactoring book (page 58). That quote didn't really make me think twice at the time I read it, but it got stuck in my head later on when I heard it from someone who's always been very influential to me. When this person told me that this was one of his favorite rules, I was still in my "no way dude, all code needs to be as clean as it can be" phase. But it's really grown on me over the last few years. Now, just to avoid any confusion: this post is about my interpretation of 'Three Strikes And You Refactor', not about what may have been originally intended or what some people think of it now.

The most important thing I remember when learning about TDD was:

  1. Write a failing test

  2. Do the simplest thing that could possibly work to make the test pass.

  3. Refactor.

With all the literature that's out there, as well as popular opinion on how important clean code is, it's tempting to go overboard with that 'refactor' step. It's tempting to go for the solution that is crispy clean. The solution that'll have your coworkers or peers go "ohhh, now that is nice". However, you really need to ask yourself the following question: "is it really worth it?". Wanting to make sure that every piece of code is perfect takes up a lot of time, and I've learned that it certainly isn't always worth it. Besides, it's quite easy to go off on a path of introducing new concepts and abstractions to your code that end up being totally unnecessary. If this increases complexity in parts that don't really matter, it ends up being a huge waste. Not only for the person who wrote it, but for every person who has to read it and comprehend it later on.

'Three Strikes And You Refactor' is a helpful rule of thumb here. Try to keep things as simple as possible and don't be afraid of a little duplication here or there. Once you find yourself (or a teammate) doing the same thing for a third time, it's worth cleaning up because that third time very often means that it wouldn't be the last time either. This way, you only introduce concepts and abstractions that are really needed, and enables you to avoid adding unnecessary cruft to the code.

What’s The Point Of Using WCF In A Web App?

38 commentsWritten on March 18th, 2012 by
Categories: Architecture, Code Quality, Performance, WCF

A very common approach of building web applications in .NET is to put most of the non-UI related code behind an internal WCF service layer. I used to be a fan of this approach as well, but these days I just don't see the benefit of that internal service layer anymore. The overhead that an internal WCF service layer adds to development, deployment and runtime performance just doesn't stack up favorably to the supposed benefits IMO. To be clear: I'm talking about WCF services that will only be used by the front end of your web application.

Let's talk about the overhead on development first. If you're using WCF services in your web app, you need proxies to access those services. Some people prefer to generate the proxies based on the WSDL of the services that will be used. In the worst case, this leads to regenerating proxies and all of the types that are defined in the WSDL every time you change a service contract or one of the types that are used by the services. If multiple people need to make changes to any of these concurrently, this easily leads to merging problems when people need to commit their changes. Another way is to share the same types on both sides (client & server), and implement your service proxies by inheriting from ClientBase and manually keeping the implementation of the proxies up to date with the definitions of their service contracts. This is better than regenerating a bunch of code all the time, but you're still writing a lot of redirection code for the purpose of, well, what exactly? Another possibility is to use dynamic proxies which automatically implement the service contracts but this increases the amount of infrastructure code you need to put in place and it's not always clear to everyone how exactly communication with the services happens. There's also a lot of WCF configuration for each service that you need to maintain, and it can quickly grow unwieldy.

Then there's the overhead on performance. I hope we can all agree that any operation that goes out of process is at least an order of magnitude slower than a similar operation that can be executed in process. First of all, there's the networking overhead (even if your services are hosted on the same machine as the web app) that you have to keep into account. Secondly, there is the cost of serializing and deserializing everything that is transferred between the client and the server. Even with the most efficient bindings and serializers, the cost of all of this quickly adds up on high-traffic web apps. That's not to say that WCF services are inherently slow. They can be very fast and efficient, but they'll never be as fast and efficient as executing that logic in process within the web app.

Finally, there's the extra overhead it introduces to the deployment phase:

  • more endpoints to set up and transfer artifacts to
  • more configuration
  • more monitoring of endpoints
  • more servers if you're not hosting the services and the web app on the same machine

Of course, people will argue that there a plenty of benefits to using a WCF service layer in a web app. The ones I hear about most often are the forced separation of business logic and UI logic and improved scalability and reliability. I really disagree that you need a physical separation of business and UI logic. I much prefer approaches where the separation is based on abstractions. A good example was recently posted by Ayende (here and here). And when it comes to scalability/reliability, a web app that isn't dependent on a WCF service layer is as easy (or even easier depending on your setup) to scale than one that is entirely dependent on WCF services. First of all, if you care about scalability/reliability your web app should already be prepared to run behind a load balancer. If you already have a load balancer in place, you can just add more web servers to your setup when needed. If you'd host the WCF services on the same machines that are hosting the web front end, you'd get less total throughput from one server than you would if that one server could just host a web app that fully executes in process (not including the database obviously). If you're hosting the WCF services on separate machines, you'd end up with more servers to handle the load and to achieve the reliability you need than you would with just being able to add more web servers to your setup. That also increases your licensing costs. And of course, it also means increased networking overhead on every service call, which also implies that the threads on your web servers will be blocked for longer periods while they wait for those service calls to return. Unless you're calling those services asynchronously, but most people simply don't. Also, if you have serious scalability and reliability requirements you're probably better off with asynchronous messaging solutions than with SOAP services.

WCF has its benefits (though I prefer Web API's or asynchronous messaging over SOAP services these days) and it has its use cases. I just don't think internal service layers for web apps is one of them.

What are the benefits that you think an internal WCF service layer brings to your web app? And what's your opinion on how they stack up versus the downsides?

DTO’s Should Transfer Data, Not Entities

10 commentsWritten on February 19th, 2012 by
Categories: Architecture, Code Quality

I've read a couple of posts recently where the authors were complaining about excessive mapping to DTO's and whether or not it's worth it. I've been a fan of DTO's for a long time, but I'm not a fan of how I frequently see them being used. More specifically, I very much dislike the all-too-common approach of creating a DTO for each entity in your domain model. In the worst cases, the DTO's actually reference each other and try too hard to mimic the structure of the entities. For example, an OrderDto which has a reference to a CustomerDto and a collection of OrderLineDto instances. In those cases, I absolutely agree that all of the mapping involved introduces way too much ceremony to the codebase without really offering any concrete benefits.

Of course, it really depends on the architecture of your system. If you're using services that are only used by the front-end of your system, then I'd still advise against making those entities available outside of the service boundary for reasons that I've discussed earlier. That doesn't mean that you should go the entity-mimicking-DTO-route though. In fact, entity-mimicking-DTO's introduce a few of the same downsides you'd get from exposing entities directly through your services. For an 'internal' service (i.e. only used by your application), I think it makes a lot more sense to use DTO's per service operation which are optimized for the use case that that service operation is meant to implement.

Essentially, an internal service's operations will be either queries or commands. When it comes to queries, why not just do the simplest thing possible? In most cases, it's sufficient to just return the data in the exact same form that the data will be displayed in. Quite often, that means a denormalized set of data where the data comes from more than one type of entity/table. There's no reason to send a bunch of entity-mimicking DTO's that reference each other to the client if you're going to use the data to populate a grid. Just send a list of DTO's which are already in the most optimal form. Populating those DTO's can then simply be done through straight SQL, a view, a stored procedure, a projection through your ORM, a map/reduce operation, or whatever else that makes sense. The point is that in most cases, you should just populate those DTO's as directly as you can instead of mapping to those DTO's. In this case, the DTO's offer a clear-cut benefit and don't really introduce tedious ceremony code in your codebase.

As for the commands (inserts, updates, anything that creates something or results in something happening), why even use DTO's in the first place? People will often argue that the ability to reuse the entity-mimicking-DTO's for these operations is a benefit. Personally, I don't really see the benefit. The mere presence of entity-mimicking-DTO's only encourages people to use them for the queries as well. Instead, I prefer to go with types that encapsulate all of the data relevant to the current command (the data to be inserted/updated, or whatever else the command needs). In a very simple scenario, this could be an InsertCustomerCommand type which simply has properties for the data that needs to be inserted. Nobody will be confusing these types with any other purpose than what their name communicates.

If you're building a web app where the server-side code does everything in process (i.e. without a WCF service), then you can in many cases just use the entities directly without a real drawback, though I'd recommend keeping an eye on unexpected select N+1 problems, since those will frequently come up when you're preparing your views. But even in this case, using DTO's that are optimized for the scenarios in which they're being used can really simplify the query-side of your system a lot.

There's nothing inherently wrong with DTO's if you use them to simply transfer data. If you're using them to transfer entities, you're robbing yourself of their biggest benefit while only making things more complex than they need to be.

The Non-Typical .NET Job

I recently referred to an interesting .NET job as a 'non-typical .NET job'. I hadn't used that term yet up until that point, so I thought that was rather interesting. But what exactly do I mean with 'non-typical .NET job'? It's pretty simple really: a job where you're using .NET technology without blindly following the guidelines, recommendations and software from Microsoft on how to develop software on the .NET platform. It basically means that you'll use whatever you think is most appropriate for what you're trying to do.

The biggest problem in the .NET world is that most companies that do .NET development just stick to what Microsoft tells them to use and how to use it. Many .NET developers largely focus on that, because they know all too well that it increases their odds of getting hired. And let's face it: Microsoft has a solution for practically everything. The only problem is that those solutions are rarely the best in what they're trying to solve. But hey, no manager gets fired for going with Microsoft, right?

The result is that there are too many companies and too many developers that focus only on what Microsoft offers. But there's a lot more to software development than what Microsoft offers, or even knows about. There are countless examples of Microsoft being late to whatever technical party is interesting at the time. And when they show up, they certainly don't always make a good impression.

If you're the kind of developer that likes to learn from what other software development communities are doing, odds are high that you're screwed. There is an interesting OSS community within the .NET world, and they frequently produce great solutions, quite often based on succes stories coming from other development communities. The problem is not that .NET developers don't have great solutions available to them. The problem is that the majority of them simply don't know about them only because there hasn't been any Microsoft hype about it, or that the devs who do know about it aren't allowed to use it because their managers are sceptical about it, most likely also because there's no Microsoft backing for the technology or architectural style that is being proposed.

I'm not advocating the avoidance of Microsoft products or solutions. By all means, use Microsoft products if they are indeed the best solution to your problem. But do be aware of the things that are getting attention outside of the Microsoft sphere and use them when it makes more sense to use them. That's the essence of the 'non-typical .NET job' and that's exactly what makes it interesting: using the right tool for the right job.

The Worst Code I Ever Wrote, And Why I’m Still Happy About It

11 commentsWritten on September 18th, 2011 by
Categories: Code Quality, Software Development

My last 2 years in high school, I was lucky enough to have 13 hours of programming-related classes every week. Those classes covered a few languages, but the most of it was spent on C/C++ (and nothing advanced either… mostly basic stuff) and Visual Basic. We thought we learned a lot, and I guess from a purely syntactical point of view that may have indeed been true. But we didn't really learn a lot about the differences between good code and bad code.

We mostly cared about getting our assignments working. When you completed an assignment, you moved on to the next one and none of the code of the previous assignment really mattered anymore. Maintainability or even readability were things that never even occurred to us. Maybe a few of the teachers tried to tell us about it at some point, but it certainly didn't stick.

In the final year, we could pick a project to work on for the majority of the year. There were no limits on the assignment, so most people picked something that interested them. I was pretty interested in manager/simulation games, so I figured I'd just build my own Formula 1 manager game. The player would get full control over a F1 team, which in my game meant:

  • picking the suppliers for tires, fuel and engines, all of which had varying levels of quality and costs associated with it
  • hiring a chief-designer, chief-mechanic and drivers, all of which had a specified talent level and negotiations took the results of the team's previous year into account, as well as the results of the potential hire's previous team, which influenced prices and negotiation tactics.
  • landing sponsors, some of which were more than happy to sponsor your team depending on last year's results, or wouldn't even answer your calls and they all had varying budgets as well

I came up with a pretty good formula to simulate realistic race results based on all these factors as well as a dose of luck, or bad luck. If you played for a few seasons, the decisions you made all added up nicely, and gradually. When the project was due, it all worked. Since I loved playing those kind of games, I tested it very extensively by playing it regularly. When issues came up, I'd fix them. When I thought of improvements, I implemented them. I delivered something that worked, and that was large enough in scope and complexity to be impressive for a school project.

But the codebase was truly atrocious. I wasn't using proper structures for either data or behavior and I ended up with a shitload of extensive multi-dimensional arrays. All of my types were in those arrays, as I just didn't know yet that they should've been types of their own. Of course, I listed all of the indices on pieces of paper so I'd know which data properties corresponded with each index. Behavior was all implemented in the UI layer. I was using VB5 at the time, which didn't really encourage me to seek out nicer ways to structure my code. I copy/pasted large gobs of code whenever I needed to reuse something. And of course, whenever I had to make a change, I needed to do it in multiple places. Quite a few places in most cases even. I was using flat files to store data, so I didn't have the opportunity to make a horrendous mess out of a SQL-based data layer but if I had used a database, my usage of it would've surely been epic.

I got a very good grade for the project, and while I was happy to see that all of my hard work was rewarded, it did kinda scare me that a codebase that was so brittle and so painful could actually work, and work well even. It was my first encounter with what is by far the biggest problem with software development to this day: if you keep adding code, sooner or later it might actually work. Luckily, this was just a school project. But can you imagine if this were a project done by a professional company? That codebase would've cost money to develop, and the company would actually have a working project, no matter how expensive it would've been to keep working on it. At the time, I couldn't imagine a company in such a situation that would say: "ah, it sucks… we need to start over". And I think we all know that very few companies would actually do that.

I love that horrendous codebase for the lessons it taught me so early on. I learned a lot about what you shouldn't do when you're developing something that's supposed to stick around for a while, and I'm glad I learned it without it costing a lot of money. Of course, that doesn't mean that every single thing I've written since is all roses and peaches, but it put me on the path of trying to continuously improve as a developer, not only in what I was able to do with code, but also as to how clear I could make it to others.

Note: I no longer have that codebase unfortunately. Really wish I did, if only because it'd be pretty funny going through it after all these years.