Microsoft And Open Source: Hoping For Better Collaboration

4 commentsWritten on April 9th, 2012 by
Categories: Opinions

By now, you've probably all heard that Microsoft is moving to an open development model for ASP.NET MVC and some other ASP.NET projects. Even though the source code of ASP.NET MVC has always been available under an open source license, its development followed a closed development model. This meant that outside contributions weren't possible, nor were we able to follow the actual commits in the MVC source code repository. With the recent announcements, this is no longer the case and I think this is fantastic news. It finally enables collaboration between Microsoft employees and people outside of Microsoft on a strategically important Microsoft product. This is good for Microsoft as well as the open source .NET community.

I hope that this newfound appreciation for Open Source within Microsoft will lead to another huge improvement in collaborative development in the open source .NET community. While Microsoft is now open to accepting contributions from the community, it would be a tremendous step forward if Microsoft would also contribute to other prominent open source .NET projects in the future. In the past, we've seen numerous open source .NET projects become popular and widely used. And unfortunately, Microsoft responded to some of those projects by producing their own libraries and frameworks that basically do the same thing. Except that, for most of those projects, they never quite matched the quality of the open source projects they were inspired by. If only all of that effort spent on duplicating already existing libraries would've been spent on improving what was already there, the entire .NET community would've been better for it.

I'd love to see a Microsoft that works with open source developers and encourages them, instead of trying to duplicate their efforts whenever they feel they need to provide their own library or framework for something that's already covered by a superior open source alternative. These duplicated projects only alienate people that at one point were passionate enough about the .NET platform to work on improving it for free, in their spare time. These are the people that Microsoft needs to cherish and nourish instead of competing with them. Microsoft has shown some interesting signs of better understanding of open source development and collaboration in the past year or so. Here's to hoping they take that critical next step as well.

What’s The Point Of Using WCF In A Web App?

38 commentsWritten on March 18th, 2012 by
Categories: Architecture, Code Quality, Performance, WCF

A very common approach of building web applications in .NET is to put most of the non-UI related code behind an internal WCF service layer. I used to be a fan of this approach as well, but these days I just don't see the benefit of that internal service layer anymore. The overhead that an internal WCF service layer adds to development, deployment and runtime performance just doesn't stack up favorably to the supposed benefits IMO. To be clear: I'm talking about WCF services that will only be used by the front end of your web application.

Let's talk about the overhead on development first. If you're using WCF services in your web app, you need proxies to access those services. Some people prefer to generate the proxies based on the WSDL of the services that will be used. In the worst case, this leads to regenerating proxies and all of the types that are defined in the WSDL every time you change a service contract or one of the types that are used by the services. If multiple people need to make changes to any of these concurrently, this easily leads to merging problems when people need to commit their changes. Another way is to share the same types on both sides (client & server), and implement your service proxies by inheriting from ClientBase and manually keeping the implementation of the proxies up to date with the definitions of their service contracts. This is better than regenerating a bunch of code all the time, but you're still writing a lot of redirection code for the purpose of, well, what exactly? Another possibility is to use dynamic proxies which automatically implement the service contracts but this increases the amount of infrastructure code you need to put in place and it's not always clear to everyone how exactly communication with the services happens. There's also a lot of WCF configuration for each service that you need to maintain, and it can quickly grow unwieldy.

Then there's the overhead on performance. I hope we can all agree that any operation that goes out of process is at least an order of magnitude slower than a similar operation that can be executed in process. First of all, there's the networking overhead (even if your services are hosted on the same machine as the web app) that you have to keep into account. Secondly, there is the cost of serializing and deserializing everything that is transferred between the client and the server. Even with the most efficient bindings and serializers, the cost of all of this quickly adds up on high-traffic web apps. That's not to say that WCF services are inherently slow. They can be very fast and efficient, but they'll never be as fast and efficient as executing that logic in process within the web app.

Finally, there's the extra overhead it introduces to the deployment phase:

  • more endpoints to set up and transfer artifacts to
  • more configuration
  • more monitoring of endpoints
  • more servers if you're not hosting the services and the web app on the same machine

Of course, people will argue that there a plenty of benefits to using a WCF service layer in a web app. The ones I hear about most often are the forced separation of business logic and UI logic and improved scalability and reliability. I really disagree that you need a physical separation of business and UI logic. I much prefer approaches where the separation is based on abstractions. A good example was recently posted by Ayende (here and here). And when it comes to scalability/reliability, a web app that isn't dependent on a WCF service layer is as easy (or even easier depending on your setup) to scale than one that is entirely dependent on WCF services. First of all, if you care about scalability/reliability your web app should already be prepared to run behind a load balancer. If you already have a load balancer in place, you can just add more web servers to your setup when needed. If you'd host the WCF services on the same machines that are hosting the web front end, you'd get less total throughput from one server than you would if that one server could just host a web app that fully executes in process (not including the database obviously). If you're hosting the WCF services on separate machines, you'd end up with more servers to handle the load and to achieve the reliability you need than you would with just being able to add more web servers to your setup. That also increases your licensing costs. And of course, it also means increased networking overhead on every service call, which also implies that the threads on your web servers will be blocked for longer periods while they wait for those service calls to return. Unless you're calling those services asynchronously, but most people simply don't. Also, if you have serious scalability and reliability requirements you're probably better off with asynchronous messaging solutions than with SOAP services.

WCF has its benefits (though I prefer Web API's or asynchronous messaging over SOAP services these days) and it has its use cases. I just don't think internal service layers for web apps is one of them.

What are the benefits that you think an internal WCF service layer brings to your web app? And what's your opinion on how they stack up versus the downsides?

Thoughts On Learning New Things

8 commentsWritten on March 11th, 2012 by
Categories: Opinions, work/career

Jef Claes published an interesting post about his preferred way of learning new things. There's one part in the post that I don't entirely agree with:

I like to believe learning should be a hands-on activity as well. Basically, stop consuming, start producing. Don't get me wrong, I do think there is value in reading blog posts (I might be slightly biased on this one), reading books and watching videos, but I find that this value is marginal compared to what you gain by actually doing it.

Hands-on activity (producing) is certainly a very important part of any learning process, but I wouldn't go as far saying that the value of reading books/blogs (consuming) is marginal compared to that of producing. In fact, I believe their value to be pretty equal. I've seen too many people who start producing simple things, and then think they've got a pretty good grasp of the technology they're using and then move on to producing more complex or bigger things without actually knowing enough of the technology they're using to support the more complex or bigger scenarios. The results certainly aren't always pretty and I'm sure each and every one of you has seen this scenario unfold with at least one developer you know. Probably more than that even.

I think in a lot of cases, people start the producing phase perhaps a bit too early and then in their enthusiasm of seeing things working sort of skip the more boring consuming that could've benefitted them a lot. Once you've started producing, you need to keep consuming regularly. A tremendously valuable part of any learning experience is getting feedback and insight from minds that have more experience with a given subject than yours. If you're lucky, you can get this from your coworkers. If you're not that lucky, you'll need to find other sources and books, blogs, videos, user group meetings, etc can be a great way to fill that void. And even if you do get to learn a lot from your coworkers, it never hurts to learn more from the experiences of others outside of your immediate circle, if only because their situations and constraints will differ from yours as well.

The other very important part about consuming is really getting to know the technology you're trying to learn. I've always found it very important to at least get an idea of how things work internally within a technology that I'm using. You certainly don't need to know all of the implementation details but just having an idea of it can really help you avoid a lot of problems once you need to use a technology in a more advanced way than in your initial experiments. Most importantly, it should give you better insights as to whether you're using the technology properly, which unfortunately isn't always the same as getting something working. And as a bonus, you'll probably learn about features you won't immediately need but knowing that they're there can save quite a bit of time and effort later on. Just imagine the improvement of the signal-to-noise ratios that you'd see on mailinglists, forums, and StackOverflow if everyone took the time to get a better grasp of the technologies they're using.

When I start with learning new libraries or frameworks, I usually start off by reading most (and often, all) of the official documentation of the technology before I even get into building something myself. If I want to learn a new programming language I'll look for the most recommended books for that language and buy one (or more, if I wasn't satisfied with the first one). I won't even start using the language until I've gone through the book. Once I feel like I've got a pretty good theoretical grasp of the technology, I start building something with it. I also start looking for good blogs on the technology and subscribe to them. I'll also start following influential people of the technology on Twitter. And I just continuously try to soak up as much knowledge as I can from people who're doing more impressive things with the technology than I am. At first, you might not understand everything they're talking about but after a while, things just start clicking and you're getting a really good grasp of things. None of this is a substitute for learning from producing, but it certainly is an incredible addition to it. And one that makes a world of difference, IMO.

Architectural Drivers

7 commentsWritten on February 26th, 2012 by
Categories: Architecture

There are many different opinions and preferences when it comes to how we should deal with software architecture. My personal preference is to try not to make things more complex than they should be, while striving to maintain enough flexibility to be able to deal with future changes in both functional and non-functional requirements in a way that hopefully doesn't require extensive effort. I've always thought of good architecture essentially being something that enables developers to keep implementing new features over time at a similar cost than what it took to implement features in the early stages of development. Thus I prefer to try to find a balance between simplicity of the codebase, and making sure that the architecture scales nicely (from a code quality point of view) as the application grows in functionality. I also like to make sure that we can scale out (from a performance/throughput point of view) relatively easily if that need to scale comes up. These are just my general preferences though, and aren't always relevant to every project or suitable for every situation. I'm now in the 10th year of my career and have seen a variety of architectural styles in the wild, and of course I've read and heard about many more since the subject has always interested me a lot. I've learned that it's important to keep architectural drivers of a project in mind when thinking about what kind of architecture is suitable for a certain project or situation. Those drivers are often more important than whatever your personal preferences are. I've come up with a list of drivers that I think should influence the architecture used by teams or organizations.

Simplicity/Complexity of the application

A common mistake is to decide to use a fashionable architectural style in a project that doesn't really have enough complexity to warrant the use of said architectural style. For instance, there's no reason whatsoever to go the DDD route if your application is mostly a forms-over-data CRUD app. You won't really get any benefit from it, and you'll only increase the total cost of the required development effort. On the other hand, an application which will feature a lot of behavior in a complex domain will certainly benefit from using a DDD approach versus a data-driven approach that implements the behavior in a more procedural, transaction-script based approach. Be honest to your customer/employer and yourself and make an objective decision based on the simplicity or the complexity of the required solution that you're being paid to build. Don't let enthusiasm for a fashionable architectural style influence your decision.

Expected lifetime of the application

Some applications live for years and years, and some are only temporary solutions that will be phased out a year or two after having been put in production. Of course, some of those temporary solutions end up being used much longer than what was originally expected due to a variety of reasons but that certainly isn't always a given. Unless you're familiar enough with how things typically go at your client or your company to predict these things, your best bet is to just go with the initially expected lifetime of the application. Does a temporary solution require the best possible architecture out there, and the resulting impact that would have on the cost to build it? I'd say it most often doesn't. In these cases, there's nothing wrong with going for the approach that is simply good enough for what is required. Striving for perfection is expensive, and generally not worth it for short-term solutions. Conversely, if the solution is expected to last a long time, a solid architecture becomes more important and is worth the extra investment to (try to) get it right if that makes it possible to keep the costs of long-term development under control.

Strategic importance of the application

Some applications are meant to improve the core business of a company, while others play a more supportive role in something that isn't actually part of the company's core business. Some applications generate revenue (directly or by improving efficiency of revenue-generating activities) while others are meant to reduce costs, typically administrative in nature. Cost reduction is important, but it's not quite as important as revenue generation and that's a factor that should be taken under consideration. A strategically important application almost always warrants putting in the effort to come up with a good architecture because it's highly likely that the application will have to evolve in whichever direction the business evolves. That's not to say that the same kind of effort wouldn't be important for cost-reducing applications. But it just might be less important than you'd like to think it is. Again, good architecture increases the cost of the project, and the return on that investment in the big picture of the company should not be ignored.

Skill-level/discipline of team

Good code and good architecture requires skill and discipline. A strong team is capable of letting good architecture grow organically and keep it at a high quality level. For a mediocre team, letting it grow organically is too often a recipe for disaster. Ideally, we'd always be working with strong teams but as we all know, that simply isn't always possible. For mediocre teams, it often makes sense to put architecture in place that is more restrictive in nature and where everyone knows how the code should be structured in advance. The downside is that this typically reduces flexibility and individual creativity, and often introduces more ceremony and indirection than what many of us would ideally like to see. But it's likely that those downsides are offset by having everyone on the same page and taking some possibly difficult decisions away from people who might not be strong enough to make the right decisions. I know that sounds harsh, but that's a reality that many of us have to deal with.

Conformity/Continuity

A lot of IT departments or software development companies prefer to use the same architecture and frameworks/libraries for most (or even all) of their projects because it makes it easier to have people work on multiple projects. In this situation, it's easier to move people between projects or have them do maintenance work on a project they weren't initially involved in. While this often prevents going with the most suitable approach for every project, it does introduce a few benefits that are hard to argue with. People will need less time to get familiar with a code base. In-house training and sharing of knowledge get easier. Bringing in new people (especially for temporary assignments) also gets somewhat easier because you have a baseline of required skills/knowledge that should go a long way within the organization. A huge downside however, is that it creates an environment that will quickly frustrate creative developers. Also, when new strong developers are brought in, they could quickly tune out once they realize they got there too 'late' to have any influence on how projects are developed. The interesting part about this driver is that it is influenced by the skill-level-discipline driver, while simultaneously influencing it towards the future.

Non-functional requirements

This is the last driver on the list, but it is certainly not the least important. Obviously, architecture is greatly influenced by some important non-functional requirements. There are many factors that can have a profound influence on what the most suitable architecture of a system could be, and it's important to think about those as quickly as possible. Do you need to respond in real time? Do you need to support mobile devices? Will there be connectivity issues? Will you need to scale massively? What are the auditing requirements? Are you dependent on third-party services? Do you need to minimize resource consumption? What kind of accessibility do you have to keep into account? The list goes on and on and many of these issues can end up becoming huge problems if you don't think about them ahead of time. Of course, you can't think of everything in advance, and many non-functional requirements can be introduced during the lifetime of the application instead of being known in advance. But the more you know in advance, the better you can prepare your software for them from the beginning.

Conclusion

There is a virtually endless variety of architectural choices and styles that can influence how you develop software. And unfortunately, there's a lot of dogma surrounding it as well. I hope this post made it clear that there is no definitive 'right' or 'wrong' when it comes to architecture, and that (as pretty much everything else in this business) it all really depends :)

I'm sure you can think of more architectural drivers, and I'd love to hear about any that you think I've forgotten.

DTO’s Should Transfer Data, Not Entities

10 commentsWritten on February 19th, 2012 by
Categories: Architecture, Code Quality

I've read a couple of posts recently where the authors were complaining about excessive mapping to DTO's and whether or not it's worth it. I've been a fan of DTO's for a long time, but I'm not a fan of how I frequently see them being used. More specifically, I very much dislike the all-too-common approach of creating a DTO for each entity in your domain model. In the worst cases, the DTO's actually reference each other and try too hard to mimic the structure of the entities. For example, an OrderDto which has a reference to a CustomerDto and a collection of OrderLineDto instances. In those cases, I absolutely agree that all of the mapping involved introduces way too much ceremony to the codebase without really offering any concrete benefits.

Of course, it really depends on the architecture of your system. If you're using services that are only used by the front-end of your system, then I'd still advise against making those entities available outside of the service boundary for reasons that I've discussed earlier. That doesn't mean that you should go the entity-mimicking-DTO-route though. In fact, entity-mimicking-DTO's introduce a few of the same downsides you'd get from exposing entities directly through your services. For an 'internal' service (i.e. only used by your application), I think it makes a lot more sense to use DTO's per service operation which are optimized for the use case that that service operation is meant to implement.

Essentially, an internal service's operations will be either queries or commands. When it comes to queries, why not just do the simplest thing possible? In most cases, it's sufficient to just return the data in the exact same form that the data will be displayed in. Quite often, that means a denormalized set of data where the data comes from more than one type of entity/table. There's no reason to send a bunch of entity-mimicking DTO's that reference each other to the client if you're going to use the data to populate a grid. Just send a list of DTO's which are already in the most optimal form. Populating those DTO's can then simply be done through straight SQL, a view, a stored procedure, a projection through your ORM, a map/reduce operation, or whatever else that makes sense. The point is that in most cases, you should just populate those DTO's as directly as you can instead of mapping to those DTO's. In this case, the DTO's offer a clear-cut benefit and don't really introduce tedious ceremony code in your codebase.

As for the commands (inserts, updates, anything that creates something or results in something happening), why even use DTO's in the first place? People will often argue that the ability to reuse the entity-mimicking-DTO's for these operations is a benefit. Personally, I don't really see the benefit. The mere presence of entity-mimicking-DTO's only encourages people to use them for the queries as well. Instead, I prefer to go with types that encapsulate all of the data relevant to the current command (the data to be inserted/updated, or whatever else the command needs). In a very simple scenario, this could be an InsertCustomerCommand type which simply has properties for the data that needs to be inserted. Nobody will be confusing these types with any other purpose than what their name communicates.

If you're building a web app where the server-side code does everything in process (i.e. without a WCF service), then you can in many cases just use the entities directly without a real drawback, though I'd recommend keeping an eye on unexpected select N+1 problems, since those will frequently come up when you're preparing your views. But even in this case, using DTO's that are optimized for the scenarios in which they're being used can really simplify the query-side of your system a lot.

There's nothing inherently wrong with DTO's if you use them to simply transfer data. If you're using them to transfer entities, you're robbing yourself of their biggest benefit while only making things more complex than they need to be.