Archive for September, 2011

The Worst Code I Ever Wrote, And Why I’m Still Happy About It

11 commentsWritten on September 18th, 2011 by
Categories: Code Quality, Software Development

My last 2 years in high school, I was lucky enough to have 13 hours of programming-related classes every week. Those classes covered a few languages, but the most of it was spent on C/C++ (and nothing advanced either… mostly basic stuff) and Visual Basic. We thought we learned a lot, and I guess from a purely syntactical point of view that may have indeed been true. But we didn't really learn a lot about the differences between good code and bad code.

We mostly cared about getting our assignments working. When you completed an assignment, you moved on to the next one and none of the code of the previous assignment really mattered anymore. Maintainability or even readability were things that never even occurred to us. Maybe a few of the teachers tried to tell us about it at some point, but it certainly didn't stick.

In the final year, we could pick a project to work on for the majority of the year. There were no limits on the assignment, so most people picked something that interested them. I was pretty interested in manager/simulation games, so I figured I'd just build my own Formula 1 manager game. The player would get full control over a F1 team, which in my game meant:

  • picking the suppliers for tires, fuel and engines, all of which had varying levels of quality and costs associated with it
  • hiring a chief-designer, chief-mechanic and drivers, all of which had a specified talent level and negotiations took the results of the team's previous year into account, as well as the results of the potential hire's previous team, which influenced prices and negotiation tactics.
  • landing sponsors, some of which were more than happy to sponsor your team depending on last year's results, or wouldn't even answer your calls and they all had varying budgets as well

I came up with a pretty good formula to simulate realistic race results based on all these factors as well as a dose of luck, or bad luck. If you played for a few seasons, the decisions you made all added up nicely, and gradually. When the project was due, it all worked. Since I loved playing those kind of games, I tested it very extensively by playing it regularly. When issues came up, I'd fix them. When I thought of improvements, I implemented them. I delivered something that worked, and that was large enough in scope and complexity to be impressive for a school project.

But the codebase was truly atrocious. I wasn't using proper structures for either data or behavior and I ended up with a shitload of extensive multi-dimensional arrays. All of my types were in those arrays, as I just didn't know yet that they should've been types of their own. Of course, I listed all of the indices on pieces of paper so I'd know which data properties corresponded with each index. Behavior was all implemented in the UI layer. I was using VB5 at the time, which didn't really encourage me to seek out nicer ways to structure my code. I copy/pasted large gobs of code whenever I needed to reuse something. And of course, whenever I had to make a change, I needed to do it in multiple places. Quite a few places in most cases even. I was using flat files to store data, so I didn't have the opportunity to make a horrendous mess out of a SQL-based data layer but if I had used a database, my usage of it would've surely been epic.

I got a very good grade for the project, and while I was happy to see that all of my hard work was rewarded, it did kinda scare me that a codebase that was so brittle and so painful could actually work, and work well even. It was my first encounter with what is by far the biggest problem with software development to this day: if you keep adding code, sooner or later it might actually work. Luckily, this was just a school project. But can you imagine if this were a project done by a professional company? That codebase would've cost money to develop, and the company would actually have a working project, no matter how expensive it would've been to keep working on it. At the time, I couldn't imagine a company in such a situation that would say: "ah, it sucks… we need to start over". And I think we all know that very few companies would actually do that.

I love that horrendous codebase for the lessons it taught me so early on. I learned a lot about what you shouldn't do when you're developing something that's supposed to stick around for a while, and I'm glad I learned it without it costing a lot of money. Of course, that doesn't mean that every single thing I've written since is all roses and peaches, but it put me on the path of trying to continuously improve as a developer, not only in what I was able to do with code, but also as to how clear I could make it to others.

Note: I no longer have that codebase unfortunately. Really wish I did, if only because it'd be pretty funny going through it after all these years.

Why I Prefer JavaScript Over CoffeeScript

28 commentsWritten on September 11th, 2011 by
Categories: JavaScript, Opinions

CoffeeScript has gotten pretty popular in the past 6 months or so. If you're not familiar with it, be sure to check out the language's website. It's basically a language that compiles to JavaScript. So you can write your client-side code in CoffeeScript, but still have the browser execute the resulting JavaScript. Or if you're using Node.js, you can just use CoffeeScript instead of JavaScript if you prefer. It doesn't matter, it's all JavaScript in the end.

To the many people who dislike JavaScript, either because of its syntax or its many pitfalls (or both), CoffeeScript is obviously great. They can get in on all the fun and progress that's happening in the JavaScript world, without having to write JavaScript themselves. Purely from a syntactical point of view, it's hard to argue the merits and benefits of the language. I'm fond of Ruby's syntax, so instinctively CoffeeScript looks much more appealing to me than JavaScript. Despite the nice syntax and protection from JavaScript's pitfalls, I still prefer to code in JavaScript.

For starters, if you're doing JavaScript development (whether client-side or server-side), you're likely to use a variety of scripts or libraries written by people other than you. While some libraries are documented very nicely, others aren't. In many cases, I still find the ability to read the source code of a library I'm using to find what I want to know to be a very valuable benefit. I've noticed that I often do this when trying to find out how to do something that I can't find in the documentation, or when I'm troubleshooting something. If you're not familiar with commonly used JavaScript patterns, this becomes harder than it needs to be. Some subtleties might pass you by, or in some cases you might have outright difficulties comprehending the code. Of course, reading JavaScript code from seasoned JavaScript developers can also be a great learning experience about how to properly use the language. Yes, there's a lot of bad JavaScript available, but there's a lot of beautiful JavaScript around as well. I'd say the benefits you can get from reading the code of the libraries you use are directly dependent on your familiarity with JavaScript as a language, as well as its most common used patterns and idioms. If you try to avoid JavaScript by sticking with CoffeeScript, you sort of limit the potential benefits to be had from a tremendously valuable resource.

Another reason is that JavaScript is gradually becoming ubiquitous and that's all the more reason to maintain your familiarity with the language. You're unlikely to build a website without a nice dose of JavaScript, either used directly or through CoffeeScript. Server-side JavaScript usage is on the rise, and the growing popularity of Node.js is only going to increase the usage. True, you can use CoffeeScript with Node.js as well, but keep in mind that you're going to be reading a lot of other people's JavaScript code when you work with Node.js. That's not a slight at Node.js as a platform or development community btw, I very much enjoy working with it and am loving the learning experience. JavaScript is also being used in a few NoSQL databases, and in those cases it's harder to avoid it. Who knows where JavaScript is going to be used in the future? Will you always have the ability to 'compile' your CoffeeScript code to JavaScript in order to take advantage of whatever new environment where it can be used? Maybe you will, maybe you won't. Who knows? What you do know is that the vast majority of documentation and information you're going to find on the internet will be using JavaScript. Again, all the more reasons to become proficient in JavaScript IMO.

Then there's the aspect of debugging and troubleshooting. While you can indeed 'compile' your CoffeeScript code to JavaScript, none of the debuggers available at the moment transform your code back to CoffeeScript at runtime if you're trying to debug something. Ok sure, we all prefer to avoid debugging by writing proper tests but the fact of the matter is this: if something goes wrong, I certainly want to understand the code that I can debug. And even more so, I want that code to be readable and not look like it was generated by a tool (and I'm not talking about a lousy developer). Or if I receive a stacktrace from a tester or (gasp) a user, I would like to be able to look at the code that is listed in the stacktrace without thinking "oh crap, what the hell is going on here? I'm not used to all this JavaScript!!".

Again, I have no issues with CoffeeScript as a language. I like its syntax and the ideas behind it. But for me, that just doesn't outweigh the downsides that I've listed here so I'll be sticking with JavaScript for quite a while longer, me thinks.

Reading: Kindle vs iPad

7 commentsWritten on September 11th, 2011 by
Categories: Books, gadgets

I bought a Kindle about a year ago. I loved it from the start. Its screen is just about perfect for reading. Sunlight is no problem whatsoever. You can easily keep reading for hours and hours without your eyes getting tired. It's so light that you can hold it in pretty much every way imaginable without it becoming uncomfortable. The process of buying ebooks on Amazon and getting them delivered on the device automatically is just great. The process of buying ebooks from other stores and uploading them to your Kindle is smooth enough, provided that you buy the books in the Mobi format. The battery life still impresses me, even though it's almost a year old.

I also bought an iPad 2 as soon as it came out. I didn't buy it thinking it could replace my Kindle. I bought it because I thought I'd prefer using it over my Macbook for the times when I just want to consume web-content, instead of actually working on something. If I want to work on something, I use the Macbook. For entertainment, the iPad is more often used because of its superior form factor and ease of use. After a while, you sort of get spoiled by the iPad and its interface. I now expect to be able to use my fingers when I see a small screen without a keyboard attached to it. The auto-rotating screen has also become a must-have to me. The fact that I can quickly switch between apps, without being visually distracted by them, is something that I absolutely love. And I expect everything to just happen instantly whenever I trigger something.

Simply put: the iPad has spoiled me so much, that I now sort of dislike using my Kindle. I know the Kindle screen is easier on the eyes for long reading sessions. I know that reading in daylight is much better on the Kindle than on the iPad. I also know that I really dislike navigating using the Kindle controls. I hate switching between portrait and landscape mode on the Kindle. I hate that everything I do makes me wait, even if it's only half a second. And I really don't like the fact that I've caught myself switching between both devices to check Twitter or email. So a while ago, I installed the Kindle app on my iPad and tried it for a while. I haven't used my Kindle since. That's not to say that I no longer think that the Kindle is a good device. In fact, for its price I think it's a great device. But it really just loses a lot of its attraction once you're used to an iPad.

For Amazon's sake, I hope the rumors about the Kindle tablet are all true. I think they're going to need to go in the tablet direction if they want the Kindle brand to remain relevant over the next couple of years. I do wonder what that's going to mean for the future of e-ink though.

Note: I actually wrote this post on my iPad, using IA Writer. It's the first time I'm writing a post on the iPad, and it's unlikely to be the last.

Repeated Failed Log-Ins: What’s Your Strategy?

11 commentsWritten on September 10th, 2011 by
Categories: ASP.NET MVC, express.js, node.js, Patterns, Performance, security

I've only been using the server that's hosting this blog for a week or two, so I'm still keeping a close eye on it. I check usage graphs (cpu, disk I/O and network) a couple of times a day to verify whether things are still running smoothly. This morning, I saw a noticeable increase in CPU usage and network activity that lasted for about 11 hours. I logged into the machine, checked some logs and found out that someone had conducted an 11 hour lasting brute-force SSH attack. It doesn't make much sense to try that on my server since my SSH daemon doesn't allow password authentication, and indeed there was no successful login during the attack so no harm done, right?

Even if such an attack is not successful, it does consume resources on the targeted server(s). And wasteful, unnecessary resource usage has always been a bit of a pet peeve of mine so I wanted to prevent this from happening again. For this particular scenario, it's pretty easy. I installed DenyHosts which routinely checks for repeated (configured at 5) failed log-in attempts, and adds the offending IP addresses to /etc/hosts.deny so every other attempted SSH connection from those IP addresses will be denied immediately. Each offending IP address will be purged from /etc/hosts.deny after 1 week. Then I added a firewall rule that prevents you from connecting through SSH more than 5 times in 60 seconds. If you go over 5 connections, it just starts dropping packets, and by the time the drop behavior for your IP address expires, you'll have been added to /etc/hosts.deny already. As I said, pretty easy in this scenario because there are great tools I can rely on.

But what would you do if you had to implement a strategy to deal with this yourself? The most interesting approach I've heard of is to add an incremental delay on each failed authentication attempt. If the user fails the authentication check, delay the response with 1 second. If the user fails the second time, delay the response with 2 seconds. Third failure means a delay of 3 seconds, and so on. This pretty much makes a brute-force or dictionary attack impossible. The key is though, that you can't block any of your request-handling threads because then you open yourself up to an easy DoS attack.

Implementing this for a web application built on Node.js and Express.js is incredibly easy (there's an ASP.NET MVC example later in this post btw). I took the authorization example of Express.js and made just a few minor changes. First of all, I added the delayAuthenticationResponse function:

function delayAuthenticationResponse(session, callback) {
  if (!session.attempts) {
    session.attempts = 1; 
  } else {
    session.attempts++;
  }

  setTimeout(callback, session.attempts * 1000);
}

This is the most important part of the implementation. Every time we get here, we increment the number of attempts for this user by one and store the number in the user's session. Side note: this is one of the few things you'd actually want to use a session for: session-related data. Then we schedule the callback to be executed after the number of attempts * 1000 milliseconds have passed. The important part to remember here is that Node's event loop is not blocked by this, so our ability to handle other requests is not impaired in any way. The only one who suffers here is the attacker. Note that in a real world implementation, you'd probably only want to start increasing the delay after 5 attempts or so, in order to not piss off users who're just having problems remembering their password.

Then I changed the authenticate function so that it receives a session as the first parameter, and uses our delayAuthenticationResponse function whenever something goes wrong:

function authenticate(session, name, pass, callback) {
  var user = users[name];

  if (!user) {
    return delayAuthenticationResponse(session, function() {
      callback(new Error('cannot find user'));
    });
  }

  if (user.pass == hash(pass, user.salt)) {
    delete session.attempts;
    return callback(null, user);
  }

  delayAuthenticationResponse(session, function() {
    callback(new Error('invalid password'));
  });
}

After that, it's just a matter of changing the function that is assigned to the login route:

app.post('/login', function(req, res){
  authenticate(req.session, req.body.username, req.body.password, function(err, user){
    if (user) {
      req.session.regenerate(function(){
        req.session.user = user;
        res.redirect('back');
      });
    } else {
      req.session.error = 'Authentication failed, please check your '
        + ' username and password.'
        + ' (use "tj" and "foobar")';
      res.redirect('back');
    }
  });
});

And there we go. This effectively makes it impossible to brute-force your way into this web application, and I'm sure you can agree it was rather easy to do so. Of course, this is only because Node.js is inherently non-blocking. In an environment where non-blocking is the exception rather than the rule, you have to keep a few more things into account when trying to implement this strategy.

For instance, ASP.NET MVC is a typical blocking web framework. There's a certain number of threads that are waiting to handle requests, and once they receive a request, they process that request in its entirety. That means that if your code has to wait on something, the request handling thread is blocked and can't handle any other requests. So obviously, if you'd like to implement this strategy for dealing with repeated failed log-ins, you really want to avoid doing something like this:

        [HttpPost]
        public ActionResult LogOn(LogOnModel model, string returnUrl)
        {
            if (ModelState.IsValid)
            {
                if (CredentialsAreValid(model.UserName, model.Password))
                {
                    FormsService.SignIn(model.UserName, model.RememberMe);
                    if (Url.IsLocalUrl(returnUrl))
                    {
                        return Redirect(returnUrl);
                    }
                    
                    return RedirectToAction("Index", "Home");
                }

                Session["attempts"] = Session["attempts"] == null ? 1 : (int)Session["attempts"] + 1;
                Thread.Sleep((int)Session["attempts"] * 1000);
                ModelState.AddModelError("", "The user name or password provided is incorrect.");
            }

            return View(model);
        }

(note: this is a slightly modified LogOn method from the default AccountController when selecting 'internet application' in the MVC project wizard)

While this looks like it does the same as the Node/Express example, it certainly doesn't. The experience for the attacker is the same, because each failed attempt causes the response time to be increased with an extra second. But on your server, the thread handling the request is blocking the whole time and is thus incapable of handling extra requests while you're making the attacker wait.

Luckily, you can use ASP.NET MVC's asynchronous controllers to provide an asynchronous implementation of an action without blocking the request handling thread:

        [HttpPost]
        public void LogOnAsync(LogOnModel model, string returnUrl)
        {
            if (ModelState.IsValid)
            {
                if (CredentialsAreValid(model.UserName, model.Password))
                {
                    FormsService.SignIn(model.UserName, model.RememberMe);
                    AsyncManager.Parameters["returnUrl"] = returnUrl;
                }
                else
                {
                    Session["attempts"] = Session["attempts"] == null ? 1 : (int)Session["attempts"] + 1;
                    var timeout = (int)Session["attempts"] * 1000;
                    AsyncManager.OutstandingOperations.Increment();

                    var timer = new System.Timers.Timer(timeout) { AutoReset = false };
                    timer.Elapsed += (sender, e) =>
                    {
                        ModelState.AddModelError("", "The user name or password provided is incorrect.");
                        AsyncManager.Parameters["model"] = model;
                        timer.Dispose();
                        AsyncManager.OutstandingOperations.Decrement();
                    };
                    timer.Start();
                }
            }
        }

        public ActionResult LogOnCompleted(LogOnModel model, string returnUrl)
        {
            if (!string.IsNullOrEmpty(returnUrl) && Url.IsLocalUrl(returnUrl))
            {
                return Redirect(returnUrl);
            }

            if (model == null)
            {
                return RedirectToAction("Index", "Home");
            }

            return View(model);
        }

Your controller has to inherit from AsyncController instead of Controller to make this work. Of course, it's much more complicated and requires more ceremony compared to the Node/Express approach, but then again, ASP.NET MVC isn't optimized for this kind of usage whereas Node/Express definitely is.

Either way, no matter what web framework you use, if you can add an incremental delay to the response of each failed log-in attempt without blocking a request-handling-thread, you've added a very effective and low-cost protection against brute-force and dictionary attacks.

Maintaining Bad Code Can Be A Great Experience

12 commentsWritten on September 4th, 2011 by
Categories: Opinions, work/career

Nobody likes to maintain a bad codebase. And still, it's an experience that i'd recommend to every developer, particularly early on in your career. People who are relatively new to coding usually don't have a good view yet on what kind of code will cause pain in the long run, and what kind of code won't. And if you keep getting assigned to new projects, you're less likely to develop that view properly compared to someone who has to maintain an existing, problematic codebase.

We've all heard the most common ways of improving as a developer: read the right books, learn from more experienced people, and above all: practice, practice, practice! And all of that is indeed very important. But i've always felt that being exposed to bad code and having to deal with it can be a tremendously important and valuable learning experience as well. As valuable as repeated practice, reading and following advice from experienced and skilled developers is, none of it will stick with you as long as the pain you feel day in, day out from working with a bad codebase.

When you're working with a bad codebase, it's important to figure out where the pain comes from. Whenever you experience pain while trying to make a change, or while trying to implement a new feature, take a bit of extra time to look for the cause of the pain. Then try to think of a way that would avoid the pain entirely, or at least lessen it. And if you can, implement your solution in the codebase. If you can't, try the approach you thought of in a personal project. If you do this routinely, you're going to learn a lot about what works well and what doesn't. I'd dare say that you'll get a better view on this than people who never have to work with legacy code, since they rarely get exposed to the long term problems caused by the code they write.