JavaScript

Displaying Feed Items On A Web Page: My Solution

7 commentsWritten on December 20th, 2011 by
Categories: express.js, JavaScript, node.js, Performance

A couple of days ago I asked you how you'd implement showing links from an RSS feed on a web page (in this case: my new company web site). These are my requirements for this:

  • It needs to be fast
  • The fewer requests that are impacted by retrieving the feed data, the better
  • If I publish a post, the links on the company website should contain the new link within 30 minutes
  • The simpler the solution, the better

I came up with a very simple solution, which satisfies these requirements better than any other solution I could think of, or heard of from other people. It is extremely fast, doesn't delay any requests, and doesn't require me to deploy anything but the company website. I'm building the site with Express on Node.js, which means I can take full advantage of the asynchronous nature of Node.js to implement this.

Let's go over the code... in the script that starts the express server, I have the following code:

var express = require('express'),
    app = module.exports = express.createServer(),
    NodePie = require('nodepie'),
    request = require('request'),
    recentFeedItems = null;

app.dynamicHelpers({
    getRecentFeedItems: function() {
        return recentFeedItems;
    }
});

// ... some extra configuration of Express that isn't relevant to this post

var processFeed = function(callback) {
    request('http://feeds.feedburner.com/davybrion', function(err, response, body) {
        if (!err && response.statusCode == 200) {
            var feed = new NodePie(body);
            feed.init();
            recentFeedItems = feed.getItems(0, 5);
            if (callback) callback();
        };
    }); 
};

setInterval(processFeed, 1800000); // process feed items every 30 minutes

processFeed(function() {
    app.listen(3000);
    console.log('Express started on port 3000');    
});

I'll discuss the code in just a moment, but first I want to show the view code that renders the links:

<ul>
<% getRecentFeedItems.forEach(function(item) { %>
    <li><time class="date"><%= item.getDate().getDate() + '/' + (item.getDate().getMonth() + 1) %></time><a href="<%= item.getPermalink() %>"><%= item.getTitle() %></a></li>
<% }); %>
</ul>

And that's all. This is the solution in its entirety!

If you're new to Node, this code probably requires some explanation. Let's start with this part:

app.dynamicHelpers({
    getRecentFeedItems: function() {
        return recentFeedItems;
    }
});

Here I'm adding a dynamic helper to the Express application. It basically means that my views have access to the getRecentFeedItems function, which returns the value of the recentFeedItems variable. It's important to know that the getRecentFeedItems function creates a closure on the recentFeedItems variable created above it. That means that if the value of the recentFeedItems variable changes at any point in time, the getRecentFeedItems function will return that new value.

var processFeed = function(callback) {
    request('http://feeds.feedburner.com/davybrion', function(err, response, body) {
        if (!err && response.statusCode == 200) {
            var feed = new NodePie(body);
            feed.init();
            recentFeedItems = feed.getItems(0, 5);
            if (callback) callback();
        };
    }); 
};

This just creates a function that we can use later on. It retrieves the feed asynchronously, and when the result is retrieved, we parse the feed using the NodePie library and we get the 5 most recent items which we store in the recentFeedItems variable. Again, this creates a closure on the recentFeedItems variable which means that every time we assign a value to this variable, any subsequent call to the getRecentFeedItems function will return the value we just assigned to it because both functions point to the same memory thanks to the magic of closures. Finally, if a callback is provided as a parameter, the callback will be invoked.

setInterval(processFeed, 1800000); // process feed items every 30 minutes

processFeed(function() {
    app.listen(3000);
    console.log('Express started on port 3000');    
});

The call to setInterval makes sure that the processFeed function is called every 30 minutes. After that, we call the processFeed function manually, and we pass in a callback where we start the Express server. This guarantees that the feed items will be in memory before the server starts processing requests.

What makes this solution so great is that we take full advantage of some of Node's benefits. Whenever we retrieve the RSS feed, Node.JS will retrieve that data asynchronously. As soon as it has fired the request to get the RSS feed, it just goes to the next event in its eventloop so no request is kept waiting while we wait for the data to be downloaded. Until the data from the RSS feed is returned, each request will just use the items that are stored in the recentFeedItems variable. Once the data has been returned, our callback is executed which overwrites the value of the recentFeedItems variable. We don't need to do any locking here because the Node.JS eventloop is single-threaded: while our callback is running, no other code that has access to the recentFeedItems variable can be executed anyway. And the actual parsing of the RSS feed is done by NodePie, which uses expat behind the scenes, which is supposedly the fastest C XML parser available.

Looking back on my initial requirements, I think this solution matches very well.

Node.js For Dummies

14 commentsWritten on December 18th, 2011 by
Categories: JavaScript, node.js

I'm sure you've all heard of Node.js by now. Its popularity is increasing rapidly, which means it's a good idea to be aware of what Node.js is and especially how it differs from more traditional technology stacks. In this post, I'll try to give an easy-to-understand overview of what makes Node.js different and make it clear that it's more than just server-side JavaScript. Note that this overview is highly simplified and only meant to help people understand how Node.js works. This is definitely not a completely accurate description of the lower-level details of Node.js.

Evented/Asynchronous I/O

In most technology stacks, API calls for I/O operations are synchronous. As in, the thread that executes the operation is blocked for the duration of the I/O operation until that operation has completed. Once completed, execution of your code proceeds. Of course, a lot of technology stacks have asynchronous variants of those operations available as well, but generally speaking, they aren't used as often as the synchronous variants. In Node.js, it's the other way around. All I/O operations are asynchronous and there are only a few synchronous implementations available (and you're generally discouraged from using them).

This means that whenever you do an I/O operation (file manipulation, network requests, database operations, etc…), Node.js initiates the I/O operation through a lower-level C/C++ layer which will perform the operation asynchronously. Once the operation has completed, Node.js will execute your callback function that you passed as a parameter to the I/O operation's function call. The important thing here is that while the I/O operation is being executed, Node.js doesn't have to wait for the operation to complete, and is able to focus entirely on processing other events. And those events can be anything: incoming network requests, executing callbacks from other operations that have completed, or invoking whatever function that is assigned to a particular event.

Eventloop

The Node.js eventloop is what makes Node.js so interesting and powerful. Node.js basically just keeps reading from an event queue until that queue is empty. As it loops through the events to be processed, it invokes the JavaScript functions that have been assigned for those events. If any of those functions performs an I/O operation, Node.js will initiate the operation and then immediately move to the next event in the event queue. Once the I/O operation has completed, an event will be added to the event queue with a reference to your original callback. Once all preceding events have been processed, Node.js will get to the newly added event and invoke your callback. Because all I/O operations are asynchronous, this enables Node.js to maximize its efficiency as it processes events because it doesn't need to wait for slow I/O operations to complete.

Single-threaded

One thing that people don't always realize is that the Node.js eventloop is single-threaded. This has some nice benefits but there's a huge drawback as well. The biggest benefit is that you don't need to worry about concurrent access to shared state. After all, there is never more than 1 thread executing your JavaScript functions. This means you don't have to write any locking code to protect shared state. The drawback to the single-threaded eventloop is that you need to be careful not to block the event loop. If you're planning on doing heavy synchronous processing in your JavaScript code, you need to realize that no other events can be processed by Node.js until that synchronous block of code has completed. Obviously, since there's only one thread going through the eventloop, any delay you cause in your code can be very costly to overall throughput and performance. For now, it's best to execute synchronous processing routines as a child process, possibly even in a language that is more suitable for this than JavaScript. But it seems that future Node.js versions will provide a more integrated way to deal with this.

Why JavaScript?

JavaScript's support of closures and it treating functions as first class objects means it's ideally suited for the evented programming model that Node.js offers. Many people still think of JavaScript as a joke or a toy language, but it's a lot more powerful than many people think it is. Yes, it certainly has problems as well, but it's definitely worth learning. I do hope that this post has made it clear that there's a lot more to Node.js than simply being server-side JavaScript. What makes Node.js so interesting are the principles that I've tried to explain in this post. Those principles can be implemented with other languages as well, and could be made to work just as great as, or perhaps even greater than Node.js itself. But you'd be hard-pressed to find a language that's so ubiquitous, yet completely devoid of a pre-existing synchronous I/O infrastructure.

Anyways, I hope I succeeded at making it somewhat clear how Node.js works and why it's so different from most other technology stacks.

Why I Prefer JavaScript Over CoffeeScript

30 commentsWritten on September 11th, 2011 by
Categories: JavaScript, Opinions

CoffeeScript has gotten pretty popular in the past 6 months or so. If you're not familiar with it, be sure to check out the language's website. It's basically a language that compiles to JavaScript. So you can write your client-side code in CoffeeScript, but still have the browser execute the resulting JavaScript. Or if you're using Node.js, you can just use CoffeeScript instead of JavaScript if you prefer. It doesn't matter, it's all JavaScript in the end.

To the many people who dislike JavaScript, either because of its syntax or its many pitfalls (or both), CoffeeScript is obviously great. They can get in on all the fun and progress that's happening in the JavaScript world, without having to write JavaScript themselves. Purely from a syntactical point of view, it's hard to argue the merits and benefits of the language. I'm fond of Ruby's syntax, so instinctively CoffeeScript looks much more appealing to me than JavaScript. Despite the nice syntax and protection from JavaScript's pitfalls, I still prefer to code in JavaScript.

For starters, if you're doing JavaScript development (whether client-side or server-side), you're likely to use a variety of scripts or libraries written by people other than you. While some libraries are documented very nicely, others aren't. In many cases, I still find the ability to read the source code of a library I'm using to find what I want to know to be a very valuable benefit. I've noticed that I often do this when trying to find out how to do something that I can't find in the documentation, or when I'm troubleshooting something. If you're not familiar with commonly used JavaScript patterns, this becomes harder than it needs to be. Some subtleties might pass you by, or in some cases you might have outright difficulties comprehending the code. Of course, reading JavaScript code from seasoned JavaScript developers can also be a great learning experience about how to properly use the language. Yes, there's a lot of bad JavaScript available, but there's a lot of beautiful JavaScript around as well. I'd say the benefits you can get from reading the code of the libraries you use are directly dependent on your familiarity with JavaScript as a language, as well as its most common used patterns and idioms. If you try to avoid JavaScript by sticking with CoffeeScript, you sort of limit the potential benefits to be had from a tremendously valuable resource.

Another reason is that JavaScript is gradually becoming ubiquitous and that's all the more reason to maintain your familiarity with the language. You're unlikely to build a website without a nice dose of JavaScript, either used directly or through CoffeeScript. Server-side JavaScript usage is on the rise, and the growing popularity of Node.js is only going to increase the usage. True, you can use CoffeeScript with Node.js as well, but keep in mind that you're going to be reading a lot of other people's JavaScript code when you work with Node.js. That's not a slight at Node.js as a platform or development community btw, I very much enjoy working with it and am loving the learning experience. JavaScript is also being used in a few NoSQL databases, and in those cases it's harder to avoid it. Who knows where JavaScript is going to be used in the future? Will you always have the ability to 'compile' your CoffeeScript code to JavaScript in order to take advantage of whatever new environment where it can be used? Maybe you will, maybe you won't. Who knows? What you do know is that the vast majority of documentation and information you're going to find on the internet will be using JavaScript. Again, all the more reasons to become proficient in JavaScript IMO.

Then there's the aspect of debugging and troubleshooting. While you can indeed 'compile' your CoffeeScript code to JavaScript, none of the debuggers available at the moment transform your code back to CoffeeScript at runtime if you're trying to debug something. Ok sure, we all prefer to avoid debugging by writing proper tests but the fact of the matter is this: if something goes wrong, I certainly want to understand the code that I can debug. And even more so, I want that code to be readable and not look like it was generated by a tool (and I'm not talking about a lousy developer). Or if I receive a stacktrace from a tester or (gasp) a user, I would like to be able to look at the code that is listed in the stacktrace without thinking "oh crap, what the hell is going on here? I'm not used to all this JavaScript!!".

Again, I have no issues with CoffeeScript as a language. I like its syntax and the ideas behind it. But for me, that just doesn't outweigh the downsides that I've listed here so I'll be sticking with JavaScript for quite a while longer, me thinks.

Solving A Problem By Avoiding It

8 commentsWritten on August 15th, 2011 by
Categories: JavaScript, node.js

I spent a little time this weekend working on something that would be of use to me: a simple utility on Node.js that calls JSLint on all JavaScript files within a given path (including its subdirectories), and only produces output if errors are found. It was pretty easy to write, and you can find it here in case you're interested.

There was one problem i ran into though. On Node, all I/O calls are non-blocking. Well, there are synchronous versions of some calls available, but you should use them as little as possible because they block the Node event-loop. One of the things i needed to do was to scan a given path recursively to find all JavaScript files in that path. And i just couldn't get it working the way i wanted to. The best i could do was a function that would invoke the callback once for every folder that was found. But i really wanted one that would only invoke the callback once, once the entire tree had been searched. I actually spent a few hours trying to get it working, and even looked for some good solutions in other projects. Most solutions i found also invoked the callback multiple times, others resorted to partially synchronous implementations. For a command-line tool like this, synchronous calls aren't a big deal but i'm trying to get better at typical Node programming, so i'm trying to avoid synchronous I/O.

So i tried yet another variation on my implementation, and it too didn't work properly. I got frustrated and said to myself "how hard can this be? i just want a list of files like 'find' would give me". Then it hit me that i could completely avoid the problem by just doing this:

var exec = require('child_process').exec;

function getJsFilesRecursively(startPath, callback) {
    exec('find ' + startPath, function(err, stdout) {
        var jsFiles = [];
        stdout.split('\n').forEach(function(f) {
            if (!/node_modules\//.test(f) && /.js$/.test(f)) {
                jsFiles.push(f);
            };
        });

        callback(null, jsFiles);
    });
};

No blocking I/O on the Node event-loop, and it took about 2 minutes to write. The only downside that it has is that it's not cross-platform because it uses the 'find' command which every *NIX-based system has, but Windows doesn't. So it will fail if you run it on Windows but for now, it'll do just fine and at least it enabled me to move on to the other stuff i needed to implement to get to the goal i initially set out for this tool.

Of course, by now i'm completely obsessed with nailing the recursive asynchronous folder-walking function so i will replace this with a proper version once i finally figure it out.

Using Mongoose’s Setters To Get Calculated Properties

1 Comment »Written on August 1st, 2011 by
Categories: JavaScript, mongoose, node.js

Note: if you haven't read my introductory post about Mongoose, you might want to do so first if you haven't seen anything about it yet.

I wanted to add an Invoice entity to my breakable toy project, and came up with this mongoose Schema:

var invoiceSchema = new Schema({
    companyId: { type: ObjectId, required: true },
    customerId: { type: ObjectId, required: true },
    invoiceNumber: { type: String, required: true, unique: true },
    date: { type: Date, required: true },
    dueDate: { type: Date, required: true }, 
    paid: { type: Boolean, required: true, default: false },
    activityId: { type: ObjectId, required: true },
    totalHours: { type: Number, required: true },
    hourlyRate: { type: Number, required: true },
    totalExcludingVat: { type: Number, required: true },
    vat: { type: Number, required: true }, 
    totalIncludingVat: { type: Number, required: true } 
});

Notice the totalExcludingVat, vat and totalIncludingVat properties. For the kind of work i do, the VAT percentage will always be the same. So i wanted the vat and totalIncludingVat properties to just be calculated automatically whenever the totalExcludingVat property was set. Mongoose makes it possible to define custom getters and setters but according to the documentation, the purpose of a setter is to transform the value being set into something different for the underlying document. In my case, i want the value being set to remain the same, but i want the vat and totalIncludingVat properties to just be calculated on the spot. Despite it probably not being the use case that was originally envisioned for the custom setters, i added this to the Invoice type:

invoiceSchema.path('totalExcludingVat').set(function(value) {
    this.vat = value * 0.21;
    this.totalIncludingVat = value * 1.21;
    return value;
});

And this actually works pretty nicely:

describe('given an invoice', function() {

    var invoice = new Invoice();

    describe('when you set its totalExcludingVat property', function() {
        
        invoice.totalExcludingVat = 1000;

        it('should automatically set the vat property', function() {
            expect(invoice.vat).toEqual(210);
        });

        it('should automatically set the totalIncludingVat property', function() {
            expect(invoice.totalIncludingVat).toEqual(1210);
        });

    });
});

describe('given an invoice created with a totalExcludingVat value', function() {

    var invoice = new Invoice({ totalExcludingVat: 1000 });

    it('should contain the correct vat value', function() {
        expect(invoice.vat).toEqual(210);
    });

    it('should contain the correct totalIncludingVat property', function() {
        expect(invoice.totalIncludingVat).toEqual(1210);
    });
});

What i like about this approach is that even though i defined a 'setter', i didn't have to define a useless getter like i would have to do in C#. I'm also curious how Mongoose implements the setter behavior. Take a look at this line:

invoice.totalExcludingVat = 1000;

In C#, that would call a compiler generated set method. In JavaScript, no function is executed when something like this is done. It's essentially exactly the same as doing this:

invoice['totalExcludingVat'] = 1000;

Not sure how they got that working, but i'm certainly glad that they did.

And yes, i'll eventually look into how they actually made it work but it's getting late and i need to get up for work in about 6 hours so i think i'll skip on getting into those details for now ;)