How Do You Pick Open Source Libraries?

22 commentsWritten on January 16th, 2012 by
Categories: Opinions, Software Development

I'm currently looking into which library I'm going to use to handle authentication in a breakable toy project. Now, despite it being just a breakable toy, I want to do it with as few constraints on technical quality as possible because I want to maximize the learning experience I'm going to get from it. That means I don't just want to quickly put something together that just works. I want something that works, but that would also hold up in real world scenarios, even though the project will at best only be used by myself. Which means that I'm going to be picky about any libraries that I take a dependency on, just as I would if this were a project that I'd be getting paid to work on.

So as I was browsing through a few possible alternatives for my authentication needs, I started thinking about my thought process when evaluating libraries/frameworks to use. I generally base my decision on the following items, listed in order of importance (to me):

  • How well does it work for my scenario? If a library satisfies all other items on this list, that certainly doesn't mean it's an automatic lock. How it works and the impact it has on my code is definitely the most important factor.
  • Popularity. I've noticed that I let the number of watchers/forks on sites like Github influence my opinion. If a project has many watchers and many forks, odds are high that there's a relatively large group of happy users as well as people involved with the project. It also increases the odds that the project will be around for a while. Of course, inactive Open Source projects often remain available as well but if nobody's working on it, I'm not exactly tempted to take a dependency on it. Log4net is a notable exception to this, obviously. But when a project has a lot of people interested in it, or better yet, contributing to it, it's a good sign that you'll easily get help if needed, it's only going to get better in the long run and that it might get forked should the original developers stop working on it. As the author of an Open Source project that doesn't have a lot of watchers/forks (Agatha), I'm aware that my point of view on this is rather hypocritical but hey, it is what it is.
  • Code quality. I don't have the time to do an in-depth review of the code as I'm sure most of us don't do either. But I do like to glance over the code to get a general feel of the quality of the code. I focus mostly on the clarity of the code and also keep an eye open for sloppiness or downright WTF's. I guess the questions I'm mostly trying to answer when doing this are: "is this code I'd like to try to improve or fix if I need to?" and "how easy would it be to debug this when I need to troubleshoot some non-obvious issues?".
  • Location of code and issue tracker. A lot of people will probably take issue with this, but I consider it to be a major plus if the project is on Github. Not just because of my personal preference of Github, but because they truly encourage people to collaborate and contribute to projects and they make it very easy to do so. Also, the site is fast! I cringe when I have to look over issues of projects on Codeplex because it's just terribly slow. And the UI doesn't come close to that of Github either. I've heard that Bitbucket is pretty similar to Github, but I've never even looked for projects there. In any case: I want to be able to download the latest version of the code at any time, or of a particular branch if I need to, as easily as possible. I also prefer an issue tracker which is fast, responsive and easy to search. It doesn't have to be Github, but those 2 requirements are important to me.
  • License. If it's GPL, I don't use it. Also, I check whether or not a commercial license needs to be purchased when you want to use the library/framework in production. Pay attention to dual-licensed projects because that Open Source license might not apply to commercial/production use!

I'd love to hear your thoughts on this. Did I miss any important factors? I just quickly put this post together so it's likely that I missed some good ones :)

  • http://twitter.com/vkornov Victor Kornov

    You’ve nailed it. Although I don’t pay much attention to “Location of code and issue tracker” as a lot of the time that’s a chore people simply hate to do. They simply stick to where they started it as most likely it still works for them.

    • http://davybrion.com Davy Brion

      true, but I do think it’s important… For instance: I think NHibernate kept their code on Sourceforge for too long (it was unbearably slow) and their JIRA was way too unreliable and slow. I was very glad to see that they’ve moved their repo to Github.  They still use JIRA (and it wouldn’t be a good idea to throw away all that history), but it’s now hosted by Atlassian. It doesn’t go down anymore, but it’s still quite slow.  I think that being on Sourceforge for so long, and having a slow/unreliable issue tracker did hurt the project in a few ways though, no matter how popular it still is (and rightfully so).

  • http://twitter.com/NotMyself Bobby Johnson

    I wonder if “available via nuget” will upset the “host on github” bias in the coming years.

    • http://davybrion.com Davy Brion

      It’s not really related. Github makes collaboration easier. Nuget makes installing/updating it easier.

  • Jason

    Funny you make the comment about GitHub. I’ve started to feel very much the same thing. (Even though I have a project at codeplex the source is at GitHub).

    I’ve found I tend to prefer projects that have a good suite of tests. Because if I have to modify the project I’d like to know that any assumptions made by the original developer aren’t lost by any tweaks the new guy makes.

  • http://twitter.com/DomRibaut Dominique Ribaut

    Nice post :-)
    All things being equal, I would prefer “official” (OSS) library to one-man-work for example using Amazone client sdk to interface with S3. The API of the official library can even  be a bit below.
    As for the location SourceForge is a real turn down and Github is a plus but not a requirement.
    Another thing that helps in my opinion a lot is good “getting started” or examples. This gives a good feel of how the lib is meant to be used and can give good hints about where to look in the code/specs to find more advanced usages. I don’t thing a 100% “read the tests” is enough :-)
     

  • http://twitter.com/Ucodia Lionel Ringenbach

    To get an idea of popularity I often type the name of the library on StackOverflow. This also gives a clue on what kind of issue I could bump into.

  • Moti

    Can you please explain (or reference) the GPL comment?

    • http://davybrion.com Davy Brion

      If I depend on a GPL licensed library, I can only release my code through a GPL compatible license, which I’m unwilling to do.

    • Luiz Felipe

      You cannot sell closed binary components that use it. I think you cannot even sell anything that uses it. its prety useless for business, better to use BSD or Apache Licence.

  • http://zsoldosp.blogspot.com/2010/10/evaluating-software-products.html Peter Zsoldos

    Davy, I would be quite interested in how much time you spend with the evaluation – I couldn’t figure that out from the post. Also, do you have any experience selecting/upgrading products/tools on a department/firm level (yes, I know that in many cases that would be a bad idea, but in some situations – e.g.: release/change management + audit app could be appropriate)? I’ve found lack of time a really limiting factor there (I’ve got a post with some observations about such a project linked in my profile)

    • http://davybrion.com Davy Brion

      could range from 30 minutes to a few hours at most… depends on how much choice there is, and how close the competition is I guess

      when it comes to making these choices on a department/firm level, other important factors come into play: 
      - availability of skills, or in case that is low, how long it takes to get people up to speed with it
      - availability of support, either commercial or through community/google/stackoverflow/etc
      - size of community (like the popularity thing but on a wider scale)
      - maturity of the library

      Last year we chose ASP.NET MVC3 and jQuery over other solutions, you can find more about the reasoning here: http://davybrion.com/blog/2011/03/why-were-going-with-html5-instead-of-silverlight/
      That wasn’t entirely a choice between open source and closed, but some of the items discussed (and listed in the spreadsheet) offer an idea why something like for example FubuMVC wasn’t selected

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1023

  • Daniel Marbach
    • http://davybrion.com Davy Brion

      oh, *love* that point about extension points! especially in case of .NET or Java, this can indeed be a very important factor

  • Pingback: オープンソースライブラリ採用時に検討すべき、5つのポイント « nanoblog

  • Pingback: PowerShell: When do you build and when do you borrow? « Start-Transcript

  • John

    I tend to prefer Google Code to github: the interface is cleaner, more responsive. Issues tracker is clean with a very responsive search.
    Github is trying too hard for fancy effects, and ends up confusing and more “in the way”, they’re slowly turning into another SourceForge (which is even more bloated and confusing).

  • Alireza

    I think being easy to use or at least having good documentation is a very important factor. Also libraries those are used in known OSS projects and libraries has a enormous impact on the users

  • Anonymous Coward

    My criteria are slightly different: there should be few forks, or at least a few mainstream forks (many forks means there’s no variant  which is fit for many distinct use cases), there should be an active community built around it (i.e. active mailing list, with relevant messages), an easily usable mechanism for contributing should be in place, code should look good and exposed APIs should be friendly, and an Apache, Eclipse or similar license should be available for commercial use. I don’t care whether the code is on github, downloadable as an archive, in subversion at sourceforge or whatever. I don’t mind a slow sourceforge, since I only use released versions in projects, and these are mostly available as archives.

    I also tend to look for API documentation (a la Javadoc). If documentation is bad or outdated, and the library is huge, that’s a turnoff. For small libraries missing documentation is acceptable for me.

  • Meskena

    Good post!
    If you’re interested in code quality of many (167) opensource projects/frameworks you could take a look at nemo, SonarSource’s public SONAR instance: http://nemo.sonarsource.org/

  • http://www.facebook.com/andrea.endrizzi Andrea Endrizzi

    When i looking for a new library to adopt in my project usually the first think that i see is the date of latest version … if is older then 2 years i will not keep in consideration. For example log4java works fine but is very old the “stable” release. But for me is obsolete, and i looking forward. Probably is not a good practice, but i don’t wanna start with a library that is not supported or developed anymore, also if should work as intended.