Ruby

Performance Of NHibernate With Ruby Objects Compared To Traditional C# Objects

3 commentsWritten on October 19th, 2010 by
Categories: IronRuby, NHibernate, Performance, Ruby

I recently showed how you can use NHibernate to persist and query Ruby objects through IronRuby. We've continued the experiment (though we've already done some big optimizations in the code based on the first results of these tests) and we recently had to decide whether or not the performance difference between using NHibernate with regular static C# code and using it with dynamic Ruby objects was acceptable. So we ran a set of tests, and compared all of the numbers. Note that we don't claim that these benchmarks are scientifically correct in any way, but we do think they give us a good idea on what we can reasonably expect. I want to share the results with you, and would appreciate any feedback you guys have on this... particularly on whether or not we missed something obvious in our tests or whether or not we should trust these numbers. After all, we're not professional benchmarkers so our approach might very well just suck :)

We have a scenario which consists of 15 'actions'. For these actions, we use some tables from the Chinook database, basically just Artist/Album/Track/Genre/MediaType. The actions are the following:

  1. Retrieve single track without joins, and access each non-reference property
  2. Retrieve single track with joins, and access all properties, including references
  3. Retrieve single track without joins, and access all properties, including references (triggers lazy-loading)
  4. Create and persist object graph: one artist with two albums with 13 tracks each
  5. Retrieve created artist from nr 4, add a new album with another 13 tracks, change the title of the first album from nr 4, and remove the second album from nr 4 including its tracks
  6. Retrieve created artist from nr 4 and delete its entire graph
  7. Create a single track
  8. Retrieve single track from step 7 and update it
  9. Retrieve single track from step 7 and update the name of one of its referenced properties
  10. Retrieve single track from step 7 and change one of the reference properties so it references a different instance
  11. Delete the track from step 7
  12. Retrieve 100 tracks and access each non-reference property
  13. Retrieve 200 tracks and access each non-reference property
  14. Retrieve 100 tracks without joins and access all properties, including references (triggers lazy-loading)
  15. Retrieve 100 tracks with joins and access all properties, including references

Note: when i say we access reference properties to trigger lazy loading, i mean that we access a non-id property of the referenced property to make sure it indeed hits the database.

The scenario is ran 500 times with regular C# objects, and 500 times with Ruby objects. We keep track of the average time of each action in the scenario, as well as the total duration of the scenario. Also, keep in mind that we ran these tests on a local database.

The following graph shows the average duration of each action in milliseconds on the Y axis, and the number of the action on the X axis:

(you can click on the graph to watch it in its full size)

Before i'll discuss these results, i'd also like to show the following graph which shows the average difference in milliseconds between the static and the dynamic execution of each action:

Two actions immediately stand out: the last two which both deal with fetching a set of items and accessing all of their properties. They're both about 6ms slower than their static counterparts, which is a performance penalty of 71% for action 14, and 87% for action 15. That deals with a part of code that we can't really optimize any more. Well, it probably is possible but we've already done a lot of work on that, and this is the best we can come up with so far.

Now, those 2 actions are things we avoid as much as possible in real code anyway, so maybe they aren't that big of an issue. The other 2 actions where there is a noticable difference (though it actually means an increase in average execution time of 1.1ms using a local database) is the creation and persistance of an object graph (step 4), and the retrieval/modification/persistence of that same graph (step 5). Most other actions don't have a noticeable difference, and in some cases the dynamic version is actually faster than the static one, no doubt because NHibernate has in some cases less work to do when using the Map EntityMode (which we rely on for the dynamic stuff) compared to the Poco EntityMode.

We also wanted to see whether the performance difference would get worse when spreading the workload evenly over a set of threads, or even a 'pool' of IronRuby engines. I was pretty happy to see that it didn't really lead to a noticeable difference.

The following graph shows the average duration of the entire scenario in a couple of different situations:

I do have to mention that the numbers shown in this graph aren't averages, but the result from running the scenario once in each situation. We did however ran the scenarios in each situation more than once, and while we didn't list the averages, the numbers are representative of each testrun... we didn't see any really noticeable differences over multiple runs. The percentage difference for each situation is shown in this graph:

As you can see, the performance penalty of the entire scenario in each situation varies between 15% and 26%.

Now, considering the fact that we prefer to avoid loading 'large' sets of data through NHibernate into entities (we prefer to use projections instead for that) we wanted to see what the difference would be for the entire duration of the scenario in each situation, without the final 4 actions. Basically, just the typical CRUD scenarios:

Now the difference varies between 6% and 15%.

Now, suppose that we have a compelling reason to actually go ahead with using this approach (we do actually, but i'm not gonna get into that here), do you think we can trust these numbers? Is there anything else we're missing? Are we complete idiots for testing the performance difference like this? Do you have any feedback whatsoever? Then please leave a comment :)

Why You Don’t Need Dependency Injection In Ruby

24 commentsWritten on October 11th, 2010 by
Categories: Code Quality, Dependency Injection, Opinions, Ruby

If you're new to Ruby and you're coming from static languages like C# or Java, you'll probably wonder why there isn't much interest in Dependency Injection in the Ruby community. The answer is quite simple: because you don't need it. Now, that's not to say that Dependency Injection isn't a valuable technique in your toolbox. In fact, if you're doing C# or Java i'd even go as far as saying it's absolutely necessary to use Dependency Injection in most of your code. Two of the biggest reasons (i know there are more, but let's focus on these for now) why Dependency Injection is important if you're using a static language are these:

  1. Highly increased testability because you can control the dependencies during automated tests
  2. Lowered coupling between classes which enables you to change implementations of dependencies at will (granted, not a lot of people actually do that often but it certainly is a real benefit)

In Ruby however, you don't really need dependency injection to achieve the 2 benefits mentioned above as i hope the following contrived example shows. Suppose we have the following 2 classes.

class Dependency
  def do_something_with(some_object)
    p some_object
  end
end

class SomeClass
  def work_your_magic_on(something)
    Dependency.new.do_something_with something
  end
end

This is no good. The work_your_magic_on method from SomeClass directly instantiates a new instance of the Dependency class. During automated tests, we could actually replace the implementation of the new method of the Dependency class to return a stub or a mock if we want to instead of an instance of the real thing. But we could never easily change the implementation of the dependency that SomeClass requires to function properly in real production code without screwing up everything else that also happens to depend on the Dependency class.

If you're coming from a static language, you'd probably be inclined to change SomeClass to this:

class SomeClass
  def initialize(dependency)
    @dependency = dependency
  end
  
  def work_your_magic_on(something)
    @dependency.do_something_with something
  end
end

Ahh, that's much better. The dependency is now injected in SomeClass' initializer method and we can very easily achieve the above mentioned 2 benefits by passing whatever we want to each instance of SomeClass, as long as it has a do_something_with method. The biggest downside however is that every consumer of SomeClass instances now needs to know about the dependencies that it requires to function properly. This quickly becomes very painful because using Dependency Injection throughout your codebase will very quickly lead to having to satisfy the dependencies of the dependencies of the dependencies of the dependencies of the class you actually need to use. This quickly requires the usage of a good Inversion Of Control Container to handle all of these dependencies for you. There's just one problem: there doesn't seem to be a widely used IOC Container in Ruby. Which in itself tells you that it's simply not needed in Ruby.

There's a better way to modify SomeClass:

class SomeClass
  def work_your_magic_on(something)
    get_dependency.do_something_with something
  end
  
  private
  
  def get_dependency
    @dependency = Dependency.new if @dependency.nil?
    @dependency
  end
end

Now, the ALT.NET fanbois will still tell you how wrong this is because SomeClass still has a direct dependency on the Dependency class. I should know because i was one of them. And again, in C# or Java i'd definitely agree that this code is bad. Not so in Ruby however because i can easily replace the actual implementation of SomeClass' dependency in both automated tests and actual production code, without impacting anything else that uses the Dependency class.

Suppose we have the following test:

  def test_magic
    object = SomeClass.new
    object.work_your_magic_on "some important string"
    # TODO: assert that the string was passed to the dependency correctly
  end

Since i don't provide the dependency to the object that i'm testing, i can't really verify that the dependency was used correctly. I'm also not using any mocking framework since i want to show how the language itself takes away the need to inject your dependencies. Given the following spy class:

class DependencySpy
  @@passed_in_objects = []
  
  def self.passed_in_objects
    @@passed_in_objects
  end
  
  def do_something_with(some_object)
    @@passed_in_objects << some_object
  end
end

I could now write my test like this:

  def test_magic
    object = SomeClass.new
    def object.get_dependency
      DependencySpy.new
    end
    object.work_your_magic_on "some important string"
    assert DependencySpy.passed_in_objects.include? "some important string"
  end

And it'll pass. The 'trick' is that i simply change the implementation of the get_dependency method for the instance that i have. This doesn't change anything at the class level, merely at the instance level. Technically, i don't really change the implementation of get_dependency in SomeClass, i merely insert a method in this particular instance which will precede the one in SomeClass during Ruby's method lookup procedure.

Also, i would like to point out that this kind of testing isn't exactly something that i'd encourage you to do, though this technique can be useful if you really need to do interaction testing. But it's a good illustration of why you don't really need to do Dependency Injection in Ruby to write testable code.

Now you might be wondering, what's the difference between doing something like this and using a tool like TypeMock in .NET to basically achieve the same thing? Well, when writing code like this in C# and testing it with TypeMock, you achieve one of the benefits that you could have with using Dependency Injection: being able to control the dependencies. But you can't change the implementation of the dependency at runtime in normal production code. In Ruby, with the approach outlined above, i can still easily achieve that like this:

class SomeClass
  
  private
  
  def get_dependency
    @dependency = SomeOtherDependency.new if @dependency.nil?
    @dependency
  end
  
end

If this code is executed after the earlier definition of SomeClass, it will reopen SomeClass and change the implementation of the get_dependency method for each instance that will be created. This effectively gives you the ability to change the implementation of a dependency at runtime in production code, without having to use Dependency Injection. Now, some Dependency Injection purists will still claim that this approach is bad because SomeClass knows which implementation of the dependency it uses. And my question to those people is: so what? I can easily change it in any situation i'd run into. You can also consider the presence of the actual type of the dependency as the default implementation to use, without having to force the requirement of this knowledge on consumers.

There is still one situation where i would probably use Dependency Injection in Ruby though, and that is when you want to benefit from what i consider to be yet another great reason to use Dependency Injection in static languages:

  • Not having to know anything about the lifecycle of your dependencies

In this case, it's probably much easier to just inject a long-living dependency in an object with a shorter lifecycle. However, if the dependency is basically a singleton (which is still the default for many of the .NET IOC Containers), then i actually would consider implementing the singleton as a class with nothing but class methods (similar to static methods in static languages, but not quite since you can still change them whenever you want) and having my other classes that depend on it call those methods directly, or through helper methods that i can still change when i need to.

I'm sure many people will disagree with some of the points i try to make in this post, but until i get some actual real-world reasons that invalidate my points, i simply don't see the point in sticking to a set of rules and guidelines that were largely made up out of necessity to deal with shortcomings of static languages. That's not to say that dynamic languages don't have any shortcomings or drawbacks, but it does mean that the rules and guidelines of how to write good code are, well, simply different. And as such, it wouldn't be wise to blindly stick with rules that were made for a different way of programming. Question what you already know, because it might not be relevant to what you need to do now.

Monkey Patching FTW!

2 commentsWritten on October 4th, 2010 by
Categories: IronRuby, Ruby

Today, i had to get a CI build of an IronRuby project that a coworker and me have been working on up and running. We have a TeamCity server and i'm pretty familiar with it, at least as far as our .NET projects are concerned. But this is the first project we're using IronRuby on, and for now it's exclusively Ruby code that has to run on IronRuby. Our requirements of the CI build are very simple: check out the latest version of the code from Subversion, run the tests and make sure we can consult the test results from the TeamCity web interface. That's it. How hard can that be, right?

The thing is... we're not using an official IronRuby version. I basically get the latest code from IronRuby's GitHub repository from time to time, build it, and we use that. I've included all of the necessary files into our subversion repository so we can just refer to the correct IronRuby version with relative paths. And no, it's not because we're trying to be cool or hardcore, it's because we depend on a fix that has been implemented in IronRuby already but that isn't present in one of the releases. So my coworker sent me a link from the TeamCity documentation that mentioned that you could just use a Rakefile with IronRuby. Easy peasy! Well, except for the fact that it would require me to install the IronRuby build that we happen to be using on the build agents, and that i'd have to update it whenever i update the IronRuby binaries that we're using. Not exactly an approach i'd prefer.

So i was already thinking along the lines of "great, we're gonna have to write yet another custom test runner to report the test results back to TeamCity". We did it back when nobody cared about writing tests for Silverlight code, so i guess we could do it again. But i just sort of looked up to it. And then my coworker said "why not just monkey patch the test runner so it outputs the results in the format that TeamCity can understand?". And he was right. There's no reason whatsoever not to use a monkey patch to get out of this bind.

The final result is a pretty minimal amount of code that didn't take long to write which gets the results we need. Granted, i lost some time because at first i was monkey patching Test::Unit's console TestRunner only to find out that it's not really being used anymore if you're on Ruby 1.9... it's been replaced with MiniTest, which unfortunately (yet understandably) trades clean code for runtime performance. If Test::Unit's console TestRunner was used, the final result would've been less than 20 lines of code in total. Now, it's a bit more but it's still pretty minimal.

First of all, it's important to know the format that TeamCity can understand from your custom build output. You can find all you need to know about that here. Once you know the expected format, the solution is actually pretty easy: change the behavior of the testrunner at runtime so that it formats the output in a way that TeamCity can do something with it instead of its regular output. Turns out i could limit my monkey patch to just one of MiniTest's classes, that being the MiniTest::Unit class. First of all, we need to add some helper methods that we can use to take care of some of TeamCity's formatting requirements:

  def tc_output(string)
    tc_string = "##teamcity[#{string}]"
    puts tc_string
    tc_string
  end
     
  def tc_escape(string)
    string
      .gsub("|", "||")
      .gsub("'", "|'")
      .gsub("\n", "|n")
      .gsub("\r", "|r")
      .gsub("]", "|]") 
  end

With those methods added to the MiniTest::Unit class, we can now modify the behavior of 2 methods of this class to get the result that we want and need. First up, is the puke method, and no, i'm not joking... the method is actually called 'puke':

  def puke(klass, method, error)
    error = case error
      when MiniTest::Skip then
        @skips += 1
        tc_output "testIgnored name='#{method}' message='test ignored'"
      when MiniTest::Assertion then
        @failures += 1
        trace = MiniTest::filter_backtrace(error.backtrace).join("\n")
        tc_output "testFailed name='#{method}' message='#{tc_escape(error.message)}' details='#{tc_escape(trace)}'"
      else
        @errors += 1
        trace = MiniTest::filter_backtrace(error.backtrace).join("\n")
        tc_output "testFailed name='#{method}' message='#{tc_escape(error.message)}' details='#{tc_escape(trace)}'"
    end
      
    error[0,1]  
  end

This method is called by MiniTest whenever a test has failed... ignoring (or skipping in the MiniTest terminology) a test is a 'failure' (and i can't really argue with that). And obviously, both assertion failures or runtime exceptions are considered to be test failures as well. In either of these 3 cases, the puke method is called and it is supposed to output something to the user to notify him/her of the problems. So i basically just took the existing code, and modified it so its output would be in the format that TeamCity can work with. Next up is the run_test_suites method, which is responsible for, you guessed it, running the tests in the various test suites.

  def run_test_suites(filter=/./)
    @test_count, @assertion_count = 0, 0
    old_sync, @@out.sync = @@out.sync, true if @@out.respond_to? :sync=
    TestCase.test_suites.each do |suite|
      tc_output "testSuiteStarted name='#{suite}'"
      suite.test_methods.grep(filter).each do |test|
        inst = suite.new test
        inst._assertions = 0
        tc_output "testStarted name='#{test}'"
        @start_time = Time.now
        result = inst.run(self)
        duration = "%f" % ((Time.now - @start_time)*1000)
        tc_output "testFinished name='#{test}' duration='#{duration}'"
        @test_count += 1
        @assertion_count += inst._assertions
      end
    end
    @@out.sync = old_sync if @@out.respond_to? :sync=
    [@test_count, @assertion_count]
  end

Again, i just took the existing code and changed its output so that TeamCity can work with it.

And the final result is this:

As you can see, build #2 didn't give you any feedback on the tests, even though they were being executed properly. Build #3 reported 2 failing tests, which were my temporary test cases to see how failed assertions or actual errors would be reported by TeamCity. Build #4 reports that all tests passed. In case you're interested, our 'build script' looks like this:

..\ironruby\bin\dotnet\ir -w tests\suite.rb

And that's it... pretty simple, no?

That just goes to show that while monkey patching is considered by a lot of people to be 'evil', it certainly has its benefits from time to time. I'm not saying you should use it as much as possible. But when it makes sense to do so, and if you're aware of the downsides and the pitfalls, then there's nothing wrong with it at all. Though it does require a language that treats you like an adult and expects you to know what you're doing ;)

Using NHibernate To Persist And Query Ruby Objects

5 commentsWritten on September 23rd, 2010 by
Categories: NHibernate, Ruby

As some of you already know, i've been experimenting with getting NHibernate and Ruby (through IronRuby) to play nice together. In this post, i'll go over what already works and how i got it working.

Suppose we have the following 2 NHibernate mappings:

  <class entity-name="Artist">
    <id name="id" column="ArtistId" type="int">
      <generator class="identity"/>
    </id>

    <property name="name" length="50" type="string" />

    <bag name="albums" cascade="all-delete-orphan" inverse="true" >
      <key column="ArtistId"/>
      <one-to-many class="Album" />
    </bag>
  </class>

  <class entity-name="Album">
    <id name="id" column="AlbumId" type="int">
      <generator class="identity"/>
    </id>

    <property name="title" length="50" type="string" not-null="true" />
    <many-to-one name="artist" column="ArtistId" not-null="true" class="Artist" />
  </class>

And suppose we have the following 2 classes:

class Artist
  attr_accessor :id, :name, :albums
  
  def initialize
    self.albums = System::Collections::ArrayList.new
  end
  
  def add_album(album)
    self.albums.add(album)
    album.artist = self
  end
  
  def remove_album(album)
    self.albums.remove(album)
    album.artist = nil
  end
end

class Album
  attr_accessor :id, :title, :artist
end

The only atypical thing about that Ruby code is the usage of System::Collections::ArrayList. That's something i haven't been able to workaround yet: if you want to use collections, you'll need to use the .NET ones for now.

I'm relying on 2 things to get everything working. One is NHibernate's Map EntityMode, the other is my own Ruby magic which i'll cover later. The important thing to know is that the Map EntityMode basically works without classes, but with dictionaries. Instead of instances of entity classes, NHibernate will return or accept dictionaries where the keys correspond to property names and the values correspond to their respective property's value. Though the goal was that the developer need not use the dictionaries directly, as the above 2 Ruby classes show. I'll get into the details of the Ruby magic later on in this post, but for now it's important to know that there's an ObjectFactory class which takes care of transforming the dictionaries that i get from NHibernate to either real instances of entity classes, or proxies of them.

First, let's take a look at transitive persistence:

    using (var session = sessionFactory.OpenSession())
    {
        var artist = ruby.Artist.@new();
        artist.name = "Rage Against The Machine";

        var album1 = ruby.Album.@new();
        album1.title = "Rage Against The Machine";

        var album2 = ruby.Album.@new();
        album2.title = "Evil Empire";

        artist.add_album(album1);
        artist.add_album(album2);

        session.Save("Artist", artist);

        session.Flush();

        artistId = artist.id();
    }

The output of running that code is this:

NHibernate: INSERT INTO Artist (name) VALUES (@p0); select SCOPE_IDENTITY();@p0 = 'Rage Against The Machine' [Type: String (50)]
NHibernate: INSERT INTO Album (title, ArtistId) VALUES (@p0, @p1); select SCOPE_IDENTITY();@p0 = 'Rage Against The Machine' [Type: String (50)], @p1 = 355 [Type: Int32 (0)]
NHibernate: INSERT INTO Album (title, ArtistId) VALUES (@p0, @p1); select SCOPE_IDENTITY();@p0 = 'Evil Empire' [Type: String (50)], @p1 = 355 [Type: Int32 (0)]

As you can see, transitive persistence is working nicely, even with collections. Now let's see how we can retrieve that data from the database and into our Ruby objects. First i need to show the following 2 helper methods for displaying the data:

    private static void PrintArtistData(dynamic artist)
    {
        Console.WriteLine("Artist: " + artist.name());
        PrintAlbumData(artist.albums());
    }

    private static void PrintAlbumData(dynamic albums)
    {
        foreach (dynamic album in albums)
        {
            Console.WriteLine("\tAlbum: " + album.title());
        }
        Console.WriteLine();
    }

Now we can get the artist we just created with a simple call to session.Get:

    using (var session = sessionFactory.OpenSession())
    {
        dynamic artist = ruby.ObjectFactory.create_from_nhibernate_hash(session.Get("Artist", artistId));
        Console.WriteLine("display output from session.Get");
        PrintArtistData(artist);
    }

And here's the output of that in the console:

NHibernate: SELECT artist0_.ArtistId as ArtistId0_0_, artist0_.name as name0_0_ FROM Artist artist0_ WHERE artist0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
display output from session.Get
Artist: Rage Against The Machine
NHibernate: SELECT albums0_.ArtistId as ArtistId1_, albums0_.AlbumId as AlbumId1_, albums0_.AlbumId as AlbumId1_0_, albums0_.title as title1_0_, albums0_.ArtistId as ArtistId1_0_ FROM Album albums0_ WHERE albums0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
        Album: Rage Against The Machine
        Album: Evil Empire

As you can see, the lazy loading of the albums collection works just as you'd expect it to. Speaking of lazy-loading, we can do the same thing with a call to session.Load instead of session.Get:

    using (var session = sessionFactory.OpenSession())
    {
        dynamic artist = ruby.ObjectFactory.create_proxy_from_nhibernate_hash(session.Load("Artist", artistId), "Artist", artistId);
        Console.WriteLine("display output from session.Load");
        PrintArtistData(artist);
    }

As you may or may not know, session.Load returns a proxy of an entity instead of actually fetching it from the database immediately (unless the instance is already in the session cache, which my current ruby code can't handle yet). NHibernate doesn't hit the database until you access any of the properties of the entity outside of the identifier, which the output of this code clearly shows:

display output from session.Load
NHibernate: SELECT artist0_.ArtistId as ArtistId0_0_, artist0_.name as name0_0_ FROM Artist artist0_ WHERE artist0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
Artist: Rage Against The Machine
NHibernate: SELECT albums0_.ArtistId as ArtistId1_, albums0_.AlbumId as AlbumId1_, albums0_.AlbumId as AlbumId1_0_, albums0_.title as title1_0_, albums0_.ArtistId as ArtistId1_0_ FROM Album albums0_ WHERE albums0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
        Album: Rage Against The Machine
        Album: Evil Empire

Notice that the select statement is outputted right before we access the name of the artist, instead of immediately as in the previous example.

We've got lazy-loading covered, but what about eager loading? Well, take a look at the following code:

    using (var session = sessionFactory.OpenSession())
    {
        var artistHash = session.CreateCriteria("Artist")
            .Add(Restrictions.IdEq(artistId))
            .SetFetchMode("albums", FetchMode.Join)
            .List()[0];

        dynamic artist = ruby.ObjectFactory.create_from_nhibernate_hash(artistHash);
        Console.WriteLine("display output from session.CreateCriteria without any lazy loading");
        PrintArtistData(artist);
    }

This fetches our artist and immediately joins its albums in the same query. When we access the albums of the artist, it no longer needs to go to the database:

NHibernate: SELECT this_.ArtistId as ArtistId0_1_, this_.name as name0_1_, albums2_.ArtistId as ArtistId3_, albums2_.AlbumId as AlbumId3_, albums2_.AlbumId as AlbumId1_0_, albums2_.title as title1_0_, albums2_.ArtistId as ArtistId1_0_ FROM Artist this_ left outer join Album albums2_ on this_.ArtistId=albums2_.ArtistId WHERE this_.ArtistId = @p0;@p0 = 355 [Type: Int32 (0)]
display output from session.CreateCriteria without any lazy loading
Artist: Rage Against The Machine
        Album: Rage Against The Machine
        Album: Evil Empire

Obviously, if we omit setting the fetchmode of the albums association we get the same output as we would get from using session.Get:

    using (var session = sessionFactory.OpenSession())
    {
        var artistHash = session.CreateCriteria("Artist")
            .Add(Restrictions.IdEq(artistId))
            .List()[0];

        dynamic artist = ruby.ObjectFactory.create_from_nhibernate_hash(artistHash);
        Console.WriteLine("display output from session.CreateCriteria with lazy loading of albums");
        PrintArtistData(artist);
    }
NHibernate: SELECT this_.ArtistId as ArtistId0_0_, this_.name as name0_0_ FROM Artist this_ WHERE this_.ArtistId = @p0;@p0 = 355 [Type: Int32 (0)]
display output from session.CreateCriteria with lazy loading of albums
Artist: Rage Against The Machine
NHibernate: SELECT albums0_.ArtistId as ArtistId1_, albums0_.AlbumId as AlbumId1_, albums0_.AlbumId as AlbumId1_0_, albums0_.title as title1_0_, albums0_.ArtistId as ArtistId1_0_ FROM Album albums0_ WHERE albums0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
        Album: Rage Against The Machine
        Album: Evil Empire

Eager fetching also works in the other direction, when fetching albums with their artist included automatically:

    using (var session = sessionFactory.OpenSession())
    {
        var albumsList = session.CreateCriteria("Album")
            .CreateAlias("artist", "a", JoinType.InnerJoin)
            .SetMaxResults(5)
            .List();

        dynamic albums = ruby.ObjectFactory.create_multiple_from_nhibernate_list(albumsList);

        foreach (dynamic album in albums)
        {
            Console.WriteLine(string.Format("'{0}' by '{1}'", album.title(), album.artist().name()));
        }
    }

This results in the following output:

NHibernate: SELECT TOP (@p0) this_.AlbumId as AlbumId1_1_, this_.title as title1_1_, this_.ArtistId as ArtistId1_1_, a1_.ArtistId as ArtistId0_0_, a1_.name as name0_0_ FROM Album this_ inner join Artist a1_ on this_.ArtistId=a1_.ArtistId;@p0 = 5 [Type: Int32 (0)]
'For Those About To Rock We Salute You 2' by 'Accept'
'Balls to the Wall' by 'Accept'
'Restless and Wild' by 'Accept'
'Let There Be Rock' by 'AC/DC'
'Big Ones' by 'Aerosmith'

Finally, we'll retrieve our artist and modify some of its data:

    using (var session = sessionFactory.OpenSession())
    {
        dynamic artist = ruby.ObjectFactory.create_from_nhibernate_hash(session.Get("Artist", artistId));

        artist.name = "RATM";
        artist.albums()[1].title = "The Battle Of Los Angeles";

        artist.remove_album(artist.albums()[0]);

        dynamic newAlbum = ruby.Album.@new();
        newAlbum.title = "Renegades";
        artist.add_album(newAlbum);

        session.Flush();
    }

If we then run the following code again:

    using (var session = sessionFactory.OpenSession())
    {
        dynamic artist = ruby.ObjectFactory.create_from_nhibernate_hash(session.Get("Artist", artistId));
        Console.WriteLine("display output from session.Get");
        PrintArtistData(artist);
    }

We can see that the data has indeed been changed as it should:

NHibernate: SELECT artist0_.ArtistId as ArtistId0_0_, artist0_.name as name0_0_ FROM Artist artist0_ WHERE artist0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
display output from session.Get
Artist: RATM
NHibernate: SELECT albums0_.ArtistId as ArtistId1_, albums0_.AlbumId as AlbumId1_, albums0_.AlbumId as AlbumId1_0_, albums0_.title as title1_0_, albums0_.ArtistId as ArtistId1_0_ FROM Album albums0_ WHERE albums0_.ArtistId=@p0;@p0 = 355 [Type: Int32 (0)]
        Album: The Battle Of Los Angeles
        Album: Renegades

Ok, so how does this all work? After all, NHibernate returns and expects dictionaries and as you can see in the code of the ruby classes, there are no dictionaries being used. The answer is actually pretty simple. NHibernate returns and expects dictionaries. I return and expect entity instances. Clearly, all we need to do is make sure that our entities pretend to be dictionaries and NHibernate will never need to know what on earth we're doing.

The first thing we need to do is to modify the implementation of the ruby classes that we have created for our entities. Obviously, i wouldn't want anyone to have to do that manually, so my ruby magic just does this at runtime. The only limit that is placed on the code you write in ruby is that within the entity classes, you can never touch the private instance fields of the attributes that you've defined. You always have to go through the accessors. Because of that limit, i can just replace all of the accessor methods with implementations that use the dictionary that NHibernate gives me as the backing store of the data instead of using instance fields. I also make sure that all equality checks are based on the underlying dictionary instead of the actual object. This passes everything but a straight-up reference check. Finally, we need to make sure that our objects can be cast to an IDictionary and that we implement the indexer property of the IDictionary interface because NHibernate will use that when we pass it transient instances to insert into the database.

First, let's take a look at the ObjectFactory class, which has a couple of class methods that we use from our .NET code to create entities based on the dictionaries that we get from NHibernate:

class ObjectFactory
  def self.create_from_nhibernate_hash(hashtable)
    entity_name = hashtable[NHibernator::TYPE_KEY_NAME.to_clr_string]
    entity = const_get(entity_name.to_sym).new
    entity.hydrate_from hashtable
    entity    
  end
  
  def self.create_proxy_from_nhibernate_hash(hashtable, entity_name, id)
    proxy = const_get("#{entity_name}Proxy".to_sym).new
    proxy.hydrate_from hashtable, id
    proxy
  end
  
  def self.create_multiple_from_nhibernate_list(list)
    entities = []
    # TODO: differentiate between proxies and normal entities in the list
    list.each { |hash| entities << create_from_nhibernate_hash(hash) }
    entities
  end
end

(as you can see from the TODO statement, this whole thing is still a work in progress)

Pretty simple stuff so far... We either create a new instance of the entity class, or of a proxy class for that entity type (i'll cover the creation of proxy classes soon). We then call its hydrate_from method, which is also added to each entity class dynamically. There's another (temporary) limitation here... i search for the class name constant in Object, which means that our current approach doesn't work when our entities have namespaces. Not really a problem for this example, and is easy to add later on when i actually need it. That's it for the ObjectFactory... the real magic is all contained in the NHibernator module. And no, i couldn't come up with a better name. Long-time readers should know by now that i absolutely suck at coming up with good names so that's why we ended up with the NHibernator module.

The NHibernator module does 2 things: it offers a method that you need to use when initializing your application so we can create the proxy classes based on NHibernate's metadata, and it also modifies the accessor methods and adds some new methods whenever it is mixed in to another class. I'm going to show the code of the NHibernator module in multiple steps to hopefully keep everything as clear as possible. First of all, i'm gonna show the declaration of a constant and a simple helper method that we're going to need:

  TYPE_KEY_NAME = "$type$"

  def self.each_writeable_accessor_of(klass, &block)
    setters = klass.public_instance_methods(true).select { |name| name =~ /\w=$/ }
    setters.each { |setter| yield setter }
  end  

The TYPE_KEY_NAME constant contains the string that NHibernate uses as the key in its dictionaries for the value which returns the current entity's type name. And the each_writeable_accessor_of method executes the given block for each writeable acessor that a class contains.

And this is how we initialize everything:

  def self.initialize(session_factory)
    all_class_metadata = session_factory.get_all_class_metadata
    
    all_class_metadata.keys.each do |key|
      metadata = all_class_metadata[key]
      realclass = Object::const_get(key)
      realclass.send :include, NHibernator
      create_proxy_class_for realclass, metadata.identifier_property_name
    end
  end
  
  def self.create_proxy_class_for(klass, identifier_name)
    proxyclass = Class.new(klass)
    Object::const_set "#{klass.name}Proxy", proxyclass
    
    proxyclass.class_eval do
      define_method identifier_name do
        @id
      end
      
      def hydrate_from(hashtable, id)
        @nhibernate_values = hashtable
        @id = id
      end
    end
    
    each_writeable_accessor_of(klass) do |setter|
      proxyclass.class_eval do
        define_method setter do |value|
          # execute the getter to force NH's lazy proxymap to fetch the data
          send setter.to_s.chop
          super value
        end
      end
    end
  end

The initialize class method takes an NHibernate ISessionFactory instance and retrieves each mapped entity with the information that we need about it. Each mapped entity's class is sent the include message with the NHibernator module as a parameter. This basically mixes in the functionality of the NHibernator module into each entity's class. I'll discuss this in the next part of the post. After we've mixed the module into the entity classes, we call the create_proxy_class_for method for each class. As you can see, creating the proxy classes is very easy stuff. Any proxy class that we create inherits from the class of the entity, and overrides the accessor method to retrieve the identifier value so that it immediately returns the identifier value. If we would've kept the default implementation, it would access the dictionary that we got from NHibernate, which would cause a select statement for this proxy to be issued, which we obviously don't want. Again, this is a work in progress and one limitation that this current proxy implementation has is that you'll get a reference to a dictionary instead of an entity when you access a reference-property of a proxy. That too will be easy to fix :)

Next up, we need to cover what happens when the NHibernator module is mixed into an entity class. Ruby has a great hook method for that, which is this:

  def self.include(base)
    # everything you do within this method will be executed whenever
    # this module is included in a class... the base parameter is
    # the class that included the module
    
    # ...
  end

I'm doing quite a bit within that method and i want to cover each item in detail. So, the next couple of pieces of code are all part of the self.include(base) method implementation. The first thing we do when this module gets included in a class is this:

    each_writeable_accessor_of(base) do |setter|
      getter = setter.to_s.chop
      
      base.class_eval do
        undef_method getter
        undef_method setter
        
        define_method getter do
          return nil if @nhibernate_values.nil?
          value = @nhibernate_values[getter.to_clr_string]
          return value unless value.is_a? System::Collections::IEnumerable
          # TODO: cache the WrappedList instance
          WrappedList.new(value)
        end
        
        define_method setter do |value|
          @nhibernate_values = System::Collections::Hashtable.new if @nhibernate_values.nil?
          @nhibernate_values[setter.to_s.chop.to_clr_string] = value
        end
      end
    end

This is pretty simple, we're just getting rid of all of the original accessor methods and replacing them with our own implementations that use the dictionary we get from NHibernate as the backing store. Note that i will discuss the WrappedList class that you see in those getters soon. The setter methods will also instantiate a new Hashtable if we don't already have a dictionary. This is necessary for transient instances since NHibernate will treat them as IDictionary instances when we pass them to the session. Speaking of which, this is the next thing we do:

    base.send :include, System::Collections::IDictionary

This single line enables any piece of .NET code to cast our instances to an IDictionary reference. Note that we haven't even implemented any of the IDictionary interface's methods yet. We don't need to implement all of them anyway, just the ones that we know will be used.

Finally, we add all of the following methods to each class that included this module:

    base.class_eval do
      def nhibernate_values
        @nhibernate_values
      end

      def hydrate_from(hashtable)
        @nhibernate_values = hashtable
        referenced_entities = Hash.new

        hashtable.keys.each do |key|
          value = hashtable[key.to_clr_string]
                    
          if value.is_a? System::Collections::IDictionary
            if value.is_a? System::Collections::Hashtable
              referenced_entity = ObjectFactory.create_from_nhibernate_hash(value)
            else
              type = value.hibernate_lazy_initializer.entity_name
              id = value.hibernate_lazy_initializer.identifier
              referenced_entity = ObjectFactory.create_proxy_from_nhibernate_hash(value, type, id)
            end
          
            referenced_entities[key] = referenced_entity
          end
        end

        referenced_entities.keys.each { |key| send "#{key}=", referenced_entities[key] }
      end
      
      def Equals(other)
        self == other
      end

      def GetHashCode
        hash
      end
            
      def ==(other)
        return false if other.nil?
        return @nhibernate_values.Equals(other) if other.is_a? System::Collections::IDictionary
        return false unless other.respond_to? :nhibernate_values
        other.nhibernate_values.Equals(@nhibernate_values)        
      end
            
      def hash
        @nhibernate_values.GetHashCode()
      end
                                                
      def [](key)
        self.send key
      end 

      def []=(key, value)
        self.send "#{key}=", value
      end      
    end

I think that code speaks for itself, except for the Equals and GetHashCode methods... those are just there because i had some issues with IronRuby mapping calls to Equals or GetHashCode to their corresponding ruby alternatives (== and hash). I eventually upgraded to the latest IronRuby revision from GitHub, because i didn't get correct results with the IronRuby 1.1 alpha 1 to get the equality checks working correctly.

Finally, i needed the following 2 helper classes to make the albums bag work correctly:

class WrappedList
  include System::Collections::IList
  
  def initialize(list)
    @list = list  
  end
  
  def each(&block)
    @list.each do |item|
      if item.respond_to? :nhibernate_values
        yield item
      else
        yield ObjectFactory.create_from_nhibernate_hash(item)
      end
    end
  end

  def GetEnumerator
    WrappedListEnumerator.new(self)
  end

  def add(item)
    @list.add item
  end
  
  def clear
    @list.clear
  end
  
  def contains(item)
    @list.contains item
  end
  
  def count
    @list.count
  end
  
  def remove(item)
    original_count = count
    if item.respond_to? :nhibernate_values
      @list.remove item.nhibernate_values
    else
      @list.remove item
    end
    original_count != count
  end
  
  def is_read_only
    @list.is_read_only
  end
  
  def index_of(item)
    @list.index_of item
  end
  
  def insert(index, item)
    @list.insert index, item
  end
  
  def remove_at(index)
    @list.remove_at index
  end
  
  def [](index)
    item = @list[index]
    return item if item.respond_to? :nhibernate_values
    ObjectFactory.create_from_nhibernate_hash(item)
  end
  
  def []=(index, item)
    @list[index] = item
  end
  
  def Equals(other)
    self == other
  end

  def GetHashCode
    hash
  end
            
  def ==(other)
    return false if other.nil?
    @list.Equals(other)
  end
        
  def hash
     GetHashCode
  end
end

class WrappedListEnumerator
  include System::Collections::IEnumerator
  
  def initialize(wrappedlist)
    @wrappedlist = wrappedlist
    reset
  end
  
  def reset
    @current_index = -1
  end
  
  def current
    @wrappedlist[@current_index]
  end
  
  def move_next
    @current_index += 1
    return false if @current_index >= @wrappedlist.count
    true
  end
end

And that's all there is to it. This is probably the longest blog post i've ever written, but the amount of code involved in getting this working really isn't that much. Granted, there are still limitations to this approach so some stuff will need to be added to it. I'm also not saying that this is actually a great idea or that you should start doing this from now on, but well, at least this is possible now :)

Mad Scientist At Work

2 commentsWritten on September 8th, 2010 by
Categories: IronRuby, NHibernate, Ruby

Just wanted to show some code, and yes, it actually works :)

    using (var session = sessionFactory.OpenSession())
    {
        var artist = ruby.ObjectFactory.CreateArtist();
        artist.Name = "some name";
        session.Save("Artist", artist);
        session.Flush();
        // and this actually prints out the id of the artist (in this case, it's defined as an identity in sql server)
        Console.WriteLine(artist.ArtistId());
    }

and

    using (var session = sessionFactory.OpenSession())
    {
        var albumId = 1;
        var artistId = 1;

        var album = ruby.ObjectFactory.create_from_nhibernate_hash(session.Get("Album", albumId));
        // the following line does NOT issue a select statement
        Console.WriteLine(album.Artist().ArtistId());
        // but this one obviously will trigger the lazy loading of the Artist instance
        Console.WriteLine(album.Artist().Name());

        var realArtist = ruby.ObjectFactory.create_from_nhibernate_hash(session.Get("Artist", artistId));

        album.Title = album.Title() + " 2";
        realArtist.Name = realArtist.Name() + " 2";

        if (album.Artist() == realArtist && album.Artist().Name() == realArtist.Name())
        {
            // this actually persists the changes
            session.Flush();
        }
        else
        {
            throw new InvalidOperationException("The universe should collapse");
        }
    }

I can't get property syntax working properly (as in: without requiring the parentheses) when accessing mapped properties due to an issue with IronRuby that needs to be fixed first, but other than that i'm pretty happy with how this works :)

Full details will be posted once i've taken care of a few more things ;)