The Inquisitive Coder – Davy Brion's Blog

Trying to walk that thin line between intelligence and ignorance

Multilingual data and NHibernate

Posted by Davy Brion on June 28th, 2007

Some applications need multilingual application data. Not just the user interface, but the actual data. There are different ways of handling this. Some people provide extra columns in their tables for translated values. Obviously, that’s a bad solution. Other people create an extra translations table for each entity that has translatable columns. Better, but all those extra translation tables really clutter the data model. I prefer to store all translations in one table. This can be cumbersome to implement, but the following approach with NHibernate actually makes it pretty easy to work with.

First, create the following tables and sequences (I’m using Oracle btw…):

CREATE TABLE languages
(
  languageid NUMBER NOT NULL PRIMARY KEY,
  "name" VARCHAR2(50) NOT NULL,
  isoname VARCHAR2(2) NOT NULL UNIQUE,
  isdefault NUMBER(1,0) NOT NULL,
  version NUMBER NOT NULL
);

CREATE TABLE translations
(
  translationid NUMBER NOT NULL PRIMARY KEY
);

CREATE TABLE translationvalues
(
  translationvalueid NUMBER NOT NULL,
  translationid NUMBER NOT NULL,
  languageid NUMBER NOT NULL,
  "value" VARCHAR2(4000) NULL,
  version NUMBER NOT NULL
);

ALTER TABLE translationvalues
  ADD CONSTRAINT translationvalues_uc_1
  UNIQUE (translationid, languageid);

ALTER TABLE translationvalues
  ADD CONSTRAINT translationvalues_fk_1
  FOREIGN KEY (languageid) REFERENCES languages (languageid);

ALTER TABLE translationvalues
  ADD CONSTRAINT translationvalues_fk_2
  FOREIGN KEY (translationid) REFERENCES translations (translationid);

CREATE SEQUENCE sq_languages;
CREATE SEQUENCE sq_translations;
CREATE SEQUENCE sq_translationvalues;

The classes look like this:

translations class diagram

and these are the mappings:

  <class name="Language" table="languages">
    <id name="Id" column="LanguageId" type="Int32">
      <generator class="sequence">
        <param name="sequence">sq_languages</param>
      </generator>
    </id>

    <version column="version" name="Version"
	unsaved-value="negative" generated="never"/>

    <property name="Name" column="Name" type="String"/>
    <property name="IsoName" column="IsoName" type="String"/>
    <property name="IsDefault" column="IsDefault" type="Boolean"/>
  </class>

  <class name="Translation" table="translations">
    <id name="Id" column="translationid" type="Int32">
      <generator class="sequence">
        <param name="sequence">sq_translations</param>
      </generator>
    </id>

    <set name="TranslationValues" inverse="true" lazy="false"
	fetch="join" cascade="all-delete-orphan">
      <key column="TranslationId"/>
      <one-to-many class="TranslationValue"/>
    </set>
  </class>

  <class name="TranslationValue" table="translationvalues">
    <id name="Id" column="TranslationValueId" type="Int32">
      <generator class="sequence">
        <param name="sequence">sq_translationvalues</param>
      </generator>
    </id>

    <version column="version" name="Version"
	unsaved-value="negative" generated="never"/>

    <many-to-one name="Translation" class="Translation"
	column="TranslationId" not-null="true"/>

    <many-to-one name="Language" class="Language"
	column="LanguageId" not-null="true"/>

    <property name="Value" column="Value" type="String"/>
  </class>

Alright, now it’s time to actually use this stuff… Suppose you have a Product entity and the Name of the Product has to be a translatable field. The Name property of the Product class would be mapped like this:

    <many-to-one name="Name" class="Translation" column="NameTranslationId"/>

Creating a product with its translations is now as simple as this:

   IList languages = session.CreateCriteria(typeof(Language)).List();

   Product product = new Product();
   product.Name = new Translation();

   foreach (Language language in languages)
   {
      product.Name.AddTranslationValue(language, "Name in " + language.Name);
   }

   session.Save(product);
   session.Save(product.Name);
   session.Flush();

The Translation class has an indexer property which provides easy access to specific translation values so you could easily modify the translations like this:

  product.Name["en"] = "English Name";
  product.Name["nl"] = "Dutch Name";

This approach makes it really easy to work with multilingual data and NHibernate takes care of all the messy details for us :)

19 Responses to “Multilingual data and NHibernate”

  1. den ben Says:

    Pretty nice implementation! Good data model too. The key difference in (uhum..) most data model implementations I have had to work with in the past is the fact that there _is_ a table ‘translations’ (I would ommit the s at the end of the name though). Most people, even the ‘Most Valuable ™’ ones tend to forget there is such a thing and just use a duplicate key in the translationvalue table with languageid and some TRanslationID out of the blue.

    Your code would be even better if you didn’t have to instantiate product.Name yourself. I assume a product always has a name…

    This just leaves the question of who/what is responsible for instantiating this product.Name. EntityFactory?, Product Constructor?, …

  2. Davy Brion Says:

    Yup, definitely a valid question… I’d certainly avoid doing it in the default constructor because that’s the one NHibernate uses to create the persistent instances as well. So for every product you’d load from the database a new Translation object would be instantiated, only to have it overwritten with the instance that NHibernate retrieves from the database moments later. Obviously, the performance penalty of this is minimal but i always prefer to avoid unnecessary object instantiation.

    The coolest solution would be if your inversion of control container would simply inject a new translation object through setter dependency injection. That way, objects that are instantiated by NHibernate don’t unnecessarily create new translation instances and the developer wouldn’t have to worry about creating the translation instance either. And it would also be completely configurable without touching code.

  3. Simon Jones Says:

    Thanks! Some time ago I was looking for a good approach for multilingual NHibernate data but the project was put on hold. I’ve now come back to this and your approach is the best I’ve seen.

    However, I don’t have the luxury of being able to change the schema of my domain tables (changing nvarchars to int pointers). Since each of my domain tables uses a GUID as the PK, I’m thinking I can probably avoid the need for the TranslationID column.

    CREATE TABLE translations
    (
    objectid uniqueidentifier NOT NULL PRIMARY KEY,
    property nvarchar(100) NOT NULL
    );

    Just wondered if you have any thoughts on this before I embark on a POC?

  4. Davy Brion Says:

    in your situation i’d simply replace the TranslationID with a TranslationGUID column but keep everything else the same. The most important part is being able to define a many-to-one from your domain table to your translation table, and a one-to-many from the translation identifier (be it a numeric id or a GUID) to its values.

    I tried a couple of different approaches, but the Translations-TranslationValues table combination was by far the easiest one and the cleanest one to get working properly. So i’d definitely keep those 2 tables around.

  5. bryan Says:

    why the relationship: “many-to-one from your domain table to your translation table” and not a one-to-one? is this only so that records may share the same translation or is there something else i’m missing?

  6. Davy Brion Says:

    i haven’t really thought about using a one-to-one for it to be honest, i just always use many-to-ones to map foreign keys. Technically, it does make it possible to share the same translation in multiple places although i don’t really make sure that they are indeed shared instead of duplicated. That would definitely be an interesting approach though.

  7. Bergius Says:

    Any ideas regarding how to get a sorted collection of your products based on a specific translation of the names? I really don’t enjoy i18n… :)

  8. Davy Brion Says:

    well, the easiest solution that comes to mind is to provide a CurrentLanguage property setter on your Translation class, and a TranslatedValueInCurrentLanguage (or a shorter name ;) ) property getter. The TranslatedValueInCurrentLanguage would simply return the specific translation based on what you set the CurrentLanguage property to. You can then simply use DynamicComparer and use “TranslatedValueInCurrentLanguage” as the sort expression.

    I haven’t tried this out yet, but it should work :)

  9. Hassan Says:

    Hi,
    Nice Idea, I am wondering if you can do this without calling “session.Save(product.Name);”, is it possible to make mapping handle calling save for translation objects automatically?

    Thanks

  10. Davy Brion Says:

    yeah, setting the ‘cascade’ attribute to ‘all’ in Product’s many-to-one mapping to Translation should do the trick

  11. Peter Says:

    Good post, but wouldn’t it be preferable to have the translation part invisible to the domain layer? Product.Name should just be a string, loaded from a separate table depending on the requested locale. Why load all the translations into the domain object? It’s unlikely that any one request is going to need or use those other locales at the entity level (rather, language choice is a change at the user or application level). I’m guessing this is needed to workaround the cache in the ORM — Product1(fr-FR) having the same identity as Product1(en-US). Is that the justification?

    I, too, have implemented a single Text table with textid/iso as the key, but without the mapping table. In my case, we made the unique key by concatenation: type/id/property, and references it from the entity table. I’m puzzling through how to replicate this using NHibernate. Have you had a chance to use this in production? Is so, any thoughts on the implementation?

  12. Davy Brion Says:

    I haven’t used this approach in production, but it is based on some multi-lingual requirements at a previous client. On each screen where multilingual strings could be edited, they wanted to be able to edit each translation at the same time. So that’s the justification for loading all the translations into the domain object. In my POC the Translation class’ ToString() method was overridden to display the translation for the user’s current locale though to make it easy to work with for display purposes.

    I also tried to get it working without a mapping table but could not find a solution that worked… the mapping table just made it really easy to use standard nhibernate approaches, and from a data integrity point of view, i think it’s probably the most ‘correct’ approach.

    Looking back on my post now, i would rename my original ‘Translation’ class to MultiLingualString and have it hold a collection of Translation objects (which would be the ‘TranslationValue’ instances as they are called in the post).

  13. Duc H Tran Says:

    Nice approach, but how about searching on individual localized property? Said, for example, we have Product class with Name and Description localized, any idea of an efficient way when we wanted to search for products that have a specified term in Name or Description (but not both).

  14. Davy Brion Says:

    You could use NHibernate’s Criteria API for that… i don’t have the above solution with me so i can’t experiment with it, but you can definitely do what you said in an efficient way using Criterias. Or you could use HQL… I’d basically try a few different ways, see what SQL gets generated every time and then i’d pick the most efficient one.

  15. Peter Says:

    Thought you might be interested in this approach by way of comparison: http://www.nablasoft.com/Alkampfer/?p=41

    I think the best that can be achieved using NHibernate is somewhere in the middle. A single EntityText table, with an EntityResource mapping table from the Entity tables. I don’t see a way around having to delegate to the resource bag from the property accessors.

  16. Border Crossing Stats » Multilingual data and NHibernate Davy Brion’s Blog Says:

    [...] Find more about it all here [...]

  17. Geert V Says:

    Hi Davy,

    My current project uses NHibernate and since it is my first big project, I run against a lot of questions including the multilanguage aspect.
    I started with an approach similar to yours but after sitting together with a DB guy, he informed me that the single translation table might be a bad idea for scalability. Because all updates/inserts of new items that have translations lock that single table.

    Currently I’m using a translation table per entity to avoid this.
    What is your idea on performance?

    Greets,

    Geert

  18. Davy Brion Says:

    @Geert

    it’s a trade-off obviously, if you have a lot of translatable entities and a high number of concurrent edits then yeah, a single table probably won’t be the best option… a translation table per entity would avoid that problem but introduces more work. But if it’s necessary, then that is probably the better way to go in your scenario

  19. Tambet Matiisen Says:

    I would rename these tables to Phrases and Translations. This makes many-to-one relationship between domain table ja Phrases table natural – Product Name is a Phrase and a Phrase can have many Translations.

    But if you don’t share the translations for domain objects, then Phrases table becomes useless. It’s only there because of NHibernate. For example deleting your domain object wouldn’t cascade delete it’s translations. (Or it does in NHibernate? It certainly doesn’t at database level.)

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>