Multilingual data and NHibernate
Posted by Davy Brion on June 28th, 2007
Some applications need multilingual application data. Not just the user interface, but the actual data. There are different ways of handling this. Some people provide extra columns in their tables for translated values. Obviously, that’s a bad solution. Other people create an extra translations table for each entity that has translatable columns. Better, but all those extra translation tables really clutter the data model. I prefer to store all translations in one table. This can be cumbersome to implement, but the following approach with NHibernate actually makes it pretty easy to work with.
First, create the following tables and sequences (I’m using Oracle btw…):
CREATE TABLE languages ( languageid NUMBER NOT NULL PRIMARY KEY, "name" VARCHAR2(50) NOT NULL, isoname VARCHAR2(2) NOT NULL UNIQUE, isdefault NUMBER(1,0) NOT NULL, version NUMBER NOT NULL ); CREATE TABLE translations ( translationid NUMBER NOT NULL PRIMARY KEY ); CREATE TABLE translationvalues ( translationvalueid NUMBER NOT NULL, translationid NUMBER NOT NULL, languageid NUMBER NOT NULL, "value" VARCHAR2(4000) NULL, version NUMBER NOT NULL ); ALTER TABLE translationvalues ADD CONSTRAINT translationvalues_uc_1 UNIQUE (translationid, languageid); ALTER TABLE translationvalues ADD CONSTRAINT translationvalues_fk_1 FOREIGN KEY (languageid) REFERENCES languages (languageid); ALTER TABLE translationvalues ADD CONSTRAINT translationvalues_fk_2 FOREIGN KEY (translationid) REFERENCES translations (translationid); CREATE SEQUENCE sq_languages; CREATE SEQUENCE sq_translations; CREATE SEQUENCE sq_translationvalues;
The classes look like this:
and these are the mappings:
<class name="Language" table="languages">
<id name="Id" column="LanguageId" type="Int32">
<generator class="sequence">
<param name="sequence">sq_languages</param>
</generator>
</id>
<version column="version" name="Version"
unsaved-value="negative" generated="never"/>
<property name="Name" column="Name" type="String"/>
<property name="IsoName" column="IsoName" type="String"/>
<property name="IsDefault" column="IsDefault" type="Boolean"/>
</class>
<class name="Translation" table="translations">
<id name="Id" column="translationid" type="Int32">
<generator class="sequence">
<param name="sequence">sq_translations</param>
</generator>
</id>
<set name="TranslationValues" inverse="true" lazy="false"
fetch="join" cascade="all-delete-orphan">
<key column="TranslationId"/>
<one-to-many class="TranslationValue"/>
</set>
</class>
<class name="TranslationValue" table="translationvalues">
<id name="Id" column="TranslationValueId" type="Int32">
<generator class="sequence">
<param name="sequence">sq_translationvalues</param>
</generator>
</id>
<version column="version" name="Version"
unsaved-value="negative" generated="never"/>
<many-to-one name="Translation" class="Translation"
column="TranslationId" not-null="true"/>
<many-to-one name="Language" class="Language"
column="LanguageId" not-null="true"/>
<property name="Value" column="Value" type="String"/>
</class>
Alright, now it’s time to actually use this stuff… Suppose you have a Product entity and the Name of the Product has to be a translatable field. The Name property of the Product class would be mapped like this:
<many-to-one name="Name" class="Translation" column="NameTranslationId"/>
Creating a product with its translations is now as simple as this:
IList languages = session.CreateCriteria(typeof(Language)).List();
Product product = new Product();
product.Name = new Translation();
foreach (Language language in languages)
{
product.Name.AddTranslationValue(language, "Name in " + language.Name);
}
session.Save(product);
session.Save(product.Name);
session.Flush();
The Translation class has an indexer property which provides easy access to specific translation values so you could easily modify the translations like this:
product.Name["en"] = "English Name"; product.Name["nl"] = "Dutch Name";
This approach makes it really easy to work with multilingual data and NHibernate takes care of all the messy details for us


June 29th, 2007 at 9:44 am
Pretty nice implementation! Good data model too. The key difference in (uhum..) most data model implementations I have had to work with in the past is the fact that there _is_ a table ‘translations’ (I would ommit the s at the end of the name though). Most people, even the ‘Most Valuable ™’ ones tend to forget there is such a thing and just use a duplicate key in the translationvalue table with languageid and some TRanslationID out of the blue.
Your code would be even better if you didn’t have to instantiate product.Name yourself. I assume a product always has a name…
This just leaves the question of who/what is responsible for instantiating this product.Name. EntityFactory?, Product Constructor?, …
June 29th, 2007 at 10:22 am
Yup, definitely a valid question… I’d certainly avoid doing it in the default constructor because that’s the one NHibernate uses to create the persistent instances as well. So for every product you’d load from the database a new Translation object would be instantiated, only to have it overwritten with the instance that NHibernate retrieves from the database moments later. Obviously, the performance penalty of this is minimal but i always prefer to avoid unnecessary object instantiation.
The coolest solution would be if your inversion of control container would simply inject a new translation object through setter dependency injection. That way, objects that are instantiated by NHibernate don’t unnecessarily create new translation instances and the developer wouldn’t have to worry about creating the translation instance either. And it would also be completely configurable without touching code.
July 30th, 2007 at 2:48 pm
Thanks! Some time ago I was looking for a good approach for multilingual NHibernate data but the project was put on hold. I’ve now come back to this and your approach is the best I’ve seen.
However, I don’t have the luxury of being able to change the schema of my domain tables (changing nvarchars to int pointers). Since each of my domain tables uses a GUID as the PK, I’m thinking I can probably avoid the need for the TranslationID column.
CREATE TABLE translations
(
objectid uniqueidentifier NOT NULL PRIMARY KEY,
property nvarchar(100) NOT NULL
);
Just wondered if you have any thoughts on this before I embark on a POC?
July 30th, 2007 at 3:08 pm
in your situation i’d simply replace the TranslationID with a TranslationGUID column but keep everything else the same. The most important part is being able to define a many-to-one from your domain table to your translation table, and a one-to-many from the translation identifier (be it a numeric id or a GUID) to its values.
I tried a couple of different approaches, but the Translations-TranslationValues table combination was by far the easiest one and the cleanest one to get working properly. So i’d definitely keep those 2 tables around.
October 18th, 2007 at 4:55 pm
why the relationship: “many-to-one from your domain table to your translation table” and not a one-to-one? is this only so that records may share the same translation or is there something else i’m missing?
October 21st, 2007 at 5:47 pm
i haven’t really thought about using a one-to-one for it to be honest, i just always use many-to-ones to map foreign keys. Technically, it does make it possible to share the same translation in multiple places although i don’t really make sure that they are indeed shared instead of duplicated. That would definitely be an interesting approach though.
October 22nd, 2007 at 6:22 pm
Any ideas regarding how to get a sorted collection of your products based on a specific translation of the names? I really don’t enjoy i18n…
October 23rd, 2007 at 8:26 am
well, the easiest solution that comes to mind is to provide a CurrentLanguage property setter on your Translation class, and a TranslatedValueInCurrentLanguage (or a shorter name
) property getter. The TranslatedValueInCurrentLanguage would simply return the specific translation based on what you set the CurrentLanguage property to. You can then simply use DynamicComparer and use “TranslatedValueInCurrentLanguage” as the sort expression.
I haven’t tried this out yet, but it should work
December 9th, 2007 at 12:16 am
Hi,
Nice Idea, I am wondering if you can do this without calling “session.Save(product.Name);”, is it possible to make mapping handle calling save for translation objects automatically?
Thanks
December 9th, 2007 at 5:40 pm
yeah, setting the ‘cascade’ attribute to ‘all’ in Product’s many-to-one mapping to Translation should do the trick
January 27th, 2008 at 9:40 pm
Good post, but wouldn’t it be preferable to have the translation part invisible to the domain layer? Product.Name should just be a string, loaded from a separate table depending on the requested locale. Why load all the translations into the domain object? It’s unlikely that any one request is going to need or use those other locales at the entity level (rather, language choice is a change at the user or application level). I’m guessing this is needed to workaround the cache in the ORM — Product1(fr-FR) having the same identity as Product1(en-US). Is that the justification?
I, too, have implemented a single Text table with textid/iso as the key, but without the mapping table. In my case, we made the unique key by concatenation: type/id/property, and references it from the entity table. I’m puzzling through how to replicate this using NHibernate. Have you had a chance to use this in production? Is so, any thoughts on the implementation?
January 28th, 2008 at 10:47 am
I haven’t used this approach in production, but it is based on some multi-lingual requirements at a previous client. On each screen where multilingual strings could be edited, they wanted to be able to edit each translation at the same time. So that’s the justification for loading all the translations into the domain object. In my POC the Translation class’ ToString() method was overridden to display the translation for the user’s current locale though to make it easy to work with for display purposes.
I also tried to get it working without a mapping table but could not find a solution that worked… the mapping table just made it really easy to use standard nhibernate approaches, and from a data integrity point of view, i think it’s probably the most ‘correct’ approach.
Looking back on my post now, i would rename my original ‘Translation’ class to MultiLingualString and have it hold a collection of Translation objects (which would be the ‘TranslationValue’ instances as they are called in the post).
January 30th, 2008 at 4:04 pm
Nice approach, but how about searching on individual localized property? Said, for example, we have Product class with Name and Description localized, any idea of an efficient way when we wanted to search for products that have a specified term in Name or Description (but not both).
January 31st, 2008 at 8:36 am
You could use NHibernate’s Criteria API for that… i don’t have the above solution with me so i can’t experiment with it, but you can definitely do what you said in an efficient way using Criterias. Or you could use HQL… I’d basically try a few different ways, see what SQL gets generated every time and then i’d pick the most efficient one.
February 2nd, 2008 at 7:05 am
Thought you might be interested in this approach by way of comparison: http://www.nablasoft.com/Alkampfer/?p=41
I think the best that can be achieved using NHibernate is somewhere in the middle. A single EntityText table, with an EntityResource mapping table from the Entity tables. I don’t see a way around having to delegate to the resource bag from the property accessors.
March 11th, 2008 at 9:08 am
[...] Find more about it all here [...]
December 14th, 2009 at 11:51 am
Hi Davy,
My current project uses NHibernate and since it is my first big project, I run against a lot of questions including the multilanguage aspect.
I started with an approach similar to yours but after sitting together with a DB guy, he informed me that the single translation table might be a bad idea for scalability. Because all updates/inserts of new items that have translations lock that single table.
Currently I’m using a translation table per entity to avoid this.
What is your idea on performance?
Greets,
Geert
December 14th, 2009 at 12:12 pm
@Geert
it’s a trade-off obviously, if you have a lot of translatable entities and a high number of concurrent edits then yeah, a single table probably won’t be the best option… a translation table per entity would avoid that problem but introduces more work. But if it’s necessary, then that is probably the better way to go in your scenario
December 23rd, 2009 at 7:18 pm
I would rename these tables to Phrases and Translations. This makes many-to-one relationship between domain table ja Phrases table natural – Product Name is a Phrase and a Phrase can have many Translations.
But if you don’t share the translations for domain objects, then Phrases table becomes useless. It’s only there because of NHibernate. For example deleting your domain object wouldn’t cascade delete it’s translations. (Or it does in NHibernate? It certainly doesn’t at database level.)