Edit online

Translating your DITA Project

Read time: 8 minute(s)

Usually when working with a DITA-based project you can either store the project contents using a Content Management System (CMS) or some open-source version control system like Git or SVN. CMSs usually come with their own translation support so this blog post is mostly for end users who use Git or SVN to store and collaborate on their DITA project.

Choosing a Translation Agency

Ideally your translation agency should be able to handle DITA content directly, without you needing to convert the DITA to some intermediary format. This means that you will have the full benefit of DITA reuse features to minimize translation costs.

As a very important rule, if you plan to translate your project you should get in touch with a DITA-aware translation agency very early in your project's timeline. Reliable translation agencies that translate DITA content directly (for example WHP) usually need to have a preliminary discussion with you about how the project is structured, what terms need to be skipped when translating, how various measuring units are translated, content reuse, taxonomy, and the handling of screenshots that appear in your DITA content. So the way that you write your DITA content will be influenced by your discussion with the translation agency.

If your translation agency does not directly handle DITA content, there are commercial tools that can be used to convert DITA to XLIFF: https://www.maxprograms.com/products/fluenta.html.

Optimizing Content for Translation

In general, there are three main principles to take into account when writing DITA content that will be translated at some point:
  1. Use a controlled vocabulary (usually the Simplified Technical English vocabulary).
  2. Avoid reusing inline elements other than product names. The following DITA Users List discussion describes the reasons for this: https://lists.oasis-open.org/archives/dita/201301/msg00029.html.
  3. Avoid profiling/filtering content at inline level. For the same reasons as (2).

General DITA Project Structure

Usually you need to keep a folder that contains all your DITA maps/topics in English and have separate folders for other languages with equivalent DITA topics translated in that specific language. This article could be useful: https://www.maxprograms.com/articles/organize_files.html.

General Translation Workflow when the Translation agency accept DITA documents

When translating DITA content, the most common process involves these steps:
  1. You create your content in the primary language using a DITA authoring tool (Oxygen XML Editor).
  2. Before each release, you gather all the DITA topics that have been changed and need to be translated. The Oxygen Translation Package Builder plugin might be handy for this.
  3. Send a copy of the relevant DITA files to the translation agency (known also as "localisation service provider").
  4. Receive translated DITA content back from the translation agency and integrate it in each language-specific project folder.

Translation Workflow when the Translation agency accepts XLIFF files

XLIFF (XML Localization Interchange File Format) is an XML-based format created to standardize the way data are passed between and among tools during a localization process. If your translation agency accepts this format, the translation workflow usually has these steps:
  1. At various milestones (for example, when a new version is released), you generate XLIFF files for each language you translate to.
  2. You send the XLIFF file to the translation service provider.
  3. Once the XLIFF returns from translation, you generate a translated version of your map and topics from the XLIFF file.
Important: The Fluenta DITA Translation add-on can help with all of these steps.

Publishing your Translated Content

All your translated DITA maps and topics should have the xml:lang attribute set with the appropriate value on the root element. Besides the actual translated content, the published output may contain various static text (such as the word Table followed by the table number, Figure following by the number, or Note appearing before each DITA <note> content). The DITA Open Toolkit includes support for various languages for HTML-based output and PDF-based output. You can also add support for other languages: http://www.dita-ot.org/dev/topics/plugin-addgeneratedtext.html#ariaid-title1. There is also a specific topic that describes how to add a new language to the Oxygen-specific WebHelp Responsive output: https://www.oxygenxml.com/doc/ug-editor/topics/localize-webhelp-responsive.html.


So who is responsible for a bad translation that may produce damage to a client following a set of mis-translated steps? From my discussions with translation service providers, the translation agencies do not assume any liability for incorrectly translated content. Usually a company that needs to translate their DITA content in multiple languages has regional headquarters in various countries and somebody from the company's regional headquarters would be responsible to review and accept the translated content as appropriate.

This concludes my DITA translation overview. As we do not translate the Oxygen User's Manual in various languages, our internal knowledge of translating DITA content is quite limited so any feedback on this small article is welcomed.