By Peter Fournier
“Plan early, plan thoroughly”

Planning for translation

Joann Hackos, Grande Dame of DITA, claimed in 2011 that just switching to DITA could save 20% on translation costs. A competent Content Management System could easily save another 20% (Joann Hackos’ Keynote on DITA and Translation Management).

If you need translation or localization, DITA seems like a slam dunk decision. Might as well get started with DITA in English and develop the translation part later, yes?

No, don’t do that. You really must plan to include translation into your DITA workflow from the start or you can fail to deliver the expected cost savings promised at the start of the migration to DITA project.


Basic issues

The translation and localization industry has migrated to its own XML standards. There are several:

  • XLIFF (XML Localization Interchange File Format)
  • TMX (Translation Memory eXchange)
  • TBX (TermBase eXchange)
  • SRX (Segmentation Rules eXchange)
  • ITS (Segmentation Rules eXchange)
  • and no doubt others …

All of these standards play well with DITA but can also play very badly depending on the tools, workflow, CMS or CCMS and outside suppliers you will use in your implementation of DITA + translation.

This diagram shows some of the complexities involved in an integrated workflow.

Without translation the diagram looks more like this:

So, is this complexity of translation good news or bad news? It’s excellent news with proper planning. What are the steps required to achieve excellent results?


Step One: It’s been done before …

DITA plus translation has been done before. Seek out others who have already implemented the transition. You will find that there are many ways to approach the problem. Some of the ways will be similar to your situation, others won’t be, but the more people you talk to the more ideas you’ll have to work with.


Step Two: Talk to translation companies

Before talking with a translation company, you need to prepare a sample of your content, in DITA, that represents the full scope of the content you will want translated. To keep all conversations on track you likely want to use the same sample you use for testing Stilo Migrate and OptimizeR.

Talk to several companies. You will find each one has different procedures, tools, and content specialties. You are looking for the company with the best fit with DITA, your content, and the capacity for the expected translation load. To minimize the technology load at your end you will most likely want to deal with a company that can handle DITA as the input to translation workflow.

However, some translation companies require XLIFF files as input to the process. If XLIFF is required ask the company about handling the conversion from DITA to XLIFF and back again. If you must generate XLIFF internally there are tools, like Fluenta, that may enable the conversion in both directions reliably.

One of the features you will want to explore with these suppliers is “translation memory”. Modern translation software can remember what has not changed and so does not need to be translated again. However, this memory can be complicated. Can I send the DITA files for an 800-page book to the supplier and expect them to manage the translation memory? Some CCMS’ and CMS’ can handle sending just the changed files. Does that fit in with the translation supplier’s workflow?


Step Three: Talk to Component Content Management System (CCMS) and Content Management System (CMS) companies

If you are already using a CCMS or CMS you may find the system is already optimized for efficient translation from a DITA source.

If you are in the market for a CMS or CCMS pay close attention to how the system will help you manage translation. Some systems can deal directly with translation companies over the internet.

In the pilot phase of a migration to DITA plus translation you likely don’t need a CMS or CCMS right away; you can use the file system instead. Using the file system in the early stages has two main advantages. The first is that it’s easier to conceptualize how all the pieces fit together before introducing a database system — it aids in learning. This will be a benefit when selecting a CCMS or CMS. Second, it makes interaction with outside suppliers easier; ZIP a folder with your content and send it out for review and discussion.

Here’s a typical folder structure for DITA in the filesystem:


Step Four: Translation companies have important things to say about writing

Translation may be necessary but is never easy. Ask your translation supplier(s) for advice on how to write English content that makes good translation possible. Grammar varies dramatically from language to the next. For example, in French tables are feminine but floors are masculine. In English tables and floors have no gender. That’s a trivial example. Apparently Finnish has 15 inflections for nouns. This thread, Product names and reuse: a very serious anti-pattern when translating documents, on the OASIS site gives an excellent summary of some of the problems related to DITA reuse. In other words, authoring, at the most basic level, can make translation more expensive. Be sure to explore this issue with potential translation suppliers.


Step Five: Translation companies have important things to say about DITA

Reuse, especially the more advanced reuse options in DITA, can cause problems and expense during translation. CONREF, DITAVAL, CONKEYREF, HREF all have special caveats when interacting with translation companies. It’s important to understand these limitations and opportunities before committing to a flavor of DITA or a specific translation partner.


Step Six: Start small

Back in step two I recommended developing a representative sample, in DITA, of your content. Use and refine that sample in all your dealings with suppliers. Having all the conversations based on the same sample will make the final choices much easier.

Starting small also has the benefit of making mistakes easier to fix. Just like Stilo Migrate‘s ability to let you iteratively approach the perfect migration from MS Word, say, to DITA, a good sample of content will allow you to manage the iterative approach to a complete DITA plus translation system in your company.



Converting from a legacy authoring platform to DITA is one thing, converting to DITA plus translation is entirely different. Suddenly you are not trying to optimize for internal requirements but internal and external optimizations simultaneously. The more planning you can do for this external and internal optimization the better. It may add six months to the planning phase of a transition to DITA but will pay off big time in avoided costs and maximized efficiency.


Not recommendations but good sources …

The following are the best sources of general information on translation I’ve found this week. Please don’t take these links or other links above as recommendations! They’re just good information I’ve found.

How To Translate DITA Projects [Step-By-Step Guide]

Translating your DITA Project


About the Author

Peter Fournier has extensive experience in the BNR/Nortel documentation space. In the late 80’s and early 90’s he studied the feasibility of moving the Technical Documentation to SGML. He later developed, with his team and advanced online help system for Network Management and other software produced by Nortel. The core of the online help software was based on SGML principles of containerization but only had five or six base elements, and a lot of attributes.  It was engineered to be compatible with SGML so the group had no trouble generating valid XML when the draft standard appeared in late 1996 or early 1997.  In 2005 he discovered, with great joy, DITA XML. He introduced DITA to JDSU (now Lumentum) in 2008 and served as DITA manager and technical prime until 2018.  Between 2010 and 2014 he  also found  time to get a startup going and developed software to assist groups of 1 to 20 people to get into DITA and manage all the background complexity, including publication.  As of 2021 he’s back in the DITA space and loving the Stilo philosophy of making highly complex transformation software easily accessible to customers.