Getting great DITA conversion results

Although conversion is only a small part of your DITA adoption project, it’s the part that causes even smart people to break out into hives. How does one go from a regular, flat document to a series of DITA topics and maps with the correct markup? How do you do that without losing important information? What’s the best way to do it? What tools do you use? How do you start?
Learn the basics

My first recommendation is that, no matter which conversion method you end up choosing,  get some basic DITA training first. You should understand what a topic is, the difference between topic types, how a DITA map connects content together, and the basics of attributes.

Even if you’re not doing the conversion yourself, you should have enough knowledge to recognize good results from bad. For example, you should know that if all your topics are the wrong types, the conversion needs to be re-done. One of the warning signs is if you have procedural information in a concept or generic topic with an ordered list (<ol>)—that should be a task topic, which has step and step-related elements you can leverage.

Visualize the results

There’s no way to get around it: your current content, now likely stored as chapters, books, and documents will become 100s or 1000s of topics, combined together using maps. The sheer number of objects that result from a conversion project often surprises people.

It’s important to visualize the end results of an entire document set being converted—you will have 1000s or 10,000s of files, plus graphics. However, you also need to visualize what an individual “topic” should be. Is it a chapter? Half a chapter? A few lines?

I tell my clients that it should be a “digestable” amount of content—enough for users to get their business goal completed (learn about X, perform Y, look up information on Z) but not so long that it is too big a bite and gives them heartburn. The length really does depend on the business goal but on average, if you were to print out a topic to PDF, it would be about ¾ of a page long. Of course, there are topics that are going to be shorter and longer but this average at least gives you an idea of what a “topic” might be.

If your conversion is giving you longer and fewer topics, then you have a problem with chunking. You need to either re-write into discrete topics, each with an appropriate title, or re-do your conversion to chunk at the appropriate heading level.

Clean up the content

Converting content that is minimal and matches the DITA architecture already can save you lots of time in post-conversion clean up. Apply minimalism. Remove extra words. Remove sentences that repeat titles or captions. Streamline everything. If you have 40 conditions in your FrameMaker book, it’s time to do a purge.

In general, a clean conversion means that your content maps nicely to the DITA elements without abusing those elements. This means your content already matches the DITA architecture.

This mapping is most evident in tasks, which have a very specific set of elements. For example, if you have a step command followed immediately by a result, it will not convert cleanly.

  1. Log in as an administrator. The administrator interface displays your dashboard, updated every 30 seconds.

If you convert this as is, you will get the markup:

<step><cmd> Log in as an administrator. The administrator interface displays your dashboard, updated every 30 seconds.</cmd></step>

Simply by adding a line break directly following the command, you can get a clean DITA conversion:

  1. Log in as an administrator.
    The administrator interface displays your dashboard, updated every 30 seconds.

<step><cmd> Log in as an administrator.</cmd>
<stepresult>The administrator interface displays your dashboard, updated every 30 seconds.</stepresult></step>

If you think that’s not a big or important change, then think again. Having the correct markup around the appropriate piece of content is what can distinguish DITA adoptions that succeed and those that fail—fail to provide the agility and power that is possible by having content in XML.

Having the result in a <stepresult> tag means that you can programmatically hide all step results for mobile output, for example. Or you can format that non-essential information differently. It also means that you can possibly re-use this step or topic in another place, even if the step result is different in that other context.

The cleaner, more minimal, and more task-based your source content is, the easier the conversion will be. As an added bonus, you will also end up with better quality DITA content.

Develop a Content Strategy

You should develop some sort of content strategy prior to beginning the bulk of your conversion.

Among other things, a content strategy defines the elements and attributes you will use. This helps inform your conversion.

For an example not at all at random, consider the use of the short description. It’s a powerful little element—but it’s also the one piece of content that is rarely pre-existing prior to a move to DITA. Some companies decide not to use short description elements while others decide to always use them (it’s a bit like a yes or no question). It’s a powerful little element, but mostly valuable in HTML output. If you don’t have an end-to-end strategy in place, you won’t know whether you should add a short description to every topic as part of your conversion.

I can tell you that adding and putting content in an element in every single topic after conversion is a painfully time-consuming job. It’s much faster to do this work programmatically during conversion, even if you have to go back and fill in some content later.

A content strategy helps you define what you need so you can have the DITA building blocks you need in place as a result of your conversion.

Pick a conversion method

You have some options when it comes down to actually performing a conversion from unstructured (or other-structured) content to XML.

  1. In house: Usually using FrameMaker conversion tables, this is a way for you to completely control your conversion. You won’t benefit from time-saving scripts or built-in best practices that other options can provide you with. Errors and  omissions can have a serious impact on your project budget and timelines, not to mention quality. The person doing the conversion ought to have a very good working knowledge of DITA architecture, but they can learn FrameMaker conversion tables on the fly.
  2. Consultant: A consultant works with you to identify the strengths and weaknesses of your content first so your conversion is of the highest quality. Consultants will also help you implement your content strategy and apply best practices. You won’t need expert knowledge of DITA but you will still have input and control over the results. Consultants can often meet very tight deadlines, when no one on the team has the time to convert large amounts of content.
  3. Conversion specialist company: These are companies that make a business out of converting content. They can convert custom XML to DITA and convert huge amounts of content in a short time, and have a powerful engine that can be customized to whatever you need. Although budget and pace are usually out of your hands, they get good results even from really difficult conversion projects.
  4. Stilo’s Migrate: In a class on its own, Stilo’s Migrate self-service format lets you leverage the time-saving tools and functionality (like converting images to SVG on the fly) of experts while still having full control over the pace, cost, and details of your conversion. It’s flexible enough to adapt to what you need and powerful enough to make the process fast and reliable. Remember that implementing best practices is still up to you, so the people using Migrate ought to have a very good knowledge of DITA architecture to ensure quality results. Alternatively, you can bounce your Migrate conversion results off a consultant to make sure you’re heading in the right direction.

The method you choose is going to depend on your timelines, budget, in-house expertise, volume of content, and comfort level. You can also mix and match them, getting help with your tough content and doing the easy ones yourself.

Perform a trial

Whichever conversion method you select, make sure you do a trial run with real content. Like taking a car for a test drive before you buy it, a trial run helps you make modifications to your conversion early on, raise content strategy questions you may not have considered, validate the quality of your pre-converted content, and validate metrics, budget, and timelines. If you need to modify your budget or choose another conversion method, this is your big chance.

There are very few drawbacks to performing a trial conversion—it does add some time to your schedule though, so factor that in.

The best type of content for a trial is the content you’re most confident with but which also shows some complexity in your content (conditions, text insets, index markers, etc.).

Hint: Stilo offers a free trial conversion that you should really take advantage of.


CMS/DITA North America 2017 | April 24-26, San Diego, CA

Stilo is pleased to once again be one of the premier sponsors of the Content Management Strategies/DITA North America Conference in San Diego, California on April 24-26 2017.

The conference serves a community of people who believe that international standards, structured content, reuse capabilities, and multiple media delivery are the directions of the future. Meet publications professionals who have implemented content management strategies and the OASIS DITA standard in their organizations. Hear from key tool developers who are actively supporting the standards-based community.

Find out more and register to attend

Add our presentation and technology test kitchen sessions to your schedule

Monday, April 24 | 10.00 – 10.40 am
Making sense of an authoring free-for-all
Patrick Baker, VP Development & Professional Services

Full DITA, Lightweight DITA, MarkDown, HTML5 … is it all getting a little confusing? Can these alternative authoring approaches form part of a unified structured content strategy, bringing together technical writing teams and SMEs with no knowledge of XML or DITA?

During the course of this presentation we will consider the relative merits of the alternative approaches, reflect upon some use case examples and show how collaboration between the different groups might be possible.

Find out more

Tuesday, April 25 | 11.20 – 12.00 noon
Technology Test Kitchen | Low-cost, web-based DITA authoring for SMEs
Patrick Baker, VP Development & Professional Services

This session will provide participants with a hands-on introduction to AuthorBridge. AuthorBridge is a low-cost, web-based XML authoring tool developed for use by SMEs with no knowledge of DITA or its complexities. It has been developed in close co-operation with IBM to complement existing XML editing tools and provide a ‘Word-like’ authoring experience for non-ID professionals.

In addition to providing an attractive user interface, AuthorBridge provides a Guided + Fluid authoring experience for SMEs, free from the frustrating constraints generally associated with XML authoring tools.

Participants will be shown how to edit and review existing DITA topics, and to quickly create new ones. No knowledge of DITA or XML is required. All participants will receive a free 30-day trial of AuthorBridge, which they may continue to use following the conference.

Find out more

STC Summit 2017 | May 7-10, Washington, DC

Stilo is pleased to once again be one of the sponsors of the Society for Technical Communication’s annual summit, the premier technical communication event.

The 2017 STC Summit takes place May 7-10 in Washington, DC and officially begins Sunday evening with an opening keynote speaker and the Welcome Reception in the Expo Hall.

Stop by our booth in the expo hall and ask us for a demo of Migrate, our cloud XML conversion service which enables technical authoring teams to convert content from source formats including FrameMaker and Word to XML DITA or AuthorBridge, our web-based XML editor which provides subject matter experts with a Guided & Fluid authoring experience without requiring any knowledge of XML or DITA.

Find out more and register

Stilo announces availability of AuthorBridge v2.0 with free 30-day trial

Providing a Guided & Fluid DITA web authoring experience for SMEs from just US$100 per user, per annum

Stilo is pleased to announce a major new release of AuthorBridge, a web authoring tool for subject matter experts that has been developed in close co-operation with the central ID tools team at IBM.

With a beautifully designed user interface, improved CALS table handling, advanced copy & paste support and new review comments feature, AuthorBridge v2.0 can be used by SMEs to easily create and edit DITA topics in a collaborative environment with technical authors.

In addition, AuthorBridge provides high levels of user guidance for SMEs with no knowledge of DITA or XML, and through its unique architecture, it delivers a free-flowing ‘Word-like’ authoring experience, not constrained by the structures of conventional XML editors.

AuthorBridge can be deployed in the cloud or on-premise, using simple file systems, shared repositories such as Git, or integrated with component content management systems (CCMS).

Sign up for a free 30-day trial

Sign up

Watch a recorded live demo (registration required)

Watch now

AuthorBridge v2.0 demo recording now available!

If you missed our live demo of the latest major new release of AuthorBridge – version 2.0, a web authoring tool for SMEs that has been developed in close co-operation with the central ID tools team at IBM – you can access the recording here.

AuthorBridge provides high levels of user guidance for SMEs with no knowledge of DITA or XML, and through its unique architecture, it delivers a free-flowing ‘Word-like’ authoring experience, not constrained by the structures of conventional XML editors.

AuthorBridge v2.0 features

  • A new beautifully designed user interface
  • Improved CALS table handling
  • Advanced copy & paste support
  • New review comments feature
  • Pricing from just US$100 per user, per annum

Find out more about AuthorBridge v2.0

Request a free 30-day trial