Friday, November 27, 2015

DITA Reuse Strategies (Short Tutorial describing all DITA Reuse possibilities)

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

Introduction

This small tutorial is based on a presentation called DITA Reuse Strategies I made at DITA Europe 2015. It's main purpose is to explore the numerous possibilities of reusing content within the DITA standard.

First of all I think the main reasons we want to reuse content in technical documentation are these ones:
  • Consistent explanations for the same situations.
  • Less content to translate.
  • Decreased time spent writing content.
  • Obtain different publications from common content.
I would like to start by saying that technical documentation writers have two very important roles:
  • Record knowledge about using tools and processes.
  • Spread knowledge to reach large audiences.
As a software engineer, having a product user's manual which is rich in examples and answers to frequently asked questions saves me time. Instead of individually explaining to end users various application behaviors I can give links to the manual or better yet our end users find that content by themselves. Because there are just not enough human resources in any company in order to individually help each end user.

We'll start with a top down approach to reuse. Complete small examples for most of the reuse situations described below can be found here: https://www.oxygenxml.com/forum/files/dita_reuse_strategies_samples.zip.

Version Control and Reuse

Version Control allows you to reuse content tagged at a certain point in time in order to produce older versions of your publications. So no matter what open source version control system like SVN or GIT you are using or commercial CMS, you should always have the possibility to produce older bug-fix versions for your documentation. So you can think of Version Control as content reuse on the time line axis.

Converting XML content to various output formats

XML in itself is perfect for reuse because:
  • XML is an intermediary format. We don't do XML for the pleasure of it. We do it because we want to obtain multiple outputs from it and it has enough content and structure inside to allow for it. Some call this single source publishing but it can be just as easily be called content reuse.
  • XML contains the necessary content.
  • XML contains the necessary structure.
  • XML is a standard. So you have a choice between open source and commercial tools.
  • XML is a standard for defining standards. Among which DITA, the most versatile standard XML vocabulary when it comes to reuse.
Whatever output you will obtain from the XML, there is a constant, this XML format which contains all your data will contain more semantic meaning than any of the published outputs.

You can read more about the selling points of using XML in this older blog post: http://blog.oxygenxml.com/2015/09/a-short-story-of-reuse.html.

Create larger publications from existing ones

You can merge multiple existing DITA Maps in various new publications.

The only existing danger for this would be if you define keys with the same name but different values in both publications. Fortunately DITA 1.3 comes to the rescue with the new keyscopes support which allows keys with the same name to be resolved to various values on each scope:
<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map>
    <title>Vegetables Soup</title>
    <topicref href="carrots/carrots.ditamap" format="ditamap" keyscope="ks1"/>
    <topicref href="potatoes/potatoes.ditamap" format="ditamap" keyscope="ks2"/>
</map>

Even if you have a single root map you can keep related sections or chapters in different DITA Maps. Besides adding more logical structure to your content you never know when you'll reuse those sub-maps in different publications.

Reuse content for similar products

This is the most common case for successful reuse, you have multiple similar products which share common functionality. So similarly the technical documentation for each of those products will also share common content. This is usually done in two ways. In the following sections I will use the term root map for referring to the DITA Map which will actually get published.

1. Use multiple Root Maps.

Each root map is published to obtain the output for a certain product type. As major benefits you can:
  • Reuse entire topics.
  • Define variable product names.
  • Remap links and reused content using keys.

Publication maps for phone models X1000 and X2000 using almost similar content except Blue-tooth chapter which appears in only one of them.

2. Use a single Root Map.

You have a single publication root map which gets published for various products using profiling filters applied on it. These filters can be applied either at topic or element levels. The product name is variable and depends on the applied filters.

Reuse fragments of content

Until now we have regarded the topic as an indivisible unit in our project. But there are many times when it becomes useful to reuse smaller elements in various places throughout the publication.

Content References

Content references are the initial and probably the mostly used reuse mechanism in the DITA specification. They allow reusing elements from a topic in various other topics throughout the publication.

Small example of content referencing

Reusable Component from topic reusables.dita:

  <dd id="CPU">
    <ul id="ul_lym_bqd_x4">
      <li>Minimum - <tm tmtype="tm">Intel Pentium III</tm>/<tm tmtype="tm">AMD Athlon</tm>
        class processor, 1 <term>GHz</term>.</li>
      <li>Recommended - Dual Core class processor.</li>
    </ul>
  </dd>

Content reference:

<dd conref="path/to/reusables.dita#topicID/CPU"/>

You can read more about how content references can be inserted in Oxygen here: https://www.oxygenxml.com/doc/versions/17.1/ug-editor/#topics/eppo-create-conref.html.

Content Key References

When compared to direct content references, content key references are done with indirect addressing. You first need to define a key for the topic which contains the reused content and make the content key reference using that key.

Small example of content key referencing

Reusable Component from topic reusables.dita:

  <dd id="CPU">
    <ul id="ul_lym_bqd_x4">
      <li>Minimum - <tm tmtype="tm">Intel Pentium III</tm>/<tm tmtype="tm">AMD Athlon</tm>
        class processor, 1 <term>GHz</term>.</li>
      <li>Recommended - Dual Core class processor.</li>
    </ul>
  </dd>

Key definition in DITA Map:

<keydef keys="reusable.install" href="reusables/reusables.dita"/>

Content key reference:

<dd conkeyref="reusable.install/CPU"/>

You can read more about how content key references can be inserted in Oxygen here: https://www.oxygenxml.com/doc/versions/17.1/ug-editor/#topics/eppo-create-conkeyref.html

Content Reference Ranges

Instead of reusing a series of consecutive elements (for example steps, list items) one by one you can reuse an entire range of sibling elements. For this to work, both the intial and the final elements need to have IDs defined on them.

Small example of content key reference with ranges

Reusable steps from task reusable_steps.dita:

  <steps>
      <step id="washing">
        <cmd>Wash the vegetables thoroughly.</cmd>
      </step>
      …..
      <step id="peeling">
        <cmd>Pass the peeler gently over the vegetable.</cmd>
      </step>
    </steps>

Key definition in DITA Map:

 <keydef keys="reusable_steps" href="reusable_steps.dita"/>

Content key reference range:

    <steps>
      <step conkeyref="reusable_steps/washing" conrefend="default.dita#default/peeling">
        <cmd/>
      </step>
    </steps>

The usual dialog from Oxygen used to insert reusable content can also be used to select the range of elements to insert: https://www.oxygenxml.com/doc/versions/17.1/ug-editor/#topics/insert-dita-content-reference.html.

Content Reuse Tips and Tricks

I tried to compile below a set of best practices to follow when reusing content:

  • Keep all your reused content in special topics located in special folders. Technical writers need to know that they are editing content which potentially is used in mutiple contexts.
  • Keep a description for each reused element. You can have topics which act like dictionaries of reused content. A table of reused content can have two columns. On the first column each cell contains the reused element and on the second one you can have a small description for each reused element. The description acts as metadata, it may give the technical writer more details about how that content should be reused.
  • Use conkeyrefs instead of conrefs. Really, because they use relative paths conrefs always break when you move topics around. But more about conkeyrefs in the next section.
  • When using conkeyrefs you should create a special map with key definitions. This keeps the reused content and the keys for it separate from the live content.
  • A topic can have multiple reusable elements inside it. In this way it will act like a dictionary of reused components. In such a topic you can keep a table with two columns. On the first table column in each cell you can have a reused element. On the second table column you can keep a small description for each element. The description is metadata, it is not meant for the published output. It is just a good way to inform technical writers about how that particular element should be reused.

Pushing Content

Besides the techniques we've seen so far for pulling reused content in multiple places you can also push content to a certain specified place inside an existing topic.

So why push content?

Imagine you have an existing publication "Cooking Book" containing a task with a couple of steps for peeling vegetables. At some point you create the DITA Map for a larger publication called "Cooking Book for Pros" which reuses the entire original publication by referencing to the original publication DITA Map. But you somehow need to add extra steps in the original task when the larger publication gets printed.

Pushing Content to an existing sequence of steps

Sequence of steps from the original task:

     <steps>
      ...........
      <step id="peeler_handling">
        <cmd>Pass the peeler gently over the vegetable.</cmd>
      </step>
    </steps>

Key definition in DITA Map for the task which will push the content:

<keydef href="stepsPusher.dita" keys=”peeling”/>

Content key reference push done from the "stepsPusher.dita" task:

        <steps>
            <step conaction="mark" conkeyref="peeling/peeler_handling">
                <cmd/>
            </step>
            <step conaction="pushafter">
                <cmd>Read the instructions.</cmd>
            </step>
        </steps>

So the only purpose of the "stepsPusher.dita" task which is referenced with a resource-only processing role and thus does not appear at all in the output is to modify the content of the original task which gets published.

How do we push content in Oxygen? First you would need to define an ID on an element which will be the target for our push. The conref push mechanism allows us either to replace, insert an element before or after this target element. After this you can create the topic which pushes the content, create the step which will be pushed. You can right click inside this steps and choose Reuse->Push Current Element....

Key References (Variables)

You can reuse simple variables like product name, executable, and so on by defining keywords in the Dita Map and then using keyref's in topics to reuse those text fragments.

Reusing keywords

Defining the reused keyword in the DITA Map:

<!-- product name -->
  <keydef keys="product" product="editor">
    <topicmeta>
      <keywords>
        <keyword>Oxygen XML Editor</keyword>
      </keywords>
    </topicmeta>
  </keydef>

Reusing the keyword in a topic:

<title>Installation Options for <ph keyref="product"/></title>

In Oxygen you can create key definitions in the DITA Map by right clicking in the DITA Maps Manager and choosing Append Child->Key definition with keyword.... After this, in the topic you can use Oxygen's regular Reuse Content action to insert the keyref.

DITA 1.3 Contributions to Reuse

DITA 1.3 takes content reuse to an entire new level allowing you to:
  • Reuse topic with variable content depending on context (keyscopes).
  • Reuse the same content profiled in various ways in the same publication (branch filtering).

Reuse with Key Scopes

Using DITA 1.3 key scopes you can reuse a topic in multiple places in the DITA Map with slightly different content.

Reuse using key scopes

Let's say you write a topic about Windows installation for your software product:
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="installation">
  <title><ph keyref="osName"/> Installation</title>
  <body>
    <p>
      <ol id="ol_g5h_st4_zt">
        <li>Download the executable.</li>
        <li>Run the executable by double clicking it.</li>
        <li>Follow steps described in the installation wizard.</li>
      </ol>
    </p>
  </body>
</topic>
and at some point your realise that exactly the same steps need to be followed for the Linux installation. The only difference is the name of the operating system. You use a keyref to refer to the operating system name but just with DITA 1.2 support the key will resolve to a single value.

Using keyscopes in the DITA Map you can define multiple values for your key depending on the context:

 <topicgroup keyscope="windows">
  <keydef keys="osName">
   <topicmeta>
    <keywords>
     <keyword>Windows</keyword>
    </keywords>
   </topicmeta>
  </keydef>
  <topicref href="installation.dita"/>
 </topicgroup>
 <topicgroup keyscope="linux">
  <keydef keys="osName">
   <topicmeta>
    <keywords>
     <keyword>Linux</keyword>
    </keywords>
   </topicmeta>
  </keydef>
  <topicref href="installation.dita"/>
 </topicgroup>

You can find a more detailed example and download samples for reuse based on key scopes in this blog post: http://blog.oxygenxml.com/2015/08/dita-13-key-scopes-next-generation-of.html.

Reuse with Branch Filtering

With branch filtering you can combine two profiles of the same DITA Map in a larger publication.

Creating a Phones Catalogues publication

If you already have a DITA Map from which you can obtain publications for various mobile phone versions based on the profiling filters applied to it, you can use branch filtering to create a larger publication which incorporates the publications for all mobile phone versions:

….................
  <topicref href="phoneDetails.ditamap" format="ditamap">
   <ditavalref href="ditaval/X1000Branch.ditaval">
    <ditavalmeta><dvrResourceSuffix>1</dvrResourceSuffix></ditavalmeta>
   </ditavalref>
  </topicref>
….......................
  <topicref href="phoneDetails.ditamap" format="ditamap">
   <ditavalref href="ditaval/X2000Branch.ditaval">
    <ditavalmeta><dvrResourceSuffix>2</dvrResourceSuffix></ditavalmeta>
   </ditavalref>
  </topicref>
…...................

You can find a more detailed example and download samples for reuse based on branch filtering in this blog post: http://blog.oxygenxml.com/2015/09/dita-13-branch-filtering-next.html

Reuse non-DITA resources

Besides DITA topics you can reuse other resources in your DITA project:
  • Reuse images either referenced directly or via a key reference.
  • Reuse other linked resources (like videos, PDFs and so on).

As binary resources are not embedded in the DITA topics, they are naturally reused by being kept in separate files and linked when necessary.

You can reuse images and link to other resources either via direct references or via indirect key references. What to choose may depend on how many times you refer to a certain image or binary resource. If you refer to it only once or twice you can use direct referencing.

If you have problems getting images to appear the same size when published to PDF and XHTML-based outputs you should make sure they do not have the dots-per-inch information saved inside them: https://www.oxygenxml.com/doc/versions/17.1/ug-editor/#topics/stretched-images-pdf-output.html.

Conclusions

The DITA standard can provide for you quite a large toolbox for reuse scenarios.

Besides the tips which are spread during this tutorial here is some additional advice for you:
  • Know a little bit about all these possibilities (at least know that they exist), you never know when one of them might come in handy.
  • For any given potential reuse situation you may find out that you can use multiple reuse strategies. So at a given time you could reuse a piece of simple text either via direct conrefs, indirect conkeyrefs or keyword keyrefs. Choosing one of the strategies will depend on the situation. For example if you plan in the future to also have inline elements in the reused text, you should go with either conref or conkeyref. If you reuse that content only in one or two places you can go with conref. But if you reuse it extensively you can define a key and use conkeyref.
  • Try to keep the reused content separately, in special folders. Writers will know that when they are editing resources from these special folders they might modify content which is potentially used in multiple places.
  • If you plan to translate your content to other languages try not to reuse inline elements (other than product name and constants which do not change when translated). Usually the translators need to translate entire block level-elements in order to have a good flow of translated content. The DITA 1.3 specs contains quite an useful recommendation for this: https://www.oxygenxml.com/dita/1.3/specs/index.html#non-normative/elementsMerged.html.