Friday, April 15, 2016

DITA Open Toolkit Frequently Asked Questions (FAQ)

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

I tried to put together below a set of frequently asked questions and useful links about the DITA Open Toolkit.

What is the DITA Open Toolkit?

The DITA Open Toolkit is a publishing tool used to convert DITA content into various output formats. It's official web site, containing download links and documentation can be found here:

How do I use the DITA Open Toolkit?

You can download, install and build output from DITA content using the command line.

Besides this, there are applications which come with the DITA Open Toolkit bundled. For example Oxygen XML Editor comes bundled with both DITA Open Toolkit 1.8 and 2.x. Oxygen provides visual means to run the bundled DITA Open Toolkit using a concept called transformation scenarios.

What version of DITA Open Toolkit should I use?

Probably most of the DITA implementations available are still using DITA Open Toolkit 1.8.5. This usually happens because companies have output customizations which have not yet been modified to work with the latest DITA Open Toolkit releases.

But if you do not have legacy plugins or customizations you should try to use whenever possible the most recent stable version of the DITA Open Toolkit available on the official download page.

What outputs can I obtain using the DITA Open Toolkit?

The entire set of default available output formats is available here: But the DITA Open Toolkit can be enhanced by installing plugins to provide additional output formats.

What is the general architecture of the DITA Open Toolkit?

The DITA Open Toolkit is a quite large mixture of ANT build scripts, Java libraries and XSLT scripts. It has a pipeline-based architecture which uses plugins to publish DITA content to various output formats. Most of the DITA Open Toolkit customizations that you want to make in order to add new publishing capabilities or to customize existing publishing choices can be made without modifying its internal core.

What is a DITA Open Toolkit plugin?

A DITA Open Toolkit plugin can either provide a new publishing format, customize an existing publishing stage or provide a DITA specialization vocabulary. The plugin can use one of the numerous extension points available in the DITA Open Toolkit:

Once you have created a plugin you can install it in the DITA Open Toolkit either by manual installation or using the new automated installation procedure.

How do I customize the HTML-based outputs?

There are a number of parameters which can be set to customize the HTML-based outputs: For example you can specify your own CSS stylesheet to be used with the generated HTML output.

You can also create a plugin to customize the HTML outputs by adding a custom XSLT stylesheet:

How do I customize the PDF output?

The PDF output is obtained by passing the original DITA content to XSL-FO and then generating PDF using an XSL-FO processor. The default bundled and used XSL-FO processor is the Apache FOP but you can also install separately and use commercial PDF processors like Antenna House or RenderX XEP.

You can customize the PDF output either using a PDF customization folder or by creating a PDF customization plugin.

There are a number of other solutions for obtaining PDF from DITA:

Tuesday, April 05, 2016

DITA Usage Survey

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

A week or two ago I opened a survey about various ways in which people are using DITA. The survey was taken by more than 50 distinct DITA users and I think it indicated quite clearly some trends in the industry. As I said from the beginning, I'm releasing the entire set of results, including individual responses to questions:

I will try below to sum up some of the results:

GIT is on a roll

GIT overcomes Subversion to be the most popular open source solution for version control used in DITA projects. Although most users who responded seem to use open source solutions for version control, there is a solid portion of them using commercial CMSs probably specialized in DITA content. I suspect that people with small to medium projects prefer open source solutions because they are more affordable for their small group of writers.

PDF is still the most popular output format

Most of the participants identified PDF as being their primary output format. Most of them output both to PDF and XHTML but the choice of PDF as the primary output format looks very clear.

Indirect addressing is becoming the main way of reusing content

Plain content references are still used more than content key references but key references are strongly used as well so it seems that indirect ways of addressing content win this game.

DITA 1.3 features

Besides the use of key scopes and branch filtering (which comes as no surprise) it would see that the troubleshooting topic and use of SVG embedded directly inside DITA content come as strong needs that DITA 1.3 fulfills.

Popular image formats

The fact that PNG is the most popular image format comes as no surprise. But SVG coming in as a close second identifies an increasing trend of using vectorial images in technical documentation. Besides the benefit of being vectorial and not losing information when scaled, SVG allows you the unique capability of translating various parts of the image.

Major DITA frustrations

It would seem there are two major DITA frustrations:
  • PDF Customization difficulties. This in my opinion wins the cake in this category. Customizations for the standard PDF output are hard, they required knowledge of XSLT, XSL-FO and of the PDF plugin architecture. But alternatives do exist:
  • "DITA is perceived as too complex for casual users." This quote says it all, the entry level is high. There are also complains about linking, filtering and reuse. All these come from the DITA flexibility and the fact that each new version adds new elements and ways of working with content. And although DITA can be specialized and reduced as a vocabulary, I suspect not many people are doing that.

That's all I wanted to cover in this post, so go ahead, enjoy the survey results and any comments are as usual welcomed.

Thursday, March 03, 2016

Implementing your own Style Guide

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

Let's say you are a team of tech writers collaborating on a DITA-based project and doing things your way, maybe you have various best practices about what elements to use and when to use them, maybe you want to impose a set of controlled values for certain attributes. So at some point you gather on an internal server a set of HTML resources which explain how various DITA elements should be used. This blog post will attempt to show you how these best practices can be shared with your team so that they are readily available when editing DITA content in Oxygen.

Custom "Style Guide" toolbar button

As you have your style guide HTML resources on a server, you can add a custom toolbar button which will appear on the DITA toolbar when editing DITA topics in the Author editing mode. When you press that toolbar button, a web browser opens up and shows you the style guide main page. Here are some steps about how to do this:

  1. In the Oxygen Preferences->Document Type Association page edit the DITA framework. Instead of editing the DITA framework directly you may choose to extend it in order to share the extension more easily:
  2. In the Author tab go to the Actions tab and there is an action with the ID styleguide. If you edit the action, it invokes an operation with a parameter called resourcePath. You can edit that parameter to point to your internal (or public) server where the WebHelp output is stored. You should also set an icon to it, you can use /images/BrowseReferenceManual16.png (it's a default icon which comes with Oxygen). Save your changes in that dialog.
  3. In the Author tab there is a Toolbar sub-tab in which you can add the styleguide action to the toolbar in the place where you want it. Press OK a couple of times in the dialogs and then action should become available on the toolbar for each topic.

Link to Style Guide for each element in the content completion window

When you press the ENTER key in the Author editing mode, you get a list of available elements. For each element there is documentation available, that documentation can be customized, for example you could add links for each element to a specific section in your style guide. This topic should tell you more about how this can be achieved:

Impose controlled attribute values

For certain attributes (for example profiling attributes, @outputclass attributes) you may want to impose a set of controlled attribute values. This blog post will tell you how:

Show validation errors or warning when guidelines are breached

If possible, some of your rules can be converted to Schematron, allowing the application to signal to the writer when a rule is not obeyed. You can also add quick fixes to show writers various ways to rectify the problem. This blog post should give you more details about this:

Bringing all of this together

There is an Oxygen XML GitHub project called DIM which attempts to approach most of these aspects in an unified manner:

Wednesday, March 02, 2016

Resources for learning DITA with Oxygen

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

From time to time we get requests from beginners or from users migrating from other tools who want to start using Oxygen with DITA and they need to know a set of useful resources.

Resources for editing DITA with Oxygen:

We have a getting started section in our user's manual: and a larger section on DITA authoring:

There are two past webinars about DITA support in Oxygen, you can follow them on our Oxygen YouTube channel:

And we have a list of videos, some of them DITA-related here:

Resources for learning DITA:

If you want to start learning about DITA in general there is a web site called Learning DITA.

The DITA 1.3 standard specification can be found here:

There are also a number of good books like DITA For Practitioners and the DITA Style Guider.

Resources for customizing the DITA output formats

Usually customizing the XHTML based outputs means creating your custom CSS selectors. If you generate WebHelp output using Oxygen, we have a section explaining basic WebHelp customizations:

For PDF-based outputs if you publish via the DITA Open Toolkit we have a section in our user's manual about PDF customizations: There is also a PDF plugin generator created by Jarno Elovirta which can be used in order to customize the PDF layout. The DITA For Print book also covers quite a lot of many customization possibilities. There are also a number of different alternatives to obtain PDF from DITA:

DITA Trivia

There are a number of blogs on which you can read various DITA-related articles:


The DITA Users List is probably the first place where you can register and ask for help with DITA-related issues. There is also a Google Groups DITA Users List. Most DITA Users register sooner or later on one of these groups.
There is also a DITA Awareness Group on Linked In.

Friday, December 18, 2015

Migrating to a Structured Standards-based Documentation Solution

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr
Potential clients come to this world of structured content authoring from two main sources:
  1. They are starting fresh and after a little bit of comparing between structured and unstructured editing, between opened and closed solutions and some soul searching they come to regard structured authoring with a specific XML standard in general (and usually DITA in particular) as the possible solution for them.
  2. They are migrating from a previous unstructured or structured solution.
I think people in this second category start thinking about structured writing when they start encountering certain limitations with their current approach. These limitations they experience with their current system could be:
  • The need to reuse more content.

    With structured XML authoring in general and with DITA in particular you have so many ways of reusing content. In a previous blog post I tried to come up with an overview about all the ways in which you can reuse content using DITA:

  • Produce multiple outputs from the same content using some complex profiling conditions which are not supported in the current work flow.
  • Stop thinking about how the content is styled.

    You may want to focus more on the actual content and on semantically tagging it than on the way in which it will be presented in a certain output format.

  • Publish to more output formats than the current editing solution allows.

    Using a widely adopted open source standard like DITA for documentation also means having access to a variety of commercial and open source tools to generate various output formats from it. For example for obtaining the PDF you have about 5-6 distinct possible solutions: And for Mobile Friendly WebHelp you have 3-4 possible solutions:

  • Enforce certain internal rules on the documents.

    It's hard to impose best practices in unstructured documents. But with structured XML content, you can use Schematron to easily cover this aspect and even to provide quick fixes for your authors:

  • Benefit of advice and help from a larger community of writers and developers.

    When you are using a closed source solution, you may have only one forum and a couple of people willing to help. When you have a larger community you will be able to reach out with a single email to lots of people, and somebody may want to help you.

  • Share documentation between different companies.

    If a larger company which uses structured writing takes over a smaller one, the smaller company will need to adopt structured writing as well.

  • Own your content.

    Some editing solutions are closed source, you are forced to use a single tool because there are no other tools being to read that format. Then you need to ask yourself the question: "Is this content actually mine?"

  • Problems with your current tool vendor.

    If the format is closed source and the tool vendor is not responsive to your needs, you need to somehow move your content over to a market with multiple tool vendors available because competition also means smaller prices and better customer support.

Switching to structured content writing also has its problems. And I think the main ones are these:
  • The people. The fact that we all are reluctant to change. The learning curve. Writers might need to re-learn how to structure and write their documentation. Besides the technical aspects they will need to learn to divide content in small modules and to reuse parts in multiple places. Writers may not be willing to do this. We usually are very reluctant to change tools if we do not see instant benefits deriving from it.
  • Effort to convert the current available content to structured content. You can either choose manual conversion or automated conversion or in most cases a mixture of the two. Conversion will never be perfect, you will still need to go through the entire content and re-structure it taking into account module-based editing.
  • Customize the obtained output format. You may get out of the box various outputs from your content but you will always need to customize it to adhere to company standards. If you are using the DITA Open Toolkit for publishing you will need basic XSLT development skills to customize the PDF and CSS skills to customize the XHTML based output.
  • Money. You need to spend more money to get new tools, possibly a new CMS. Although I consider that starters, for a pilot project DITA does not need to be expensive. Here's how we're using DITA internally for our user's manual:
  • Sometimes you might need to control the styling of your obtained output so much and it would be impossible to separate the styling information from the content.

So can we draw a conclusion from all this?

Well, maybe not everybody interested in structured authoring will succeed to convert to it. But I think that one thing will hold true in most cases:

Once you convert to structured content, you will never go back.

Tuesday, December 15, 2015

Sharing New Custom File Templates for a Specific Vocabulary

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

The support Oxygen provides for editing DITA topics comes with quite an extensive set of new file templates used to create new DITA topic types. If you have a team of writers, you may want to filter out certain new file templates or add your custom new file templates, then share these custom templates with your team members.

This blog post will attempt to give you some clear steps for sharing a custom set of new file templates with your team.

All the original DITA new topic templates are located in the folder: OXYGEN_INSTALL_DIR\frameworks\dita\templates.

Instead of making changes directly to that folder, copying the entire DITA framework configuration folder (like OXYGEN_INSTALL_DIR\frameworks\dita), modifying and distributing it you can choose to extend the DITA framework and distribute the extension. In this way, you will benefit of new functionality added to the base framework by newer Oxygen versions and still use your customizations.

The steps below describe how an extension of the DITA framework which adds a custom set of new file templates can be constructed and shared:
  1. Create somewhere on your disk, in a place where you have full write access a folder structure like: custom_frameworks/dita-extension.
  2. In that new folder structure create another folder custom_frameworks/dita-extension/templates which will contain all your custom new topic templates.
  3. In the Document Type Association / Locations preferences page add in your Additional frameworks directories list the path to your custom_frameworks folder.
  4. In the Document Type Association preferences page select the DITA document type configuration and use the Extend button to create an extension for it.
  5. Give a custom name to the extension, for example DITA - Custom and then change its Storage to external, then save it to a path like: path/to/.../custom_frameworks/dita-extension/dita-extension.framework.
  6. Make changes to the extension, go to the Templates tab, remove all previous entries from it and add a new entry pointing to your custom templates folder: ${frameworkDir}/templates.
  7. Click OK to close the dialog and then either OK or Apply to save the preferences changes.

After you perform the steps above you will have in the dita-extension folder a fully functioning framework extension which can be shared with others.

The framework can then be shared with others in several ways:
  • Copy it to their [OXYGEN_DIR]/frameworks directory.
  • Create somewhere on disk a custom_frameworks folder, copy the framework there and then from the Document Type Association / Locations preferences page add in your Additional frameworks directories list the path to the custom_frameworks folder.
  • Distribute the framework along with a project.

    Follow these steps:
    1. On your local drive, create a directory with full write access, containing the project files and a custom_frameworks folder containing your dita-extension framework.
    2. Start the application, go to the Project view and create a project. Save it in the newly created directory.
    3. In the Document Type Association / Locations preferences page, select Project Options at the bottom of the page.
    4. Add in the additional framework directories list an entry like ${pd}/custom_frameworks.
    5. Add other resources to your project, for example you can have all your DITA content located inside the project folder.
    6. You can then share the new project directory with other users. For example you can commit it to your version control system and have they update their working copy. When they open the customized project file in the Project view, the new document type becomes available in the list of Document Types.
  • Deploy the framework/document type configuration as an add-on.

After your team members install the framework they can check in Document Type Association preferences page in the list of Document Types to see if the framework is present and if it appears before the bundled DITA framework (meaning that it has higher priority).

You can use the framework extension mechanism to customize lots of aspects of the DITA functionality in Oxygen. For example you can remove various elements from the content completion list:

Tuesday, December 08, 2015

DITA Map Validate and Check for Completeness Overview

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

The Validate and Check For Completeness is an action available on the DITA Maps Manager view toolbar and it can be used to perform thorough checks on the entire DITA Map structure and set of referenced topics. We've made this action available to you a couple of years ago and during these years, based on your direct feedback we kept adding additional checks and functionality to it. We always took care to optimize the processing speed in order to allow for validating projects containing thousands of resources in 10-15 seconds.

In this blog post I will try to make a list of all the checks that the action does in order to ensure you that your DITA content is valid and proper:
  • Validate each DITA resource directly or indirectly referenced from your DITA Map with its associated DTD or XML Schema and report any errors which may arise.
  • Validate each DITA resource with an additional Schematron resource which you can provide. Schematron is quite handy when it comes to enforcing internal rules on the DITA content and we use it quite a lot for checking our user's manual.
  • Batch validate referenced DITA resources. This setting validates each DITA resource according to the validation scenario associated with it in Oxygen. This will decrease the validation speed quite a bit but if you have DITA 1.3 resources which are Relax NG based you should check it in order to validate each resource according to the Relax NG Schema.
  • Use specific DITAVAL or profiling condition filters when performing the validation. From a single published DITA Map you may get multiple publications based on the profiling filters applied. Because these filters are used to remove entire topics or parts of topics, you may have links and conrefs which become invalid when certain filters are applied on the map. So it makes sense to validate your DITA project by applying all profiling filters you would apply when publishing it in order to be aware of these potential broken references.
  • Report profiling attributes or values which are not valid according to the Subject Scheme Map associated with your project. You can read more about controlling profiling attributes and values here:
  • Identify possible conflicts in profile attribute values. When the profiling attributes of a topic contain values that are not found in parent topic profiling attributes, the content of the topic is overshadowed when generating profiled output.
  • Check the existence of non-DITA referenced resources. You will get reports if links to local images or other resources are broken. You can also decide to verify the existence of remote links. For example if you have links to various external web sites, you might be interested in seeing if those remote servers are still there.
  • Report links to topics not referenced in DITA maps. Checks that all referenced topics are linked in the DITA map. Otherwise you may get working links to topics which are not included in the table of contents.
  • Check for duplicate topic IDs within the DITA map context. By default the topic ID can be used in the WebHelp output for context sensitive help. Also certain CMSs require that a certain topic ID would be unique in the entire DITA Map.
  • Report elements with the same ID placed in the same DITA Topic according to the specification.
  • Report missing domains attribute which may indicate an improper DITA specialization.
  • Report invalid class attribute values according to the specification.
  • Report invalid key names according to the specification.
  • Report references to missing keys or links which refer to keys which have no target resource defined on them.
  • Report problems when elements referenced using DITA content reference range are not siblings or are not properly sequenced.
  • Report links which have no target set on them either via href or keyref.
  • Report non-portable absolute references to DITA resources.
  • Report when links contain invalid encoded characters or Windows-like path separators.
  • Report when resources are referenced with incorrect path capitalization.
  • Report a mismatch between the referenced resource type and its format attribute.
  • Report a mismatch between the referenced resource type and its type attribute.
  • Report topic references in a DITA Map pointing to non-topic elements in the target topics.
  • Report invalid content references and content key references, references to non-existing resources, to non-existing IDs, report when the source element is not a specialization of the target element.

I think I covered most of the checks that this validation does.

Are there other checks you would like to see in a future version? Would you like to see this validation available as a separate process which could be run on a server?