Friday, December 18, 2015

Migrating to a Structured Standards-based Documentation Solution

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr
Potential clients come to this world of structured content authoring from two main sources:
  1. They are starting fresh and after a little bit of comparing between structured and unstructured editing, between opened and closed solutions and some soul searching they come to regard structured authoring with a specific XML standard in general (and usually DITA in particular) as the possible solution for them.
  2. They are migrating from a previous unstructured or structured solution.
I think people in this second category start thinking about structured writing when they start encountering certain limitations with their current approach. These limitations they experience with their current system could be:
  • The need to reuse more content.

    With structured XML authoring in general and with DITA in particular you have so many ways of reusing content. In a previous blog post I tried to come up with an overview about all the ways in which you can reuse content using DITA:

  • Produce multiple outputs from the same content using some complex profiling conditions which are not supported in the current work flow.
  • Stop thinking about how the content is styled.

    You may want to focus more on the actual content and on semantically tagging it than on the way in which it will be presented in a certain output format.

  • Publish to more output formats than the current editing solution allows.

    Using a widely adopted open source standard like DITA for documentation also means having access to a variety of commercial and open source tools to generate various output formats from it. For example for obtaining the PDF you have about 5-6 distinct possible solutions: And for Mobile Friendly WebHelp you have 3-4 possible solutions:

  • Enforce certain internal rules on the documents.

    It's hard to impose best practices in unstructured documents. But with structured XML content, you can use Schematron to easily cover this aspect and even to provide quick fixes for your authors:

  • Benefit of advice and help from a larger community of writers and developers.

    When you are using a closed source solution, you may have only one forum and a couple of people willing to help. When you have a larger community you will be able to reach out with a single email to lots of people, and somebody may want to help you.

  • Share documentation between different companies.

    If a larger company which uses structured writing takes over a smaller one, the smaller company will need to adopt structured writing as well.

  • Own your content.

    Some editing solutions are closed source, you are forced to use a single tool because there are no other tools being to read that format. Then you need to ask yourself the question: "Is this content actually mine?"

  • Problems with your current tool vendor.

    If the format is closed source and the tool vendor is not responsive to your needs, you need to somehow move your content over to a market with multiple tool vendors available because competition also means smaller prices and better customer support.

Switching to structured content writing also has its problems. And I think the main ones are these:
  • The people. The fact that we all are reluctant to change. The learning curve. Writers might need to re-learn how to structure and write their documentation. Besides the technical aspects they will need to learn to divide content in small modules and to reuse parts in multiple places. Writers may not be willing to do this. We usually are very reluctant to change tools if we do not see instant benefits deriving from it.
  • Effort to convert the current available content to structured content. You can either choose manual conversion or automated conversion or in most cases a mixture of the two. Conversion will never be perfect, you will still need to go through the entire content and re-structure it taking into account module-based editing.
  • Customize the obtained output format. You may get out of the box various outputs from your content but you will always need to customize it to adhere to company standards. If you are using the DITA Open Toolkit for publishing you will need basic XSLT development skills to customize the PDF and CSS skills to customize the XHTML based output.
  • Money. You need to spend more money to get new tools, possibly a new CMS. Although I consider that starters, for a pilot project DITA does not need to be expensive. Here's how we're using DITA internally for our user's manual:
  • Sometimes you might need to control the styling of your obtained output so much and it would be impossible to separate the styling information from the content.

So can we draw a conclusion from all this?

Well, maybe not everybody interested in structured authoring will succeed to convert to it. But I think that one thing will hold true in most cases:

Once you convert to structured content, you will never go back.

Tuesday, December 15, 2015

Sharing New Custom File Templates for a Specific Vocabulary

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

The support Oxygen provides for editing DITA topics comes with quite an extensive set of new file templates used to create new DITA topic types. If you have a team of writers, you may want to filter out certain new file templates or add your custom new file templates, then share these custom templates with your team members.

This blog post will attempt to give you some clear steps for sharing a custom set of new file templates with your team.

All the original DITA new topic templates are located in the folder: OXYGEN_INSTALL_DIR\frameworks\dita\templates.

Instead of making changes directly to that folder, copying the entire DITA framework configuration folder (like OXYGEN_INSTALL_DIR\frameworks\dita), modifying and distributing it you can choose to extend the DITA framework and distribute the extension. In this way, you will benefit of new functionality added to the base framework by newer Oxygen versions and still use your customizations.

The steps below describe how an extension of the DITA framework which adds a custom set of new file templates can be constructed and shared:
  1. Create somewhere on your disk, in a place where you have full write access a folder structure like: custom_frameworks/dita-extension.
  2. In that new folder structure create another folder custom_frameworks/dita-extension/templates which will contain all your custom new topic templates.
  3. In the Document Type Association / Locations preferences page add in your Additional frameworks directories list the path to your custom_frameworks folder. Click OK or Apply in the Preferences dialog to save your changes.
  4. In the Document Type Association preferences page select the DITA document type configuration and use the Extend button to create an extension for it.
  5. Give a custom name to the extension, for example DITA - Custom and then change its Storage to external, then save it to a path like: path/to/.../custom_frameworks/dita-extension/dita-extension.framework.
  6. Make changes to the extension, go to the Templates tab, remove all previous entries from it and add a new entry pointing to your custom templates folder: ${frameworkDir}/templates.
  7. Click OK to close the dialog and then either OK or Apply to save the preferences changes.

After you perform the steps above you will have in the dita-extension folder a fully functioning framework extension which can be shared with others.

The framework can then be shared with others in several ways:
  • Copy it to their [OXYGEN_DIR]/frameworks directory.
  • Create somewhere on disk a custom_frameworks folder, copy the framework there and then from the Document Type Association / Locations preferences page add in your Additional frameworks directories list the path to the custom_frameworks folder.
  • Distribute the framework along with a project.

    Follow these steps:
    1. On your local drive, create a directory with full write access, containing the project files and a custom_frameworks folder containing your dita-extension framework.
    2. Start the application, go to the Project view and create a project. Save it in the newly created directory.
    3. In the Document Type Association / Locations preferences page, select Project Options at the bottom of the page.
    4. Add in the additional framework directories list an entry like ${pd}/custom_frameworks.
    5. Add other resources to your project, for example you can have all your DITA content located inside the project folder.
    6. You can then share the new project directory with other users. For example you can commit it to your version control system and have they update their working copy. When they open the customized project file in the Project view, the new document type becomes available in the list of Document Types.
  • Deploy the framework/document type configuration as an add-on.

After your team members install the framework they can check in Document Type Association preferences page in the list of Document Types to see if the framework is present and if it appears before the bundled DITA framework (meaning that it has higher priority).

You can use the framework extension mechanism to customize lots of aspects of the DITA functionality in Oxygen. For example you can remove various elements from the content completion list:

Tuesday, December 08, 2015

DITA Map Validate and Check for Completeness Overview

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

The Validate and Check For Completeness is an action available on the DITA Maps Manager view toolbar and it can be used to perform thorough checks on the entire DITA Map structure and set of referenced topics. We've made this action available to you a couple of years ago and during these years, based on your direct feedback we kept adding additional checks and functionality to it. We always took care to optimize the processing speed in order to allow for validating projects containing thousands of resources in 10-15 seconds.

In this blog post I will try to make a list of all the checks that the action does in order to ensure you that your DITA content is valid and proper:
  • Validate each DITA resource directly or indirectly referenced from your DITA Map with its associated DTD or XML Schema and report any errors which may arise.
  • Validate each DITA resource with an additional Schematron resource which you can provide. Schematron is quite handy when it comes to enforcing internal rules on the DITA content and we use it quite a lot for checking our user's manual.
  • Batch validate referenced DITA resources. This setting validates each DITA resource according to the validation scenario associated with it in Oxygen. This will decrease the validation speed quite a bit but if you have DITA 1.3 resources which are Relax NG based you should check it in order to validate each resource according to the Relax NG Schema.
  • Use specific DITAVAL or profiling condition filters when performing the validation. From a single published DITA Map you may get multiple publications based on the profiling filters applied. Because these filters are used to remove entire topics or parts of topics, you may have links and conrefs which become invalid when certain filters are applied on the map. So it makes sense to validate your DITA project by applying all profiling filters you would apply when publishing it in order to be aware of these potential broken references.
  • Report profiling attributes or values which are not valid according to the Subject Scheme Map associated with your project. You can read more about controlling profiling attributes and values here:
  • Identify possible conflicts in profile attribute values. When the profiling attributes of a topic contain values that are not found in parent topic profiling attributes, the content of the topic is overshadowed when generating profiled output.
  • Check the existence of non-DITA referenced resources. You will get reports if links to local images or other resources are broken. You can also decide to verify the existence of remote links. For example if you have links to various external web sites, you might be interested in seeing if those remote servers are still there.
  • Report links to topics not referenced in DITA maps. Checks that all referenced topics are linked in the DITA map. Otherwise you may get working links to topics which are not included in the table of contents.
  • Check for duplicate topic IDs within the DITA map context. By default the topic ID can be used in the WebHelp output for context sensitive help. Also certain CMSs require that a certain topic ID would be unique in the entire DITA Map.
  • Report elements with the same ID placed in the same DITA Topic according to the specification.
  • Report missing domains attribute which may indicate an improper DITA specialization.
  • Report invalid class attribute values according to the specification.
  • Report invalid key names according to the specification.
  • Report references to missing keys or links which refer to keys which have no target resource defined on them.
  • Report problems when elements referenced using DITA content reference range are not siblings or are not properly sequenced.
  • Report links which have no target set on them either via href or keyref.
  • Report non-portable absolute references to DITA resources.
  • Report when links contain invalid encoded characters or Windows-like path separators.
  • Report when resources are referenced with incorrect path capitalization.
  • Report a mismatch between the referenced resource type and its format attribute.
  • Report a mismatch between the referenced resource type and its type attribute.
  • Report topic references in a DITA Map pointing to non-topic elements in the target topics.
  • Report invalid content references and content key references, references to non-existing resources, to non-existing IDs, report when the source element is not a specialization of the target element.

I think I covered most of the checks that this validation does.

Are there other checks you would like to see in a future version? Would you like to see this validation available as a separate process which could be run on a server?