Tuesday, December 08, 2015

DITA Map Validate and Check for Completeness Overview

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

The Validate and Check For Completeness is an action available on the DITA Maps Manager view toolbar and it can be used to perform thorough checks on the entire DITA Map structure and set of referenced topics. We've made this action available to you a couple of years ago and during these years, based on your direct feedback we kept adding additional checks and functionality to it. We always took care to optimize the processing speed in order to allow for validating projects containing thousands of resources in 10-15 seconds.

In this blog post I will try to make a list of all the checks that the action does in order to ensure you that your DITA content is valid and proper:
  • Validate each DITA resource directly or indirectly referenced from your DITA Map with its associated DTD or XML Schema and report any errors which may arise.
  • Validate each DITA resource with an additional Schematron resource which you can provide. Schematron is quite handy when it comes to enforcing internal rules on the DITA content and we use it quite a lot for checking our user's manual.
  • Batch validate referenced DITA resources. This setting validates each DITA resource according to the validation scenario associated with it in Oxygen. This will decrease the validation speed quite a bit but if you have DITA 1.3 resources which are Relax NG based you should check it in order to validate each resource according to the Relax NG Schema.
  • Use specific DITAVAL or profiling condition filters when performing the validation. From a single published DITA Map you may get multiple publications based on the profiling filters applied. Because these filters are used to remove entire topics or parts of topics, you may have links and conrefs which become invalid when certain filters are applied on the map. So it makes sense to validate your DITA project by applying all profiling filters you would apply when publishing it in order to be aware of these potential broken references.
  • Report profiling attributes or values which are not valid according to the Subject Scheme Map associated with your project. You can read more about controlling profiling attributes and values here:http://blog.oxygenxml.com/2015/07/controlled-attribute-values-for-your.html.
  • Identify possible conflicts in profile attribute values. When the profiling attributes of a topic contain values that are not found in parent topic profiling attributes, the content of the topic is overshadowed when generating profiled output.
  • Check the existence of non-DITA referenced resources. You will get reports if links to local images or other resources are broken. You can also decide to verify the existence of remote links. For example if you have links to various external web sites, you might be interested in seeing if those remote servers are still there.
  • Report links to topics not referenced in DITA maps. Checks that all referenced topics are linked in the DITA map. Otherwise you may get working links to topics which are not included in the table of contents.
  • Check for duplicate topic IDs within the DITA map context. By default the topic ID can be used in the WebHelp output for context sensitive help. Also certain CMSs require that a certain topic ID would be unique in the entire DITA Map.
  • Report elements with the same ID placed in the same DITA Topic according to the specification.
  • Report missing domains attribute which may indicate an improper DITA specialization.
  • Report invalid class attribute values according to the specification.
  • Report invalid key names according to the specification.
  • Report references to missing keys or links which refer to keys which have no target resource defined on them.
  • Report problems when elements referenced using DITA content reference range are not siblings or are not properly sequenced.
  • Report links which have no target set on them either via href or keyref.
  • Report non-portable absolute references to DITA resources.
  • Report when links contain invalid encoded characters or Windows-like path separators.
  • Report when resources are referenced with incorrect path capitalization.
  • Report a mismatch between the referenced resource type and its format attribute.
  • Report a mismatch between the referenced resource type and its type attribute.
  • Report topic references in a DITA Map pointing to non-topic elements in the target topics.
  • Report invalid content references and content key references, references to non-existing resources, to non-existing IDs, report when the source element is not a specialization of the target element.

I think I covered most of the checks that this validation does.

Are there other checks you would like to see in a future version? Would you like to see this validation available as a separate process which could be run on a server?

6 comments:

  1. Hi Radu,

    thank you for posting this. I really like to perform this validation via CLI as a post commit hook to run in a CI server like Travis or Jenkins. Great post!

    ReplyDelete
    Replies
    1. Thanks Stefan.
      Internally we run in an automated build script the validation and check for completeness every night on our user's manual.
      And I also think it would be a good idea to expose this type of scripted validation to our users.

      Delete
  2. Harald Hermann7:25 PM

    Thanks a lot. We also would like to invoke this validation from CI Jenkins. Is it available already?

    ReplyDelete
    Replies
    1. Not yet, we'll try to look into this after we release Oxygen 18.0, I will try to update this blog post when the feature is implemented.

      Delete
  3. I would like to second (third?) the request for a CLI validation and completeness check. It would be invaluable for validating files converted to DITA from other formats.

    ReplyDelete
    Replies
    1. We just released Oxygen 18.1 which allows using a special scripting license for the validation:
      https://oxygenxml.com/oxygen_scripting.html

      Delete