Friday, January 25, 2019

Migrating Various Document Formats to DITA

Share to Facebook Share to Twitter Email This Share on Google Plus Share on Tumblr

Most companies do not start new DITA-based projects from scratch. They already have content written in various other formats and somehow they need that content converted to DITA. In this blog post, I will offer some conversion advice depending on the format of your current project.

Migrating DocBook content to DITA.

Because DocBook content is XML, migrating it to DITA is quite straight forward:
  1. You first convert the DocBook document to a single large DITA composite file and you can do that with the predefined transformation scenario bundled with Oxygen called DocBook to DITA.
  2. There is a utility XSLT stylesheet on the Oxygen XML GitHub account that can convert a DITA composite to a DITA map with separate DITA topic files: https://github.com/oxygenxml/old-userguide-docbookbased/blob/master/split-DITA-topic.xsl

Migrating Microsoft Word content to DITA

The Oxygen XML User Manual has a detailed topic enumerating the possibilities to convert Microsoft Word content to DITA: https://www.oxygenxml.com/doc/versions/20.1/ug-editor/topics/ooxml-to-dita.html?hl=migrate%2Cdita

Migrating Excel content to DITA

You can use Oxygen's Smart Paste functionality to copy content from an Excel spreadsheet and paste it inside an opened DITA topic. Also, as an alternate possibility, the Oxygen Resources Converter add-on was updated to be able to batch convert Excel to DITA: https://github.com/oxygenxml/oxygen-resources-converter

Migrating LibreOffice content to DITA

LibreOffice documents can be saved in Word format, and once you do that, you can convert the Word content to DITA as described above. Alternatively, you can save the LibreOffice documents to DocBook and then apply the DocBook to DITA conversion technique described above.

Migrating Google Docs to DITA

You have three possibilities to convert Google Docs to DITA using Oxygen:
  • Copy/Pasting from Google Docs to a DITA Topic opened in Oxygen in the Author visual editing mode should work and convert the pasted content to DITA.
  • Save the Google document as OpenDocumentFormat (ODF) then save the ODF document as DocBook with Libre Office, then apply the DocBook to DITA transformation scenario shipped in Oxygen to convert DocBook to DITA.
  • Save the Google document as HTML then use the Oxygen batch converter add-on to convert it to DITA: https://github.com/oxygenxml/oxygen-resources-converter

Migrating Markdown content to DITA

The DITA Open Toolkit publishing engine bundled with Oxygen allows you to reference Markdown files directly in a DITA map and either publish them directly or export the Markdown files to DITA one by one: https://www.oxygenxml.com/doc/versions/20.1/ug-editor/topics/markdown-dita-x-dita2.html. If you want to convert multiple Markdown documents at once, you can use the Oxygen Resources Converter add-on: https://github.com/oxygenxml/oxygen-resources-converter

Migrating HTML content to DITA

Using Oxygen's Smart Paste functionality, you can open the HTML documents in a web browser, then copy the contents and paste it in a DITA topic opened in Oxygen's Author visual editing mode. If you want to convert multiple HTML files, you can use the Oxygen Resources Converter add-on: https://github.com/oxygenxml/oxygen-resources-converter

Migrating unstructured FrameMaker to DITA

There is a FrameMaker plugin that can be used for this type of conversion: http://leximation.com/tools/info/fm2dita.php

Migrating MadCap content to DITA

Some recent MadCap versions seem to have facilities to export content directly to DITA. Otherwise, you will need to convert XHTML content to DITA with a custom XSLT stylesheet to preserve variable references.

Migrating other formats to DITA

You may find third-party applications (like Pandoc) that can convert your content to HTML or to some kind of XML format like DocBook. Once you have HTML or DocBook content, you can convert them to DITA using the advice above.