Migrating Various Document Formats to DITA
Most companies do not start new DITA-based projects from scratch. They already have content written in various other formats and somehow they need that content converted to DITA. In this blog post, I will offer some conversion advice depending on the format of your current project.
Migrating DocBook Content to DITA.
You can migrate one or multiple DocBook documents to DITA using the Oxygen Batch Documents Converter add-on: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
The DocBook to DITA conversion contains an option named Create DITA maps from DocBook documents containing multiple sections. When this option is selected, all sections from your DocBook document will be separated into individual DITA topics and referenced in a DITA map.
Migrating Microsoft Word Content to DITA
The Oxygen XML User Manual has a detailed topic enumerating the possibilities to convert Microsoft Word content to DITA: https://www.oxygenxml.com/doc/ug-editor/topics/ooxml-to-dita.html.
Migrating Excel Content to DITA
You can use Oxygen's Smart Paste functionality to copy content from an Excel spreadsheet and paste it inside an opened DITA topic. Also, as an alternate possibility, the Oxygen Batch Documents Converter add-on was updated to be able to batch convert Excel to DITA: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
Migrating LibreOffice Content to DITA
LibreOffice documents can be saved in Word format, and once you do that, you can convert the Word content to DITA as described above. Alternatively, you can save the LibreOffice documents to DocBook and then apply the DocBook to DITA conversion technique described above.
Migrating Google Docs to DITA
- Copy/Pasting from Google Docs to a DITA Topic opened in Oxygen in the Author visual editing mode should work and convert the pasted content to DITA.
- Save the Google document as OpenDocumentFormat (ODF) then save the ODF document as DocBook with Libre Office, then apply the DocBook to DITA transformation scenario shipped in Oxygen to convert DocBook to DITA.
- Save the Google document as HTML then use the Oxygen batch converter add-on to convert it to DITA: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
Migrating Markdown Content to DITA
The DITA Open Toolkit publishing engine bundled with Oxygen allows you to reference Markdown files directly in a DITA map and either publish them directly or export the Markdown files to DITA one by one: https://www.oxygenxml.com/doc/ug-editor/topics/markdown-dita-2.html. If you want to convert multiple Markdown documents at once, you can use the Oxygen Batch Documents Converter add-on: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
Migrating HTML Content to DITA
Using Oxygen's Smart Paste functionality, you can open the HTML documents in a web browser, then copy the contents and paste it in a DITA topic opened in Oxygen's Author visual editing mode. If you want to convert multiple HTML files, you can use the Oxygen Batch Documents Converter add-on: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
Migrating Unstructured FrameMaker to DITA
There is a detailed blog post enumerating the possibilities to convert Unstructured FrameMaker content to DITA: Migrating Unstructured Adobe FrameMaker Content to DITA.
Migrating MadCap Content to DITA
This open source project contains such a stylesheet which attempts to convert a Flare project to DITA XML and instructions about how to use it. As an alternative some recent MadCap versions seem to have facilities to export content directly to DITA.
Migrating Confluence Content to DITA
To convert Confluence content to DITA, you can use the Oxygen Batch Documents Converter add-on: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html.
You first need to export the content to HTML. For this, log in to your Confluence account and navigate to the specific space that you want to export. Then go to Space Settings→Export space and choose to export it as HTML. Then, back on Oxygen, you can then use the Confluence to DITA action (available once the add-on is installed) to convert the exported index.html file into a DITA map with topics.
Migrating AsciiDoc to DITA
The Asciidoctor third-party application can be used to convert AsciiDoc content to DocBook. Then, you can convert the DocBook content to DITA using the method described here.
Migrating reStructuredText to DITA
The Pandoc third-party application can be used to convert reStructuredText content to DocBook or HTML. Then, you can convert the DocBook or HTML content to DITA using the Oxygen Batch Documents Converter add-on.
Migrating LaTex to DITA
You may use a third-party application (like Pandoc) to convert LaTex content to Word or HTML. Afterwards use the Oxygen Batch Documents Converter: https://www.oxygenxml.com/doc/ug-editor/topics/batch-converter-addon.html
Migrating Other Formats to DITA
You may find third-party applications (like Pandoc) that can convert your content to HTML or to some kind of XML format like DocBook. Once you have HTML or DocBook content, you can convert them to DITA using the advice above.