Deleting Elements in a Refactoring Operation
Recently, a writer wanted to remove the index from their DITA book. This required the following:
-
Removing the
<indexlists>
element from the map:<backmatter> <booklists> <indexlist/> </booklists> </backmatter>
-
Removing topic-level
<indexterm>
elements from topic prologs:<topic id="feature_A"> <title>About Feature A</title> <prolog> <metadata> <keywords> <indexterm>feature A</indexterm> </keywords> </metadata> </prolog>
-
Removing inline
<indexterm>
elements from topic content:<p>This is about <indexterm>feature B</indexterm>feature B.</p>
Oxygen provides a "Delete element" refactoring operation. However, it does precisely what it says—deletes the specified elements, leaving everything else in place:
<topic id="feature_A">
<title>About Feature A</title>
<prolog>
<metadata>
<keywords>
</keywords>
</metadata>
</prolog>
I decided to create an XSLT refactoring operation that does the following:
-
Deletes the specified elements
-
Deletes any containing (ancestor) elements that became empty as a result
-
Updates whitespace/newline formatting around deleted elements as needed
-
Serves as an easily customizable template for other element deletion uses
Fortunately, as described in Custom Refactoring Operations, Oxygen allows us to package up customized XSLT refactoring operations in an easy-to-use way. For the XML descriptor file, put this content into a remove-index.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<refactoringOperationDescriptor
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.oxygenxml.com/ns/xmlRefactoring" id="remove-index"
name="Remove index from a DITA book">
<description>Remove index terms and backmatter index from a DITA book.</description>
<script type="XSLT" href="remove-index.xsl"/>
<category>DITA</category>
</refactoringOperationDescriptor>
For the XSLT file itself, put this content into a remove-index.xsl file:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<!-- elements to delete -->
<xsl:variable name="elements-to-delete" select="('indexterm', 'indexlist')"/>
<!-- delete up to (and including) these elements, if they become empty -->
<xsl:variable name="delete-up-to" select="('prolog', 'backmatter')"/>
<!-- baseline identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- remove elements-to-delete -->
<xsl:template match="*[name() = $elements-to-delete]"/>
<!-- remove whitespace/newlines before elements-to-delete -->
<xsl:template match="text()
[following-sibling::*[1]
[name() = $elements-to-delete]]
[matches(., '^\s*\n\s*$')]"/>
<!-- remove elements that contain our to-be-deleted elements,
but only if they become empty -->
<xsl:template match="*[ancestor-or-self::*[name() = $delete-up-to]]
[descendant::*[name() = $elements-to-delete]]">
<!-- apply templates to this element's contents and see what we get -->
<xsl:variable name="contents" as="node()*">
<xsl:apply-templates select="node()"/>
</xsl:variable>
<!-- if children elements remain, copy this element (and its preceding whitespace/newlines)
and put its contents inside -->
<xsl:if test="$contents[self::*]">
<xsl:copy select="preceding-sibling::node()[1][self::text()][matches(., '^\s*\n\s*$')]"/>
<xsl:copy select=".">
<xsl:sequence select="$contents"/>
</xsl:copy>
</xsl:if>
</xsl:template>
<!-- remove whitespace/newlines before elements-to-delete
(we re-add whitespace/newlines above, if needed -->
<xsl:template match="text()
[following-sibling::*[1]
[ancestor-or-self::*[name() = $delete-up-to]]
[descendant::*[name() = $elements-to-delete]]]
[matches(., '^\s*\n\s*$')]"/>
</xsl:stylesheet>
At the beginning of the refactoring operation, two XSLT variables are defined:
-
elements-to-delete
- the element names to delete, regardless of their contents -
delete-up-to
- the highest-level containing element names to delete, if they become empty
The refactoring operation works as follows:
-
The
elements-to-delete
elements are always deleted.-
Any whitespace/newline
text()
nodes directly preceding them are also deleted.
-
-
Any elements that (1) contain an
elements-to-delete
element as a descendant, (2) are contained by or are themselves adelete-up-to
element, and (3) become empty due to the element deletion, are deleted.-
To determine if a "containing" element becomes empty due to the deletion,
<xsl:apply-templates>
is called, then the results are checked to see if any elements remain. This is what allows the deletion to continue dynamically up through the containing elements.
-
-
To conditionally keep the whitespace/newline
text()
node directly preceding a "containing" element,-
A standalone unconditional template always deletes the whitespace/newline
text()
node preceding a containing element, whether it will be kept or not. -
Inside the template that conditionally keeps containing elements, that same preceding
text()
node is re-included if the containing element is kept.
-
The following example shows a <prolog>
element that disappears
completely because it does not contain anything other than an
<indexterm>
element:
Before refactoring | After refactoring |
---|---|
|
|
The following example shows a <prolog>
element that is partially kept
because it also contains a <resourceid>
element:
Before refactoring | After refactoring |
---|---|
|
|
This same refactoring code can be adapted to other use cases by editing the
elements-to-delete
and delete-up-to
variables as
needed.