Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
415 views
in Technique[技术] by (71.8m points)

How to remove namespaces from XML using XSLT

I have a 150 MB (it can go even more sometimes) XML file. I need to remove all the namespaces. It's on Visual Basic 6.0, so I'm using DOM to load the XML. Loading is okay, I was skeptical at first, but somehow that part works fine.

I am trying the following XSLT, but it removes all the other attributes also. I want to keep all the attributes and elements, I just need to remove the namespaces. Apparently it's because I have xsl:element but not attribute. How can I include the attributes there?

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="UTF-8" />
    <xsl:template match="*">
        <xsl:element name="{local-name()}">
            <xsl:apply-templates select="@* | node()"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your XSLT removes attributes also, because you don't have a template that would copy them. <xsl:template match="*"> matches only elements, not attributes (or text, comments or processing instructions).

Below is a stylesheet that removes all namespace definitions from the processed document but copies all other nodes and values: elements, attributes, comments, text and processing instructions. Please pay attention to 2 things

  1. Copying the attributes as such is not enough to remove all namespaces. Also an attribute can belong to a namespace, even when the containing element doesn't belong to a namespace. Therefore also attributes need to be created, like elements. Creating attributes is done with <xsl:attribute> element.
  2. A valid XML document cannot contain an element that has two or more attributes with same expanded name but elements can contain multiple attributes with same local name if the attributes have different namespaces. This means that removing the namespace prefix from an attribute name will cause dataloss if there is an element that has at leas two attributes with same local name. Other one of these attributes will be removed (or overwritten).

...and the code:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output indent="yes" method="xml" encoding="utf-8" omit-xml-declaration="yes"/>

    <!-- Stylesheet to remove all namespaces from a document -->
    <!-- NOTE: this will lead to attribute name clash, if an element contains
        two attributes with same local name but different namespace prefix -->
    <!-- Nodes that cannot have a namespace are copied as such -->

    <!-- template to copy elements -->
    <xsl:template match="*">
        <xsl:element name="{local-name()}">
            <xsl:apply-templates select="@* | node()"/>
        </xsl:element>
    </xsl:template>

    <!-- template to copy attributes -->
    <xsl:template match="@*">
        <xsl:attribute name="{local-name()}">
            <xsl:value-of select="."/>
        </xsl:attribute>
    </xsl:template>

    <!-- template to copy the rest of the nodes -->
    <xsl:template match="comment() | text() | processing-instruction()">
        <xsl:copy/>
    </xsl:template>

</xsl:stylesheet>

You could also use <xsl:template match="node()"> instead of that last template but then you should use priority attribute to prevent elements matching to this template.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...