Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
842 views
in Technique[技术] by (71.8m points)

xml - XSLT Ignore duplicate elements across multiple files

I recently asked a question regarding how to ignore multiple elements, and got some good responses regarding using 'preceding' and the Muenchian Method. However I was wondering whether it is possible to do this across multiple files, with an index xml file.

Index.xml

<?xml-stylesheet type="text/xsl" href="merge2.xsl"?>
<list>
    <entry name="File1.xml" />
    <entry name="File2.xml" />
</list>

Example of XML file

<Main>
    <Records>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
    <Records>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
</Main>

Merge2.xsl

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

This works fine on one file, and gives me the desired result of producing one table, with unique items only being displayed and the count being added. However I have been unable to produce the desired result when going through the index.xml for multiple files.

I have tried using a seperate template targeting the index.xml and applying the 'Main' template to the different XML files, and also tried using a for-each to cycle through the different files.

Before being introduced to the Muenchian Method I was using for-each with 'preceding' to check for duplicate nodes, however 'preceding' only seems to search back through the current document and have been unable to find information on using this across multi documents.

Is it possible with either of these methods to be able to search through multiple documents for duplicated element text?

Many thanks for any help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Basically keys are built per document so a direct key based Muenchian grouping will not allow you to identify and remove duplicates in more than one document.

You could however first merge the two documents into one and then apply the Muenchian grouping to the merged document.

If you want to merge and group in one stylesheet you need to use exsl:node-set or similar:

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl">

  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="/">
    <xsl:variable name="merged-rtf">
      <Main>
        <xsl:copy-of select="document(list/entry/@name)/Main/Records"/>
      </Main>
    </xsl:variable>
    <xsl:apply-templates select="exsl:node-set($merged-rtf)/Main"/>
   </xsl:template>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

You would now pass your index.xml as the main input document to the stylesheet.

If you want to do this transformation in the IE browser then you need to replace the exsl:node-set with Microsoft's ms:node-set (with the proper namespace) or you need to use the approach in http://dpcarlisle.blogspot.de/2007/05/exslt-node-set-function.html to make sure the exsl:node-set function is implemented.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.8k users

...