I have a fairly nested XML file that I'd like to transform with an XSL template to something a little simpler to make bulk loading the data into SQL more efficient. I wanted to do it in C++ (Codeblocks with gcc) but I'm having a bit of trouble just being able to load the document with any of the libraries I've come across, including MSXML. If anyone has any experience using MSXML in Codeblocks with gcc let me know!
I have a stylesheet that transforms the XML in Excel VBA with a DOMDocument but I don't want to depend on Excel. I figured the next best thing would be a VBScript.
The data are one or two text values that are held in <DATAVALUE>
nodes, descendants of 100 <LOCATION>
nodes. The first child of each <LOCATION>
node, called <LOCATIONNAME>
, holds a unique name for each <LOCATION>
node (i.e; NAME1
-NAME100
). The third and fourth children of the <LOCATION>
node (if there is a fourth child) are <DATA>
nodes, each holding a <DATAVALUE>
node. The file can have upwards of 1 million <SAMPLE>
nodes. Here is the XML:
<?xml version="1.0" encoding="utf-8"?>
<MYImportFile xmlns="urn:ohLookHEREaNamespacedeclaration">
<HEADERVERSION>1.10</HEADERVERSION>
<MESSAGE>Import</MESSAGE>
<MYBED>QUEEN</MYBED>
<SOURCE>SPRING </SOURCE>
<USERID>MMOUSE</USERID>
<DATETIME>2019-11-25T12:31:00</DATETIME>
<SAMPLE TYPE="No" APPLE="false">
<SAMPLEID>0000565</SAMPLEID>
<SAMPLECATEGORY>CLASS5</SAMPLECATEGORY>
<LOCATION APPLE="false">
<LOCATIONNAME>NAME1</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>aaaa</DATAVALUE>
</DATA>
<DATA>
<DATAVALUE>bbbb</DATAVALUE>
</DATA>
</LOCATION>
'''''''''''''''''there are 100 LOCATION entries''''''''''''''''''''''''
<LOCATION APPLE="false">
<LOCATIONNAME>NAME100</LOCATIONNAME>
<READBY>MMOUSE</READBY>
<TIME>12:31:00</TIME>
<DATA>
<DATAVALUE>zzzz</DATAVALUE>
</DATA>
</LOCATION>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
I want to point something out so it's a little more clear what's going on. In the transformed xml document, one of the things I need to account for is when there is only one <DATA>
node in a <LOCATION>
. This is done by copying the first <DATAVALUE>
node into a second <DATAVALUE>
node in the new document. For example, the <DATAVALUE>
, "zzzz"
that appears twice in the transformed sheet only appears in the initial XML once. Here is what I want the transformed XML to look like:
<?xml version="1.0" encoding="UTF-8"?>
<MYImportFile>
<SAMPLE>
<SAMPLEID>0000565</SAMPLEID>
<NAME1_1>aaaa</NAME1_1>
<NAME1_2>bbbb</NAME1_2>
<NAME2_1>cccc</NAME2_1>
<NAME2_2>dddd</NAME2_2>
'''''''''''''''''there are 100 LOCATION entries transformed to NAME1-NAME100''''''''''''''''''''''''
<NAME100_1>zzzz</NAME100_1>
<NAME100_2>zzzz</NAME100_2>
</SAMPLE>
'''''''''''''''''repeat for however many SAMPLES there are''''''''''''''''''''''
</MYImportFile>
My StyleSheet (that works with VBA code):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="urn:ohLookHEREaNamespacedeclaration" exclude-result-prefixes="b">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/b:MYImportFile">
<MYImportFile>
<xsl:for-each select="b:SAMPLE">
<SAMPLE>
<SAMPLEID>
<xsl:value-of select="b:SAMPLEID"/>
</SAMPLEID>
<NAME1_1>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_1>
<xsl:choose>
<xsl:when test="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE">
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[2]/b:DATAVALUE"/>
</NAME1_2>
</xsl:when>
<xsl:otherwise>
<NAME1_2>
<xsl:value-of select="b:LOCATION/b:LOCATIONNAME[text() = 'NAME1']/../b:DATA[1]/b:DATAVALUE"/>
</NAME1_2>
</xsl:otherwise>
</xsl:choose>
'''''''''''''''''''there are 100 NAME entires to recieve the 100 locations
</SAMPLE>
</xsl:for-each>
</MYImportFile>
</xsl:template>
</xsl:stylesheet>
My Script:
Option Explicit
Const strInputFile = "C:PathfileName.xml"
Const strTemplateFile = "C:PathconvFileName.xsl"
Const strOutputFile = "C:Path
ewFileName.xml"
Dim objXMLDoc : Set objXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.async = False
objXMLDoc.loadXML(strInputFile)
objXMLDoc.SetProperty "SelectionNamespaces", "xmlns='urn:myNamespace'"
Dim objXSLDoc : Set objXSLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXSLDoc.async = False
objXSLDoc.loadXML(strTemplateFile)
Dim objNewXMLDoc : Set objNewXMLDoc = WScript.CreateObject("Msxml2.DOMDocument")
objXMLDoc.transformNodeToObject objXSLDoc, objNewXMLDoc
objNewXMLDoc.save strOutputFile
The error:
Line: 19
Char: 1
Error: The stylesheet does not contain a document element. The
stylesheet may be empty, or it may not be a well-formed XML document.
Code: 80004005
Source: msxml3.dll
I'm guessing either my script isn't quite right or there's a setting I'm missing, causing mismatching objects and libraries, because my VBA macro transforms the xml with that stylesheet. Anyone have any ideas? Suggestions to make this thing run?
See Question&Answers more detail:
os