Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
294 views
in Technique[技术] by (71.8m points)

ruby - How to combine two XML files with Nokogiri

I am trying to combine two separate, but related, files with Nokogiri. I want to combine the "product" and "product pricing" if "ItemNumber" is the same.

I loaded the documents, but I have no idea how to combine the two.

Product File:

<Products>
  <Product>
    <Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
    <ProductTypeId>0</ProductTypeId>
    <Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
    <ActiveFlag>Y</ActiveFlag>
    <ImageFile>100024.jpg</ImageFile>
    <ItemNumber>100024</ItemNumber>
    <ProductVariants>
      <ProductVariant>
        <Sku>100024</Sku>
        <ColorName></ColorName>
        <SizeName></SizeName>
        <SequenceNo>0</SequenceNo>
        <BackOrderableFlag>N</BackOrderableFlag>
        <InventoryLevel>0</InventoryLevel>
        <ColorCode></ColorCode>
        <SizeCode></SizeCode>
        <TaxableFlag>Y</TaxableFlag>
        <VariantPromoGroupCode></VariantPromoGroupCode>
        <PricingGroupCode></PricingGroupCode>
        <StartDate xsi:nil="true"></StartDate>
        <EndDate xsi:nil="true"></EndDate>
        <ActiveFlag>Y</ActiveFlag>
      </ProductVariant>
    </ProductVariants>
  </Product>
</Products>

Product Pricing Fields:

<ProductPricing>
  <ItemNumber>100024</ItemNumber>
  <AcquisitionCost>8.52</AcquisitionCost>
  <MemberCost>10.7</MemberCost>
  <Price>14.99</Price>
  <SalePrice xsi:nil="true"></SalePrice>
  <SaleCode>0</SaleCode>
</ProductPricing>

I am looking to generate a file like this:

<Products>
  <Product>
    <Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
    <ProductTypeId>0</ProductTypeId>
    <Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
    <ActiveFlag>Y</ActiveFlag>
    <ImageFile>100024.jpg</ImageFile>
    <ItemNumber>100024</ItemNumber>
    <ProductVariants>
      <ProductVariant>
        <Sku>100024</Sku>
        <ColorName></ColorName>
        <SizeName></SizeName>
        <SequenceNo>0</SequenceNo>
        <BackOrderableFlag>N</BackOrderableFlag>
        <InventoryLevel>0</InventoryLevel>
        <ColorCode></ColorCode>
        <SizeCode></SizeCode>
        <TaxableFlag>Y</TaxableFlag>
        <VariantPromoGroupCode></VariantPromoGroupCode>
        <PricingGroupCode></PricingGroupCode>
        <StartDate xsi:nil="true"></StartDate>
        <EndDate xsi:nil="true"></EndDate>
        <ActiveFlag>Y</ActiveFlag>
      </ProductVariant>
    </ProductVariants>
  </Product>
  <ProductPricing>
    <ItemNumber>100024</ItemNumber>
    <AcquisitionCost>8.52</AcquisitionCost>
    <MemberCost>10.7</MemberCost>
    <Price>14.99</Price>
    <SalePrice xsi:nil="true"></SalePrice>
    <SaleCode>0</SaleCode>
  </ProductPricing>
</Products>

Here is the code I have so far:

require 'csv'
require 'nokogiri'

xml = File.read('lateApril-product-pricing.xml')
xml2 = File.read('lateApril-master-date')

doc = Nokogiri::XML(xml)
doc2 = Nokogiri::XML(xml2)

pricing_data = []
item_number = []

doc.xpath('//ProductsPricing/ProductPricing').each do |file|

  itemNumber = file.xpath('./ItemNumber').first.text
  variant_Price = file.xpath('./Price').first.text

  pricing_data << [ itemNumber, variant_Price ]

  item_number << [ itemNumber ]
end 

puts item_number ## This prints all the item number but i have no idea how to loop through them and combine them with Product XML

doc2.xpath('//Products/Product').each do |file|
  itemNumber = file.xpath('./ItemNumber').first.text #not sure how to write the conditions here since i don't have pricing fields available in this method
end 
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try this on:

require 'nokogiri'

doc1 = Nokogiri::XML(<<EOT)
<Products>
  <Product>
    <Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
  </Product>
</Products>
EOT

doc2 = Nokogiri::XML(<<EOT)
<ProductPricing>
  <ItemNumber>100024</ItemNumber>
</ProductPricing>
EOT

doc1.at('Product').add_next_sibling(doc2.at('ProductPricing'))

Which results in:

puts doc1.to_xml

# >> <?xml version="1.0"?>
# >> <Products>
# >>   <Product>
# >>     <Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
# >>   </Product><ProductPricing>
# >>   <ItemNumber>100024</ItemNumber>
# >> </ProductPricing>
# >> </Products>

Please, when you ask, strip the example input and expected resulting output to the absolute, bare, minimum. Anything beyond that wastes space, eye-time and brain CPU.

This is untested code, but is where I'd start if I was going to merge two files containing multiple <ItemNumber> nodes:

require 'nokogiri'

doc1 = Nokogiri::XML(<<EOT)
<Products>
  <Product>
    <Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
    <ItemNumber>100024</ItemNumber>
  </Product>
</Products>
EOT

doc2 = Nokogiri::XML(<<EOT)
<ProductPricing>
  <ItemNumber>100024</ItemNumber>
</ProductPricing>
EOT

# build a hash containing the item numbers in doc1 for each product
doc1_products_by_item_numbers = doc1.search('Product').map { |product|
  item_number = product.at('ItemNumber').value
  [
    item_number,
    product
  ]
}.to_hash

# build a hash containing the item numbers in doc2 for each product pricing
doc2_products_by_item_numbers = doc2.search('ProductPricing').map { |pricing| 
  item_number = pricing.at('ItemNumber').value
  [
    item_number,
    pricing
  ]
}.to_hash

# append doc2 entries to doc1 after each product based on item numbers
doc1_products_by_item_numbers.keys.each { |k|
  doc1_products_by_item_numbers[k].add_next_sibling(doc2_products_by_item_numbers[k])
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...