Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
164 views
in Technique[技术] by (71.8m points)

python - How to deal with xmlns values while parsing an XML file?

I have the following toy example of an XML file. I have thousands of these. I have difficulty parsing this file.

Look at the text in second line. All my original files contain this text. When I delete i:type="Record" xmlns="http://schemas.datacontract.org/Storage" from second line (retaining the remaining text), I am able to get accelx and accely values using the code given below.

How can I parse this file with the original text?

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfRecord xmlns:i="http://www.w3.org/2001/XMLSchema-instance" i:type="Record" xmlns="http://schemas.datacontract.org/Storage">
  <AvailableCharts>
    <Accelerometer>true</Accelerometer>
    <Velocity>false</Velocity>
  </AvailableCharts>
  <Trics>
    <Trick>
      <EndOffset>PT2M21.835S</EndOffset>
      <Values>
        <TrickValue>
          <Acceleration>26.505801694441629</Acceleration>
          <Rotation>0.023379150593228679</Rotation>
        </TrickValue>
      </Values>
    </Trick>
  </Trics>
  <Values>
    <SensorValue>
      <accelx>-3.593643144</accelx>
      <accely>7.316485176</accely>
    </SensorValue>
    <SensorValue>
      <accelx>0.31103436</accelx>
      <accely>7.70408184</accely>
    </SensorValue>
  </Values>
</ArrayOfRecord>

Code to parse the data:

import lxml.etree as etree
tree = etree.parse(r"C:estdel.xml")
root = tree.getroot()

val_of_interest = root.findall('./Values/SensorValue')

for sensor_val in val_of_interest:
    print sensor_val.find('accelx').text
    print sensor_val.find('accely').text

I asked related question here: How to extract data from xml file that is deep down the tag

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The confusion was caused by the following default namespace (namespace declared without prefix) :

xmlns="http://schemas.datacontract.org/Storage"

Note that descendants elements without prefix inherit default namespace from ancestor, implicitly. Now, to reference element in namespace, you need to map a prefix to the namespace URI, and use that prefix in your XPath :

ns = {'d': 'http://schemas.datacontract.org/Storage' }
val_of_interest = root.findall('./d:Values/d:SensorValue', ns)

for sensor_val in val_of_interest:
    print sensor_val.find('d:accelx', ns).text
    print sensor_val.find('d:accely', ns).text

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...