Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
180 views
in Technique[技术] by (71.8m points)

python - How to test if an item exists in XML file with beautiful soup?

I have an XML file looking like this (much more entries in practice) :

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tns:FlxPtn xmlns:ts2c="http://interop.covea.fr/Covea-Flx-TypesS2C-009" xmlns:tns="http://interop.covea.fr/Covea-App-PolRntVie-024" xmlns:rf2c="http://interop.covea.fr/Covea-Referentiel" xmlns:cov="http://interop.covea.fr/Covea-FlxPtn-002" xmlns:fs2c="http://interop.covea.fr/Covea-Flx-EneFncS2C-007" xsi:schemaLocation="http://interop.covea.fr/Covea-Flx-TypesS2C-009 Covea-Flx-TypesS2C-009.xsd http://interop.covea.fr/Covea-App-PolRntVie-024 S2C_XSD_VIGERIRVES_V10.0_024_MMA_124.xsd http://interop.covea.fr/Covea-App-PolRntVie-024 S2C_XSD_VIGERIRVES_V10.0_024.xsd http://interop.covea.fr/Covea-Referentiel Covea-Referentiel.xsd http://interop.covea.fr/Covea-FlxPtn-002 Covea-FlxPtn-002.xsd http://interop.covea.fr/Covea-Flx-EneFncS2C-007 Covea-Flx-EneFncS2C-007.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <cov:DonEneTch>
    <cov:IdVrsEne>002</cov:IdVrsEne>
    <cov:IdFlx>1V400220191231VIGERIRVESMMA1</cov:IdFlx>
    <cov:TsCraFlx>2020-01-13 10.02.13.000000</cov:TsCraFlx>
    <cov:IdEmtFlx>MMA</cov:IdEmtFlx>
    <cov:IdRctFlx>MMA</cov:IdRctFlx>
    <cov:TyFlx>NOTIF</cov:TyFlx>
    <cov:TyTrtFlx>1</cov:TyTrtFlx>
    <cov:AquFlx>0</cov:AquFlx>
    <cov:EvrExu>PRODUCTION</cov:EvrExu>
    <cov:IdApnEmt>MMA</cov:IdApnEmt>
    <cov:AcnApl>VIGERIRVES</cov:AcnApl>
    <cov:IdVrsFlx>124</cov:IdVrsFlx>
    <cov:IdTrtEmt>Rentes Vie</cov:IdTrtEmt>
    <cov:IdUtl>TTBATCH</cov:IdUtl>
    <cov:VrsCbl></cov:VrsCbl>
    <cov:ChpLbr>1V400220191231VIGERIRVESMMA1377900</cov:ChpLbr>
  </cov:DonEneTch>
  <tns:DonMet>
    <tns:DonEneFnc>
      <fs2c:CodSocJur>1V4002</fs2c:CodSocJur>
      <fs2c:DatArr>20191231</fs2c:DatArr>
      <fs2c:TypFiColl>VIGERIRVES</fs2c:TypFiColl>
      <fs2c:TimStmCreFic>2020-01-13 10.02.13.000000</fs2c:TimStmCreFic>
      <fs2c:CodEns>MMA</fs2c:CodEns>
    </tns:DonEneFnc>

    <tns:PolRntVie>
      <tns:NumEnr>20191290</tns:NumEnr>
      <tns:NumPol>050000111901</tns:NumPol>
    </tns:PolRntVie>
    <tns:PolRntVie>
      <tns:NumEnr>20191291</tns:NumEnr>
      <tns:NumPol>050000112002</tns:NumPol>
      <tns:PMVie>4385.15</tns:PMVie>
    </tns:PolRntVie>

What I would like to extract is the last part informations "NumEnr", "NumPol", "PMVie" (if it exists). I have been helped to build the following code :

from bs4 import BeautifulSoup
import csv
libname = "C:/Users/a61787/Documents/"

with open(libname+'Test.xml') as f_input:
    soup = BeautifulSoup(f_input, "lxml")

with open(libname+'b.csv', 'w', newline='', encoding='utf-8') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(['Numenr', 'NumPol', 'PMVie'])
    
    for tns in soup.find_all("tns:polrntvie"):
        csv_output.writerow(tns.find(entry).text for entry in ['tns:numenr', 'tns:numpol', 'tns:pmvie'])

It works fine where all the items are there but gives me the following error where there are some missing items => AttributeError: 'NoneType' object has no attribute 'text'.

So, I'm looking for a way to test if all desired items are missing or not. If missing, I would like to have an empty string in the csv file for this column.

Not familiar with the xml files. Any help would be appreciated.

question from:https://stackoverflow.com/questions/66054180/how-to-test-if-an-item-exists-in-xml-file-with-beautiful-soup

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your problem is that the find call returns None if there are no matching elements. Instead of trying to combine everything in a single line...

csv_output.writerow(tns.find(entry).text for entry in ['tns:numenr', 'tns:numpol', 'tns:pmvie'])

...consider splitting that up like this so that you can explicitly check for None and deal with it appropriately:

res = (
    tns.find(entry)
    for entry in ['tns:numenr', 'tns:numpol', 'tns:pmvie']
)
csv_output.writerow(x.text if x else '' for x in res)

(This inserts an empty field ('') for cases in which there was no matching element.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...