There's probably a better way to do what you actually want to do—in particular, there's a good chance your real XML has a single <locations>
tag that all the <location>
tags go underneath, so there's no reason to search for the last <location>
tag at all…
But here's how you'd do it.
os.chdir('c:/Users/ME/Documents/XML_Parasing_Python/')
origname = 'vs_original_M.xml'
master = ET.parse(origname)
for path in os.listdir('.'):
if path != origname and os.path.splitext(path)[-1] == '.xml':
child = ET.parse(path)
root = child.getroot()
last_location_parent = master.find('.//*[{}][last()]'.format(root.tag))
last_location_parent.append(root)
master.write('master.xml')
Most of this is pretty simple. You have to find the parent of the last location
node, then you can append
another node to it.
The only tricky bit there is the XPath expression in the find
, so let me break it down for you (but you will have to read the docs to really understand it!):
.//
means "descendants of the current node". (Technically you should be able to just use //
for "descendants of the root", but there are bugs in earlier versions of etree, so it's safer this way.)
*
means "with any tag name".
[location]
means "with a child "location" tag. (Of course I'm filling in the child's root tag using the format
method. If you know that all of your children have location
as the root, you can hardcode the tag name, and move the find
out of the loop as well.)
[last()]
means "the last one".
So, putting it all together, this is the last descendant of the root with any name with a child "location" tag.
If you don't understand XPath, you can always iterate things manually to get the same effect, but it's going to be longer, and easier to introduce subtle bugs, so it's really worth learning XPath.
I changed a bunch of other things in your program. Let me explain:
There's no reason to do if foo: return True
else: return False
; you can just do return foo
. But that means your whole function is just return HART_filename.endswith('.xml')
, so you don't even really need a function. And it's better to use path functions like os.path.splitext
than string functions on paths.
If you do for number in range(1, xml_list_length)
, you don't need number = 1
at the start and number += 1
in the loop; the for
statement already does that for you.
But you don't want to start at 1 anyway; Python lists are indexed starting at 0. If you're using that to skip over vs_original_M.xml
, that only works if you get lucky; the order in which listdir
returns things is unspecified and arbitrary. The only way to skip a file with a certain name is to check its name.
You almost never want to loop over range(len(foo))
. If you just need the elements of foo
, just do for element in foo
. If you need the index for each element as well, do for index, element in enumerate(foo)
.
Finally, you should almost never check if foo == True
. In Python, many things are "truthy" besides just True
(the number 74
, the string "hello", etc.), and you can just use if foo
to check whether foo is truthy. Only use == True
if you explicitly want to make sure it fails or other truthy values; if you just want to check the result of a boolean function like is_xml
or endswith
or the ==
operator, just check it directly.