Python XML parser parser provides one of the easiest ways to read and extract useful information from the XML file. In this short tutorial we are going to see how we can parse XML file, modify and create XML documents using python ElementTree XML API.
Python ElementTree API is one of the easiest way to extract, parse and transform XML data.
So let’s get started using python XML parser using ElementTree:
Example1
Creating XML file
First we are going to create a new XML file with an element and a sub-element.
#Import required library import xml.etree.ElementTree as xml def createXML(filename): # Start with the root element root = xml.Element("users") children1 = xml.Element("user") root.append(children1) tree = xml.ElementTree(root) with open(filename, "wb") as fh: tree.write(fh) if __name__ == "__main__": createXML("testXML.xml")
Once we run above program, a new file is created named “textXML.xml” in our current default working directory:
Which contains contents something like:
<users><user /></users>
Please note while writing the file, we have used the ‘wb’ mode .i.e. write the file in binary mode.
Adding values to XML elements
Let’s give some values to the XML elements in our above program:
#Import required library import xml.etree.ElementTree as xml def createXML(filename): # Start with the root element root = xml.Element("users") children1 = xml.Element("user") root.append(children1) userId1 = xml.SubElement(children1, "Id") userId1.text = "hello" userName1 = xml.SubElement(children1, "Name") userName1.text = "Rajesh" tree = xml.ElementTree(root) with open(filename, "wb") as fh: tree.write(fh) if __name__ == "__main__": createXML("testXML.xml")
After running the above program, we’ll see that new elements are added with values, something like:
<users> <user> <Id>hello</Id> <Name>Rajesh</Name> </user> </users>
Above output looks ok.
Now let’s start editing files:
Editing XML data
Let’s add some bit of data into from a file in our existing program.
newdata.xml
<users> <user> <id>1a</id> <name>Rajesh</name> <salary>NA</salary> </user> <user> <id>2b</id> <name>TutorialsPoint</name> <salary>NA</salary> </user> <user> <id>3c</id> <name>Others</name> <salary>NA</salary> </user> </users>
Above is our current xml file, let’s try to update the salary of each users:
#Import required library import xml.etree.ElementTree as ET def updateET(filename): # Start with the root element tree = ET.ElementTree(file=filename) root = tree.getroot() for salary in root.iter('salary'): salary.text = '500000' tree = ET.ElementTree(root) with open("newdata.xml", "wb") as fh: tree.write(fh) if __name__ == "__main__": updateET("newdata.xml")
Output
So we see the salary is changed from ‘NA’ to ‘500000’.
Example: Python XML Parser
Now let’s write another program which will parse XML data present in the file and print the data.
#Import required library import xml.etree.cElementTree as ET def parseXML(file_name): # Parse XML with ElementTree tree = ET.ElementTree(file=file_name) print(tree.getroot()) root = tree.getroot() print("tag=%s, attrib=%s" % (root.tag, root.attrib)) # get the information via the children! print("-" * 25) print("Iterating using getchildren()") print("-" * 25) users = root.getchildren() for user in users: user_children = user.getchildren() for user_child in user_children: print("%s=%s" % (user_child.tag, user_child.text)) if __name__ == "__main__": parseXML("newdata.xml")
Output
<Element 'users' at 0x0551A5A0> tag = users, attrib = {} ------------------------- Iterating using getchildren() ------------------------- id = 1a name = Rajesh salary = 500000 id = 2b name = TutorialsPoint salary = 500000 id = 3c name = Others salary = 500000