How to get specific nodes in xml file in Python?

Using the xml library you can get any node you want from the xml file. But for extracting a given node, you'd need to know how to use xpath to get it. You can learn more about XPath here:https://www.w3schools.com/xml/xml_xpath.asp.

Example

For example, assume you have a xml file with following structure,

<bookstore>
    <book category="cooking">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
    </book>
    <book category="children">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
    </book>
</bookstore>

And you want to extract all title nodes with lang attribute en, then you'd have the code −

from xml.etree.ElementTree import ElementTree
tree = ElementTree()
root = tree.parse("my_file.xml")
for node in root.findall("//title[@lang='en']"):
    for type in node.getchildren():
        print(type.text)