Convert HTML source code to JSON Object using Python Last Updated : 03 Mar, 2021 Summarize Comments Improve Suggest changes Share Like Article Like Report In this post, we will see how we can convert an HTML source code into a JSON object. JSON objects can be easily transferred, and they are supported by most of the modern programming languages. We can read JSON from Javascript and parse it as a Javascript object easily. Javascript can be used to make HTML for your web pages. We will use xmltojson module in this post. The parse function of this module takes the HTML as the input and returns the parsed JSON string. Syntax: xmltojson.parse(xml_input, xml_attribs=True, item_depth=0, item_callback) Parameters: xml_input can be either a file or a string.xml_attribs will include attributes if set to True. Otherwise, ignore them if set to False.item_depth is the depth of children for which item_callback function is called when found.item_callback is a callback function Environment Setup: Install the required modules : pip install xmltojson pip install requests Steps: Import the libraries Python3 import xmltojson import json import requests Fetch the HTML code and save it into a file. Python3 # Sample URL to fetch the html page url = "https://geeksforgeeks-example.surge.sh" # Headers to mimic the browser headers = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 \ (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36' } # Get the page through get() method html_response = requests.get(url=url, headers = headers) # Save the page content as sample.html with open("sample.html", "w") as html_file: html_file.write(html_response.text) Use the parse function to convert this HTML into JSON. Open the HTML file and use the parse function of xmltojson module. Python3 with open("sample.html", "r") as html_file: html = html_file.read() json_ = xmltojson.parse(html) The json_ variable contains a JSON string that we can print or dump into a file. Python3 with open("data.json", "w") as file: json.dump(json_, file) Print the output. Python3 print(json_) Complete Code: Python3 import xmltojson import json import requests # Sample URL to fetch the html page url = "https://geeksforgeeks-example.surge.sh" # Headers to mimic the browser headers = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 \ (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36' } # Get the page through get() method html_response = requests.get(url=url, headers = headers) # Save the page content as sample.html with open("sample.html", "w") as html_file: html_file.write(html_response.text) with open("sample.html", "r") as html_file: html = html_file.read() json_ = xmltojson.parse(html) with open("data.json", "w") as file: json.dump(json_, file) print(json_) Output: {"html": {"@lang": "en", "head": {"title": "Document"}, "body": {"div": {"h1": "Geeks For Geeks", "p": "Welcome to the world of programming geeks!", "input": [{"@type": "text", "@placeholder": "Enter your name"}, {"@type": "button", "@value": "submit"}]}}}} Comment More infoAdvertise with us Next Article How to retrieve source code from Python objects? M mukulbindal170299 Follow Improve Article Tags : Python Python-json Practice Tags : python Similar Reads Convert class object to JSON in Python In Python, class objects are used to organize complex information. To save or share this information, we need to convert it into a format like JSON, which is easy to read and write. Since class objects can't be saved directly as JSON, we first convert them into a dictionary (a data structure with ke 3 min read How to retrieve source code from Python objects? We are given a object and our task is to retrieve its source code, for this we have inspect module, dill module and dis module built-in standard libraries in Python programming. They provide several useful functions to track information about live objects such as modules, classes, methods, functions 2 min read Convert Generator Object To JSON In Python JSON (JavaScript Object Notation) is a widely used data interchange format, and Python provides excellent support for working with JSON data. However, when it comes to converting generator objects to JSON, there are several methods to consider. In this article, we'll explore some commonly used metho 2 min read How to return a json object from a Python function? Returning a JSON object from a Python function involves converting Python data (like dictionaries or lists) into a JSON-formatted string or response, depending on the use case. For example, if you're working with APIs, you might return a JSON response using frameworks like Flask. Let's explore sever 2 min read Convert JSON data Into a Custom Python Object In Python, converting JSON data into a custom object is known as decoding or deserializing JSON data. We can easily convert JSON data into a custom object by using the json.loads() or json.load() methods. The key is the object_hook parameter, which allows us to define how the JSON data should be con 2 min read Deserialize JSON to Object in Python Let us see how to deserialize a JSON document into a Python object. Deserialization is the process of decoding the data that is in JSON format into native data type. In Python, deserialization decodes JSON data into a dictionary(data type in python).We will be using these methods of the json module 2 min read Like