Python language is used extensively for web programming. When we browser website we use the web address which is also known as URL or uniform resource locator. Python has inbuilt materials which can handle the calls to the URL as well as pass the result that comes out of visiting the URL. In this article we will see a module named as urllib. We will also see the various functions present in this module which help in getting the result from the URL.
Installing urllib
To install urllib in the python environment, we use the below command using pip.
pip install urllib
Running the above code gives us the following result −
Opening an URL
The request.urlopen method is used to visit an URL and fetch its content to the python environment.
Example
import urllib.request address = urllib.request.urlopen('https://www.tutorialspoint.com/') print(address.read())
Output
Running the above code gives us the following result −
b'<!DOCTYPE html>\r\n<!--[if IE 8]><html class="ie ie8"> <![endif]-->\r\n<!--[if IE 9]><html class…….. …………… ………………. new Date());\r\ngtag(\'config\', \'UA-232293-6\');\r\n</script>\r\n</body>\r\n</html>\r\n' -->
urllib.parse
We can parse the URL to check if it is a valid one or not. We can also Pass a query string to the search option. The response can be checked for its validity and we can print the entire response if it is a valid one.
Example
import urllib.request import urllib.parse url='https://tutorialspoint.com' values= {'q':'python'} data = urllib.parse.urlencode(values) data = data.encode('utf-8') # data should be bytes print(data) req = urllib.request.Request(url, data) resp = urllib.request.urlopen(req) print(resp) respData = resp.read() print(respData)
Output
Running the above code gives us the following result −
b'q=python' <http.client.HTTPResponse object at 0x00000195BF706850> b'<!DOCTYPE html>\r\n<!--[if IE 8]><html class="ie ie8"> <![endif]………… ………………… \r\n</script>\r\n</body>\r\n</html<\r\n' -->
urllib.parse.urlsplit
urlsplit can be used to takein an url, then split it into parts which can be used for further data manipulation. For example if we want to programmatically judge if a URL is SSL certified or not then we apply urlsplit and get the scheme value to decide. In the below example we check the different parts of the supplied URL.X
Output
import urllib.parse url='https://tutorialspoint.com/python' value = urllib.parse.urlsplit(url) print(value)
Running the above code gives us the following result −
SplitResult(scheme='https', netloc='tutorialspoint.com', path='/python', query='', fragment='')