11.Python Selenium Guide - Managing Cookies _ ScrapeOps
11.Python Selenium Guide - Managing Cookies _ ScrapeOps
In this comprehensive guide, we will delve into the world of Selenium and explore practical strategies for
handling cookies. We'll be covering:
Introduction
Understanding Cookies
Selenium and Cookies
Cookie Handling in Production
Conclusion
More Cool Articles
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 1/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
The code below shows how all these methods can be used:
#add a cookie
driver.add_cookie({"name": "our new cookie!", "value": "some random text!"})
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 2/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
)
#add another cookie
driver.add_cookie({"name": "another_cookie", "value": "we added a second cookie!"})
Understanding Cookies
Let's delve into the fundamentals of cookies to unravel their purpose, functionality, and the implications
they hold for both users and developers.
In the section above, we learned that all cookies have a name and a value . There are a number of
other fields that make up a cookie. Let's take a look at some of the information that makes up a cookie:
domain is the website you were at when the cookie was created
name is the name of a cookie
value is the value that the cookie is storing
httpOnly is a boolean. If httpOnly is set to true, client-side scripts cannot access this data
path is the portion of the domain that can access the cookie. If the path is / , this means that it is
on the root path (this cookie can be accessed from the entire domain)
sameSite is also used to restrict access to a cookie. sameSite can be set to None , Lax , or
Strict
secure is another boolean. If secure is set to True , this cookie can only be sent as an encrypted
request to the server using https
The cookies we created earlier used the attributes listed above. While cookies can contain more
information, there are a couple more attributes we need to pay attention to:
session is another boolean that tells the site whether or not our cookie is used to store a session
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 3/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Types of Cookies
Cookies serve various purposes and can be categorized based on their functions, lifespan, and origin.
Here are some common types of cookies:
Session Cookies are stored and used to manage a user session. When you close a browser, Session
Cookies are removed when you close the browser. These cookies are used specifically to store
information about your browsing session such as pages you visited.
Persistent Cookies are stored on a device even after the browser is closed. These cookies typically store
things like login information and user preferences on a site.
As we did earlier, we can retrieve cookies using Selenium. You can also view cookies stored from right
inside your browser.
While not necessarily relevant to scraping, this is important for anyone who cares about their data when
browsing the web.
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 4/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Once you've selected to manage your cookies, you get a pop-up with a list of sites that can access
your data
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 5/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
There are many reasons to manage our cookies in Selenium. You can use them for managing
credentials and for managing how your automated browser is tracked. Any type of stored browser
information typically gets stored in cookies.
With proper cookie management, we can keep logged into a site or we can clear all information from a
site and log out much more quickly and effectively. We can also remove cookies to make it more
difficult to track our browser.
While we use Selenium primarily for scraping, there are many developers who use it to automate testing
on their sites.
When using Selenium for automated site tests, the ability to quickly manage cookies makes the login
and logout process much more efficient (just like it does for us when scraping). Imagine writing unit
tests for a social media site and having to hard code user authentication every time you need to do
something different!
Cookie management is a critical skill when dealing with the following scenarios:
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 6/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Retrieving cookies
The process of retrieving cookies involves obtaining information about the cookies currently stored for a
specific domain in the browser. This can include fetching all cookies or getting details about a particular
cookie by its name.
#open chrome
driver = webdriver.Chrome()
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 7/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Open Chrome and navigate to the page with webdriver.Chrome() and driver.get()
Create a variable, cookies , using driver.get_cookies()
Print cookies to the terminal so we can see them
Adding cookies
Adding cookies is the action of setting new cookies in the browser for a specific domain.
#open chrome
driver = webdriver.Chrome()
Open a browser and navigate to the url with webdriver.Chrome() and driver.get()
Add a cookie with the name of new_cookie and the value of added_cookie
Get our list of cookies with driver.get_cookies()
Print the list so we can see the new cookie we just added
Deleting cookies
The deletion of cookies involves removing either a specific cookie or all cookies associated with a
particular domain from the browser. This action helps manage the state and session information stored
in cookies.
#open chrome
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 8/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
driver = webdriver.Chrome()
Open a browser and navigate to the url with webdriver.Chrome() and driver.get()
Add a cookie with driver.add_cookie({"name": "new_cookie", "value": "added_cookie"})
Create a cookies variable and print our cookies list to the terminal
Delete the cookie with driver.delete_cookie("new_cookie")
Print the newly emptied cookies to the terminal
Modifying cookies encompasses altering the values or properties of existing cookies, such as changing
the key-value pairs.
This process involves retrieving a specific cookie, creating a new cookie with the desired modifications,
and then adding the modified cookie back to the browser.
#open chrome
driver = webdriver.Chrome()
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 9/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Open an instance of Chrome and navigate to the page with webdriver.Chrome() and
driver.get()
Add a cookie with driver.add_cookie()
Print the cookies list
Iterate through our cookies list and find the cookie with the name we're searching for
Once we've found our cookie in the list, we modify the value
Remove the original cookie with driver.delete_cookie() and replace it with the modified cookie
with driver.add_cookie()
Print the cookies to the terminal so we can view our modified cookie
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 10/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
In the example above, when we changed the value of the cookie,could have changed other attributes
(keys) and their values.
To change the expiry date, instead of: cookie["value"] = "modified_cookie , we could use
cookie["expirationDate"] = some_new_timestamp
If we wish to filter cookies based on certain data, we iterate through our list and we looking for a certain
attribute to target. Take a look at the snippet below that we used in one of the previous sections of this
article:
We iterate through the list and use if cookie["name"] == "new_cookie" to find our target. If we wanted
all the cookies from quotes.toscrape.com, we could do this:
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 11/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
We've now modified this snippet to take all cookies from quotes.toscrape.com and switch httpOnly to
True .
While you need to push different buttons to get cookies using your normal browser. Selenium's builtin
cookie methods give you a universal way to create, read, update and delete your cookies no matter
which browser Selenium is controlling. To remove all cookies, we can simply use the
delete_all_cookies() method and this will clear the browser cache.
When managing cookies across multiple domains, we can simply filter through our cookie list and use
key-value pairs to target a specific domain. If we want only cookies from a certain site, we can use
get_cookies() to retrieve all our cookies and iterate through it with a for loop to target our cookies by
domain name.
In order to properly handle any exceptions thrown, it is best to use try to attempt to add a cookie into
the browser and a catch statement to handle any exceptions that may get thrown.
This combination of try and catch allows our script to continue execution should any errors occur. If
you are experiencing any performance issues, it is always a good idea clear your cookies.
If you wish to login to certain sites effectivley and efficiently, you may want to save your cookies to a file,
this way you can come back and read them any time you need to.
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 12/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
#open chrome
driver = webdriver.Chrome()
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 13/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Open Chrome and navigate to the page with webdriver.Chrome() and driver.get()
Next we find the username and password objects using their ID
We fill these objects with the username and password variables using the send_keys() method
Next we use XPATH to find and click() the login button
After signing in, we retrieve all of our cookies with driver.get_cookies() and save them as a
variable
Next, we close the browser with driver.close()
We then open up a new browser and head back to the page
We add our cookies from the previous session to the current session with driver.add_cookie()
We sleep() for 5 seconds just so you can see that we are not logged in yet
We then use driver.refresh() and you can see that we're logged in
Now let's use cookie management to handle a SPA (Single Page Application). Believe it or not, Github is
a single page application. Let's use cookies to interact with Github!
driver = webdriver.Chrome()
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 14/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 15/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
Github uses 2 Factor Authentication, when running the script above, make sure to login using your 2FA
before you input y to tell your script that you've logged in.
This script stops and pauses when you need to enter your 2FA. We input y to let the script know that
you've finished entering your 2FA and that the script is ready to continue its execution.
Conclusion
You've reached the end of this article. You should have a solid grasp on adding, deleting, and even
modifying cookies. You now know how to save them as variables and use them in future sessions.
You've even logged into LinkedIn and Github using cookies! Now that you've harnessed the power of
cookies in Selenium, go use your knowledge to scrape something cool!
Want to know more about scraping in Python? Checkout the ScrapeOps Python Playbook
To know more about Selenium, take a look at the Selenium Documentation
Python Pyppeteer
Rotating Proxies in Python
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 16/17
3/31/24, 9:32 PM Python Selenium Guide - Managing Cookies | ScrapeOps
https://scrapeops.io/selenium-web-scraping-playbook/python-selenium-managing-cookies/ 17/17