Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)

Home Guide Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)

Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)

Puppeteer is the most popular automation testing tool by Google. However, it supports only JavaScript programming language. Many Python enthusiasts and community contributors grouped to support Python for Puppeteer. Pypeeteer is a port of Puppeteer and it mimics the Puppeteer API by using the Python programming language.

Overview

What is Pyppeteer?

Pyppeteer is an unofficial Python port of Puppeteer, a Node.js library, that automates Chromium-based browsers through the DevTools Protocol.

Why Use Puppeteer?

Automate UI testing for modern web applications
Scrape JavaScript-heavy websites dynamically
Generate PDFs or screenshots of webpages programmatically
Simulate user interactions (clicks, typing, navigation)
Test website performance across devices and networks
Control headless browsers in CI/CD pipelines
Monitor uptime and visual changes in web pages

This guide explains what Pyppeteer is, its uses, how it differs from Puppeteer, bets practices and more.

What is Pyppeteer?

Pyppeteer is a Python version of the Puppeteer automation tool. It helps to automate the web application using the Python language. It continues to keep the core functionality similar to Puppeteer. Pyppeteer supports Chromium and Chrome browsers for test automation. It is an open-source software distributed under MIT license.

Read More: Top 10 Puppeteer Alternatives

Use cases of Pyppeteer

Here are some of the use cases of Pyppeteer:

1. Web Scraping Dynamic Content

Extract data from JavaScript-heavy websites that don’t render content on the server side.
Example: Scraping product listings, news articles, or job postings.

2. Automated Form Submission

Fill and submit forms for tasks like testing, mass registrations, or bots.
Example: Auto-fill contact forms, job applications, or surveys.

3. PDF Generation from Web Pages

Convert webpages (including charts, tables, and styles) into clean PDFs.
Example: Invoice generation, report exports, or receipts.

4. UI Testing and Monitoring

Run end-to-end tests for frontend apps by simulating user interactions.
Example: Test if login flow, search functionality, or checkout process works.

5. Website Screenshot Automation

Capture screenshots of full pages, specific elements, or responsive views.
Example: Visual regression testing, content archiving, or preview generation.

6. Social Media or SEO Bot Tasks

Automate posting, monitoring, or capturing metadata like title and description
Example: Auto-post blog links or validate open graph tags.

7. Emulating Devices and Networks

Simulate different screen sizes, devices, or slow network conditions.
Example: Test mobile responsiveness or app behavior on 3G.

8. Capturing Performance Metrics

Measure page load times, resource usage, and script performance.
Example: Benchmark web performance across deployments.

9. Captcha Pre-checks (Non-breaking Only)

Detect presence of CAPTCHA and log/report (not for solving).
Example: Monitor website defenses or redirects.

10. Browser Fingerprint Emulation

Customize user-agent, viewport, geolocation, timezone, etc.
Example: Test personalization, localization, or bot detection mechanisms

Differences between Puppeteer and Pyppeteer

Fundamentally, both are built on the same architecture. However, Pyppeteer is a ported version of Puppeteer JavaScript as there are programming-level restrictions few things may change.

A few pyppeteer examples are listed below:

Puppeteer uses an object for passing parameters to functions. Pyppeteer functions accept both dictionary and keyword arguments for options

Example:

browser = await launch({'headless': True})

browser = await launch(headless=True)

Puppeteer JavaScript uses the $ for the element selector. However, in Python, $ is not a valid identifier, instead of $ python uses J as a shorthand operator as listed below:

Puppeteer	Pyppeteer	Pyppeteer shorthand
Page.$()	Page.querySelector()	Page.J()
Page.$$()	Page.querySelectorAll()	Page.JJ()
Page.$x()	Page.xpath()	Page.Jx()

Puppeteer’s version of JavaScript evaluate() takes a JavaScript function or a string representation of a JavaScript expression. However, Pyppeteer takes a string representation of JavaScript or a function

Example:

content = await page.evaluate('document.body.textContent', force_expr=True)

You have discussed Pyppeteer and Puppeteer differences understand the usage with the Pyppeteer example:

Note: Pyppeteer project is no longer actively maintained. The latest version of Pyppeteer doesn’t work smoothly, it is not recommended to use Pyppeteer instead as per the documentation it suggested using Playwright Python which is well-maintained and more similar to Pyppeteer.

How to set up Pyppeteer?

Discuss installing and setting up a Pyppeteer using Python.

Pre-requisites to install Pyppeteer

Basic understanding of Python
Download and install Python 3.8 or higher

Install Pyppeteer

Navigate to the desired folder (example, PyppeteerDemo)
Open Terminal and type the below command

pip install Pyppeteer

Once the installation is successful you are good to start the automation.

Read More: Understanding Puppeteer Headless

How to perform various Actions on Pyppeteer

It is important to know about the actions that can be performed using the Pyppeteer. If you are already familiar with Puppeteer the learning curve will be very minimal. Understand the different actions in Pyppeteer

Launching the browser with Pyppeteer

To launch the browser you need to create the browser instance first.

browser = await launch()

This can be used to create the browser instance or context. Using the browser object you can create as many pages as possible

Pyppeteer Example:

import asyncio

from pyppeteer import launch

async def scraper():

   browser =await launch({"headless": False})

   page = await browser.newPage()

   await page.goto('https://www.google.com/')

   await browser.close()

asyncio.run(scraper())

In the above code, a browser instance is created first and then calls a newPage() method to get the new page. Once a new page is created, you can perform actions like navigating to the URL (example: https://www.google.com), etc.

Read More: How to use Proxy in Puppeteer?

Opening specific versions of Chrome/Chromium browser with Pyppeteer

Pyppeteer is more flexible and customizable, You can specify the browser executable to launch the browser. For example, if you have already installed a specific version of Chrome you can pass the path to the launch() function as mentioned below

browser = await launch(headless=False, executablePath='C:\\Program Files\\Google\\Chrome\Application\\chrome.exe')

Typing the Test on the web page

Consider you want to search for something on the page, you can use the page.type() method to perform the action

Example:

page.type("#mySearch", "Browserstack");

In the above method, #mySearch is the locator and BrowserStack is the Search text

Clicking the Button on the Webpage

Pyppeteer makes clicking buttons very simple, page.click() method can be used to perform the click action. page.click() method accepts one parameter that is the locator, the locator can be any valid css locator.

Example:

page.click('#mybtn');

Printing PDF files with Pyppeteer

Pyppeteer allows you to print or save the webpage as PDF, instead of taking a screenshot you can save the whole page in PDF format. Page.pdf() function can be used for this purpose.

Example:

await page.pdf({'path': 'python_print.pdf'})

Switching Tabs with Pyppeteer

Pyppeteer follows a different approach to switching the tabs. As mentioned earlier you can create as many as pages you want once you get the browser instance. Understand with an example:

   browser =await launch({"headless": False})

   page1 = await browser.newPage()

   page2 = await browser.newPage()

   await page1.goto('https://www.google.com/')

   await page2.goto('https://www.browserstack.com/');

In the above example, you have created two pages namely page1 and page2, page1 represents the first page and page2 represents the second page. You can just use page 1 or page 2 to perform the action.

Example: Clicking on the search button on the second page

await page2.click('#browserstackSearchButtton')

Managing Cookies with Pyppeteer

Pyppeteer provides the capability to manage the cookies. You can Set, Print, and Delete the cookies. Below are the available methods to manage cookies

Example:

await page.cookies() : Get all the available cookies

await page.setCookie(...cookies) : Set the Cookie

await page.deleteCookie(...cookieNames) : Delete the cookie

Handling iFrames with Pyppeteer

IFrames are a legacy technique to divide the page, however, Pyppeteer supports iFrame actions.

For example, if you want to perform a click action on a specific element on the iFrame you can follow the below approach

Example:

iframe_element = await page.querySelector(iframe_1)

iframe = await element.contentFrame()

await iframe.querySelector(iframe_button).click()

Handling Alerts and Pop-ups with Pyppeteer

Alerts and pop-ups are native to the operating system. These can be handled using the asyncio() functionality. Below are examples of handling the browser popups and alert messages

Example: Handle Confirm Dialog Box

page.on('dialog', handle_confirm _dialog_box)

def handle_confirm _dialog_box (dialog):

   asyncio.ensure_future(dialog.accept(test_message))

Example: Handle Dismiss Dialog Box

page.on('dialog', handle_dismiss_dialog_box)

def handle_dismiss_dialog_box(dialog):

   asyncio.ensure_future(dialog.dismiss())

Similarly, you can handle different types of Dialogs

Handling Dynamic Content with Pyppeteer

Dynamic web elements are a new standard of modern web applications. For example, when you navigate to the page it loads only contents or elements for the current viewport. As you scroll down additional contents are added.

In such scenarios, if you navigate to the web page and perform an action it throws the exception with the error element not found. One of the solutions for this is to scroll down until webpage elements are visible.

This scenario can be handled in Pyppeteer using the scrollIntoView() function.

Example:

Scroll until the element is visible


   async def scroll_to_element(page, selector):

await page.evaluateHandle(

       '''async (selector) => {

           const element = document.querySelector(selector);

           if (element) {

               element.scrollIntoView();

           }

       }''',

       selector

   )

   return selector

Once element is found perform the action

   elem_button_footer = await scroll_to_element(page, button_footer)

   await page.click(elem_button_footer)

Parallel Execution with Pyppeteer

Pyppeteer doesn’t support parallel execution. That means you cannot run multiple tests at a single point of time however this can be achieved using the third-party plugin called pytest-parallel.

Once you install pytest-parallel you can use the following command to run the tests in parallel

pytest –workers 2: Run tests in 2 threads
pytest –workers auto: Automatically runs by looking at the CPU core

Web Scraping with Pyppeteer

Pyppeteer Python is a good combination for scraping web pages. Web scraping helps to perform research in the industry, especially in the retail segment. Competitors make an analysis of particular products, their SKUs, pricing and discounts, etc.

Pyppeteer can be used for web scraping. Below is a simple example of web scraping.

Below are the steps to perform
Navigate to webpage
Get all the product cards
For each product get the name and price

Example:

async def scrape_it (page, url) -> list:

page.goto('https://someretailwebsite.com')

rows = await page.querySelectorAll('#product_cards')

scraping_data_arr = []

for row in rows:

  name = await row.querySelector('p.name')

  price = await row.querySelector('div.price-value')

  nameText = await page.evaluate('(element) => element.textContent', name)

  priceValue = await page.evaluate('(element) => element.textContent', price)

  scraping_data_dict = {

  'product name': nameText,

            'product price': priceValue

        }

scraping_data_arr.append(scraping_data_dict)

return scraping_data_arr

Talk to an Expert

Cross Browser Testing with Pyppeteer

As discussed in the above section, Pyppeteer is a fork of Puppeteer JavaScript. The browser support is the same as the Puppeteer. Puppeteer is intended to perform testing in Chromium and Chrome browsers only. It doesn’t support any other browser.

Though it has experimental support for Firefox it is not recommended to use. Due to these limitations, cross-browser testing cannot be achieved using Pyppeteer. However, the best alternative is to use Python Playwright which supports many different browsers without any hassle.

Best Practices for using Pyppeteer

Here are some of the best practices for using Pyppeteer:

1. Use waitForSelector() Instead of Fixed Timeouts

Avoid await page.waitFor(5000).
Use await page.waitForSelector(‘#element’) to wait until an element actually appears.

2. Run in Headless Mode for Performance

Use headless mode (headless: true) for faster execution unless debugging.
When debugging, switch to headless: false and enable slowMo.

3. Close Browser Instances Properly

Always close the browser using await browser.close() to avoid memory leaks.
Use try-finally or process.on(‘exit’) to ensure cleanup.

4. Isolate Browser Contexts: Use browser.createIncognitoBrowserContext() for independent sessions (good for multi-user automation or testing).

5. Throttle Requests for Scraping: Respect server load; add random delays between requests to mimic human behavior.

6. Intercept and Block Unnecessary Requests: Use page.setRequestInterception(true) to block images, fonts, and ads for faster performance.

7. Use Page Evaluation Wisely: Prefer page.$eval() and page.evaluate() for in-page computations rather than sending raw data back to Python.

8. Handle Exceptions Gracefully: Use try-catch blocks and meaningful error messages to make debugging easier.

9. Leverage Async/Await Correctly: Ensure all async operations are awaited properly to avoid race conditions or unpredictable behaviors.

10. Keep Dependencies Updated: Pyppeteer depends on Chromium and internal protocols—stay updated to avoid breakages due to browser protocol changes.

How to run Pyppeteer Tests on Real Devices with BrowserStack Automate

Pyppeteer is in the maintenance phase and it is no longer actively maintained. As mentioned earlier, the latest version of the Pyppeteer doesn’t allow to download the latest browsers. Since Pyppeteer stopped active development there is not much support for the cloud execution.

Even though this can be achieved through some hacks, stability, and reliability cannot be guaranteed.

Note: However, If you have a cloud execution requirement you can migrate your project to Puppeteer and JavaScript. If you feel programming language is a barrier you can consider switching to Playwright Python.

BrowserStack Automate supports both Puppeteer JavaScript and Playwright Python seamlessly without any hassle. You can learn more about integrating Playwright with BrowserStack and test your application on 3500+ real devices to test on the cloud.

Conclusion

Puppeteer is the most popular tool for test automation, however, it lacks programming language support other than JavaScript, to overcome this challenge Pyppeteer is a fork from Puppeteer and written wrapper to support Python language.

Due to recent developments, the Pyppeteer project is not actively maintained. When it comes to automation Cross browser testing, Cloud test execution, and parallel execution are the most crucial things to achieve the ROI. However, Pyppeteer lacks all three crucial features.

Unless you test your application on multiple device and browser combinations you cannot predict the behavior of the application in production and it is always riskier. As recommended by the Pyppeteer team consider switching to Playwright Python which enables all crucial features without any hustle.

Additionally, BrowserStack Automate supports Playwright seamlessly, this helps to achieve good ROI from automation and also helps to release your code with confidence.

Try BrowserStack Now

Automation Tests on Real Devices & Browsers

Seamlessly Run Automation Tests on 3500+ real Devices & Browsers

Get answers on our Discord Community

Join our Discord community to connect with others! Get your questions answered and stay informed.

Join Discord Community

Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)