Grabbing parameters from default requests

`scrapy_playwright` tries to imitate the web-browser so it downloads all resources (images, scripts, stylesheets, etc). Therefore, given this information is downloaded, is it possible to grab the payload from specific requests sent from the network tab - specifically from the fetch/XHR tab?

For example (minimal reproducible code):
```
import scrapy
from scrapy_playwright.page import PageCoroutine

class testSpider(scrapy.Spider):
    name = 'test'

    def start_requests(self):
        yield scrapy.Request(
            url="http://quotes.toscrape.com/scroll",
            cookies={"foo": "bar", "asdf": "qwerty"},
            meta={
                "playwright": True,
                "playwright_page_coroutines": [
                    PageCoroutine("wait_for_selector", "div.quote"),
                    PageCoroutine("evaluate", "window.scrollBy(0, document.body.scrollHeight)"),
                    PageCoroutine("wait_for_selector", "div.quote:nth-child(11)"),  # 10 per page
                    PageCoroutine( "screenshot", path="scroll.png", full_page=True
                    ),
                ],
            },
        )
    def parse(self, response):
        pass
```

Produces the following output:

```
2022-02-21 11:50:39 [scrapy-playwright] DEBUG: [Context=default] Request: <GET http://quotes.toscrape.com/api/quotes?page=2> (resource type: xhr, referrer: http://quotes.toscrape.com/scroll)
2022-02-21 11:50:39 [scrapy-playwright] DEBUG: [Context=default] Request: <GET http://quotes.toscrape.com/api/quotes?page=3> (resource type: xhr, referrer: http://quotes.toscrape.com/scroll)
2022-02-21 11:50:39 [scrapy-playwright] DEBUG: [Context=default] Request: <GET http://quotes.toscrape.com/api/quotes?page=4> (resource type: xhr, referrer: http://quotes.toscrape.com/scroll)
...
```

Very similar to firefox which allows you to copy the url and url parameters. The aim is to store the url parameters into a list each time playwright downloads a specific request url. With a more complex website I would have to find specific requests urls and grab the url parameters. 

Something like:

```
if response.meta['resource_type'] is 'xhr':
    print(parameters(response.meta['resource_type_urls'])) 
```

This is a pseduo-example to express what I want to get; `parameters` would be a function to grab the url parameters.

Or perhaps it works like this:
```
if response.meta['resource_type'] is 'xhr':
    print(response.meta['parameters']) 
```

However saving it into `response.meta` will likely overload the results if I have a large number of urls for resource types, and url parameters are fairly large dicts.

 - I'm convinced this data is available as it's downloaded however I just do not know how to get it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Grabbing parameters from default requests #61

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Grabbing parameters from default requests #61

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions