0% found this document useful (0 votes)

69 views5 pages

Python - Web Scraping Videos - Stack Overflow

The document discusses techniques for web scraping videos, specifically using Python libraries like BeautifulSoup and requests to extract video URLs from websites. It provides example code for downloading a specific episode of 'Bob's Burgers' and highlights the importance of locating the correct HTML tags, such as <video> and <source>, to retrieve the video file. Additionally, it addresses challenges faced when scraping video content and offers solutions for successfully downloading the videos.

Uploaded by

Louie Lu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views5 pages

Python - Web Scraping Videos - Stack Overflow

Uploaded by

Louie Lu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Web Scraping Videos

Asked 4 years ago Modified 1 year, 8 months ago Viewed 15k times

I'm attempting to do a proof of concept by downloading a TV episode of Bob's Burgers at

https://www.watchcartoononline.com/bobs-burgers-season-9-episode-3-tweentrepreneurs.
4
I cannot figure out how to extract the video url from this website. I used Chrome and Firefox
web developer tools to figure out it is in an iframe, but extracting src urls with BeautifulSoup
searching for iframes, returns links that have nothing to do with the video. Where are the
references to mp4 or flv files (which I see in Developer Tools - even though clicking them is
forbidden).

Any understanding on how to do video web scraping with BeautifulSoup and requests would
be appreciated.

Here is some code if needed. A lot of tutorials say to use 'a' tags, but I didn't receive any 'a'
tags.

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.watchcartoononline.com/bobs-burgers-season-9-
episode-5-live-and-let-fly")
soup = BeautifulSoup(r.content,'html.parser')
links = soup.find_all('iframe')
for link in links:
print(link['src'])

python video screen-scraping

Share Follow edited Nov 7, 2018 at 20:04 asked Nov 7, 2018 at 19:37
petezurich user192085
8,545 9 38 56 97 1 1 9

get the video source in the <video> tag. I've found it to be this one in your example:
cdn.cizgifilmlerizle.com/cizgi/… Then you can use python requests with stream=true parameter like this
– Lucas Wieloch Nov 7, 2018 at 19:41

Possible duplicate of Is there a way to download a video from a webpage with python? – Lucas Wieloch
Nov 7, 2018 at 19:41

Join Stack Overflow to find the best answer to your technical question, help others
Sign up
answer theirs.
Report this ad

Sorted by:
2 Answers Highest score (default)

import requests
url = "https://disk19.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e03.mp4?
6 st=_EEVz36ktZOv7ZxlTaXZfg&e=1541637622"
def download_file(url,filename):
# NOTE the stream=True parameter
r = requests.get(url, stream=True)
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
#f.flush() commented by recommendation from J.F.Sebastian
return filename

download_file(url,"bobs.burgers.s09e03.mp4")

This code will download this particular episode onto your computer. The video url is nested
inside the <video> tag in the <source> tag.

Share Follow answered Nov 7, 2018 at 20:53

Dimitriy Kruglikov
118 1 12

This did save a file named after your function, but it was invalid and only 162bytes. Why didn't
beautifulsoup find the video and source tags? I couldn't even located the url containing the extension
mp4 with bs4 or by simply searching the requests response text/content. – user192085 Nov 8, 2018
at 20:50

Background Information
4 (scroll all the way down for your answer)
This Overflow
Join Stack is only easily obtainable
to find if the website
the best answer to your you're trying
technical to gethelp
question, the video
othersformat from makes it
Sign up
answerexplicitly
theirs. stated in the HTML. If you want to, for example, get a .mp4 file from the site of your
choice by referencing the .mp4 URL, then if we use this site here for instance;
https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314 if we look for <video> in
inspect element, there will be an src containing the .mp4

Now if we were to try to grab the .mp4 URL from this website like this

import requests
from bs4 import BeautifulSoup

html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url)
soup = BeautifulSoup(html_response.text, 'html.parser')

for mp4 in soup.find_all('video'):

mp4 = mp4['src']

print(mp4)

We would get a KeyError: 'src' output. This happens due to the actual video being stored in
source which we can view if we print out the values inside soup.find_all('video')

import requests
from bs4 import BeautifulSoup

html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url)
soup = BeautifulSoup(html_response.text, 'html.parser')

for mp4 in soup.find_all('video'):

pass

print(mp4)

The output:

<video class="video-js vjs-default-skin vjs-big-play-centered" controls=""

data-setup="{}" height="264" id="example_video_1" poster="" preload="none"
width="640">
<source src="https://mountainoservo0002.animecdn.com/Yakunara-Mug-Cup-
mo/Yakunara-Mug-Cup-mo-Episode-01.1-1080p.mp4" type="video/mp4"/>
</video>

So if we wanted to now download the .mp4, we would use the source element and get the
src from that instead.

import requests
import shutil # - - This module helps to transfer information from 1 file to
another
from bs4 import BeautifulSoup # - - We could honestly do this without soup

Join Stack Overflow to find the best answer to your technical question, help others
Sign up
answer theirs.
# - - Get the url of the site you want to scrape
html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url)
soup = BeautifulSoup(html_response.text, 'html.parser')

# - - Get the .mp4 url and the filename

for vid in soup.find_all('source'):
url = vid['src']
filename = vid['src'].split('/')[-1]

# - - Get the video

response = requests.get(url, stream=True)

# - - Make sure the status is OK

if response.status_code == 200:
# - - Make sure the file size is not 0
response.raw.decode_content = True

with open(filename, 'wb') as f:

# - - Copy what's in response.raw and transfer it into the file
shutil.copyfileobj(response.raw, f)

(You could obviously simplify this by just copying the source's src manually and using that as
the base URL without having to use html_url I just wanted to show you that you could choose
to reference the .mp4 (aka the source's src ))

Once again, not every site is this clear-cut. For this site in particular, we're fortunate that it is
this manageable. Other sites you may try to scrape a video from might have to require you to
go from Elements (in inspect element) to Network . There you'd have to try getting the
snippets of embedded links and try downloading them all to make up the full video but once
again, not always so easy but The video for the site you requested is.

YOUR ANSWER
Go to inspect element, click on Chromecast Player (2. Player) located at the top of the
video to view the HTML attributes and finally click on the embed that should look like this

pid=437035&h=25424730eed390d0bb4634fa93a2e96c&t=1618011716&embed=cizgi

Once you've done that, click play, make sure inspect element is open, click the video to view
the attributes (or ctrl+f to filter for <video> ) and copy the src which should be

https://cdn.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e05.mp4?st=f9OWlOq1e-
2M9eUVvhZa8A&e=1618019876

Now we can download it with python.

import requests
Join Stack
# -Overflow to find the
- This module bestto
helps answer to your
transfer technical question,
information help others
from 1 file to another
Sign up
answer theirs.
import shutil
url = "https://cdn.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e05.mp4?
st=f9OWlOq1e-2M9eUVvhZa8A&e=1618019876"

response = requests.get(url, stream=True)

if response.status_code == 200:
# - - Make sure the file size is not 0
response.raw.decode_content = True

with open('bobs-burgers.mp4', 'wb') as f:

# - - Take the data from response.raw and transfer it to the file
shutil.copyfileobj(response.raw, f)
print('downloaded file')
else:
print('Download failed')

Share Follow answered Apr 9, 2021 at 23:50

theletter_zee
41 2

Join Stack Overflow to find the best answer to your technical question, help others
Sign up
answer theirs.

Product Selection Guide EcoStruxure Building 998-20312057 07.20
No ratings yet
Product Selection Guide EcoStruxure Building 998-20312057 07.20
84 pages
Soal Paket A - Pembahasan
No ratings yet
Soal Paket A - Pembahasan
17 pages
Cambridge IGCSE: Mathematics (Us) 0444/41
No ratings yet
Cambridge IGCSE: Mathematics (Us) 0444/41
20 pages
Chess Strategy For Beginners Winning Stra - Magnus Templar
No ratings yet
Chess Strategy For Beginners Winning Stra - Magnus Templar
66 pages
Sr. Software Engineer: Dharmendra Kumar Sahu
No ratings yet
Sr. Software Engineer: Dharmendra Kumar Sahu
4 pages
Google Index Checker
No ratings yet
Google Index Checker
3 pages
The Role of Computers in Chemistry: Easy and Fast Calculation
No ratings yet
The Role of Computers in Chemistry: Easy and Fast Calculation
2 pages
Franchise Management System Project Report
No ratings yet
Franchise Management System Project Report
109 pages
N Etiquette
No ratings yet
N Etiquette
16 pages
2 - Event Driven Programming
No ratings yet
2 - Event Driven Programming
7 pages
MSX Insights - Overview
No ratings yet
MSX Insights - Overview
23 pages
Alinm Surge SH
No ratings yet
Alinm Surge SH
14 pages
W3 U3 JO MBA S4 Digital Marketing Brand MGT
No ratings yet
W3 U3 JO MBA S4 Digital Marketing Brand MGT
19 pages
محتويات أساسيات الحاسوب والأنترنت
No ratings yet
محتويات أساسيات الحاسوب والأنترنت
8 pages
Getting Started BIG-IP Part2 AppDelivery Lab Guide PDF
No ratings yet
Getting Started BIG-IP Part2 AppDelivery Lab Guide PDF
23 pages
IELTS Academic Reading Free Samples. Sample 1.2 - IELTS-up
No ratings yet
IELTS Academic Reading Free Samples. Sample 1.2 - IELTS-up
6 pages
Birthday Reminder: Prepatred By: Guide Name
No ratings yet
Birthday Reminder: Prepatred By: Guide Name
12 pages
Assignment2 2023 2024
No ratings yet
Assignment2 2023 2024
7 pages
IELTS Academic Reading Free Practice. Sample 1.3 - IELTS-up
No ratings yet
IELTS Academic Reading Free Practice. Sample 1.3 - IELTS-up
6 pages
IELTS Academic Reading Free Samples. Sample 2.2 - IELTS-up
No ratings yet
IELTS Academic Reading Free Samples. Sample 2.2 - IELTS-up
6 pages
Apa Style: General Rules
No ratings yet
Apa Style: General Rules
7 pages
Terms of Reference For Research Template
No ratings yet
Terms of Reference For Research Template
9 pages
Software Engineering Project
No ratings yet
Software Engineering Project
27 pages
Educational Technology
No ratings yet
Educational Technology
6 pages
Lab 10 VirusTotal - Wireshark (15 Points)
No ratings yet
Lab 10 VirusTotal - Wireshark (15 Points)
7 pages
JPMC V Waisome FL Lawrence Nardi Deposition
100% (2)
JPMC V Waisome FL Lawrence Nardi Deposition
330 pages
Class Action Lawsuit - YouPorn History Sniffing
No ratings yet
Class Action Lawsuit - YouPorn History Sniffing
11 pages
WCCP Options en-US
No ratings yet
WCCP Options en-US
402 pages
Grandstream Networks, Inc.: Analog IP Gateway GXW410x 4 or 8 FXO Ports User Manual
No ratings yet
Grandstream Networks, Inc.: Analog IP Gateway GXW410x 4 or 8 FXO Ports User Manual
32 pages
Template Teksmedik
No ratings yet
Template Teksmedik
42 pages
Angela Assignment 1
No ratings yet
Angela Assignment 1
2 pages
Facebook, Cambridge Analytica Data Scandal
No ratings yet
Facebook, Cambridge Analytica Data Scandal
3 pages
Defacing Websites A Step by Step Process by Ankit Fadia Hacking Truths - FTP Exploits
No ratings yet
Defacing Websites A Step by Step Process by Ankit Fadia Hacking Truths - FTP Exploits
3 pages
Software Engineer Resume Example
No ratings yet
Software Engineer Resume Example
1 page
YouTube Content Machine PDF
100% (1)
YouTube Content Machine PDF
28 pages
Youtube Shorts 2023 SECRETGFX
100% (1)
Youtube Shorts 2023 SECRETGFX
28 pages
Python Module-4
No ratings yet
Python Module-4
109 pages
Docs - Anipy Cli API
No ratings yet
Docs - Anipy Cli API
24 pages
Beautiful Soup Documentation - Beautiful Soup 4.13.0 Documentation
No ratings yet
Beautiful Soup Documentation - Beautiful Soup 4.13.0 Documentation
54 pages
DAP 4 Module
No ratings yet
DAP 4 Module
45 pages
DAP Module4
No ratings yet
DAP Module4
109 pages
Web Scraping and Data Collection CheatSheet 1731972399
No ratings yet
Web Scraping and Data Collection CheatSheet 1731972399
10 pages
Index
No ratings yet
Index
2 pages
This Script To Watch Videos in Twiter
No ratings yet
This Script To Watch Videos in Twiter
4 pages
Web Scraping Cheat Sheet (2021), Python For Web Scraping by Frank Andrade Geek Culture - Medium
100% (3)
Web Scraping Cheat Sheet (2021), Python For Web Scraping by Frank Andrade Geek Culture - Medium
26 pages
How To Download Private Vimeo Videos (6 Easy Steps) - Followchain
No ratings yet
How To Download Private Vimeo Videos (6 Easy Steps) - Followchain
9 pages
Web Scrapping
100% (1)
Web Scrapping
20 pages
Beautiful Soup Documentation - Beautiful Soup 4.4.0 Documentation
No ratings yet
Beautiful Soup Documentation - Beautiful Soup 4.4.0 Documentation
49 pages
03 Web Scraping
No ratings yet
03 Web Scraping
41 pages
Beautiful Soup Documentation: Getting Help
100% (1)
Beautiful Soup Documentation: Getting Help
56 pages
Webscraping1 1 PDF
No ratings yet
Webscraping1 1 PDF
10 pages
Sithfal-Task2 Explation Matter
No ratings yet
Sithfal-Task2 Explation Matter
6 pages
Let's Use Python To Scrap Some Online Movies - Videos - by Peng Cao - Freedium
No ratings yet
Let's Use Python To Scrap Some Online Movies - Videos - by Peng Cao - Freedium
6 pages
Beautifulsoap4 Experiments
No ratings yet
Beautifulsoap4 Experiments
7 pages
Lec 7
No ratings yet
Lec 7
20 pages
DAP - Module 4
No ratings yet
DAP - Module 4
57 pages
7python Web Scraping Processing Images and Videos
No ratings yet
7python Web Scraping Processing Images and Videos
5 pages
API Cheatsheet
No ratings yet
API Cheatsheet
4 pages
Robo Animes Pogramado em Python para Telegram
No ratings yet
Robo Animes Pogramado em Python para Telegram
3 pages
How Do I Download A File Over HTTP Using Python - Stack Overflow
No ratings yet
How Do I Download A File Over HTTP Using Python - Stack Overflow
8 pages
DS Lab 9
No ratings yet
DS Lab 9
2 pages
Scrapping The Web
100% (1)
Scrapping The Web
13 pages
Web Scraping With BeautifulSoup
100% (1)
Web Scraping With BeautifulSoup
8 pages
Module 5-Web Scraping
No ratings yet
Module 5-Web Scraping
8 pages
Polite
No ratings yet
Polite
10 pages
Extracting Data From HTML Table
No ratings yet
Extracting Data From HTML Table
12 pages
Lesson 4 Unstructured Data
No ratings yet
Lesson 4 Unstructured Data
20 pages
Fun With Python
100% (5)
Fun With Python
113 pages
Streamlit Installation
No ratings yet
Streamlit Installation
4 pages
Project Topic: Online Protected Youtube Video Downloader
No ratings yet
Project Topic: Online Protected Youtube Video Downloader
19 pages
Chapter1 PDF
No ratings yet
Chapter1 PDF
22 pages
bimibimi 爬取 python环境
No ratings yet
bimibimi 爬取 python环境
4 pages
Unlimited Downloader
No ratings yet
Unlimited Downloader
3 pages
Cheat Sheet: API's and Data Collection: Package/Method Description Code Example
No ratings yet
Cheat Sheet: API's and Data Collection: Package/Method Description Code Example
4 pages
Ibm Python Module 5 Apis Data Collection
No ratings yet
Ibm Python Module 5 Apis Data Collection
3 pages
Chapter 11. Web Scraping
100% (1)
Chapter 11. Web Scraping
57 pages
Importing Data in Python Ii: Importing Flat Files From The Web
No ratings yet
Importing Data in Python Ii: Importing Flat Files From The Web
22 pages
Robo de Videos Do Youtube Programado em Python para Telegram
No ratings yet
Robo de Videos Do Youtube Programado em Python para Telegram
3 pages
Pytube Documentation: Release 9.0.7
No ratings yet
Pytube Documentation: Release 9.0.7
34 pages
Video Journal Details
No ratings yet
Video Journal Details
2 pages
How Can I Get Href Links From HTML Using Python?: 6 Answers
No ratings yet
How Can I Get Href Links From HTML Using Python?: 6 Answers
3 pages
Beautiful Soup
No ratings yet
Beautiful Soup
61 pages
Practical Introduction To Web Scraping in Python
100% (1)
Practical Introduction To Web Scraping in Python
14 pages
Beautiful Soup Documentation
No ratings yet
Beautiful Soup Documentation
61 pages
Webm Guide For Retards: Recording Software
No ratings yet
Webm Guide For Retards: Recording Software
3 pages
Youtube Dl
No ratings yet
Youtube Dl
1 page
YouTube DownLoader
No ratings yet
YouTube DownLoader
20 pages
Beautiful Soup Documentation
No ratings yet
Beautiful Soup Documentation
53 pages
Beautiful Soup
No ratings yet
Beautiful Soup
40 pages
Python For Web Scraping - Week 3: 1 Installing A Module
No ratings yet
Python For Web Scraping - Week 3: 1 Installing A Module
4 pages