0% found this document useful (0 votes)
30 views6 pages

Open Facebook Crawler Based On Python

The document outlines requirements for developing an open Facebook crawler in Python. It specifies details like the crawler needing a GUI to input login credentials and a page URL, crawling posts and comments from the given page, and saving the extracted data to an Excel file with fields like brand, post ID, date, content, reactions.

Uploaded by

Asim Anayat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views6 pages

Open Facebook Crawler Based On Python

The document outlines requirements for developing an open Facebook crawler in Python. It specifies details like the crawler needing a GUI to input login credentials and a page URL, crawling posts and comments from the given page, and saving the extracted data to an Excel file with fields like brand, post ID, date, content, reactions.

Uploaded by

Asim Anayat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

OFFICIAL (CLOSED) \ NON-SENSITIVE

Open Facebook Crawler based on Python


Requirements and Deliverables: To implement and deliver an Open Facebook
Crawler in python that will allow automatic collection of open Facebook posts
and comments based on a given Open Facebook page according to the
specifications 1) Open Facebook Crawler UI and 2) Open Facebook Python
Crawler. The Open Facebook Scraper based on Python will be able to run on
any Windows Notebook.

This will include: Python source code and necessary python libraries installers
and user instructions to set-up and run the crawler on a windows Notebook.

Specifications of the Open Facebook Crawler GUI


 There will be a Graphical User Interface (GUI) where users will be able to
enter their Facebook personal information: User_ID and Password
information and the Open Facebook Page URL for configuring the crawl as
shown in Figure 1.

 A SAVE button to save the information. Make an excel file in same where
the application exist. Save username, password and page url in that file.

 A START button to start the crawler.

 A STOP button to provide hard stop for the crawler.

 Status of the Crawler will be updated: “Scraper is Running or Scraper is not


Running”.

 Status of the Crawl will be updated every 5-10 secs according to the total
number of posts and number of comments crawled.

 Incorrect User_ID or/and Password will invoke a warning prompt to


encourage the user to check and re-enter their personal Facebook
information as shown in Figure 2.

1
OFFICIAL (CLOSED) \ NON-SENSITIVE

 Incorrect open Facebook page URL will also invoke a warning prompt to
encourage the user to check and re-enter the URL as shown in Figure 2.

Figure 1: GUI interface for configuring Open Facebook Crawl

Figure 2: Appearance of Prompts dialog boxes when information is not


correctly entered in the GUI interface for configuring Open Facebook Crawl

Specifications of the Open Facebook Python Crawler


2
OFFICIAL (CLOSED) \ NON-SENSITIVE

 The Open Facebook Python crawler must be able to crawl the following
information found on the Facebook Pages specifically: all Posts and
Comments found on the page.

 The following is a list of items to be extracted and placed in a output excel


file as shown In Figure 3 and Table 1 from the all the posts and comments
found in the Open Facebook Page:

Facebook POSTS: Brand, Post ID, Date, Content, No. Likes, No.
Shares, No. Comments
Facebook Comments: User Names and Comments, Post ID

Table 1: Typical items to be crawled

Figure 3: A typical Open Facebook Post and all the items to be crawled

3
OFFICIAL (CLOSED) \ NON-SENSITIVE

How to get post ID


To get post ID click on date. Post ID will visible in url.

Location of output excel file

4
OFFICIAL (CLOSED) \ NON-SENSITIVE

Output file should be saved in the same folder where application is placed.
Name of output excel file should be the name of crawled Facebook page.
How output file should formated
A sample output file is provided. Follow that
Meaning of open facebook pages
Open pages are those which are visible on facebook and published. If page is
unpublished and cannot be accessed, show error (as discussed above).
Note
 For the company Facebook pages let's say they have 300 posts and
maybe 3000 comments. You must ensure your tool is able to scrape all
the posts and comments as much as possible.
 Scrap data in a way that facebook should not block account due to
scrapping activities.

Final Delivery
As discussed, please follow the specifications, example output file and deliver
the following: 1) souce code of python facebook posts and comments scraper
2) user-guide on installation of code to run on window platform, 3) necessary
python installers libraries. so that we can run it on my end. Thanks

Example lists of Typical Open Facebook Pages for Potential Crawling


The following are some examples of open-Facebook pages
No Open-Facebook Pages

1 https://www.facebook.com/ShopeeSingapore
2 https://www.facebook.com/LazadaSingapore
3 https://www.facebook.com/cnn
4 https://www.facebook.com/ChannelNewsAsia
5 https://www.facebook.com/lovebonito
6 https://www.facebook.com/SonySingapore
7 https://www.facebook.com/GrabFoodSG

5
OFFICIAL (CLOSED) \ NON-SENSITIVE

You might also like