Open Facebook Crawler Based On Python
Open Facebook Crawler Based On Python
This will include: Python source code and necessary python libraries installers
and user instructions to set-up and run the crawler on a windows Notebook.
A SAVE button to save the information. Make an excel file in same where
the application exist. Save username, password and page url in that file.
Status of the Crawl will be updated every 5-10 secs according to the total
number of posts and number of comments crawled.
1
OFFICIAL (CLOSED) \ NON-SENSITIVE
Incorrect open Facebook page URL will also invoke a warning prompt to
encourage the user to check and re-enter the URL as shown in Figure 2.
The Open Facebook Python crawler must be able to crawl the following
information found on the Facebook Pages specifically: all Posts and
Comments found on the page.
Facebook POSTS: Brand, Post ID, Date, Content, No. Likes, No.
Shares, No. Comments
Facebook Comments: User Names and Comments, Post ID
Figure 3: A typical Open Facebook Post and all the items to be crawled
3
OFFICIAL (CLOSED) \ NON-SENSITIVE
4
OFFICIAL (CLOSED) \ NON-SENSITIVE
Output file should be saved in the same folder where application is placed.
Name of output excel file should be the name of crawled Facebook page.
How output file should formated
A sample output file is provided. Follow that
Meaning of open facebook pages
Open pages are those which are visible on facebook and published. If page is
unpublished and cannot be accessed, show error (as discussed above).
Note
For the company Facebook pages let's say they have 300 posts and
maybe 3000 comments. You must ensure your tool is able to scrape all
the posts and comments as much as possible.
Scrap data in a way that facebook should not block account due to
scrapping activities.
Final Delivery
As discussed, please follow the specifications, example output file and deliver
the following: 1) souce code of python facebook posts and comments scraper
2) user-guide on installation of code to run on window platform, 3) necessary
python installers libraries. so that we can run it on my end. Thanks
1 https://www.facebook.com/ShopeeSingapore
2 https://www.facebook.com/LazadaSingapore
3 https://www.facebook.com/cnn
4 https://www.facebook.com/ChannelNewsAsia
5 https://www.facebook.com/lovebonito
6 https://www.facebook.com/SonySingapore
7 https://www.facebook.com/GrabFoodSG
5
OFFICIAL (CLOSED) \ NON-SENSITIVE