Git is the most popular version control system, where millions of developers manage their project or files (code). In this we will try to fetch top 10 most starred repositories within a month.
As we are mainly scraping the GitHub repositories, we are going to use mainly the,
Requests & BeautifulSoup library to fetch the repositories.
We will store the result in a file & display it. It will show the result based on position (stars) with name & repos.
Below is the code to implement it:
import requests from bs4 import BeautifulSoup r = requests.get('https://github.com/trending/lua?since=monthly') bs = BeautifulSoup(r.text, 'lxml') lista_repo = bs.find_all('ol', class_='repo-list') f1 = open('starred-repos.txt', 'w') for lr in lista_repo: aux = lr.find_all('div', class_='d-inline-block col-9 mb-1') for ld in aux: rank = ld.find_all('a') f1.writelines(str(rank)) f1.writelines('\n') f1.close() f1 = open('starred-repos.txt', 'r') texto = [] for x in f1: if x[0] == '[' and x[1] == '<' and x[2] == 'a': na = x.split('"') texto.append(na[1]) f1.close() f1 = open('starred-repos.txt', 'w') f1.writelines('{}\t {}\t\t {}\t\n\n'.format('Position ', 'Name ', 'Repositories ')) for i in range(10): tex= texto[i].split('/') name = tex[1] repos = tex[2] f1.writelines('{}- \t {}\t\t {}'.format(i + 1, name, repos)) f1.writelines('\n') f1.close() f1 = open('starred-repos.txt', 'r') print(f1.read()) f1.close()
Output
Position Name Repositories 1- skywind3000 z.lua 2- Kong kong 3- Gawen WireHub 4- PapyElGringo material-awesome 5- koreader koreader 6- stijnwop guidanceSteering 7- Courseplay courseplay 8- Tencent LuaPanda 9- ntop ntopng 10- awesomeWM awesome