Open In App

Extract Author's information from Geeksforgeeks article using Python

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

In this article, we are going to write a python script to extract author information from GeeksforGeeks article.

Module needed

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not comes built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Import module
  • Make requests instance and pass into URL
  • Initialize the article Title
  • Pass URL into a getdata()
  • Scrape the data with the help of requests and Beautiful Soup
  • Find the required details and filter them.

Stepwise execution of scripts:

Step 1: Import all dependence

Python
# import module
import requests
from bs4 import BeautifulSoup

 
Step 2: Create a URL get function 

Python3
# link for extract html data
# Making a GET request 
    
def getdata(url):
    r=requests.get(url)
    return r.text

Step 3: Now merge the Article name into URL and pass the URL into the getdata() function and Convert that data into HTML code 

Python3
# input article by geek
article = "optparse-module-in-python"

# url
url = "https://www.geeksforgeeks.org/"+article

# pass the url
# into getdata function
htmldata=getdata(url)
soup = BeautifulSoup(htmldata, 'html.parser')

# display html code
print(soup)

Output: 

Step 4: Traverse the author's name from the HTML document. 

Python
# traverse author name
for i in soup.find('div', class_="author_handle"):
    Author = i.get_text()
print(Author)

Output: 

kumar_satyam

Step 5: Now create a URL with author-name and get HTML code. 

Python3
# now get author information
# with author name
profile ='https://auth.geeksforgeeks.org/?to=https://auth.geeksforgeeks.org/profile.php'+Author+'/profile' 

# pass the url
# into getdata function
htmldata=getdata(profile)
soup = BeautifulSoup(htmldata, 'html.parser')

Step 6: Traverse the author's information.

Python3
# traverse information of author
name = soup.find(
    'div', class_='mdl-cell mdl-cell--9-col mdl-cell--12-col-phone textBold medText').get_text()


author_info = []
for item in soup.find_all('div', class_='mdl-cell mdl-cell--9-col mdl-cell--12-col-phone textBold'):
    author_info.append(item.get_text())

print("Author name :")
print(name)
print("Author information  :")
print(author_info)

Output:

Author name : Satyam Kumar 
Author information  : 
['LNMI patna', '\nhttps://www.linkedin.com/in/satyam-kumar-174273101/'] 
 

Complete code:

Python3
# import module
import requests
from bs4 import BeautifulSoup

# link for extract html data
# Making a GET request


def getdata(url):
    r = requests.get(url)
    return r.text


# input article by geek
article = "optparse-module-in-python"

# url
url = "https://www.geeksforgeeks.org/"+article


# pass the url
# into getdata function
htmldata = getdata(url)
soup = BeautifulSoup(htmldata, 'html.parser')

# traverse author name
for i in soup.find('div', class_="author_handle"):
    Author = i.get_text()

# now get author information
# with author name
profile = 'https://auth.geeksforgeeks.org/?to=https://auth.geeksforgeeks.org/profile.php'+Author+'/profile'

# pass the url
# into getdata function
htmldata = getdata(profile)
soup = BeautifulSoup(htmldata, 'html.parser')

# traverse information of author
name = soup.find(
    'div', class_='mdl-cell mdl-cell--9-col mdl-cell--12-col-phone textBold medText').get_text()


author_info = []
for item in soup.find_all('div', class_='mdl-cell mdl-cell--9-col mdl-cell--12-col-phone textBold'):
    author_info.append(item.get_text())

print("Author name :", name)
print("Author information  :")
print(author_info)

Output:

Author name : Satyam Kumar 
Author information  : 
['LNMI patna', '\nhttps://www.linkedin.com/in/satyam-kumar-174273101/'] 
 


Similar Reads