Extract Data From JustDial using Selenium
Last Updated :
24 Oct, 2021
Improve
Let us see how to extract data from Justdial using Selenium and Python. Justdial is a company that provides local search for different services in India over the phone, website and mobile apps. In this article we will be extracting the following data:
- Phone number
- Name
- Address
We can then save the data in a CSV file.
Approach:
- Import the following modules: webdriver from selenium, ChromeDriverManager, pandas, time and os.
- Use the driver.get() method and pass the link you want to get information from.
- Use the driver.find_elements_by_class_name() method and pass 'store-details'.
- Instantiate empty lists to store the values.
- Iterate the StoreDetails and start fetching the individual details that are required.
- Create a user-defined function strings_to_number() to convert the extracted string to numbers.
- Display the details and save them as a CSV file according to the requirements.
# importing the modules
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
import pandas as pd
import time
import os
# driver.get method() will navigate to a page given by the URL address
driver.get("https://www.justdial.com/Delhi/Ceiling-Tile-Dealers-Armstrong/nct-11271379")
# the user-defined function
def strings_to_num(argument):
switcher = {
'dc': '+',
'fe': '(',
'hg': ')',
'ba': '-',
'acb': '0',
'yz': '1',
'wx': '2',
'vu': '3',
'ts': '4',
'rq': '5',
'po': '6',
'nm': '7',
'lk': '8',
'ji': '9'
}
return switcher.get(argument, "nothing")
# fetching all the store details
storeDetails = driver.find_elements_by_class_name('store-details')
# instantiating empty lists
nameList = []
addressList = []
numbersList = []
# iterating the storeDetails
for i in range(len(storeDetails)):
# fetching the name, address and contact for each entry
name = storeDetails[i].find_element_by_class_name('lng_cont_name').text
address = storeDetails[i].find_element_by_class_name('cont_sw_addr').text
contactList = storeDetails[i].find_elements_by_class_name('mobilesv')
myList = []
for j in range(len(contactList)):
myString = contactList[j].get_attribute('class').split("-")[1]
myList.append(strings_to_num(myString))
nameList.append(name)
addressList.append(address)
numbersList.append("".join(myList))
# initialize data of lists.
data = {'Company Name': nameList,
'Address': addressList,
'Phone': numbersList}
# Create DataFrame
df = pd.DataFrame(data)
print(df)
# Save Data as .csv
df.to_csv('demo1.csv', mode = 'a', header = False)
Output: