Scraping Your Mailbox with Python & Basics in
Parsing
Index
● Installing Jupyter Notebook
● Authentication (OAuth2 & app passwords)
● Accessing your mailbox (IMAP)
1. Installing Jupyter Notebook
For new users, it’s highly recommended to install Anaconda. Anaconda conveniently
installs Python, the Jupyter Notebook. Use the following installation steps:
Download Anaconda by clicking on ‘Download for Mac’
Once the installation is complete, Anaconda Navigator will appear in your Launchpad
Once Anaconda Navigator is open and you click on “Launch” under the Jupyter
Notebook tile as shown in the screenshot, this is what will happen:
A Local Web Server Starts
Anaconda will spin up a local server (usually at http://localhost:8888). This server is
what powers Jupyter in your browser.
Your Browser Automatically Opens
A new tab will open in your default web browser, showing the Jupyter Notebook
interface. Don’t worry—your files are not online. Everything is running locally on your
computer.
You Land in the File Browser
The first screen shows a file browser, rooted in the directory where the server started
(usually your user folder). You can navigate through folders or create new ones here.
You Can Now Create or Open Notebooks
Click “New” → Python 3 (or whichever kernel is installed) to start a new
notebook. Or click on an existing .ipynb file to open it.
Running Code Inside the Notebook
Each notebook is made up of cells. You can write and run Python code inside these
cells. Results will appear right below each one.
2. Authentication (OAuth2 & App Passwords)
This is required when accessing your Gmail via Python scripts or third-party apps that
don’t support 2-Step Verification directly. You must have 2-Step Verification enabled
on your Google account as a pre-requisite.
Step 1: Go to Your Google Account (https://myaccount.google.com/)
Step 2: Enable 2-Step Verification (If not already done)
● Click on Security in the left sidebar
● Under “Signing in to Google”, find 2-Step Verification and click on it
● Follow the prompts to set it up using your phone number or Google
Authenticator app
Step 3: Access App Passwords
● Once 2-Step Verification is enabled, go back to the Security section
● Under “Signing in to Google”, click App passwords
Step 4: Generate an App Password
● Under ‘Your app password’ create a custom name e.g. ‘Python Script’ and
click on ‘Create’
Step 5: Copy the Generated Password
● Google will show you a 16-character password
● You won’t see this password again, so store it securely or generate a new
one later
3. Accessing your mailbox (IMAP)
You’ll first need to install all the required third-party libraries before running the script to
scrape your mailbox. Run the below command in your jupyter notebook cell.
pip install numpy pandas tqdm pytz
This section explains the code used to connect to your Gmail inbox using Python
and retrieve emails securely
#Importing the necessary libraries
import numpy as np
import pandas as pd
import re
import imaplib, email
from datetime import datetime, timedelta
from tqdm import tqdm
import pytz
These libraries handle different parts of the process:
● numpy, pandas: For any data processing or tabular handling.
● re: Used for pattern matching (e.g., extracting info from email text).
● imaplib, email: To connect to the mailbox and parse messages.
● datetime, timedelta: For working with date and time ranges.
● tqdm: Adds a progress bar (useful when processing many emails).
● pytz: Helps with timezone conversions (e.g., UTC to IST).
# Define the timezone
utc = pytz.utc
ist = pytz.timezone('Asia/Kolkata')
# Input user name & App password
user = '[email protected]'
password = 'App Password'
imap_url = 'imap.gmail.com'
# Input user name and app password
user = '
[email protected]'
password = 'App Password' # Paste your gmail app password
imap_url = 'imap.gmail.com'
# Connect to Gmail
my_mail = imaplib.IMAP4_SSL(imap_url) # connect to gmail
my_mail.login(user, password) # sign in with your credentials
my_mail.select('Inbox') # select the folder that you want to retrieve