0% found this document useful (0 votes)
14 views

Python Module-3 Notes (21EC646)_final

Uploaded by

kgaddigoudar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Python Module-3 Notes (21EC646)_final

Uploaded by

kgaddigoudar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

PYTHON PROGRAMMING

Module -3
Regular Expression (Pattern Matching) & Reading and Writing Files

Syllabus:
Pattern Matching:
• Pattern Matching with Regular Expressions,
• Finding Patterns of Text Without Regular Expressions,
• Finding Patterns of Text with Regular Expressions,
• More Pattern Matching with Regular Expressions,
• The findall() Method,
• Character Classes,
• Making Your Own Character Classes,
• The Caret and Dollar Sign Characters,
• The Wildcard Character,
• Review of Regex Symbols.

Reading and Writing Files:


• Files and File Paths,
• The os.path Module,
• The File Reading/Writing Process,
• Saving Variables with the shelve Module,
• Saving Variables with the pprint. pformat() Function
Python Programming

PATTERN MATCHING

Pattern matching, is the process of finding specific text (pattern) within a text file.

There are two main types:


1. Without regular expressions
2. With regular expressions

1. Without regular expressions

Example 1: Searching for a String (word) in given paragraph (text file)

Let's search for the word "Dhoni" in the given paragraph using Python without using regular
expressions. Instead, we'll use basic string methods like find and in.

Python Program:
# Paragraph
text = """Mahendra Singh Dhoni, commonly known as MS Dhoni, is one of the most
successful captains in the history of Indian cricket. Known for his calm demeanor and
exceptional leadership skills, Dhoni has led India to numerous victories, including the
ICC World T20 in 2007, the ICC Cricket World Cup in 2011, and the ICC Champions
Trophy in 2013."""

# Word to search for


pattern = "Dhoni"

# Check if the word is in the paragraph using the `in` operator


if pattern in text:
print(f ' The word {word_to_search} is found in the paragraph.')
else:
print(f 'The word {word_to_search} is not found in the paragraph.')

Note: Here if and in are keywords. Text and pattern are variables

Explanation:

1. Here text is variable which contains the text data or multiple lines of string
2. pattern is variable which contains the string or word to be searched "Dhoni".
3. in Operator: We use the in operator to search/check if the pattern exists in the
text/string. If pattern exists in the text, then it returns True, so if
statement is executed
4. Print the Result: Based on the output of the in operator, print result.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Example 2: Finding Phone Number from a Given Text (Imp_as per syllabus)

Aim:
To find phone numbers with the format (XXX-XXX-XXXX) within a given text.
The phone number format consists of 12 characters: the first 3 are digits, followed by a hyphen,
then 3 digits, another hyphen, and finally 4 digits.

Program:
def PhoneNumber(text):
if len(text) != 12:
return False
for i in range(0, 3):
if not text[i].isdecimal():
return False
if text[3] != '-':
return False
for i in range(4, 7):
if not text[i].isdecimal():
return False
if text[7] != '-':
return False
for i in range(8, 12):
if not text[i].isdecimal():
return False
return True.

#function call
text= “415-555-4242 ”
Z=PhoneNumber(text)
Output : True

text= “Dhoni-123 ”
Z=PhoneNumber(text)
Output : False

Explanation
1. Function Definition: PhoneNumber
2. Check if the string length is 12 characters. If not, return False.
3. Verify the first 3 characters are digits. If not, return False.
4. Check if the 4th character is a hyphen ('-'). If not, return False.
5. Verify characters 5 to 7 are digits. If not, return False.
6. Check if the 8th character is a hyphen ('-'). If not, return False.
7. Verify characters 9 to 12 are digits. If not, return False.
8. Output: If all are True, return True.

Finding Phone Numbers in a Larger Text


• Loop Through the String: Iterate through message, check each 12-character chunk.
• Check Each Chunk: If a chunk matches the phone number pattern, print it.
• Completion: Print "Done" after the loop finishes.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

message = 'Call me at 415-555-1011 tomorrow. 415-555-9999 is my office.'

for i in range(len(message)):
chunk = message[i :i+12]
if PhoneNumber(chunk):
print(f 'Phone number found: {chunk}’)

Limitations of Previous Approach:


The PhoneNumber() function is long and only matches one specific pattern.
It doesn't work for other formats like 415.555.4242 or (415) 555-4242 or extensions like
415-555-4242 x99.

Regular Expressions (Regex) (Finding Patterns using Regular Expressions)

Introduction to Regular Expressions (Regex):


o A regular expression (regex) is a sequence of characters that defines a search pattern
in text.
o Regex is a powerful tool to describe text patterns.

Syntax of Regex
Characters:

o . : Matches any character except a newline.


o \d : Matches any digit (0-9).
o \D : Matches any non-digit character.
o \w : Matches any word character (alphanumeric plus underscore).
o \W : Matches any non-word character.
o \s : Matches any whitespace character (spaces, tabs, line breaks).
o \S : Matches any non-whitespace character.

2. Anchors:
o ^ : Matches the start of a string.
o $ : Matches the end of a string.

3. Quantifiers:
o * : Matches 0 or more repetitions.
o + : Matches 1 or more repetitions.
o ? : Matches 0 or 1 repetition.
o {n} : Matches exactly n repetitions.
o {n,} : Matches n or more repetitions.
o {n,m} : Matches between n and m repetitions.
o
4. Character Classes:
o [abc] : Matches any single character a, b, or c.
o [^abc] : Matches any single character except a, b, or c.
o [a-z] : Matches any single character from a to z.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

5. Groups and Alternations:


o (abc) : Matches the exact sequence abc.
o a|b : Matches either a or b.

6. Escaping:
o \ : Escape character, used to match special characters literally (e.g., \., \*).

Example: \d matches any digit (0-9).

The regex \d\d\d-\d\d\d-\d\d\d\d matches a phone number pattern like 123-456-7890.

Using Regex in Python:


Steps :
• Import re module.
• Store the string in the variable
• Define the pattern (word to be searched)
• Compile the pattern using re.compile()
• Search for the required pattern in the given text using search method
• Retrieve the found pattern (matched object) using group() method
• Check whether pattern is found or not using if else statement

Program:

import re

#store the string in variable


str= "My phone number is : 234-567-8989"

#define the pattern


pattern=r"\d\d\d-\d\d\d-\d\d\d\d"
#or pattern=r"\d{3}-\d{3}-\d{4}"

#compile the pattern


compl=re.compile(pattern)

#search the pattern using search method


mo=compl.search(str)

#retrive the mo using group()


mo_g=mo.group() if mo else “No match”

#print result
print(mo_g)

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Parenthesis ( ) [Grouping of pattern using Parentheses]

regex =short form of regular expressions)

o Parentheses () is used in regex is used to define groups.


o Parenthesis allow us to specify sub-patterns within the larger pattern to obtain
specific parts, separately

Creating Groups:
o Putting parentheses () around parts of the regex pattern creates groups.
o Example: (\d\d\d)-(\d\d\d-\d\d\d\d) creates two groups:
(\d\d\d)=> group 1 for the area code
(\d\d\d-\d\d\d\d)=> group 2 for the main number.

2. Accessing Groups:
o After performing a search with search() method, use group() method on the to
retrieve specific groups:
▪ mo.group(0)( (equivalent to mo.group()) retrieves the entire matched text.
▪ mo.group(1) retrieves the text matched by the first group.
▪ mo.group(2) retrieves the text matched by the second group.

Program:

import re

#store the string in variable


str= "My phone number is : 233-567-8989"

#define the pattern


pattern=r"(\d\d\d)-(\d\d\d-\d\d\d\d)"

#compile the pattern


compl=re.compile(pattern)

#search the pattern using search method


mo=compl.search(str)

#retrive the mo using group()


Entire_no=mo.group(0)if mo else “no match”
Area_code=mo.group(1) if mo else “no match”
Main_no=mo.group(2) if mo else “no match”

#print result
print(f"Area code : {area_code}\n Main number : {main_number} \nEntire
number : {entire_no}")

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Pipe Character (|): Matching Multiple Groups using Pipe


o The | character is used to match one of many objects in the text or string.
o It is kind of OR operation.
o Syntax: r'Ramesh | Mahesh'

It will match either 'Ramesh' or 'Mahesh' in the string

o If both 'Ramesh' and 'Mahesh' are in the string, the first occurrence is matched.

import re
Str1='Ramesh Mahesh Suresh Kalmesh'
Str2='Mahesh Ramesh Suresh Kalmesh'

pattern = r'Ramesh|Mahesh'
comp= re.compile(pattern)

mo1=comp.search(Str1)
mo1_g= mo1.group() if mo1 else “No match”
print(mo1_g) # Output: Ramesh

mo2=comp.search(Str2)
mo2_g= mo2.group() if mo2 else “No match”
print(mo2_g) # Output: Mahesh

Pipe with parenthesis Matching patterns with a common prefix


• To match patterns with a common prefix, we have to use parentheses.
• Example:

'Batman', 'Batmobile', 'Batcopter', and 'Batbat'

Pattern= r'Bat(man|mobile|copter|bat)'.

• This way, we can specify the common prefix 'Bat' only once.

Program:

import re

# Store the string with common prefix


Str = 'Batman, Batmobile, Batcopter, and Batbat'

# Define the r pattern with common prefix


pattern = r'Bat (man|mobile|copter|bat)'

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Compile the regex pattern


comp = re.compile(pattern)

# Use the search method to search for the pattern


mo1 = comp.search(Str)

# Retrieve the full matched text and the part inside parentheses
mo1_g = mo1.group() # Full match
mo1_g2 = mo1.group(3) # Part inside parentheses

# Print the results


print(mo1_g) # Output: Batman (the first full match)
print(mo1_g2) # Output: copter (part of the match inside parentheses)

Question Mark (?) [Optional Matching (zero or one)]


• Question mark (?) character is used for optional matching in the regex.
• The group is created using parenthesis. The group followed by ? is considered as optional
• Here group has may NOT appear or if it is appearing, it has to appear only once, not multiple
times.
• Example:
• Pattern: r'Bat(wo)?man'
Here (wo) is group and is optional.
• Str1 = 'The Adventures of Batman' ; Output: Batman
• Str2= 'The Adventures of Batwoman'; Output: Batwoman
• Str2= 'The Adventures of Batwowoman' ; Output: None

matches both 'Batman' (without 'wo') and 'Batwoman' (with 'wo').

import re

# Store the string with possible matches


Str1 = 'The Adventures of Batman'
Str2= 'The Adventures of Batwoman'
Str3= 'The Adventures of Batwowoman'

# Define the pattern with optional matching (?)


pattern = r'Bat(wo)?man'

# Compile the regex pattern


comp = re.compile(pattern)

# Use the search method to search for the pattern


mo1 = comp.search(Str1)
mo1_g = mo1.group() if mo1 else “no match”

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

print(mo1_g) # Output: Batman

mo2 = comp.search(Str2)
mo2_g = mo2.group() if mo2 else “no match”
print(mo2_g) # Output: Batwoman

Mo3 = comp.search(Str3)
Mo3_g = mo3.group() if mo3 else “no match”
print(mo3_g) # Output: None

except :
print(“No match”)

Example -2
import re

# Store the string in a variable


String1 = 'My number is 415-555-4242.'
String2= 'My number is 555-4242.'

# Store the regex pattern in a variable


pattern = r'(\d{3}-)?\d{3}-\d{4}'

# Compile the regex pattern


compl = re.compile(pattern)

try:

#for string 1
# Use the search method to search for the pattern
mo1 = compl.search(String1)
mo1_g=mo1.group() if mo1 else “no match”
print(mo1_g)

mo2 = compl.search(String2)
mo2_g=mo2.group()
print(mo2_g)

except:
print(“None”)

Star(*) (Matching zero or More)


• The star (*) is used to match zero or more instances.
• Here group can appear any number of times, or not at all.

Example: r' Bat(wo)*man' can match:


• 'Batman' (zero instances of 'wo')
• 'Batwoman' (one instance of 'wo')
• 'Batwowowowoman' (multiple instances of 'wo')

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

import re

# Store the string with possible matches


Str1 = 'The Adventures of Batman'
Str2 = 'The Adventures of Batwoman'
Str3 = 'The Adventures of Batwowowowoman'

# Define the pattern with zero or more matching (*)


pattern = r'Bat(wo)*man'

# Compile the regex pattern


comp = re.compile(pattern)

try:
# Use the search method to search for the pattern
mo1 = comp.search(Str1)
mo1_g = mo1.group() if mo1 else “no match”
print(mo1_g) # Output: Batman

mo2 = comp.search(Str2)
mo2_g = mo2.group() if mo2 else “no match”
print(mo2_g) # Output: Batwoman

mo3 = comp.search(Str3)
mo3_g = mo3.group() if mo3 else “no match”
print(mo3_g) # Output: Batwowowowoman

except:
print(“No match”)

Difference Between * and ? in Regex


• * (star): Matches zero or more instances of the preceding group.
o Example: r'Bat(wo)*man' can match 'Batman', 'Batwoman', 'Batwowowowoman', etc.

• ? (question mark): Matches zero or one instance of the preceding group.


o Example: r'Bat(wo)?man' can match 'Batman' or 'Batwoman', but not 'Batwowowowoman'.

Plus (+) (Matching One or More)


• + (plus): Matches one or more objects in the string
• The group having a plus must appear at least once.
• If we need to match an actual plus sign character, then prefix the plus sign with a backslash to escape it:
\+.

import re

# Store the string with possible matches


Str1 = 'The Adventures of Batman'
Str2 = 'The Adventures of Batwoman'
Str3 = 'The Adventures of Batwowowowoman'

# Define the pattern with (+)


pattern = r'Bat(wo)+ man'

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Compile the regex pattern


comp = re.compile(pattern)

# Use the search method to search for the pattern


mo1 = comp.search(Str1)
mo1_g = mo1.group() if mo1 else ‘No match’
print(mo1_g) # Output: No match

mo2 = comp.search(Str2)
mo2_g = mo2.group() if mo1 else ‘No match’
print(mo2_g) # Output: Batwoman

mo3 = comp.search(Str3)
mo3_g = mo3.group() if mo1 else ‘No match’
print(mo3_g) # Output: Batwowowowoman

Matching Specific Repetitions with Curly Brackets


1. Curly Brackets for Specific Repetitions:
o To repeat a group a specific number of times, use curly brackets {}.
o Example: (Ha){3} matches exactly three repetitions of 'Ha'.

2. Curly Brackets with Ranges:


o Range with a minimum and maximum:
▪ Example: (Ha){3,5} ; matches 3 to 5 repetitions of 'Ha'.
o Omitting the first number:
▪ (Ha){,5} matches up to 5 repetitions.
o Omitting the second number:
▪ (Ha){3,} matches 3 or more repetitions.

3. Equivalent Patterns:
o (Ha){3} is the same as (Ha)(Ha)(Ha).
o (Ha){3,5} is the same as ((Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha)(Ha)).

import re

# Example strings

Str = 'HaHaHaHaHaHa'

# Define the regex pattern with specific repetitions


pattern_speicific = r'(Ha){3}'
pattern_range = r'(Ha){3,5}'
pattern_no_1st= r'(Ha){,5}'
pattern_no_last= r'(Ha){3,}’

# Compile the pattern


comp_specific = re.compile(pattern_specific)
comp_range = re.compile(pattern_range)
comp_no_first= re.compile(pattern_no_1st)
comp_no_last= re.compile(pattern_no_last)

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Search
mo1 = comp_specific.search(Str)
mo1_g=mo1.group() # Output: HaHaHa
print(mo1_g)

mo2 = comp_range.search(str)
mo2_g=mo2.group()
print(mo2_g)

mo3 = comp_no_first.search(str)
mo3_g=mo3.group()
print(mo3_g) # Output: HaHaHaHa

mo4 = comp_no_last.search(str)
mo4_g=mo4.group()
print(mo4_g) # Output: HaHaHaHaHa

Greedy and Nongreedy Matching


1. Greedy Matching:
o By default, Python's regular expressions are greedy.
o They match the longest string possible in ambiguous situations.
o Example: strr= hahahahahahahahahahahaha
(ha){3,5}, here it matches 'HaHaHaHaHa'

2. Nongreedy Matching:
o We can use the ? after the curly bracket to make the regex nongreedy.
o It matches the shortest string possible.
o Example: (Ha){3,5}? Here it matches 'HaHaHa'

3. Two Meanings of ?:
o Represents a nongreedy match when used after a quantifier.
o Represents an optional group when used directly after a group.

import re

str='HaHaHaHaHa'

# Define the greedy regex pattern


greedy_pattern = r'(Ha){3,5}'

# Compile the greedy regex pattern


greedy= re.compile(greedy_pattern)

# Use the search method to search for the pattern


mo1 = greedy.search(str)
mo1_g = mo1.group()
print(mo1_g) # Output: HaHaHaHaHa

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Define the nongreedy regex pattern


nongreedy_pattern = r'(Ha){3,5}?'

# Compile the nongreedy regex pattern


nongreedy = re.compile(nongreedy_pattern)

# Use the search method to search for the pattern


mo2 = nongreedy.search(str)
mo2_g = mo2.group()
print(mo2_g) # Output: HaHaHaHa

The findall() Method

• The findall() method returns all matches in the given in a string.


• The search() method, returns only the first match out of multiple matches, but findall()
returns all matches.

Key Points:
1. Without Groups:
▪ If the regex pattern has no groups, findall() returns a list of strings.
▪ List contains matched strings in the text.

1. With Groups:
o If the pattern contains groups (denoted by parentheses), findall() returns a list of tuples.
o Each tuple contains matched strings for each group

Without Groups
import re

# Store the string in a variable


Str = 'Cell: 415-555-9999 Work: 212-555-0000'

# Store the regex pattern in a variable


pattern = r'\d\d\d-\d\d\d-\d\d\d\d'

# Define a regex pattern using the pattern variable


comp = re.compile(pattern)

# Use search method to find the first match


mo = comp.search(Str)
mo_g = mo.group()
print(mo_g) # Output: 415-555-9999

# Use findall method to find all matches


matches = comp.findall(Str)
print(matches) # Output: ['415-555-9999', '212-555-0000']

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

With Groups:

import re

# Store the string in a variable


Str = 'Cell: 415-555-9999 Work: 212-555-0000'

# Store the regex pattern in a variable


pattern = r'(\d\d\d)-(\d\d\d)-(\d\d\d\d)'

# Define a regex pattern using the pattern variable


comp = re.compile(pattern)

# Use findall method to find all matches


matches_with_groups = comp.findall(Str)
print(matches_with_groups)

# Output: [('415', '555', '9999'), ('212', '555', '0000')]

Character Classes
• Character classes simplify regular expressions by using shorthand for common groups of
characters.
• Shorthand codes for common character classes:

• Character classes make regular expressions more concise. For example, [0-5] matches only
the numbers 0 to 5, which is shorter than typing (0|1|2|3|4|5).

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Example:

To use character classes to find all instances of a number followed by a word in a string:

import re

# Define the string to search


Str = '12 drummers, 11 pipers, 10 lords, 9 ladies, 8 maids, 7 swans, 6
geese, 5 rings, 4 birds, 3 hens, 2 doves, 1 partridge'

# Define the regex pattern using shorthand character classes


pattern = r'\d+\s\w+'

# Compile the regex pattern


comp = re.compile(pattern)

# Use findall method to find all matches


matches = comp.findall(Str)

# Print the results


print(matches) # Output: ['12 drummers', '11 pipers', '10 lords', '9
ladies', '8 maids', '7 swans', '6 geese', '5 rings', '4 birds', '3
hens', '2 doves', '1 partridge']

In this example, the regular expression \d+\s\w+ matches text with one or more numeric digits
(\d+), followed by a whitespace character (\s), followed by one or more word characters (\w+). The
findall() method returns all matching strings of the regex pattern in a list.

Creating our own character classes


• Basic Character Class:
• To define a character class, we have to enclose, characters inside square brackets.
• For example, [aeiouAEIOU] matches any vowel, both lowercase and uppercase.

• Including Ranges:
• We can include ranges of characters using a hyphen.
• For example, [a-zA-Z0-9] matches all lowercase letters, uppercase letters, and numbers.

• Special Characters in Character Classes:


• Inside square brackets, special characters like ., *, ?, or () are treated as string.
• Here we need not escape them using backslash.
• For example, [0-5.] matches digits 0 to 5 and a period, it does not require [0-5\.].

• Negative Character Class:


• Placing a caret ^ just after the opening bracket of a character class creates a negative
character class.
• This matches all characters that are not in the defined character class.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

• For example, [^aeiouAEIOU] matches every character that isn't a vowel.

Program:

import re

# Store the string in a variable


text = 'India won the twenty twenty world cup'

# Define pattern
Pattern1 = r'[aeiouAEIOU]'
Pattern2 = r'[a-zA-Z0-9]'

# Compile the pattern


Comp1= re.compile(Pattern1)
Comp2= re.compile(Pattern2)

#for pattern1
Mo1 = Comp1.findall(text)
print(Mo1) # Output: ['I', 'i', 'a', 'o', 'e', 'e', 'e', 'o', 'u']

#for pattern2
Mo2 = Comp2.findall(text)
print(Mo2)
# Output: 'I’n’d’I’a’ w’o’n’ t’h’e’ t’w’e’n’t’y’ t’w’e’n’t’y’ w’o’r’l’d’ c’u’p'

Caret (^) symbols

• It is used at the beginning of a regex pattern.


• It indicates that the match must occur at the start of the text.
• Example: r'^Hello' matches strings that start with 'Hello'.

import re
Str1='Hello world!'
Str2='He said hello.'

Pattern =r'^Hello'
comp= re.compile(pattern)

#text starting with pattern


mo1=comp.search(Str1)
mo1_g= comp. group() if mo1 else ‘No match’
print(mo1_g) Output: Hello

mo2=comp.search(Str2)
mo2_g= comp. group() if mo2 else ‘No match’
print(mo2_g) Output: ‘None’

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Dollar ($) Sign Symbol:

• It is used at the end of a regex pattern.


• It indicates that the match must occur at the end of the text.
• Example: r'$Hello$' matches strings or text that ends with 'Hello'.

import re
Str1='Hello world!'
Str2='He said Hello'

Pattern =r 'Hello$'
comp= re.compile(pattern)

mo1=comp.search(Str1)
mo1_g= mo1. group() if mo1 else ‘None’
print(mo1_g) Output: Hello

mo2=comp.search(Str2)
mo2_g= mo2. group() if mo1 else ‘None’
print(mo2_g) Output: ‘None’

Combining Caret (^) and Dollar ($) :

• It indicates, that the entire string must match the pattern.


• Example: r'^\d+$' matches strings that consist the entire numeric characters.

import re

Str1='1234567890'
Str2='12345xyz67890'

Pattern = r'^\d+$'
comp= re.compile(pattern)

mo1=comp.search(Str1)
mo1_g= mo1. group() if mo1 else ‘No match’
print(mo1_g) Output: 1234567890

mo2=comp.search(Str2)
mo2_g= mo2. group() if mo1 else ‘No match’
print(mo2_g) Output: No match

▪ These symbols are essential for defining precise patterns in text matching using regular
expressions.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

The Wildcard Character (.) [DOT character ]

The dot (.) is a wildcard that matches any character except a newline.

import re

# Store the string in a variable


text = 'The cat in the hat sat on the flat mat.'

# Define pattern to match any character followed by 'at'


pattern = r'.at'

# Compile the pattern


comp = re.compile(pattern)

# Use findall method to find all matches


mo = comp.findall(text)
print(mo) # Output: ['cat', 'hat', 'sat', 'lat', 'mat']

Dot-Star (.*) [Matches Everything]


▪ dot character means “any single character except the newline,” and the star character means
“zero or more of the preceding character
▪ (.*) matches zero or more of any character except a newline.
▪ Used to match any and all text.

Example:
import re

# Store the string in a variable


text = 'First Name: Virat Last Name: Kohli'

# Define pattern t
pattern = r'First Name: (.*) Last Name: (.*)'

# Compile the pattern


comp = re.compile(pattern)

# Use search method to find the match


mo = comp.search(text)
mo_g1 = mo.group(1) if mo else 'No match'
mo_g2 = mo.group(2) if mo else 'No match'

print(mo_g1) # Output: 'Virat'


print(mo_g2) # Output: 'Kohli'

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Matching Newlines with the Dot Character [re.DOTALL]

The dot-star will match everything except a newline. By passing re.DOTALL as the second
argument to re.compile(), we can make the dot character to match all characters, including the
newline character

Program
import re

# Store the string in a variable


text = 'Serve the public trust.\nProtect the innocent.\nUphold the law.'

# Define pattern (does not match newlines)


pattern1 = r'.*'

# Compile the pattern


comp1 = re.compile(pattern1)

# Use search method to find the match


mo1 = comp1.search(text)
mo_g= mo1.group() if mo1 else 'No match'
print(mo_g) # Output: 'Serve the public trust.'

# Define pattern with re.DOTALL (matches newlines)


pattern2 = r '.*'

# Compile the pattern with re.DOTALL


comp2 = re.compile(pattern2, re.DOTALL)

# Use search method to find the match


mo2 = comp2.search(text)
mo2_g = mo2.group() if mo2 else 'No match'
print(mo2_g) # Output: 'Serve the public trust.\nProtect the innocent.\nUphold the law.'

Case-Insensitive Matching (re.IGNORECASE or re.I)


o Regular expressions normally match text exactly as specified in terms of casing.
o To match text regardless of uppercase or lowercase letters, use re.IGNORECASE or re.I as
the second argument to re.compile().

Program:
import re

# Define text with different casings


text1 = 'RoboCop is part man, part machine, all cop.'
text2 = 'ROBOCOP protects the innocent.'
text3 = 'Al, why does your programming book talk about robocop so much?'

# Define pattern for case-insensitive matching


pattern = r'robocop'

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Compile the pattern with re.IGNORECASE


robocop = re.compile(pattern, re.IGNORECASE)

# Perform searches and print results


mo1 = robocop.search(text1)
mo2 = robocop.search(text2)
mo3 = robocop.search(text3)

result1 = mo1.group() if mo1 else 'No match'


result2 = mo2.group() if mo2 else 'No match'
result3 = mo3.group() if mo3 else 'No match'

print(result1) # Output: 'RoboCop'


print(result2) # Output: 'ROBOCOP'
print(result3) # Output: 'robocop'

sub() Method [To replace obtained pattern with new string]


The sub() method is used to find text patterns and replace them with new text.
The sub() method takes two arguments:
▪ The replacement string.
▪ The string or text.

1. Example:
o Replace all instances of "Agent [name]" with "CENSORED"

import re

# text or string
text = 'Agent Virat gave the secret documents to Agent Dhoni.'

# Define the pattern


Pattern= r 'Agent \w+ '

comp = re.compile(pattern)

# Use the sub() method to replace matched patterns


result = comp.sub('CENSORED', text)

print(result)
# Output: 'CENSORED gave the secret documents to CENSORED.'

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Using Matched Text in Substitution:


• We can use the matched text in the replacement by using \1, \2, \3, etc., to refer to groups in
the pattern.

import re

# text or string
text = 'Agent Virat gave the secret documents to Agent Dhoni.'

# Define the pattern


Pattern= r' Agent (\w)\w* '

comp = re.compile(pattern)

# Use the sub() method to replace matched patterns


result = comp.sub(r' \1**** ', text)

print(result)

# Output:’V**** gave the secret documents to D****.'

▪ Censor the agent names, showing only the first letter and replacing the rest with asterisks:

Managing Complex Regexes [re.VERBOSE]

▪ Regular expressions can get complicated when dealing with complex text patterns.
▪ To make them more readable, we can use "verbose mode" with re.VERBOSE, which allows
for whitespace and comments inside the regex string.

Benefits of Verbose Mode:


• Readability: Spread the regex over multiple lines.
• Comments: Add comments to explain parts of the regex.
• Whitespace Ignored: Spaces and newlines within the regex are ignored.

Program:
import re
# String or Text
text = 'Call me at (123) 456-7890 or 123-456-7890 ext. 1234.'

# Define the regex pattern with comments and spread over multiple lines
pattern = r'''(
(\d{3} | \(\d{3}\))? # area code
(\s |- |\.)? # separator
\d{3} # first 3 digits
(\s |- |\.) # separator
\d{4} # last 4 digits
(\s*(ext|x|ext.)\s*\d{2,5})? # extension
)'''

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

# Compile the pattern with re.VERBOSE


comp = re.compile(pattern, re.VERBOSE)

# Find matches in the text


matches = comp.findall(text)

# Print matches
for match in matches:
print(match)

Combining re.IGNORECASE, re.DOTALL, and re.VERBOSE


▪ We can use re.VERBOSE for comments and readability, and re.IGNORECASE for case
insensitivity, we can combine these multiple flags using the bitwise OR operator (|).
▪ This allows us to include multiple options in the re.compile() function.

Steps to Combine Flags:

1. Combine Flags: Use the bitwise OR operator (|) to combine re.IGNORECASE, re.DOTALL, and
re.VERBOSE.
2. Compile with Combined Flags: Pass the combined flags as the second argument to
re.compile().

Program

import re

# Text to search
text = "Hi\nHI\nHi"

# Define a pattern with verbose mode for readability


pattern = r'''
Hi # match 'Hi'
'''

# Compile the pattern with re.IGNORECASE, re.DOTALL, and re.VERBOSE


comp = re.compile(pattern, re.IGNORECASE | re.DOTALL | re.VERBOSE)

# Find all matches in the text


matches = comp.findall(text)

# Print matches
print(matches) # Output: ['Hi', 'HI', 'Hi']

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Project: Phone Number and Email Address Extractor


Aim: To Automatically find and extract all phone numbers and email addresses from text on the
clipboard.
1. To Copy text from the clipboard.
2. To use regular expressions to identify phone numbers and email addresses from text.
3. To Paste the found information back to the clipboard.

Program:
import pyperclip
import re

# Define the pattern for phone numbers


phone_pattern = r'''(
(\d{3}|\(\d{3}\))? # group 1: area code
(\s|-|\.)? # group 2: separator
\d{3} # group 3: first 3 digits
(\s|-|\.) # group 4: separator
\d{4} # group 5: last 4 digits
(\s*(ext|x|ext.)\s*\d{2,5})? # group 6: extension (optional)
)'''

# Define the pattern for email addresses


email_pattern = r'''(
[a-zA-Z0-9._%+-]+ # username
@ # @ symbol
[a-zA-Z0-9.-]+ # domain name
(\.[a-zA-Z]{2,4}) # dot-something
)'''

# Compile the regex patterns


phone_comp = re.compile(phone_pattern, re.VERBOSE)
email_comp= re.compile(email_pattern, re.VERBOSE)

# Get the text from clipboard


text = str(pyperclip.paste())

# Find all matches for phone numbers and email addresses


matches = []

for groups in phone_regex.findall(text):


phone_num = '-'.join([groups[1], groups[3], groups[5]])
if groups[8] != '':
phone_num += ' x' + groups[8]
matches.append(phone_num)

for groups in email_regex.findall(text):

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

matches.append(groups[0])

# Join the matches into a single string


if matches:
pyperclip.copy('\n'.join(matches))
print('Copied to clipboard:')
print('\n'.join(matches))
else:
print('No phone numbers or email addresses found.')

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Reading and Writing Files


▪ Variables can be used to store data/result. But if want retain our data/result even after our
program has finished, we need to save it to a file.
▪ We can think of a file’s contents as a single string value.
▪ Let us see how to use Python to create, read, and save files on the hard drive

Files and File Paths


• Filename and Path:
o A file has a filename (e.g., project.docx)
o A path (e.g., C:\Users\asweigart\Documents).
o The extension (e.g., .docx) tells the file type.

• Path Structure:
o Folders (or directories) can contain files and other folders.
o Example: project.docx is in Documents, which is in Python, which is in Users.
o The root folder: In Windows is C:\ (C: drive);
in Linux, it's /.
o Path Separators: Windows uses backslashes (\), while OS X and Linux use forward slashes (/).

os.path.join()
os.path.join()is used to handle paths in python

• Creating Paths:
os.path.join('usr', 'bin', 'spam')
▪ It creates usr\bin\spam on Windows
▪ usr/bin/spam on OS X/Linux.
▪ Useful for constructing file paths programmatically.

• Current Working Directory (CWD):


o Each program has a CWD, which is the base path for relative paths.
os.getcwd() : It is used to get the CWD

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

os.chdir() : It is used to change CWD.

hanges to C:\Windows\System32.

Absolute vs. Relative Paths:


There are two ways to specify a file path.
• Absolute path: It always begins with the root folder
• Relative path: It is relative to current working directory

• Special Path Names:


o . refers to the current directory.
o .. refers to the parent directory.

• Creating New Folders with os.makedirs()

o It creates a new folder hierarchy. This creates all intermediate folders, even if they don't exist.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

The os.path Module

Overview:
o The os.path module inside os module in Python.
o It offers functions for working with file paths and filenames.
o It ensures compatibility across different operating systems.
o Importing the Module: we can Import the module using import os

Handling Absolute and Relative Paths:


• The os.path module provides functions to obtain the absolute path of a relative path and to
check whether a given path is an absolute path.

os.path.abspath(path):
o It Converts a relative path to an absolute path.
▪ Example:
import os
os.path.abspath(' . ')
Output: 'C:\\Python34'.

os.path.isabs(path):
• It checks, whether the given path is absolute.
• Output is True if the given path is an absolute path and False if it is a relative path.
▪ Example:
os.path.isabs('.')
Output: False.

Relative Path Conversion:


o os.path.relpath(path, start):
▪ It gives a relative path from start to path.
▪ Example: os.path.relpath('C:\\Windows', 'C:\\')
▪ Output: 'Windows'.
▪ Useful for navigating between directories.

Extracting Directory and File Names:


o os.path.basename(path):
▪ It returns the last component of the path (filename).
▪ Example: os.path.basename('C:\\Windows\\System32\\calc.exe')
▪ Output: calc.exe'.

o os.path.dirname(path):
▪ It returns everything before the last component of the path (directory).
▪ Example: os.path.dirname('C:\\Windows\\System32\\calc.exe')
▪ Output: 'C:\\Windows\\System32'.

o os.path.split(path):
▪ It returns both the directory and filename as a tuple.
▪ Example: os.path.split('C:\\Windows\\System32\\calc.exe')
▪ Output: ('C:\\Windows\\System32', 'calc.exe').

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Splitting Path Components:


o path.split(os.path.sep):
▪ It splits a path, and creates a list of path components.
▪ Example: 'C:\\Windows\\System32\\calc.exe'.split(os.path.sep)
▪ Output: ['C:', 'Windows', 'System32', 'calc.exe'].

File Sizes and Folder Contents:


o os.path.getsize(path):
▪ It returns the size of the file in bytes.
▪ Example: os.path.getsize('C:\\Windows\\System32\\calc.exe')
▪ Output: 776192.

o os.listdir(path):
▪ It returns a list of filenames in the directory specified by path.
▪ Example: os.listdir('C:\\Windows\\System32')
▪ Output: list of filenames in that directory.

To find the total size of all the files in the directory,


In this case, use os.path.getsize() and os.listdir() together

Checking Path Validity:


o os.path.exists(path):
▪ It checks if the path exists.
▪ Example: os.path.exists('C:\\Windows')
▪ Output: True.

o os.path.isfile(path):
▪ It checks if the path is a file.
▪ Example: os.path.isfile('C:\\Windows\\System32\\calc.exe')
▪ Output: True.

o os.path.isdir(path):
▪ Checks if the path is a directory.
▪ Example: os.path.isdir('C:\\Windows\\System32')
▪ Output: True.

Reading and Writing Files in Python


File Operations
File operation takes place in the following order:
o Open a file

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

o Read or Write (Perform operation)


o Close the file

Opening Files with open () function


• Built-in open () function is used to open a file.
• The syntax of open () function:
f1= open (“filename”, “mode”)

Here arguments:
filename: Name of the file to be opened. This filename can be with the pathname or
without the pathname. Pathname of the file is optional if file is in current working
directory.

f1: when we open file using open() function, python returns the file object. This file object
is stored in the variable f1, this is also called as handle. A File object represents a file in our
computer; it is simply another type of value in Python, much like the lists or dictionaries
This file object can then be used to read the contents of the file, perform other operations
on the file.

mode: We have to specify what is purpose of opening a file (for reading or writing etc).

The various modes:

Example: Opening a file:


# if file is in current folder
(current directory)
f = open("Virat.txt”, “r”)
#read mode

# Specifying full path (if file is in not current directory)


f = open(“D:\sujay\Virat.txt”, “r”)
Here Virat.txt is text file

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Reading Files
• Consider the text file with name “myfile” written in notepad and stored in current directory.

Reading file using read() function.


read ()
This function is used to read the entire content of the file (if the argument is not mentioned)
# Open file in current folder
f1 = open(“myfile.txt”, “r”)
d1=f1.read()
print(d1)

Output:

• Here in this case, it reads the entire file at once and stores in the variable “d1”. So, this
method is suitable when the file size is very less.

Reading file using readlines() function.


readlines()
# Open file in current folder
f1 = open(“myfile.txt”, “r”)
d1=f1.readlines()
print(d1)

Output:
Hello python \n',
Welcome python\n',
Hello India \n',
How are you \n',

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Note: readlines() method returns a list of string values from the file, one string for each line of text

Writing Files
• we can use write() method to write data into a file.
• write() can be used in 2 modes: mode “w” or mode “a”.
"a" - Append - will append to the end of the file
"w" - Write - will overwrite any existing content
• If the file does not exist, then a new file (with the given name) will be created.
Example 1: To append
This method will append text to the end of the existing file. No overwriting in case of append
Open the file "myfile.txt" and append content to the file:
f = open("myfile.txt", "a")
f.write("Hello Virat \n")
f.write(“ Hello Dhoni”)
f.close()

#Now open and read the file after the appending:


f = open("myfile.txt", "r")
print(f.read())

• This data will be written at end of already existing content.


• Here first: “Hello Virat ” will be written into file, and since we have mentioned \n
character, the cursor will move to the next line.
• So, in the next line “Hello Dhoni” will be written.
• In case, if we had not mentioned \n character, then data will be written continuously

Example 2: To write
This method will over-write on the all-ready existing data
#Open the file "myfile.txt" and overwrite the content:
f = open(" myfile.txt", "w")
f.write("Hello Pandya")
f.close()
#open and read the file after the writing :
f = open("myfile.txt", "r")
print(f.read())

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Saving Variables with the Shelve Module (In binary files)


• We can save variables of Python programs in binary shelf files using the shelve module.
• Binary shelf files are used to store data in a binary format.
• Example: After running a program and configuring some settings, we can save these
settings into a binary shelf file. Later, when we run the program again, we can load the
settings from the file instead of configuring them again.
• The shelve module allows us to implement Save and Open features in our programs.

Open shelf file and store data


import shelve

# Open a shelf file


shelf_file = shelve.open('mydata')

# Store a list in the shelf file, with key ‘cats’


cats = ['Zophie', 'Pooka', 'Simon']
shelf_file['cats'] = cats

# Close the shelf file.


shelf_file.close()

To retrieve data
shelf_file = shelve.open('mydata')
retrieved_cats = shelf_file['cats'] # Retrieve the list using the key 'cats'
print(retrieved_cats) # Output: ['Zophie', 'Pooka', 'Simon']
shelf_file.close()

To convert to list
• Just like dictionaries, shelf values have keys() and values().
• This shelve method will return list-like values but not the true lists
• To get true lists, pass the returned values to the list() function.

# Get keys and values, convert to lists


keys_list = list(shelfFile.keys())
values_list = list(shelfFile.values())

print("Keys:", keys_list)
print("Values:", values_list)

# Close the shelf file


shelfFile.close()

Saving Variables with the pprint.pformat() Function

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

• The pprint.pformat() function from the pprint module allows us to convert complex
data structures into a formatted string, which is easy to read.

Program:
import pprint

# Step 1: Create data structure(dictionary)


cats = [{'name': 'Zophie', 'desc': 'chubby'}, {'name': 'Pooka', 'desc': 'fluffy'}]

# Step 2: Convert to formatted string


formatted_cats = pprint.pformat(cats)

# Step 3: Write to a .py file


with open('myCats.py', 'w') as fileObj:
fileObj.write('cats = ' + formatted_cats + '\n')

# Step 4: Import and use the saved data


import myCats

# Step 5: Access the data


print(myCats.cats)

print(myCats.cats[0]) # {'name': 'Zophie', 'desc': 'chubby'}


print(myCats.cats[0]['name']) # 'Zophie'

Comparison with the shelve Module:


• Text vs Binary: The .py file is text-based, while shelve files are binary.
• Accessibility: Text files can be easily read and edited, while shelve files are more suited
for programmatic access.
• Data Types: Only basic data types (integers, floats, strings, lists, dictionaries) can be saved
as text. Complex objects like file handles cannot be saved in a text format.

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

Project: Generating Random Quiz Files


1. Import Random Module: Import random to use its functions for shuffling and sampling.
2. Data Storage: Store the quiz data in a dictionary named capitals where the keys are states and
the values are their capitals.
3. Generate Quiz Files: Use a loop to generate 35 different quizzes.
o File Creation: Create a new text file for each quiz and its corresponding answer key.
o Header Writing: Write a standard header to each quiz file for student details.
o Shuffle Questions: Shuffle the order of states to randomize questions.
4. Generate Questions: For each quiz, loop through 50 states to create questions and multiple-choice
options.
o Correct and Wrong Answers: Get the correct answer and three random wrong answers.
o Randomize Options: Combine and shuffle the answer options.
o Write Questions: Write the question and answer options to the quiz file.
o Write Answer Key: Write the correct answer to the answer key file.
5. Close Files: Close the quiz and answer key files after writing to them.

This program will create 35 unique quizzes and their corresponding answer keys, each with
randomized questions and answer choices.

Program :
import random

# The quiz data: keys are states and values are their capitals.
capitals = {
'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise',
'Illinois': 'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines',
'Kansas': 'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge',
'Maine': 'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston',
'Michigan': 'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson',
'Missouri': 'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln',
'Nevada': 'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton',

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

'New Mexico': 'Santa Fe', 'New York': 'Albany', 'North Carolina': 'Raleigh',
'North Dakota': 'Bismarck', 'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City',
'Oregon': 'Salem', 'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee': 'Nashville',
'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont': 'Montpelier',
'Virginia': 'Richmond', 'Washington': 'Olympia', 'West Virginia': 'Charleston',
'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'
}

# Generate 35 quiz files.


for quizNum in range(35):
# Create the quiz and answer key files.
quizFile = open(f'capitalsquiz{quizNum + 1}.txt', 'w')
answerKeyFile = open(f'capitalsquiz_answers{quizNum + 1}.txt', 'w')

# Write out the header for the quiz.


quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
quizFile.write((' ' * 20) + f'State Capitals Quiz (Form {quizNum + 1})\n\n')

# Shuffle the order of the states.


states = list(capitals.keys())
random.shuffle(states)

# Loop through all 50 states, making a question for each.


for questionNum in range(50):
# Get right and wrong answers.
correctAnswer = capitals[states[questionNum]]
wrongAnswers = list(capitals.values())
wrongAnswers.remove(correctAnswer)
wrongAnswers = random.sample(wrongAnswers, 3)
answerOptions = wrongAnswers + [correctAnswer]
random.shuffle(answerOptions)

# Write the question and the answer options to the quiz file.
quizFile.write(f'{questionNum + 1}. What is the capital of {states[questionNum]}?\n')
for i in range(4):
quizFile.write(f' {"ABCD"[i]}. {answerOptions[i]}\n')
quizFile.write('\n')

# Write the answer key to a file.


answerKeyFile.write(f'{questionNum + 1}.
{"ABCD"[answerOptions.index(correctAnswer)]}\n')

quizFile.close()

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

answerKeyFile.close()

Program 2 (with Indian state and its capitals)


import random

# The quiz data: keys are states and values are their capitals.
capitals = {
'Andhra Pradesh': 'Amaravati', 'Arunachal Pradesh': 'Itanagar', 'Assam': 'Dispur',
'Bihar': 'Patna', 'Chhattisgarh': 'Raipur', 'Goa': 'Panaji', 'Gujarat': 'Gandhinagar',
'Haryana': 'Chandigarh', 'Himachal Pradesh': 'Shimla', 'Jharkhand': 'Ranchi',
'Karnataka': 'Bengaluru', 'Kerala': 'Thiruvananthapuram', 'Madhya Pradesh':
'Bhopal',
'Maharashtra': 'Mumbai', 'Manipur': 'Imphal', 'Meghalaya': 'Shillong',
'Mizoram': 'Aizawl', 'Nagaland': 'Kohima', 'Odisha': 'Bhubaneswar',
'Punjab': 'Chandigarh', 'Rajasthan': 'Jaipur', 'Sikkim': 'Gangtok',
'Tamil Nadu': 'Chennai', 'Telangana': 'Hyderabad', 'Tripura': 'Agartala',
'Uttar Pradesh': 'Lucknow', 'Uttarakhand': 'Dehradun', 'West Bengal': 'Kolkata',
'Andaman and Nicobar Islands': 'Port Blair', 'Chandigarh (Union Territory)':
'Chandigarh',
'Dadra and Nagar Haveli and Daman and Diu': 'Daman', 'Lakshadweep':
'Kavaratti',
'Delhi': 'New Delhi', 'Puducherry': 'Puducherry', 'Jammu and Kashmir': 'Srinagar',
'Ladakh': 'Leh'
}

# Generate 35 quiz files.


for quizNum in range(35):
# Create the quiz and answer key files.
quizFile = open(f'capitalsquiz{quizNum + 1}.txt', 'w')
answerKeyFile = open(f'capitalsquiz_answers{quizNum + 1}.txt', 'w')

# Write out the header for the quiz.


quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')

Prof. Sujay Gejji ECE, SGBIT, Belagavi


Python Programming

quizFile.write((' ' * 20) + f'State Capitals Quiz (Form {quizNum + 1})\n\n')

# Shuffle the order of the states.


states = list(capitals.keys())
random.shuffle(states)

# Loop through all the states, making a question for each.


for questionNum in range(len(states)):
# Get right and wrong answers.
correctAnswer = capitals[states[questionNum]]
wrongAnswers = list(capitals.values())
wrongAnswers.remove(correctAnswer)
wrongAnswers = random.sample(wrongAnswers, 3)
answerOptions = wrongAnswers + [correctAnswer]
random.shuffle(answerOptions)

# Write the question and the answer options to the quiz file.
quizFile.write(f'{questionNum + 1}. What is the capital of
{states[questionNum]}?\n')
for i in range(4):
quizFile.write(f' {"ABCD"[i]}. {answerOptions[i]}\n')
quizFile.write('\n')

# Write the answer key to a file.


answerKeyFile.write(f'{questionNum + 1}.
{"ABCD"[answerOptions.index(correctAnswer)]}\n')

quizFile.close()
answerKeyFile.close()

Prof. Sujay Gejji ECE, SGBIT, Belagavi

You might also like