Unit 2 Python
Unit 2 Python
Python has several functions for creating, reading, updating, and deleting
files.
File Handling
The key function for working with files in Python is the open() function.
"r" - Read - Default value. Opens a file for reading, error if the file does not exist
"a" - Append - Opens a file for appending, creates the file if it does not exist
"w" - Write - Opens a file for writing, creates the file if it does not exist
"x" - Create - Creates the specified file, returns an error if the file exists
In addition you can specify if the file should be handled as binary or text mode
Syntax
To open a file for reading it is enough to specify the name of the file:
f = open("demofile.txt")
f = open("demofile.txt", "rt")
Because "r" for read, and "t" for text are the default values, you do not need to
specify them.
Note: Make sure the file exists, or else you will get an error.
Open a File on the Server
Assume we have the following file, located in the same folder as Python:
demofile.txt
The open() function returns a file object, which has a read() method for reading
the content of the file:
f = open("demofile.txt", "r")
print(f.read())
If the file is located in a different location, you will have to specify the file path,
like this:
f = open("D:\\myfiles\welcome.txt", "r")
print(f.read())
Example
Return the 5 first characters of the file:
f = open("demofile.txt", "r")
print(f.read(5))
Read Lines
You can return one line by using the readline() method:
Example
Read one line of the file:
f = open("demofile.txt", "r")
print(f.readline())
By calling readline() two times, you can read the two first lines:
Example
Read two lines of the file:
f = open("demofile.txt", "r")
print(f.readline())
print(f.readline())
By looping through the lines of the file, you can read the whole file, line by line:
Example
Loop through the file line by line:
f = open("demofile.txt", "r")
for x in f:
print(x)
Close Files
It is a good practice to always close the file when you are done with it.
Example
Close the file when you are finished with it:
f = open("demofile.txt", "r")
print(f.readline())
f.close()
Note: You should always close your files. In some cases, due to buffering,
changes made to a file may not show until you close the file
Local machine
file_path = 'C:/Users/Admin/Desktop/python.txt'
try:
with open(file_path, 'r') as file:
contents = file.read()
print(contents)
except FileNotFoundError:
print(f"Error: File not found at {file_path}")
except Exception as e:
print(f"An error occurred: {e}")
Google colab
Upload the file, copy the path and run
try:
with open(file_path, 'r') as file:
contents = file.read()
print(contents)
except FileNotFoundError:
print(f"Error: File not found at {file_path}")
except Exception as e:
print(f"An error occurred: {e}")
Notice that the open() function takes two input parameters: file path (or file name if the
file is in the current working directory) and the file access mode. There are many modes
for opening a file:
● open('path','r'): opens a file in read mode
● open('path',w'): opens or creates a text file in write mode
● open('path',a'): opens a file in append mode
● open('path','r+'): opens a file in both read and write mode
● open('path',w+'): opens a file in both read and write mode
● open('path',a+'): opens a file in both read and write mode
After opening the file with the read mode, you can also use the following function to
access or examine the Information stored in the file:
● .read(): This function reads the complete information from the file unless a
number is specified. Otherwise, it will read the first n bytes from the text files.
● .readline(): This function reads the information from the file but not more than
one line of information unless a number is specified. Otherwise, it will read the
first n bytes from the text files. It is usually used in loops
● .readlines() – This function reads the complete information in the file and prints
them as well in a list format
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from google.colab import files
uploaded = files.upload()
import io
df = pd.read_excel(io.BytesIO(uploaded['des.xlsx']))
df.shape
print(dataframe['column-name'].tolist())
Reading a File
Here are a few ways to read a file in Python:
Note: Make sure to replace 'file.txt' with the actual path to your file.
In this method, We will first import the Pandas module then we will use
Pandas to read our excel file. You can read more operations using the excel
file using Pandas
# import pandas lib as pd
import pandas as pd
# read by default 1st sheet of an excel file
dataframe1 = pd.read_excel('C:\\Users\\Admin\\Desktop\\python1.xlsx')
print(dataframe1)
You can read an Excel file in Python using the openpyxl library, which
supports both .xlsx and .xls formats:
import openpyxl
# Load workbook
wb = openpyxl.load_workbook('C:\\Users\\Admin\\Desktop\\python1.xlsx')
# Select active sheet (optional)
sheet = wb.active
# Access cell values
cell_value = sheet['A1'].value
print(cell_value)
# Close workbook when done
wb.close()
To open and manipulate Excel files, you can also use the xlrd library, which
supports .xls files (not .xlsx):
import xlrd
# Open Excel file
workbook = xlrd.open_workbook('path/to/your/excel_file.xls')
# Access sheet by index or name
sheet = workbook.sheet_by_index(0) # Access first sheet
# OR
# sheet = workbook.sheet_by_name('Sheet1') # Access sheet by
name
# Read cell value
cell_value = sheet.cell_value(0, 0) # Access cell A1
print(cell_value)
# Close workbook when done
workbook.close()
import pandas as pd
# Read Excel file into DataFrame
df = pd.read_excel('path/to/your/excel_file.xlsx')
# Display DataFrame
print(df.head())
Pandas’ read_excel() function can handle both .xls and .xlsx formats.
If you prefer working with CSV format data extracted from an Excel file, you
can convert an Excel file into CSV format using Pandas:
import pandas as pd
df = pd.read_excel('path/to/your/excel_file.xlsx')
# Export DataFrame to CSV
df.to_csv('output_file.csv', index=False)
This converts the Excel data into a CSV file named output_file.csv in the
current directory.
import openpyxl
# Load workbook
wb = openpyxl.load_workbook('C:\\Users\\Admin\\Desktop\\python2.xlsx')
# Select active sheet (optional)
sheet = wb.active
# Access cell values
cell_value = sheet['A3'].value
print(cell_value)
# Close workbook when done
wb.close()
rose
In [8]:
# import pandas lib as pd
import pandas as pd
# read by default 1st sheet of an excel file
dataframe1 = pd.read_excel('C:\\Users\\Admin\\Desktop\\python2.xlsx')
print(dataframe1)
flower colour
0 jasmine white
1 rose pink
2 marygold yellow
In [7]:
import pandas as pd
# Read Excel file into DataFrame
df = pd.read_excel('C:\\Users\\Admin\\Desktop\\python2.xlsx')
# Display DataFrame
print(df.head())
flower colour
0 jasmine white
1 rose pink
2 marygold yellow
In [15]:
import pandas as pd
# Read Excel file into DataFrame
df = pd.read_excel('C:\\Users\\Admin\\Desktop\\python2.xlsx')
# Export DataFrame to CSV
df.to_csv('C:\\Users\\Admin\\Desktop\\output_file.csv', index=False)
In [16]:
from openpyxl import Workbook
workbook = Workbook()
workbook.save(filename='C:\\Users\\Admin\\Desktop\\python4.xlsx')
In [23]:
# import openpyxl module
import openpyxl
wb = openpyxl.load_workbook('C:\\Users\\Admin\\Desktop\\python4.xlsx')
#sheet = wb.active
data = (
(1, 2, 3),
(4, 5, 6)
)
for row in data:
sheet.append(row)
wb.save('C:\\Users\\Admin\\Desktop\\python4.xlsx')
In [ ]:
Output:
After creating an empty file, let’s see how to add some data to it using
Python. To add data first we need to select the active sheet and then using
the cell() method we can select any particular cell by passing the row and
column number as its parameter. We can also write using cell names. See
the below example for a better understanding.
Example:
Output:
Refer to the below article to get detailed information about writing to excel.
Example:
Output:
We can also use the append() method to append multiple data at the end of
the sheet.
Example:
Output:
Example:
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet['A1'] = 200
sheet['A2'] = 300
sheet['A3'] = 400
sheet['A4'] = 500
sheet['A5'] = 600
sheet['A7'] = '= SUM(A1:A5)'
wb.save("sum.xlsx")
Output:
Refer to the below article to get detailed information about the Arithmetic
operations on Spreadsheet.
Example:
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet.row_dimensions[1].height = 70
sheet.column_dimensions['B'].width = 20
wb.save('sample.xlsx')
Output:
Merging Cells
A rectangular area of cells can be merged into a single cell with the
merge_cells() sheet method. The argument to merge_cells() is a single string
of the top-left and bottom-right cells of the rectangular area to be merged.
Example:
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet.merge_cells('A2:D4')
sheet.merge_cells('C6:D6')
wb.save('sample.xlsx’)