0% found this document useful (0 votes)
2 views

Python Module- 4herrewHRW

This document covers reading and writing files in Python, detailing how to manage file paths, create directories, and handle both plaintext and binary files. It explains the process of opening, reading, and writing to files, as well as using the shelve module for persistent data storage. Additionally, it introduces the concept of Excel spreadsheets and the openpyxl module for handling Excel files.

Uploaded by

ktvinyas00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Python Module- 4herrewHRW

This document covers reading and writing files in Python, detailing how to manage file paths, create directories, and handle both plaintext and binary files. It explains the process of opening, reading, and writing to files, as well as using the shelve module for persistent data storage. Additionally, it introduces the concept of Excel spreadsheets and the openpyxl module for handling Excel files.

Uploaded by

ktvinyas00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Python Module- 4

Reading and Writing Files

• Variables are a fine way to store data while your program is


running, but if you want your data to persist even after your
program has finished, you need to save it to a file.
• You can think of a file’s contents as a single string value,
potentially gigabytes in size.
• How to use Python to create, read, and save files on the hard
drive.

23 December 2024 2
Files and File Paths
A file has two key properties: a filename (usually written as one word) and a
path. The path specifies the location of a file on the computer.

C:\Users\asweigart\Documents\project.docx

path
fileName

23 December 2024 3
Backslash on Windows and
Forward Slash on OS X and Linux
• On Windows, paths are written using backslashes (\) as the separator
between folder names.
• OS X and Linux, however, use the forward slash (/) as their path separator.
• Fortunately, this is simple to do with the os.path.join() function.
• If you pass it, the string values of individual file and folder names in your
path, os.path.join() will return a string with a file path using the correct
path separators.

23 December 2024 4
For example, the following example joins
names from a list of filenames to the end
of a folder’s name:

import os
myFiles = ['accounts.txt', 'details.csv', 'invite.docx']
for filename in myFiles:
print(os.path.join('C:\\Users\\asweigart', filename))

Output:
C:\Users\asweigart\accounts.txt
C:\Users\asweigart\details.csv
C:\Users\asweigart\invite.docx

23 December 2024 5
The Current Working Directory
• Every program that runs on your computer has a current working directory, or
cwd. Any filenames or paths that do not begin with the root folder are assumed
to be under the current working directory.
• For example:
• You can get the current working directory as a string value with the os.getcwd()
function and change it with os.chdir().

import os import os
print(os.getcwd()) os.chdir('C:\\Windows\\System32')
print(os.getcwd())

Output: Output:

C:\Users\Admin\spyder-py3 C:\Windows\System32

23 December 2024 6
Python will display an error if you
try to change to a directory that
does not exist.

23 December 2024 7
Absolute vs. Relative Paths
There are two ways to specify a file path.
• An absolute path, which always begins with the
root folder
• A relative path, which is relative to the program’s
current working directory
• There are also the dot (.) and dot-dot (..) folders.
These are not real folders but special names that can
be used in a path.
• A single period (“dot”) for a folder name is shorthand
for “this directory.”
• Two periods (“dot-dot”) means “the parent folder.”
23 December 2024 8
an example of some folders and files. When the current working directory is set
to C:\bacon, the relative paths for the other folders and files are set as they are
in the figure.

23 December 2024 9
Creating New Folders with
os.makedirs()
Your programs can create new folders (directories) with the os.makedirs() function.
Enter the following into the interactive shell:

import os
os.makedirs('C:\\delicious\\walnut\\waffles')

23 December 2024 10
The File Reading/Writing Process
• Once you are comfortable working with folders and relative
paths, you’ll be able to specify the location of files to read and
write.
• Plaintext files- contain only basic text characters and do not
include font, size, or color information.
• Text files with the .txt extension or Python script files with
the .py extension are examples of plaintext files.
• These can be opened with Windows’s Notepad or OS X’s
TextEdit application.
• Your programs can easily read the contents of plaintext files and
treat them as an ordinary string value.

23 December 2024 22
• Binary files are all other file types, such as word processing
documents, PDFs, images, spreadsheets, and executable
programs.
• If you open a binary file in Notepad or TextEdit, it will look like
scrambled nonsense, like in Figure

Since every different type of


binary file must be handled in its
own way, we have many
modules that make working with
binary files easier

23 December 2024 23
Three steps to reading or writing files
There are three steps to reading or writing files in Python.
• Call the open() function to return a File object.
• Call the read() or write() method on the File object.
• Close the file by calling the close() method on the File
object.

23 December 2024 24
Opening Files with the open() Function
• To open a file with the open() function, you pass it a string
path indicating the file you want to open; it can be either an
absolute or relative path.
• The open() function returns a File object.
Try it by creating a text file named hello.txt using Notepad or TextEdit. Type Hello
world! as the content of this text file and save it in your user home folder.

23 December 2024 25
Reading the Contents of Files
Now that you have a File object, you can start reading from it. If you want to read the
entire contents of a file as a string value, use the File object’s read() method.

helloFile =
open('C:\\Users\\TEST\\first.txt','r')
helloread = helloFile.read()
print(helloread) first.txt

BMS Institute
output COMPUTER SCIENCE
AND ENGINEERING

BMS Institute
COMPUTER SCIENCE AND
ENGINEERING

23 December 2024 26
the readlines() method
you can use the readlines() method to get a list of string values from the file, one string
for each line of text.

helloFile = open('C:\\Users\\TEST\\second.txt','r')
helloread = helloFile.readlines()
print(helloread)
second.txt
output

["When, in disgrace with fortune and men's


eyes,\n", 'I all alone beweep my outcast
state,\n', 'And trouble deaf heaven with my
bootless cries,\n', 'And look upon myself
and curse my fate,']

Note that each of the string values ends with a


newline character, \n , except for the last line
of the file. A list of strings is often easier to
work with than a single large string value.
23 December 2024 27
Writing to Files
• Python allows you to write content to a file in a way similar
to how the print() function “writes” strings to the screen.
• There are two ways to write the data to a file
• Write mode
• Append mode

Write mode:
• Write mode will overwrite the existing file and start from
scratch, just like when you overwrite a variable’s value with
a new value.
• Pass 'w' as the second argument to open() to open the file in
write mode

23 December 2024 28
Example:
Note:
If the filename passed to open() does not exist, both write and append mode will
create a new, blank file.
Bacon.txt

baconFile = open('C:\\Users\\TEST\\bacon.txt',
'w')
baconFile.write('MY NAME IS SUDARSANAN\n')
baconFile.close()

readname =
open('C:\\Users\\TEST\\bacon.txt','r')
helloread = readname.read()
print(helloread)

output
MY NAME IS SUDARSANAN

23 December 2024 29
Example2 once again write a new data to
bacon.txt file
baconFile = open('C:\\Users\\TEST\\bacon.txt',
'w') bacon.tx
baconFile.write('WHAT IS YOUR NAME\n') t
baconFile.close()
readname =
open('C:\\Users\\TEST\\bacon.txt','r')
helloread = readname.read()
print(helloread)

output
WHAT IS YOUR NAME

23 December 2024 30

Append mode:
Append mode: will append text to the end of the existing file. You can think of this as
appending to a list in a variable, rather than overwriting the variable altogether.
• Pass 'a' as the second argument to open() to open the file in append mode.

baconFile = open('C:\\Users\\TEST\\bacon.txt', 'a') bacon.txt


baconFile.write('MY NAME IS SUDARSANAN\n')
baconFile.close()
readname = open('C:\\Users\\TEST\\bacon.txt','r')
helloread = readname.read()
print(helloread)

Output:

WHAT IS YOUR NAME


MY NAME IS SUDARSANAN

23 December 2024 31
Saving Variables with the shelve
Module
• You can save variables in your Python programs to binary shelf
files using the shelve module.
• The shelve module in Python’s standard library is a simple yet
effective tool for persistent data storage when using a
relational database solution is not required.
• The shelf object defined in this module is dictionary-like object
which is persistently stored in a disk file.
• Only string data type can be used as key in this special
dictionary object
• After running the code on Windows, you will see three new files
in the current working directory: mydata.bak, mydata.dat, and
mydata.dir.
23 December 2024 32
Example: store the data in shelfobject

import shelve This will create test.dir,


s = shelve.open("test") test.bak, test.dat file in
s['name'] = "Ajay" current directory and store
s['age'] = 23 key-value data in hashed
s['marks'] = 75 form.
s.close()

23 December 2024 35
To access value of a particular key in shelf.
import shelve
s=shelve.open('test')
print( s['age'])
s['age']=25
print(s.get('age'))

output
23
25

23 December 2024 36
The Shelf object has following methods available

23 December 2024 37
The items(), keys() and values()
methods return view objects.

23 December 2024 38
To remove a key-value pair from shelf

23 December 2024 39
To merge items of another dictionary
with shelf use update() method

23 December 2024 40
Example:
import shelve
shelfFile = shelve.open('mydata')
cats = ['Zophie', 'Pooka', 'Simon'] output
shelfFile[ cats'] = cats
shelfFile.close() ['Zophie', 'Pooka', 'Simon']

shelfFile = shelve.open('mydata')
print(shelfFile[ cats'])
shelfFile.close()

23 December 2024 41
Accessing key and value using list
function

23 December 2024 42
Saving Variables with the
pprint.pformat() Function
• The pprint.pformat() function will return this same text as a string instead of
printing it.
• Not only is this string formatted to be easy to read, but it is also syntactically
correct Python code.
• Say you have a dictionary stored in a variable and you want to save this variable
and its contents for future use.
• Using pprint.pformat() will give you a string that you can write to .py file.
import pprint
cats = [{'name': 'Zophie', 'desc': 'chubby'}, {'name': 'Pooka', 'desc': 'fluffy'}]
pprint.pformat(cats)
fileObj = open('myCats.py', 'w')
fileObj.write('cats = ' + pprint.pformat(cats) + '\n')
fileObj.close()

cats = [{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': myCats


'Pooka'}] .py
23 December 2024 44
And since Python scripts are themselves just text files with the .py file extension,
your Python programs can even generate other Python programs. You can then
import these files into scripts.

import myCats
print(myCats.cats)
print(myCats.cats[0])
print(myCats.cats[0]['name'])

Output:
[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]

{'desc': 'chubby', 'name': 'Zophie'}

Zophie

23 December 2024 45
Project: Multiclipboard
• Say you have the boring task of filling out many forms in a web
page or software with several text fields.
• The clipboard saves you from typing the same text over and over
again. But only one thing can be on the clipboard at a time.
• If you have several different pieces of text that you need to copy
and paste, you have to keep highlighting and copying the same
few things over and over again.

23 December 2024 54
• The program will save each piece of clipboard text under a
keyword.
• For example, when you run py mcb.pyw save spam, the
current contents of the clipboard will be saved with the
keyword spam.
• This text can later be loaded to the clipboard again by running
py mcb.pyw spam.
• If the user forgets what keywords they have, they can run py
mcb.pyw list to copy a list of all keywords to the clipboard.

23 December 2024 55
Here’s what the program does:
• The command line argument for the keyword is checked.
• If the argument is save, then the clipboard contents are saved to the
keyword.
• If the argument is list, then all the keywords are copied to the
clipboard.
• Otherwise, the text for the keyword is copied to the keyboard.
This means the code will need to do the following:
• Read the command line arguments from sys.argv.
• Read and write to the clipboard.
• Save and load to a shelf file.

23 December 2024 56
Step 1: Comments and Shelf Setup

s v. Copying and pasting will require the pyperclip module, and reading the command line
arguments will require the sys module. The shelve module will also come in handy:
Whenever the user wants to save a new piece of clipboard text, you’ll save it to a shelf file.
Then, when the user wants to paste the text back to their clipboard, you’ll open the shelf file
and load it back into your program. The shelf file will be named with the prefix mcb w.
23 December 2024 57
Step 2: Save Clipboard Content with a
Keyword
The program does different things depending on whether the user wants to save text
to a keyword, load text into the clipboard, or list all the existing keywords.

23 December 2024 58
Step 3: List Keywords and Load a
Keyword’s Content

23 December 2024 59
WORKING WITH EXCEL
SPREADSHEETS
Excel Documents
• An Excel spreadsheet document is called a workbook.
• A single workbook is saved in a file with
the .xlsx extension.
• Each workbook can contain multiple sheets (also
called worksheets).
• The sheet the user is currently viewing (or last viewed
before closing Excel) is called the active sheet.
• Each sheet has columns (addressed by letters starting
at A) and rows (addressed by numbers starting at 1).
• A box at a particular column and row is called a cell.
Each cell can contain a number or text value.
• The grid of cells with data makes up a sheet.
INSTALLING THE OPENPYXL MODULE
• Install by running pip install --user -U openpyxl
• import openpyxl
READING EXCEL DOCUMENTS
>>> import openpyxl
>>> wb =
openpyxl.load_workbook('example.xlsx')
>>> type(wb)
<class
'openpyxl.workbook.workbook.Workbook'>
>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> wb.sheetnames # The workbook's sheets' names.
['Sheet1', 'Sheet2', 'Sheet3']
>>> sheet = wb['Sheet3'] # Get a sheet from the
workbook.
>>> sheet
<Worksheet "Sheet3">
>>> type(sheet)
<class 'openpyxl.worksheet.worksheet.Worksheet'>
>>> sheet.title # Get the sheet's title as a string.
'Sheet3'
>>> anotherSheet = wb.active # Get the active sheet.
>>> anotherSheet
<Worksheet "Sheet1">
Getting Cells from the Sheets
>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> sheet = wb['Sheet1'] # Get a sheet from the workbook.
>>> sheet['A1'] # Get a cell from the sheet.
<Cell 'Sheet1'.A1>
>>> sheet['A1'].value # Get the value from the cell.
datetime.datetime(2015, 4, 5, 13, 34, 2)
>>> c = sheet['B1'] # Get another cell from the sheet.
>>> c.value
'Apples'
>>> # Get the row, column, and value from the cell.
>>> 'Row %s, Column %s is %s' % (c.row, c.column, c.value)
'Row 1, Column B is Apples'
>>> 'Cell %s is %s' % (c.coordinate, c.value)
'Cell B1 is Apples'
>>> sheet['C1'].value
73
>>> sheet.cell(row=1, column=2)
<Cell 'Sheet1'.B1>
>>> sheet.cell(row=1, column=2).value
'Apples'
>>> for i in range(1, 8, 2): # Go through every other
row:
... print(i, sheet.cell(row=i, column=2).value)
...
1 Apples
3 Pears
5 Apples
7 Strawberries
>>> import openpyxl
>>> wb =
openpyxl.load_workbook('example.xlsx')
>>> sheet = wb['Sheet1']
>>> sheet.max_row # Get the highest row
number.
7
>>> sheet.max_column # Get the highest
column number.
3
Converting Between Column Letters
and Numbers
>>> import openpyxl
>>> from openpyxl.utils import get_column_letter,
column_index_from_string
>>> get_column_letter(1) # Translate column 1 to a letter.
'A'
>>> get_column_letter(2)
'B'
>>> get_column_letter(27)
'AA'
>>> get_column_letter(900)
'AHP'
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> sheet = wb['Sheet1']
>>> get_column_letter(sheet.max_column)
'C'
>>> column_index_from_string('A') # Get A's number.
1
>>> column_index_from_string('AA')
27
Getting Rows and Columns from the
Sheets
>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> sheet = wb['Sheet1']
>>> tuple(sheet['A1':'C3']) # Get all cells from A1 to C3.
((<Cell 'Sheet1'.A1>, <Cell 'Sheet1'.B1>, <Cell
'Sheet1'.C1>), (<Cell
'Sheet1'.A2>, <Cell 'Sheet1'.B2>, <Cell 'Sheet1'.C2>),
(<Cell 'Sheet1'.A3>,
<Cell 'Sheet1'.B3>, <Cell 'Sheet1'.C3>))
➊ >>> for rowOfCellObjects in sheet['A1':'C3']:
➋ ... for cellObj in rowOfCellObjects:
... print(cellObj.coordinate, cellObj.value)
... print('--- END OF ROW ---')
A1 2015-04-05 13:34:02
B1 Apples
C1 73
--- END OF ROW ---
A2 2015-04-05 03:41:23
B2 Cherries
C2 85
--- END OF ROW ---
A3 2015-04-06 12:46:51
B3 Pears
C3 14
--- END OF ROW ---
>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> sheet = wb.active
>>> list(sheet.columns)[1] # Get second column's cells.
(<Cell 'Sheet1'.B1>, <Cell 'Sheet1'.B2>, <Cell 'Sheet1'.B3>,
<Cell 'Sheet1'.
B4>, <Cell 'Sheet1'.B5>, <Cell 'Sheet1'.B6>, <Cell 'Sheet1'.B7>)
>>> for cellObj in list(sheet.columns)[1]:
print(cellObj.value)
Apples
Cherries
Pears
Oranges
Apples
Bananas
Strawberries
Workbooks, Sheets, Cells
• Import the openpyxl module.
• Call the openpyxl.load_workbook() function.
• Get a Workbook object.
• Use the active or sheetnames attributes.
• Get a Worksheet object.
• Use indexing or the cell() sheet method
with row and column keyword arguments.
• Get a Cell object.
• Read the Cell object’s value attribute.
WORKING WITH CSV FILES AND JSON
DATA
THE CSV MODULE
reader Objects
>>> import csv
➋ >>> exampleFile = open('example.csv')
➌ >>> exampleReader = csv.reader(exampleFile)
➍ >>> exampleData = list(exampleReader)
➎ >>> exampleData
[['4/5/2015 13:34', 'Apples', '73'], ['4/5/2015
3:41', 'Cherries', '85'],
['4/6/2015 12:46', 'Pears', '14'], ['4/8/2015 8:59',
'Oranges', '52'],
['4/10/2015 2:07', 'Apples', '152'], ['4/10/2015
18:10', 'Bananas', '23'],
['4/10/2015 2:40', 'Strawberries', '98']]
>>> exampleData[0][0]
'4/5/2015 13:34'
>>> exampleData[0][1]
'Apples'
>>> exampleData[0][2]
'73'
>>> exampleData[1][1]
'Cherries'
>>> exampleData[6][1]
'Strawberries'
Reading Data from reader Objects in
a for Loop
>>> import csv
>>> exampleFile = open('example.csv')
>>> exampleReader = csv.reader(exampleFile)
>>> for row in exampleReader:
print('Row #' + str(exampleReader.line_num) + ' ' +
str(row))
Row #1 ['4/5/2015 13:34', 'Apples', '73']
Row #2 ['4/5/2015 3:41', 'Cherries', '85']
Row #3 ['4/6/2015 12:46', 'Pears', '14']
Row #4 ['4/8/2015 8:59', 'Oranges', '52']
Row #5 ['4/10/2015 2:07', 'Apples', '152']
Row #6 ['4/10/2015 18:10', 'Bananas', '23']
Row #7 ['4/10/2015 2:40', 'Strawberries', '98']
writer Objects
>>> import csv
➊ >>> outputFile = open('output.csv', 'w', newline='')
➋ >>> outputWriter = csv.writer(outputFile)
>>> outputWriter.writerow(['spam', 'eggs', 'bacon',
'ham'])
21
>>> outputWriter.writerow(['Hello, world!', 'eggs',
'bacon', 'ham'])
32
>>> outputWriter.writerow([1, 2, 3.141592, 4])
16
>>> outputFile.close()
spam,eggs,bacon,ham
"Hello, world!",eggs,bacon,ham
1,2,3.141592,4
The delimiter and lineterminator
Keyword Arguments
>>> import csv
>>> csvFile = open('example.tsv', 'w', newline='')
➊ >>> csvWriter = csv.writer(csvFile, delimiter='\t',
lineterminator='\n\n')
>>> csvWriter.writerow(['apples', 'oranges', 'grapes'])
24
>>> csvWriter.writerow(['eggs', 'bacon', 'ham'])
17
>>> csvWriter.writerow(['spam', 'spam', 'spam',
'spam', 'spam', 'spam'])
32
>>> csvFile.close()
DictReader and DictWriter CSV
Objects
• For CSV files that contain header rows, it’s often
more convenient to work with
the DictReader and DictWriter objects, rather
than the reader and writer objects.
• The reader and writer objects read and write to
CSV file rows by using lists.
• The DictReader and DictWriter CSV objects
perform the same functions but use dictionaries
instead, and they use the first row of the CSV file
as the keys of these dictionaries.
>>> import csv
>>> exampleFile = open('exampleWithHeader.csv')
>>> exampleDictReader = csv.DictReader(exampleFile)
>>> for row in exampleDictReader:
... print(row['Timestamp'], row['Fruit'],
row['Quantity'])
...
4/5/2015 13:34 Apples 73
4/5/2015 3:41 Cherries 85
4/6/2015 12:46 Pears 14
4/8/2015 8:59 Oranges 52
4/10/2015 2:07 Apples 152
4/10/2015 18:10 Bananas 23
4/10/2015 2:40 Strawberries 98
>>> import csv
>>> exampleFile = open('example.csv')
>>> exampleDictReader = csv.DictReader(exampleFile,
['time', 'name',
'amount'])
>>> for row in exampleDictReader:
... print(row['time'], row['name'], row['amount'])
...
4/5/2015 13:34 Apples 73
4/5/2015 3:41 Cherries 85
4/6/2015 12:46 Pears 14
4/8/2015 8:59 Oranges 52
4/10/2015 2:07 Apples 152
4/10/2015 18:10 Bananas 23
4/10/2015 2:40 Strawberries 98
>>> import csv
>>> outputFile = open('output.csv', 'w', newline='')
>>> outputDictWriter = csv.DictWriter(outputFile, ['Name',
'Pet', 'Phone'])
>>> outputDictWriter.writeheader()
>>> outputDictWriter.writerow({'Name': 'Alice', 'Pet': 'cat',
'Phone': '555-
1234'})
20
>>> outputDictWriter.writerow({'Name': 'Bob', 'Phone':
'555-9999'})
15
>>> outputDictWriter.writerow({'Phone': '555-5555', 'Name':
'Carol', 'Pet':
'dog'})
20
>>> outputFile.close()
Name,Pet,Phone
Alice,cat,555-1234
Bob,,555-9999
Carol,dog,555-5555
PROJECT: REMOVING THE HEADER FROM
CSV FILES
• The program will need to open every file with
the .csv extension in the current working
directory, read in the contents of the CSV file,
and rewrite the contents without the first row
to a file of the same name.
• This will replace the old contents of the CSV
file with the new, headless contents.
At a high level, the program must do the
following:
1. Find all the CSV files in the current working
directory.
2. Read in the full contents of each file.
3. Write out the contents, skipping the first line,
to a new CSV file.
• At the code level, this means the program will
need to do the following:
1. Loop over a list of files from os.listdir(), skipping
the non-CSV files.
2. Create a CSV reader object and read in the
contents of the file, using
the line_num attribute to figure out which line
to skip.
3. Create a CSV writer object and write out the
read-in data to the new file.
Step 1: Loop Through Each CSV File
#! python3
# removeCsvHeader.py - Removes the header from all CSV files in the current
# working directory.
import csv, os
os.makedirs('headerRemoved', exist_ok=True)
# Loop through every file in the current working directory.
for csvFilename in os.listdir('.'):
if not csvFilename.endswith('.csv'):
➊ continue # skip non-csv files
print('Removing header from ' + csvFilename + '...')
# TODO: Read the CSV file in (skipping first row).
# TODO: Write out the CSV file.
Step 2: Read in the CSV File
# Read the CSV file in (skipping first row).
csvRows = []
csvFileObj = open(csvFilename)
readerObj = csv.reader(csvFileObj)
for row in readerObj:
if readerObj.line_num == 1:
continue # skip first row
csvRows.append(row)
csvFileObj.close()
Step 3: Write Out the CSV File
Without the First Row
# Loop through every file in the current working directory.
➊ for csvFilename in os.listdir('.'):
if not csvFilename.endswith('.csv'):
continue # skip non-CSV files
--snip--
# Write out the CSV file.
csvFileObj = open(os.path.join('headerRemoved', csvFilename), 'w',
newline='')
csvWriter = csv.writer(csvFileObj)
for row in csvRows:
csvWriter.writerow(row)
csvFileObj.close()
Removing header from NAICS_data_1048.csv...
Removing header from NAICS_data_1218.csv...
--snip--
Removing header from NAICS_data_9834.csv...
Removing header from NAICS_data_9986.csv...
Ideas for Similar Programs
• Compare data between different rows in a
CSV file or between multiple CSV files.
• Copy specific data from a CSV file to an Excel
file, or vice versa.
• Check for invalid data or formatting mistakes
in CSV files and alert the user to these errors.
• Read data from a CSV file as input for your
Python programs.
JSON AND APIS
• JavaScript Object Notation is a popular way to
format data as a single human-readable string.
• JSON is the native way that JavaScript
programs write their data structures and
usually resembles what
Python’s pprint() function would produce.
• Ex: {"name": "Zophie", "isCat": true,
"miceCaught": 0, "napsTaken": 37.5,
"felineIQ": null}
• JSON is useful to know, because many websites
offer JSON content as a way for programs to
interact with the website.
• This is known as providing an application
programming interface (API).
• Accessing an API is the same as accessing any
other web page via a URL.
• The difference is that the data returned by an API
is formatted (with JSON, for example) for
machines.
Using APIs, you could write programs
that do the following
• Scrape raw data from websites. (Accessing APIs is
often more convenient than downloading web
pages and parsing HTML with Beautiful Soup.)
• Automatically download new posts from one of
your social network accounts and post them to
another account. For example, you could take
your Tumblr posts and post them to Facebook.
• Create a “movie encyclopedia” for your personal
movie collection by pulling data from IMDb,
Rotten Tomatoes, and Wikipedia and putting it
into a single text file on your computer.
THE JSON MODULE
• Python’s json module handles all the details of
translating between a string with JSON data and
Python values for
the json.loads() and json.dumps() functions.
• JSON can’t store every kind of Python value. It can
contain values of only the following data types: strings,
integers, floats, Booleans, lists, dictionaries,
and NoneType.
• JSON cannot represent Python-specific objects, such
as File objects,CSV reader or writer objects, Regex obje
cts, or Selenium WebElement objects.
Reading JSON with the loads()
Function
>>> stringOfJsonData = '{"name": "Zophie",
"isCat": true, "miceCaught": 0,
"felineIQ": null}'
>>> import json
>>> jsonDataAsPythonValue =
json.loads(stringOfJsonData)
>>> jsonDataAsPythonValue
{'isCat': True, 'miceCaught': 0, 'name': 'Zophie',
'felineIQ': None}
Writing JSON with the dumps()
Function
>>> pythonValue = {'isCat': True, 'miceCaught':
0, 'name': 'Zophie',
'felineIQ': None}
>>> import json
>>> stringOfJsonData =
json.dumps(pythonValue)
>>> stringOfJsonData
'{"isCat": true, "felineIQ": null, "miceCaught": 0,
"name": "Zophie" }'

You might also like