0% found this document useful (0 votes)
2 views34 pages

Data File Handling - 1

The document provides an overview of data file handling in Python, detailing the types of files (text and binary), methods for opening, reading, writing, and closing files, as well as the use of the 'with' statement for resource management. It also covers binary file operations, including serialization with the pickle module, and introduces CSV file handling, highlighting its advantages and the built-in CSV module for reading and writing tabular data. Key concepts such as file modes, file pointer manipulation, and error handling are also discussed.

Uploaded by

aarushck911
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views34 pages

Data File Handling - 1

The document provides an overview of data file handling in Python, detailing the types of files (text and binary), methods for opening, reading, writing, and closing files, as well as the use of the 'with' statement for resource management. It also covers binary file operations, including serialization with the pickle module, and introduces CSV file handling, highlighting its advantages and the built-in CSV module for reading and writing tabular data. Key concepts such as file modes, file pointer manipulation, and error handling are also discussed.

Uploaded by

aarushck911
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Data File

Handling

PREPARED BY: VIJITH P


PGT CS
SILVER HILLS PUBLIC
SCHOOL
Data File ?

 A file is a stream of bytes comprising relevant data.


 It is used to store data permanently.
 Data objects in a file may be in the form of character sequences, or more complex
objects such as lists, dictionaries, or objects of user defined types.
 Data files are files that store and preserve data released to a particular application.
 There are mainly two types of files used in python
 Text files.
 Binary files.
Text File

 Text file
 A text file is a human readable file that comprises of a sequence of characters stored in
the form of ASCII or UNICODE.
 Each line in the text file is terminated by an End Of Line (EOL) character that may vary
across operating system.
 A text file is recognized by its name having .txt extension.
 CSV file
 In a comma separated file the data values in each line are separated by commas.
 A delimited text file makes use of a delimiter to separate the contents in each line.
 Similarly in Tab separated file, the data values in each line are separated by tabs.
Binary File

 The data in a binary file is stored in machine-readable data objects. (In a sequence of
binary digits (0s and 1s) without any specific delimiters.)
 For example, in a text file the number – 12345 will be stored as a sequence of six bytes, where
as in binary file it may be stored as an integer object requiring 16,32, or 64 bits depending on
the size of the integer object.

 Unlike text files, binary files do not require a comma, space, or end of line character.

 Binary files can represent a wide range of data types, including numbers, images, audio, video,
and executable code. They store data in its raw, binary form, without any need for
additional characters to separate or delineate different pieces of information.
Opening and Closing Files

 For opening a file, python provides the built-in function open().


 Syntax:
 f=open(filename,accessMode)
 f – means file handle or file object.
 filename- Name of the file by which the file is stored on a computer.
 accessmode- The mode in which we need to access the file.
 Modes for opening a file:
 r: read mode
 w:writre mode
 a: append mode – to write at the end of the file
 r+,w+ - both read and write mode
 a+- both read and append mode.
File object

 File objects: Access to the file is made using its handle.


 It enables us to perform different types of file operations on the file such as writing
into the file, reading from a file and appending data to the file.

 Closing Files:
 Once the necessary operations on a file have been carried out, it should be closed
using a call to the function close()
 Syntax:
 Fileobject.close()
READING FROM A FILE

 read() : To read entire data from the file. Starts reading from the beginning of the
file to the end of the file in the form of string.
 read(n) : To read n characters from the file, starting from the cursor (from the
beginning ). If the file holds fewer than ‘n’ characters, it will read the until the end
of file.
 readline() : The readline() function reads a line of the file and returns it in the form
of string. For a specified number n, this function reads at most n bytes. However, it
does not read more than one line, even if ‘n’ exceeds the length of the line.
 readlines(): To read all lines from the file into a list and returns a list of strings,
separated by new line character.
WRITING TO FILE

 write(string): This method takes a string as parameter and writes it in the text file
in a single line. We will have to add ‘\n’ character to the end of the string. \n is
treated as special characters of 2 bytes. As the argument to the function has to
be string, for storing numeric value, we have to convert it to string.
 Syntax: fileobject.write(string)
 The write() actually writes the data onto a buffer. When the close() method is
executed, the contents from this buffer are moved to the file located on the
permanent storage.
 For storing numeric data value, conversion to string is required.
 Program1,2.
WRITING TO FILE (continue)

 writelines(): This method is used to write sequence data type such as list,
tuple etc. including multiple strings into a file.
 Syntax: fileobject.writelines(sequence)
 program
Use of with statement

 with statement is used to create a file instead of single open() function. Also we
can use this statement to group file operation statements within block. Using
with ensures that all the resources allocated to the file objects get deallocated
automatically once we stop using the file.

 Sytax:

 With open(“file name”,”mode”) as fileobject:

file manipulation statements.

 program4
Appending file

 Append means – ‘to add to’; if we need to add more data to a file
which already has some data in it, we will be appending data.
 Syntax:
<file object>=open(<file name>),’a’)
 program5
Absolute and Relative Path

 The two most important attributes of a file are the file name and path.
 The path identifies a file’s location on the computer.
 A file path can be specified in two ways
 An absolute path one that always starts at the root folder.
 A path that is related to the current working directory of the program.
Representation
Representation 2
Representation 3
The flush() Function

 In general, the data written to a file is temporarily stored in a file buffer and
transferred from buffer to file on disk only when the close() function is invoked.
 The flush() function can be used to forcefully write the content from python’s buffer to
a file without waiting for the user to close the file.
 This makes the content in the buffer readily written to the file on the disk and available
for use.
 Syntax: <file_object>.flush()
 Program (Eg:)
Random Access Using seek() and
tell()
 Accessing and Manipulating Location of File pointer – Random Access
 Python provides two functions that help you manipulate the position of file-pointer and thus you
can read/write from the desired location of the file. The functions are:
 tell() – returns the current position of the file pointer.
 seek() – for changing the position of the file pointer to a desired location.

 The seek() function : The seek() function is used to change the position of the file pointer (file handle) by
placing the file pointer at a specific position in the opened file. seek() can be done in two ways.
 Absolute Positioning : It will give the actual position of the file pointer where the file pointer has to be
placed.
 Syntax: <file-handle>.seek(file_location)
 Eg: f.seek(20) – This statement shall move the file pointer to 20th byte in the file pointer no matter where
you are.
Working with Binary Files

 Relative Positioning : It has two arguments, offset (new position to set the file pointer)
and from-what(actual position referring to which the file pointer is displaced forward or
backward). It is mentioned with three different options, 0- beginning , 1- current position 2-
end position of the file.
 Syntax: <file-handle>.seek(off-set, from-what)

Position to set
the file pointer Reference point
The tell () function:

The tell() function returns the current position of the file pointer in the file.
Note: The beginning (0) is the default reference point. The reference
Syntax: <file_handle>.tell()
points (current and end) are only used in binary files.
program
Standard File Streams

 We use file objects to work with data file; similarly input/output from standard I/O
devices is also performed using standard I/O stream object.
 In order to work with standard I/O stream, we need to import sys module.
 The standard streams available in python are:
 Standard input stream.
 Standard output stream.
 Standard error stream.
 The methods which are available for I/O operations in it are:
 read()- for reading a byte at a time from keyboard
 write()- for writing data on console, i.e. monitor.
Standard File Streams

 The three standard streams are described as follows:


 sys.stdin: When a stream reads from standard input.
 sys.stdout: Data written to sys.stdout typically appears on your screen.
 sys.stderr: It is similar to sys.stdout as it also prints directly to the console but with the
difference that it also prints exceptions, error messages along with debugging comments.
 program
Working with Binary Files

 If you need to write and read non-simple objects like dictionaries, tuples, list or
nested lists on to the files, and if we need to maintain their structure as it is, better
choice is to use binary files.
 For this purpose objects are often serialized and then stored in binary files.
 The module pickle is used for serializing and de-serializing any python object
structure.
 Serialization is the process of transforming data or an object in memory to a stream
of bytes. These stream of bytes in a binary file can then be stored in a disk or data
base.
 Serialization process is also called pickling.
 While reading the contents of the file, a reverse process i.e., a byte stream is
converted in to an object hierarchy known as de-serialization or unpickling.
Working with Binary Files

 Pickle module can be used to store any kind of object in a binary file as it allows us
to store python objects with their structure.
 The following steps are to be taken for performing reading and writing operations on
a binary file.
 1. we need to import pickle module using import pickle statement.
 2. open binary file with required access mode.
 3. process binary file by writing/reading objects using pickle module’s methods.
 4. once done, close the file.
Working with Binary Files

 Writing onto a Binary File – Pickling


 In order to write an object on to binary file opened in write mode, we will use dump()
function of pickle module .
 Syntax: pickle.dump(<object-to-be-written>,<filehandle-of-open-file)
#programs-2

 Reading from a Binary File – unpickling

 For reading data from a file, we have to use load() function of pickle module as it would
then unpickle the data coming from the file.
 Syntax: <object>=pickle.load(<file handle>)
Working with Binary Files

 Sometimes pickle. load() function will raise EOFError (a runtime exception) when
you reach end of file while reading from the file.
 You can handle this by following one of the following method:
 use try and except blocks or using with statement.
 Syntax:<filehandle>=pickle.load(<file_name>,<readmode>)
try:
<object>=pickle.load(<file handle>)
#other statements
except EOFError:
<filehandle>.close()
Working with Binary Files

 Syntax:
 with open(<filename>,<mode>) as <file handle>:
# use pickle.load here in this with block
# perform their file manipulation task in this with block
You need not mention any exception with the with statement
explicitly.
Working with Binary Files

 Appending Records in a binary file.


 Appending records in binary file is similar to writing, only thing you have to ensure is that
you must open the file in append mode (ab).
 A file opened in append mode will retain the previous records and append the new records
written in the file.
 The same dump() function is using to append data in to binary file.
 Program
Working with Binary Files

 Searching in a file
 Searching in binary file is done in the following way.
 1. Open the file in read mode
 2. Read the file contents record by record
 3. In every read record, look for the desired search key
 4. If found, process as desired
 5. If not found, read the next record and look for the desired search-key
 6. If search-key is not found in any of the records, report that no such value found in the file.

 Program text book


Working with Binary Files

 Updating in a binary file


 Updating an object means changing its value(s) and storing it again. Udating is done in the
following way:
 1.Locate the record to be by searching for it.
 2.Make changes in the loaded record in memory.
 3.Write back onto the file at the exact location of old record.

The above first two steps can perform easily, but for the third step you need to know the
exact file location of the record to write the updated data into the file.
Working with Binary Files

 Implementation of wb+,rb+ and ab+ modes in Binary files


 Use of ‘wb+’ (Write and Read) in a binary file

 The ‘wb+’ mode enables a binary files to be opened in write as well as read mode. It means,
after writing the content in the file, you need not re-open it in read mode for accessing the
records. When writing into the file is over, you must use seek(0) function to bring the file
pointer at the beginning of the file. Further, reading operation can be done similar to the read
mode.

 Program
INTRODUCTION TO CSV

 CSV means Comma Separated Values.


 It is a human readable format which is extensively used to store tabular data, in a
spread sheet or data base.
 A CSV file stored tabular data (numbers and text) in plain text.
 A CSV is a delimited text file that uses comma to separate values.
 Data in the CSV format can be imported to and exported from programs that store data in
tables, such as Microsoft Excel or OpenOffice Calc.
 Each line of the csv file is called a data/record.
 Each record consists of fields separated by commas.
INTRODUCTION TO CSV

 Why to use CSV ?


 Storing huge and exponentially growing data set.
 Processing data having complex structure. (structured, un-structured, semi-
structured)
 Bringing huge amount of data to computation unit becomes a bottleneck.
INTRODUCTION TO CSV

 Advantages using CSV Files.


 CSV is faster to handle.
 CSV is smaller in size.
 CSV is easy to generate and import onto a spreadsheet or database.
 CSV is human readable and easy to edit manually.
 CSV is simple to implement and parse
 CSV is processed by almost all existing applications.
CSV FILE HANDLING IN PYTHON

 For working with CSV files in python, there is an inbuilt module called CSV, it is used to read
and write tabular data in csv format.
 There are two basic operations that can be carried out on a CSV file.
 Reading a CSV file.
 Writing to a CSV file.
 Reading from a CSV file.
 Reading from csv file is done using the reader object.
 We use open() function to open a CSV file and it returns a file object.
 This file object (reader) creates a special type of object to access CSV file using reader()
function.
 The reader object is an iterable format that gives us access to each line of csv file as a list of
fields.
 You can also use next() directly on it to read the next line of csv file.
CSV FILE HANDLING IN PYTHON

 Writing to a CSV File


 writer() function is used to write some contents to a csv file.
 The csv.writer() function returns a writer object that converts the users data
into a delimited string.
 This string can later be used to write into CSV files using the writerow()
function.
 The writerow() method allows us to write a list of fields to the file. The fields
can be strings or numbers or both.
 No need to use EOL character because writerow() does it for you as necessary.

You might also like