0% found this document useful (0 votes)

27 views

Chapter 4 - Import-Export Data

The document discusses transferring data between CSV files, DataFrames, and MySQL databases in Python. It covers loading and saving CSV files from DataFrames using read_csv() and to_csv() functions. It also covers connecting to a MySQL database and executing SQL queries to load data into a DataFrame using read_sql().

Uploaded by

Anjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Chapter 4 - Import-Export Data

Uploaded by

Anjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Importing/Expoting

Data beween
CSV Files/MySQL and
Pandas
Chapter - 4
4.1 Introduction
●
DataFrames are capable of storing any types of data in 2D
tabular form.
●
Most data files that we use to store data such as
spreadsheet files or database tables, also store the data in
2D tabular formats.
●
Since DataFrame can also hold data in similar way, you
can transfer data from dataframe to such data files or from
files into dataframes.
●
In this Chapter we will learn how to transfer data
among .CSV file,dataframe and database table
●
.CSV file( .CSV is a format that stores data in comma
separated form – Comma Separated Values)
4.2 Transferring Data between .csv Files and DataFrames

●
“The acronym CSV is short for Comma-Separated
Values.The CSV format refers to a tabular data that has
been saved as plaintext where data is separated by
commas.”
For example : The data of a table will be stored in CSV
format as shown below
Advantages of CSV format
●
A simple, compact and ubiquitous format for data storage.
●
A common format for data interchange.
●
It can be opened in popular spreadsheet packages like MS-
Excel, Calc etc.
●
Nearly all spredsheets and databases supports
import/export to csv format.
4.2.1 Loading Data From CSV to DataFrames

Python's Pandas library offers two function

read_csv( ) - This function helps you bring data from a
CSV file into a dataframe.
to_csv( ) - This function helps you write a dataframe's
data to a CSV file.

We can create a CSV file by saving data of an MS-Excel

file in CSV format using save AS command from File
tab/menu and selceting Save As Type as CSV Format.
4.2.1A Reading From a CSV File to Dataframe
We can use read_csv( ) function to raed data from a
CSV file in your dataframe by using the function as per
following syntax:
<DF> = pandas.read_csv( < filepath > )
eg. df = pd.rad_csv( “ c:\\data\\sample.csv “)
print( df )

NOTE:it has taken the first roe from the CSV file as the column names for dataframe
4.2.1B Reading CSV File and Specifying Own Column Names
We may have a CSV file that does not have top row
containg column headers

●
Now if you read such a file by just giving the filepath, it
will take the top row as the column headers.But the top
row(1,Sarah,Kapur) is data, not column headings.
●
In such situation, we can specify own column headings in
read_csv( ) using names argument.
Df2 = pd.read_csv( “ c:\\data\\mydata.csv “, names = [ “
Roll no “ , “First_Name” , “Last_Name”] )
And now when you print df2, Python will show:
If we want the first row not to be used as header and at the
same time we don't want to specify column headings rather
go with default column headings which go like 0, 1, 2, 3....
then simply give argument as header = None in read_csv( )
df3 = pd.read_csv(“c:\\data\\mydata.csv “ , header = None)
Now comes a situation where you have first row of CSV file storing
some column headings but you don't want to use them.

For this situation , we need to give two arguments along with file path :
one for column headings i.e, names = <column headings sequence>
and another skipows = <n>.
df5 = pd.read_csv(“c:\\data\\mydata.csv”, names=[“Rollno”, “Name” ,
“Marks”], skiprows = 1]
4.2.1C Reading Specified number of Rows from CSV File
Giving argument nrows = <n> in read_csv( ), will read the
specified number of rows from the CSV file
df6 = pd.read_csv(“ c:\\data\\mydata.csv”, names = [ “Rollno”
,”Name”, “Surname”] , nrows=3)
print(df6)

Using nrows argument, you can extract top rows, bottom

rows and middle rows too if you use it with head( ) or tail( )
functions
4.2.1D Reading from CSV files having Separator Different from Comma
Some CSV files are so created that their separator character is
different from comma such as a semicolo(;) or a pipe symbol(|) etc.
●
To read data from such CSV files, you need to specify an additinal
argument as sep =<separator character>.
●
If you skip this argument then default separator character(comma)
is considered.
4.2.2 Storing DataFrame's Data to CSV File
Sometimes, we have data available in dataframes and we want to
save that data in a CSV file.For this purpose, Python Pandas
provides to_csv( ) function that saves the data of a dataframe in a
CSV file.
<DF> .to_csv( <filepath> )
Or <DF>.to_csv( <filepath> , sep = <separator_character>)
The separator character must be a one character string only.
●
As no separator has been mentioned , it will take default separator
cooma.
●
Also, if a file exists with the same name at the given location, it will
overwrite it.
●
Open the file and you will find the data of dataframe df7

Let us save the same dataframe's data in another file namely

new2.csv but with separator.as '|' charartor.
df7.to_csv( “c:\\data\\new2.csv”, sep = “|“)
4.2.2A Handling NaN Values with to_csv( )
Sometimes, your dataframe has some missing values.If your
dataframe dont have missing value then execute the following
statement on dataframe to insert missing values
.
df7.loc[ 3, “Name”] = np.NaN
df7.loc[ 0, “Surame”] = np.NaN

Now your dataframe df7 will

store data as:

Now, if you store this dataframe

in a CSV file by giving following
command:
df7.to_csv(“ c:\\data\\new3.csv”, sep= “|”)

By default , the missing/NaN values

are stored as empty strings in CSV
file.
You can specify your own string that can be written for missing/NaN
values by giving an argument na_rep = <string>
Following statement will write NULL in place of NaN values in the
CSV file :
df7.to_csv( “c:\\data\\new3.csv” , sep = “ | “ , na_rep = “NULL” )
4.3 Transferring Data between DataFrames and MySQL
An SQL database is a relational database having data in tables
called relations.
●
It uses a special type of query language, Structure Query
Language(SQL), to query upon manipulate data or to
communicate with database.
●
There are many SQL database available such as MySQL, SQL
Server, SQLite etc.
●
We will learn to import/export data from MySQL database to a
dataframe in Python progam and vice versa.
●
Installing mysql connector or pymysql packages
●
In order to connect with MySQL from with a Python program, you
must have the mysql connector package (mysql-connector-python)
or pymysql installed.
●
For this, we can open the command shell and go to the Python
installation folder by the following command :
●
C:\WINDOWS\system32>cd <Python folder path>
●
And then type the following command on the prompt to install mysql-
connector-python or pymysql package :
●
C:\<path> > pip install mysql-connector-python
●
C:\<path> >pip install pymysql
●
Once you have mysql-connector-python or pymysql package
installed, you can import/export data from a MySQL database into
your Python program.
●
4.3.1 Bringing Data from MySQL Database into a DataFrame
●
There are mainly seven steps that must be followed in order to
create a database connectivity application.
●
Step 1 Start Python and import the packages required for atabase
programming.
●
Step 2 Open a connection to database.
●
Step 3 Execute sql command and fetch rows into a dataframe
●
Step 4 Process as desired.
●
Step 5 Close the connection.
Step 1: Import Required Libraries
import pandas as pd
import mysql.connector as sqltor

Step 2: Open a Connection to MySQL Database

The connect( ) function establish conection to a MySQL database

<Connection-object> = mysql.connector.connect (host =<host-

name>,user = <username>, passwd = <password> [, database
=<database>]

import mysql.connector as sqltor

The
mycon = sqltor.connect ( host = “localhost” ,
connection user = “root”, passwd = “MyPass”,
object database=”test” )
Loginid and password
Make sure that this is name of of your mysql
an existing database name database it should be
correct
Step 3: Execute SQL command and fetch rows in a Dataframe
Once the connection to SQL database is established, read data from a
table into a dataframe using read_sql() function

<DF> = pandas.read_sql( “<SQL statement > “ , <connection name> )

eg.
df = pd.read_sql( “ SELECT * From Student ; “ , mycon )

Here, make sure that the SQL satatement given inside the read_sql( )
function :
( i ) must end with a semicolon and
( ii) should be enclosed in quotes.
Now the above full code will give output as shown below
4.3.2 Framing Flexible SQL Queries with User Data
●
Sometimes, you may need to run queries which are based on some
parameters or values that provide from outside.
●
Such queries are called parameterised queries.
●
To execute parameterised queries in a mysql.connector connection,
you need to form SQL query string that include values of parameter.
String Templates with % formatting
●
In this style , string formatting uses this general form : f % v
f - is a template string
v – value or values to be formatted.
●
if multiple values are to be formatted, v must be a tuple.For rhis you
can write the SQL query in a string but use a %s code in place of the
value to be provided as a parameter.
e.g. “select * from student where marks > %s “
The above string is an incomplete string, to complete it , you must
provide a tuple of values with % prefixed .
e.g. If you want to provide value 70 for %s placeholder , then the
query will be :
“select * from student where marks > %s “ %( 70 , )
f v
Now you can store this query string in a variable and then execute
that variable through read_sql() function e.g.
1. sname = “input( “Which student's record you want to see ? Enter
Name : “)

2. qrystr = “select * from student where name = ' %s ' ; % ( sname , )

3. df1 = pd.read_sql(qrystr, <connection> )

4.4 Exporting a DataFrame's Data as a Table in MySQL Database
●
For writing onto MySQL database, we shall connact to a MySQL
database using pymysql library which uses create_engine() of
sqlalchemy library .There are some reasons behind it .
●
mysql.connector does not support writing onto MySQL database
using to_sql( ).
●
For writing onto MySQL's version 5.5 or higher, a connection with
strong ORM support is required to_sql( ), which is created through
sqlalchemy's create_engine( ) function.
●
Pymysql library ensure Python to MySQL transition of data smoothly.
For exporting a dataframe onto a MySQL table make sure to install
these two libraries using pip install <library name>, and follow the
steps given below :
i) Import pandas, pymysql and sqlalchemy libraries.
ii) Establish connection toMySQL database
iii) Write dataframe's data onto MySQL table.
Step 1 : Import required libraries by issuing commands as :

import pandas as pd
import pymysql
From sqlalchemy import create_engine #only create_engine is needed

Step 1 : Establish connection to database using create_engine( )

<db engine> = create_engine('mysq + pymysql://<user>:<password>
@localhost/<MySQL database> “)
<connection name> = <db engine>.connect( )
e.g. engine = create_engine('mysql+pymysql://root:MyPass@localhost/School')

Step 3 : Write dataframe's data onto MySQL table using to_sql( )

In this step, now you can write a dataframe in the form of a table by
using to_sql( ).
<df>.to_sql(<tablename>, <connection> [ index=True] [,if_exists = “append” | “replace
“ | “fail”] )
Consider dataframe tDf shown below

Let us create a table Topper1 in MySQL database namely test

using the dataframe tDf.

ERP Configuration Using GBI Phase II Handbook (A4) en v3.3
No ratings yet
ERP Configuration Using GBI Phase II Handbook (A4) en v3.3
102 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Mtcna
No ratings yet
Mtcna
34 pages
Chapter 12eng Data Transfer Between Files SQL Databases and DataFrames
0% (1)
Chapter 12eng Data Transfer Between Files SQL Databases and DataFrames
14 pages
XIIInfo Pract 35
No ratings yet
XIIInfo Pract 35
11 pages
BTech 5 CSE Data Analytics Using Python Unit 4 Notes
No ratings yet
BTech 5 CSE Data Analytics Using Python Unit 4 Notes
25 pages
Importing Data Into Pandas Dataframes
No ratings yet
Importing Data Into Pandas Dataframes
5 pages
CH 1 Data Transfer Between Files SQL Databases and DataFrames
No ratings yet
CH 1 Data Transfer Between Files SQL Databases and DataFrames
13 pages
Actuators and Drivers
No ratings yet
Actuators and Drivers
23 pages
Ch4class 12 Readingcsv Files
No ratings yet
Ch4class 12 Readingcsv Files
6 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
DAP_Module3
No ratings yet
DAP_Module3
42 pages
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
No ratings yet
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
15 pages
Learn Python Pandas For Data Science Quick TutorialExamples For All Primary Operations of DataFrames
No ratings yet
Learn Python Pandas For Data Science Quick TutorialExamples For All Primary Operations of DataFrames
37 pages
ainotes
No ratings yet
ainotes
5 pages
RM - Pandas_Importing Data
No ratings yet
RM - Pandas_Importing Data
15 pages
pandas data frame
No ratings yet
pandas data frame
11 pages
Ainotes dataframe
No ratings yet
Ainotes dataframe
5 pages
Data Transfer Between Files, SQL Databases & Dataframes: Comma To Separate Each Specific Data Value. CSV Advantages
No ratings yet
Data Transfer Between Files, SQL Databases & Dataframes: Comma To Separate Each Specific Data Value. CSV Advantages
6 pages
Pandas 1
No ratings yet
Pandas 1
64 pages
7th class of CSV and DataFrame
No ratings yet
7th class of CSV and DataFrame
9 pages
Rest of the Ip Project
No ratings yet
Rest of the Ip Project
26 pages
1 Pandas Basic I
No ratings yet
1 Pandas Basic I
22 pages
ANL252 SU4 Jul2022
No ratings yet
ANL252 SU4 Jul2022
55 pages
Python Unit 5
No ratings yet
Python Unit 5
21 pages
XII IP CH 4 Importing Exporting
No ratings yet
XII IP CH 4 Importing Exporting
14 pages
INFORMATIC Complete Project
No ratings yet
INFORMATIC Complete Project
27 pages
Python & MySQL for Data Analysis
No ratings yet
Python & MySQL for Data Analysis
45 pages
1632369343606_Data_Frame_Notes2
No ratings yet
1632369343606_Data_Frame_Notes2
4 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
12 pages
Pandas
No ratings yet
Pandas
57 pages
ANL252 SU6 Jul2022
No ratings yet
ANL252 SU6 Jul2022
51 pages
Pandas
No ratings yet
Pandas
12 pages
Python For Data Analysis (1) - 171-192
No ratings yet
Python For Data Analysis (1) - 171-192
24 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Master PySpark 1-18
No ratings yet
Master PySpark 1-18
59 pages
CSV File
No ratings yet
CSV File
30 pages
Features of Python
No ratings yet
Features of Python
14 pages
Working With CSV Files
No ratings yet
Working With CSV Files
4 pages
lab 1 ML lab
No ratings yet
lab 1 ML lab
15 pages
4 Data Transformation Using Pandas
No ratings yet
4 Data Transformation Using Pandas
59 pages
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
No ratings yet
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
26 pages
Pandas I Notes 06 - June 20
No ratings yet
Pandas I Notes 06 - June 20
13 pages
III Unit Fds
No ratings yet
III Unit Fds
24 pages
dataframing_in_csv
No ratings yet
dataframing_in_csv
14 pages
Fds Unit - III
No ratings yet
Fds Unit - III
58 pages
Notes -CSV FILES
No ratings yet
Notes -CSV FILES
7 pages
CW MD Jahid Hasan 2024
No ratings yet
CW MD Jahid Hasan 2024
20 pages
14oct Pandas 2024
No ratings yet
14oct Pandas 2024
13 pages
Csv –
No ratings yet
Csv –
7 pages
05 Data Loading, Storage and Wrangling-1
No ratings yet
05 Data Loading, Storage and Wrangling-1
22 pages
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
No ratings yet
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
12 pages
Employee Data Analysis System ( Ip Class 12 ) ( 2024-25 )
No ratings yet
Employee Data Analysis System ( Ip Class 12 ) ( 2024-25 )
30 pages
7.2 - Data Frame Basics.mp4
No ratings yet
7.2 - Data Frame Basics.mp4
3 pages
Chapter5 3CSVFile
No ratings yet
Chapter5 3CSVFile
7 pages
CSV (Rajib)
No ratings yet
CSV (Rajib)
11 pages
Revision Point - Dataframe
No ratings yet
Revision Point - Dataframe
11 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
From Everand
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
Arun Manivannan
No ratings yet
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
AgileAssets LRS Gateway Brochure 2018
No ratings yet
AgileAssets LRS Gateway Brochure 2018
2 pages
TM-1811 AVEVA Everything3D Equipment Modelling - Chapter 5 - Equipment Utilities
No ratings yet
TM-1811 AVEVA Everything3D Equipment Modelling - Chapter 5 - Equipment Utilities
10 pages
Using Digital Signatures in SAP QM
100% (2)
Using Digital Signatures in SAP QM
19 pages
110 6252 en R2 CLI Reference Guide
No ratings yet
110 6252 en R2 CLI Reference Guide
542 pages
Naukri MuhajiraniHassanVT (2y 5m)
No ratings yet
Naukri MuhajiraniHassanVT (2y 5m)
1 page
Assignment 1: Student:Puscas Vlad Group:30237
No ratings yet
Assignment 1: Student:Puscas Vlad Group:30237
8 pages
Elevator Pitch: Introduce Yourself in Few Seconds
No ratings yet
Elevator Pitch: Introduce Yourself in Few Seconds
25 pages
REST API Interview Questions: Click Here
No ratings yet
REST API Interview Questions: Click Here
38 pages
A Reference Architecture For The Internet of Things - Identity of Things
No ratings yet
A Reference Architecture For The Internet of Things - Identity of Things
15 pages
Machine Learning + Devops Using Azure ML Services
No ratings yet
Machine Learning + Devops Using Azure ML Services
17 pages
ANA Website Redesign Framework
100% (1)
ANA Website Redesign Framework
1 page
8
No ratings yet
8
4 pages
TeamViewer12 Manual MSI Deployment en
No ratings yet
TeamViewer12 Manual MSI Deployment en
21 pages
SNMP - PPT - Brian Candler
No ratings yet
SNMP - PPT - Brian Candler
27 pages
BAPI - It Is Nothing, But A FM Which Is Used To Load The Data Into SAP System. The Data May Be From The
No ratings yet
BAPI - It Is Nothing, But A FM Which Is Used To Load The Data Into SAP System. The Data May Be From The
2 pages
Skillset-Oct-7-2023-1
No ratings yet
Skillset-Oct-7-2023-1
1 page
Automatic Watershed Delineation Using Arcswat/Arc Gis: By: - Yalelet.F
No ratings yet
Automatic Watershed Delineation Using Arcswat/Arc Gis: By: - Yalelet.F
9 pages
Appendix A: Answers To The Test Your Knowledge Questions
No ratings yet
Appendix A: Answers To The Test Your Knowledge Questions
62 pages
LTE trace-ENM-AMOS
75% (4)
LTE trace-ENM-AMOS
3 pages
Web Brute Common
No ratings yet
Web Brute Common
156 pages
Willcom Installation
No ratings yet
Willcom Installation
2 pages
ARB 1316 - Whitepaper - DataFabric - Its Time Has Come
No ratings yet
ARB 1316 - Whitepaper - DataFabric - Its Time Has Come
35 pages
Get Started Guide For WebSphere Application Server On Amazon's AMI
No ratings yet
Get Started Guide For WebSphere Application Server On Amazon's AMI
19 pages
Uni MS
No ratings yet
Uni MS
8 pages
BSCIT
No ratings yet
BSCIT
34 pages
Symantec Endpoint Detection and Response
No ratings yet
Symantec Endpoint Detection and Response
4 pages
Database Systems: Assignment 4
No ratings yet
Database Systems: Assignment 4
7 pages
Lokesh Resume
No ratings yet
Lokesh Resume
7 pages

Chapter 4 - Import-Export Data

Uploaded by

Chapter 4 - Import-Export Data

Uploaded by

Importing/Expoting

Python's Pandas library offers two function

We can create a CSV file by saving data of an MS-Excel

Using nrows argument, you can extract top rows, bottom

Let us save the same dataframe's data in another file namely

Now your dataframe df7 will

Now, if you store this dataframe

By default , the missing/NaN values

Step 2: Open a Connection to MySQL Database

<Connection-object> = mysql.connector.connect (host =<host-

import mysql.connector as sqltor

<DF> = pandas.read_sql( “<SQL statement > “ , <connection name> )

2. qrystr = “select * from student where name = ' %s ' ; % ( sname , )

3. df1 = pd.read_sql(qrystr, <connection> )

Step 1 : Establish connection to database using create_engine( )

Step 3 : Write dataframe's data onto MySQL table using to_sql( )

Let us create a table Topper1 in MySQL database namely test

You might also like