Working With CSV Files

Notes for collecting data from CSV filies in Scikit Learn.

Uploaded by

aaravzoc31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views4 pages

Working With CSV Files

Notes for collecting data from CSV filies in Scikit Learn.

Uploaded by

aaravzoc31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Working With CSV Files

CSV are files with comma separated values. These files are widely used in Machine
Learning.

We mostly use the Pandas python library for manipulating csv files.

Reading CSV file stored locally

df = pd.read_csv(‘file_name“)

Opening/Reading CSV file from an URL

import requests
From io import StirngIO
url = ‘url_here’
header = {“user-Agent”: “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0)
Gecko/20100101 Firefox/66.0”}
req= requests.get(url, headers=headers)
Data = StringIO(req.text)

pd.read_csv(data)

Sep Parameter
By default read_csv is set to read data separated by comma values which are of csv
files. But when we have to deal with tsv files we have to set sep = \t as they are
separated by Tab.Eg:

df = pd.read_csv(‘file_name’, sep = ‘\t’)

header row missing/names parameter

When we don't have headings row for our data we can explicitly give header data by
using names parameter of read_csv(‘’)

df = pd.read_csv(‘file_name’, sep = ‘\t’, names =[‘’,’’,’’,’’])

Index_col parameter
When we have two index columns, one from our data and another one auto generated
by our default pandas. We can remove the default generated by using this parameter.

df = pd.read_csv(‘file_name’, Index_col=’our_index_col_name’)

Header parameter
If by mistake our header row is also consider in first row of data. We can fix that by
using header parameter in read_csv method.
pd.read_csv(‘filename’, header = 1)

use_cols parameter
if we want to take/consider only some olumns of our data we can use this parameter
and specify which columns should be taken from data.
Pd.read_csv(‘file_name’, use_cols = [‘col1_name’, ‘col2_name’, so on)

Skiprows parameter
If we want to skip some specific rows while importing data. We usi this in pd.read_csv
as skiprows=[‘row numbers seperated by comma’]

Nrows
Used in pd.read_csv, if we only want to import some specified rows form large data.
Used in pd.read_csv(‘filename’, nrows = no of rows we want to import eg 100)

Encoding parameter
Some data set don’t have utf8 encodings and gives error. In case of that we can specify
encodings =’correct encoding of that dat set’
Skip bad lines
Some data sets have uneven number of datasets for specific column or row which
pandas cannot read so to avoid we use error_bad_lines = False. This skips that code or
row.

Dtypes parameter
We can change datatype for specif column as dtype = {‘col_name’:data_type}

Handling Dates
In data sets datee and time are stored as string. So we cannot use data time python operations
on it. To change that we can use parse_dates =[‘name of colms u want to store as date sep by
commas’]

Converters
Some times we want to transform data variables before the data set has been loaded, we use
this. Eg: Royal challengers bangolore to RCB.
converters={‘col_name wher_we_want_to_apply’: func for transformation}.
Eg def rename(name){
If name ==”Royan Challengers Bangolore”:
return”RCB”
Else:
return name

na_values parameter
we can specify which values we can consider na values in datasets. Eg hyphens should be
considered na values. na_values = [‘value name eg: Male’]. If multiple sep my comma
Loading data in chunks
We can use chunksize = 5000(eg). While using chunksize we have to run our operaions in loop.
So that it applies for all chunks.

It, Culture, and The Society
78% (9)
It, Culture, and The Society
17 pages
XII IP CH 4 Importing Exporting
No ratings yet
XII IP CH 4 Importing Exporting
14 pages
Class 12 IP Ch-1 CSV File Handling
No ratings yet
Class 12 IP Ch-1 CSV File Handling
8 pages
Employee Data Analysis System (Ip Class Xii)
No ratings yet
Employee Data Analysis System (Ip Class Xii)
26 pages
Chapter 4 - Import-Export Data
No ratings yet
Chapter 4 - Import-Export Data
30 pages
SAP FSCM-Dispute Management-FI/AR: Wipro Confidential
100% (1)
SAP FSCM-Dispute Management-FI/AR: Wipro Confidential
97 pages
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
100% (2)
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
12 pages
1-Simple Stress
No ratings yet
1-Simple Stress
33 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
Pandas 1
No ratings yet
Pandas 1
64 pages
3.3. CSV Files
No ratings yet
3.3. CSV Files
28 pages
Computer Science
No ratings yet
Computer Science
35 pages
Handling CSV Files in Python
No ratings yet
Handling CSV Files in Python
11 pages
BTech 5 CSE Data Analytics Using Python Unit 4 Notes
No ratings yet
BTech 5 CSE Data Analytics Using Python Unit 4 Notes
25 pages
CSL 410 L16
No ratings yet
CSL 410 L16
22 pages
III Unit Fds
No ratings yet
III Unit Fds
24 pages
Python For Data Analysis (1) - 171-192
No ratings yet
Python For Data Analysis (1) - 171-192
24 pages
Actuators and Drivers
No ratings yet
Actuators and Drivers
23 pages
CW MD Jahid Hasan 2024
No ratings yet
CW MD Jahid Hasan 2024
23 pages
INFORMATIC Complete Project
No ratings yet
INFORMATIC Complete Project
27 pages
Project IP 2023
No ratings yet
Project IP 2023
16 pages
Iot Smart Parking PDF
No ratings yet
Iot Smart Parking PDF
69 pages
Exercise13 STUDY MATERIAL
No ratings yet
Exercise13 STUDY MATERIAL
17 pages
CSV File
No ratings yet
CSV File
30 pages
Pandas Data Frame
No ratings yet
Pandas Data Frame
11 pages
Unit IV File Handling - CSV Files
No ratings yet
Unit IV File Handling - CSV Files
28 pages
1 Pandas Basic I
No ratings yet
1 Pandas Basic I
22 pages
CW MD Jahid Hasan 2024
No ratings yet
CW MD Jahid Hasan 2024
20 pages
Dataframing in CSV
No ratings yet
Dataframing in CSV
14 pages
7th Class of CSV and DataFrame
No ratings yet
7th Class of CSV and DataFrame
9 pages
Unit5 CS
No ratings yet
Unit5 CS
15 pages
CSV File
No ratings yet
CSV File
9 pages
4g Functions in Pandas - PPTX - Lyst2398
No ratings yet
4g Functions in Pandas - PPTX - Lyst2398
11 pages
Chapter 5.3 CSV File Handling
No ratings yet
Chapter 5.3 CSV File Handling
9 pages
05 Data Loading, Storage and Wrangling-1
No ratings yet
05 Data Loading, Storage and Wrangling-1
22 pages
Pandas I Notes 06 - June 20
No ratings yet
Pandas I Notes 06 - June 20
13 pages
Notes On CSV Filespdf
No ratings yet
Notes On CSV Filespdf
11 pages
Importing A CSV File Into The DataFrame
No ratings yet
Importing A CSV File Into The DataFrame
11 pages
Python Unit 5
No ratings yet
Python Unit 5
21 pages
CSV (Rajib)
No ratings yet
CSV (Rajib)
11 pages
L CsvReadWrite
No ratings yet
L CsvReadWrite
10 pages
MEROX
No ratings yet
MEROX
8 pages
Reading and Writing CSV Files
No ratings yet
Reading and Writing CSV Files
13 pages
CSV Files
No ratings yet
CSV Files
8 pages
Ainotes
No ratings yet
Ainotes
5 pages
Ainotes Dataframe
No ratings yet
Ainotes Dataframe
5 pages
PP Manual Exp No. 07
No ratings yet
PP Manual Exp No. 07
9 pages
CSV Comma Separated Values
No ratings yet
CSV Comma Separated Values
7 pages
Basic Operations With CSV Files: CSV (Comma Separated Values) May Be A Simple File Format Accustomed To
No ratings yet
Basic Operations With CSV Files: CSV (Comma Separated Values) May Be A Simple File Format Accustomed To
7 pages
12 Ip
No ratings yet
12 Ip
4 pages
Notes - CSV FILES
No ratings yet
Notes - CSV FILES
7 pages
Data Frame Notes2
No ratings yet
Data Frame Notes2
4 pages
Python CSV
No ratings yet
Python CSV
4 pages
File Handling
No ratings yet
File Handling
6 pages
Importing Data Into Pandas Dataframes
No ratings yet
Importing Data Into Pandas Dataframes
5 pages
Chapter5 3CSVFile
No ratings yet
Chapter5 3CSVFile
7 pages
Data Science Fundamentals - Python: 1 How To Load Machine Learning Data
No ratings yet
Data Science Fundamentals - Python: 1 How To Load Machine Learning Data
4 pages
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
No ratings yet
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
15 pages
About CSV Files
No ratings yet
About CSV Files
2 pages
CSV File Handling
No ratings yet
CSV File Handling
2 pages
CSV New
No ratings yet
CSV New
4 pages
CSV File Handling
No ratings yet
CSV File Handling
3 pages
Appendix-I Application Form For Empanelment of Valuers PDF
No ratings yet
Appendix-I Application Form For Empanelment of Valuers PDF
2 pages
Python and CSV: Readers
No ratings yet
Python and CSV: Readers
4 pages
CSV Files Import Export
No ratings yet
CSV Files Import Export
3 pages
Top Tech Cat
No ratings yet
Top Tech Cat
33 pages
Leave Application Form (New)
No ratings yet
Leave Application Form (New)
1 page
Landslide Cameron Highland
No ratings yet
Landslide Cameron Highland
14 pages
Guideline On Submission of Amendment and Record Piling Plans PDF
No ratings yet
Guideline On Submission of Amendment and Record Piling Plans PDF
9 pages
International Student Handbook: WWW - Tarc.edu - My
No ratings yet
International Student Handbook: WWW - Tarc.edu - My
30 pages
Erosion and Erosion-Corrosion of Metals: A.V. Levy
No ratings yet
Erosion and Erosion-Corrosion of Metals: A.V. Levy
12 pages
Wireless Networks
No ratings yet
Wireless Networks
5 pages
Sprite Library For CSharp PDF
No ratings yet
Sprite Library For CSharp PDF
29 pages
Chapter 5 Interpretation of Contracts - Obligations and Contracts
No ratings yet
Chapter 5 Interpretation of Contracts - Obligations and Contracts
44 pages
5 Year Procurement Projection 30032023
No ratings yet
5 Year Procurement Projection 30032023
26 pages
Mini Jolly Dali 20 Manual
No ratings yet
Mini Jolly Dali 20 Manual
6 pages
КМиОЗ ПП ENG
No ratings yet
КМиОЗ ПП ENG
35 pages
Czarina T. Malvar v. Kraft Food Philippines, Inc.
No ratings yet
Czarina T. Malvar v. Kraft Food Philippines, Inc.
1 page
Red-Headed League (Pt.2)
No ratings yet
Red-Headed League (Pt.2)
10 pages
LST-1198 Decom &amp Xfrto Morroco 7-9-84
No ratings yet
LST-1198 Decom &amp Xfrto Morroco 7-9-84
30 pages
Wcma Application Note
No ratings yet
Wcma Application Note
103 pages
Empirical Finance Assignment
No ratings yet
Empirical Finance Assignment
19 pages
The Social Network Review
No ratings yet
The Social Network Review
16 pages
28 Huerta Alba Resort Inc v. CA
No ratings yet
28 Huerta Alba Resort Inc v. CA
2 pages
Cambria Company v. Cosmos Granite & Marble, NC - Complaint
No ratings yet
Cambria Company v. Cosmos Granite & Marble, NC - Complaint
17 pages
The Effect of Globalization On Vietnamese People's Eating and Drinking Habits
No ratings yet
The Effect of Globalization On Vietnamese People's Eating and Drinking Habits
3 pages
Statics and Mechanics HW 1 PITT
No ratings yet
Statics and Mechanics HW 1 PITT
4 pages
Local Link Portlaoise To Roscrea Bus Timetable
No ratings yet
Local Link Portlaoise To Roscrea Bus Timetable
2 pages
On Line Audit 2
No ratings yet
On Line Audit 2
2 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet

Working With CSV Files

Uploaded by

Working With CSV Files

Uploaded by

Working With CSV Files

Reading CSV file stored locally

Opening/Reading CSV file from an URL

df = pd.read_csv(‘file_name’, sep = ‘\t’)

df = pd.read_csv(‘file_name’, sep = ‘\t’, names =[‘’,’’,’’,’’])

You might also like