2.1 Importing Python Data

The document is a cheat sheet for importing various data types in Python, including Excel, pickled files, HDF5, SAS, Matlab, Stata, and relational databases using libraries like pandas, NumPy, and SQLAlchemy. It provides code snippets for reading data from different file formats and accessing their contents. Additionally, it includes tips for navigating the filesystem and using context managers for file operations.

Uploaded by

ashishsinghji1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views1 page

2.1 Importing Python Data

Uploaded by

ashishsinghji1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Excel Spreadsheets Pickled Files

Python For Data Science Cheat Sheet

>>> file = ‘urbanpop.xlsx’ >>> import pickle
>>> data = pd.ExcelFile(file) >>> with open(‘pickled_fruit.pkl’, ‘rb’) as file:

Importing Data
>>> df_sheet2 = data.parse(‘1960-1966’, pickled_data = pickle.load(file)
skiprows=[0],
names=[‘Country’,
‘AAM: War(2002)’])
>>> df_sheet1 = data.parse(0,
parse_cols=[0],
skiprows=[0], HDF5 Files
names=[‘Country’])
Learn Python for Data Science Interactively To access the sheet names, use the sheet_names attribute:
>>> import h5py
>>> filename = ‘H-H1_LOSC_4_v1-815411200-4096.hdf5’
>>> data = h5py.File(filename, ‘r’)
>>> data.sheet_names
Importing Data in Python
Most of the time, you’ll use either NumPy or pandas to import
your data:

>>> import numpy as np

SAS Files Matlab Files
>>> import pandas as pd
>>> from sas7bdat import SAS7BDAT
>>> with SAS7BDAT(‘urbanpop.sas7bdat’) as file: >>> import scipy.io
df_sas = file.to_data_frame() >>> filename = ‘workspace.mat’
>>> mat = scipy.io.loadmat(filename)
Help
>>> np.info(np.ndarray.dtype)
>>> help(pd.read_csv)
Stata Files
>>> data = pd.read_stata(‘urbanpop.dta’) Exploring Dictionaries
Text Files Accessing Elements with Functions
Relational Databases >>> print(mat.keys())
>>> for key in data.keys():
Print dictionary keys
Print dictionary keys
Plain Text Files print(key)
>>> from sqlalchemy import create_engine meta
>>> filename = ‘huck_finn.txt’ >>> engine = create_engine(‘sqlite://Northwind.sqlite’) quality
>>> file = open(filename, mode=’r’) Open the file for reading strain
>>> text = file.read() Read a file’s contents >>> pickled_data.values() Return dictionary values
>>> print(file.closed) Check whether file is closed Use the table_names() method to fetch a list of table names: >>> print(mat.items()) Returns items in list format
>>> file.close() Close file of (key, value)tuple pairs
>>> print(text) >>> table_names = engine.table_names()

Using the context manager with Accessing Data Items with Keys
Querying Relational Databases
>>> with open(‘huck_finn.txt’, ‘r’) as file:
>>> con = engine.connect() >>> for key in data [‘meta’].keys() Explore the HDF5 structure
print(file.readline()) Read a single line
>>> rs = con.execute(“SELECT * FROM Orders”) print(key)
print(file.readline())
>>> df = pd.DataFrame(rs.fetchall()) Description
print(file.readline())
>>> df.columns = rs.keys() DescriptionURL
>>> con.close() Detector
Duration
GPSstart
Using the context manager with Observatory
Table Data: Flat Files Type
>>> with engine.connect() as con:
UTCstart
rs = con.execute(“SELECT OrderID FROM Orders”)
Importing Flat Files with numpy >>> print(data[‘meta’][‘Description’].value) Retrieve the value for a key
df = pd.DataFrame(rs.fetchmany(size=5))
Files with one data type df.columns = rs.keys()

>>> filename = ‘mnist.txt’

>>> data = np.loadtxt(filename,
Querying relational databases with pandas
Navigating Your FileSystem
delimiter=’,’, String used to separate values
skiprows=2, Skip the first 2 lines
usecols=[0,2], Read the 1st and 3rd column >>> df = pd.read_sql_query(“SELECT * FROM Orders”, engine)
dtype=str) The type of the resulting array Magic Commands
Files with mixed data types !ls List directory contents of files and directories
%cd .. Change current working directory
>>> filename = ‘titanic.csv’ Exploring Your Data %pwd Return the current working directory path
>>> data = np.genfromtxt(filename,
delimiter=’,’,
names=True, Look for column header NumPy Arrays
dtype=None) os Library
>>> data_array.dtype Data type of array elements
>>> data_array.shape Array dimensions >>> import os
>>> data_array = np.recfromcsv(filename) >>> len(data_array) Length of array >>> path = “/usr/tmp”
>>> wd = os.getcwd() Store the name of current
directory in a string
Importing Flat Files with numpy pandas DataFrames >>> os.listdir(wd) Output contents of the di
rectory in a list
>>> filename = ‘winequality-red.csv’ >>> df.head() Return first DataFrame rows >>> os.chdir(path) Change current working
>>> data = pd.read_csv(filename, >>> df.tail() Return last DataFrame rows directory
nrows=5, Number of rows of file to read >>> df.index Describe index >>> os.rename(“test1.txt”, Rename a file
header=None, Row number to use as col names >>> df.columns Describe DataFrame columns “test2.txt”)
sep=’\t’, Delimiter to use >>> df.info() Info on DataFrame >>> os.remove(“test1.txt”) Delete an existing file
comment=’#’, Character to split comments >>> data_array = data.values Convert a DataFrame to an a >>> os.mkdir(“newdir”) Create a new directory
n a_values=[“”]) String to recognize as NA/NaN NumPy array

DAP Module4 Notes
No ratings yet
DAP Module4 Notes
17 pages
Lecture Week2
No ratings yet
Lecture Week2
72 pages
Week 3 Python
No ratings yet
Week 3 Python
152 pages
Chapter 4 - Import-Export Data
No ratings yet
Chapter 4 - Import-Export Data
30 pages
Python Data Import
100% (1)
Python Data Import
28 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
Module 3 Notes
No ratings yet
Module 3 Notes
45 pages
Pandas 1
No ratings yet
Pandas 1
64 pages
Data Frame
No ratings yet
Data Frame
95 pages
Data Type in Python
No ratings yet
Data Type in Python
20 pages
Data Mining Using Python Manual
No ratings yet
Data Mining Using Python Manual
69 pages
Pandas
No ratings yet
Pandas
57 pages
Data Analysis Using Python Day - 1 To Day - 4
No ratings yet
Data Analysis Using Python Day - 1 To Day - 4
30 pages
DAP Module3
No ratings yet
DAP Module3
42 pages
Importing Data in Python
No ratings yet
Importing Data in Python
13 pages
Importing Data From A .CSV File: Brandon Krakowsky
No ratings yet
Importing Data From A .CSV File: Brandon Krakowsky
26 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
13 pages
RM - Pandas - Importing Data
No ratings yet
RM - Pandas - Importing Data
15 pages
Week 2 Laboratory Activity
No ratings yet
Week 2 Laboratory Activity
7 pages
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
100% (1)
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
12 pages
Rest of The Ip Project
No ratings yet
Rest of The Ip Project
26 pages
III Unit Fds
No ratings yet
III Unit Fds
24 pages
Lecture 21 Working With Pandas
No ratings yet
Lecture 21 Working With Pandas
11 pages
Pandas - Data Manipulation and Analysis Library - Educative
No ratings yet
Pandas - Data Manipulation and Analysis Library - Educative
7 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
Utf-8''libraries Data Management
No ratings yet
Utf-8''libraries Data Management
9 pages
01 Python For Data Analysis (Ziad)
No ratings yet
01 Python For Data Analysis (Ziad)
53 pages
Pandas Basics For Data Science
No ratings yet
Pandas Basics For Data Science
2 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
DAwHPC L03 Data Cleaning Practical
No ratings yet
DAwHPC L03 Data Cleaning Practical
43 pages
GT Operating and Maintenance Manual v943 - 240416 - 184428
No ratings yet
GT Operating and Maintenance Manual v943 - 240416 - 184428
765 pages
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
No ratings yet
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
7 pages
Learn Python Pandas For Data Science Quick TutorialExamples For All Primary Operations of DataFrames
No ratings yet
Learn Python Pandas For Data Science Quick TutorialExamples For All Primary Operations of DataFrames
37 pages
WS#3
No ratings yet
WS#3
4 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
05 Data Loading, Storage and Wrangling-1
No ratings yet
05 Data Loading, Storage and Wrangling-1
22 pages
CSV File
No ratings yet
CSV File
30 pages
CH 3 2
No ratings yet
CH 3 2
17 pages
Python Programming Tutorial For Machine Learning Beginners Using
No ratings yet
Python Programming Tutorial For Machine Learning Beginners Using
13 pages
State Estimation in Electric Power Systems - A Generalized Approach (Monticelli) (2012)
100% (4)
State Estimation in Electric Power Systems - A Generalized Approach (Monticelli) (2012)
405 pages
Ch2 PDF Slides
No ratings yet
Ch2 PDF Slides
26 pages
Pandas
No ratings yet
Pandas
12 pages
Ainotes
No ratings yet
Ainotes
5 pages
Python Unit 5
No ratings yet
Python Unit 5
21 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Course - Introduction To Data Science (SD211105)
No ratings yet
Course - Introduction To Data Science (SD211105)
10 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Importing Data Cheat Sheet Python For Data Science: Pickled Files Exploring Your Data
No ratings yet
Importing Data Cheat Sheet Python For Data Science: Pickled Files Exploring Your Data
1 page
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
12 pages
CSV New
No ratings yet
CSV New
4 pages
Fds Unit - III
No ratings yet
Fds Unit - III
58 pages
Felcom 12 15 16 Ssas Tie PDF
No ratings yet
Felcom 12 15 16 Ssas Tie PDF
80 pages
TC Electronic Hall of Fame Reverb Manual English PDF
No ratings yet
TC Electronic Hall of Fame Reverb Manual English PDF
28 pages
Exp 1
No ratings yet
Exp 1
5 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Sustainable Devlopment Goals
No ratings yet
Sustainable Devlopment Goals
34 pages
Importing Data Python Cheat Sheet PDF
No ratings yet
Importing Data Python Cheat Sheet PDF
1 page
Appraisal Form
No ratings yet
Appraisal Form
12 pages
Group Theory Problems
No ratings yet
Group Theory Problems
3 pages
Introduction To Linear Programming Sau
No ratings yet
Introduction To Linear Programming Sau
42 pages
DLL - FP Wk8 Day 1
No ratings yet
DLL - FP Wk8 Day 1
5 pages
Cse Btech (Admission Data)
No ratings yet
Cse Btech (Admission Data)
30 pages
Course Outcome 3rd Semester
No ratings yet
Course Outcome 3rd Semester
11 pages
Anti Ragging Affidavit
No ratings yet
Anti Ragging Affidavit
1 page
Presentation Free Diving Range 2018 Eng
No ratings yet
Presentation Free Diving Range 2018 Eng
55 pages
Solutions
100% (1)
Solutions
25 pages
Wise Holdings Vs Garcia
100% (2)
Wise Holdings Vs Garcia
2 pages
Spare Parts List: Forward and Reversible Plate
No ratings yet
Spare Parts List: Forward and Reversible Plate
44 pages
Plate # 2 Primary and Secondary Batteries
No ratings yet
Plate # 2 Primary and Secondary Batteries
14 pages
Basic Understaning of Detection of Intrusion Aand Threats
No ratings yet
Basic Understaning of Detection of Intrusion Aand Threats
9 pages
KPCSW Report.2022
No ratings yet
KPCSW Report.2022
43 pages
US Manufacturing Output Falls in April On Weak Auto Production by
No ratings yet
US Manufacturing Output Falls in April On Weak Auto Production by
5 pages
RRB NTPC Syllabus
No ratings yet
RRB NTPC Syllabus
3 pages
Aman - Data Science Internship - Internship
No ratings yet
Aman - Data Science Internship - Internship
1 page
Course Certificate Aman Verma
No ratings yet
Course Certificate Aman Verma
1 page
Page 5 A&A May 5, 2025 - Barclay Page 5
No ratings yet
Page 5 A&A May 5, 2025 - Barclay Page 5
1 page
MINERALS
No ratings yet
MINERALS
4 pages
Weekly Lesson Plan (Grade 10)
No ratings yet
Weekly Lesson Plan (Grade 10)
8 pages
Corporate Governanceand Ethics
No ratings yet
Corporate Governanceand Ethics
8 pages
(Student Version) 91264 - 2023 - Anything Is Popsicle
No ratings yet
(Student Version) 91264 - 2023 - Anything Is Popsicle
4 pages
Final Sudeshna Resume
No ratings yet
Final Sudeshna Resume
1 page
Chalukya Exp Second Ac (2A) : Electronic Reservation Slip (ERS)
No ratings yet
Chalukya Exp Second Ac (2A) : Electronic Reservation Slip (ERS)
3 pages
Formal Language & Automata
No ratings yet
Formal Language & Automata
2 pages
Candidate Registration Report
No ratings yet
Candidate Registration Report
2 pages
CB Model Gearbox Rebuild
No ratings yet
CB Model Gearbox Rebuild
7 pages
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
No ratings yet
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
2 pages
Darjeeling Toy Train
No ratings yet
Darjeeling Toy Train
2 pages
Module 1: Short Questions
No ratings yet
Module 1: Short Questions
1 page
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
From Everand
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
Matthew Rosch
No ratings yet
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet

2.1 Importing Python Data

Uploaded by

2.1 Importing Python Data

Uploaded by

Excel Spreadsheets Pickled Files

Python For Data Science Cheat Sheet

>>> import numpy as np

>>> filename = ‘mnist.txt’

You might also like