Just Give Me The Codes Lecture 2: Data Importation: Goals: Import Data Into Jupyterlab View The Dataset

The document provides instructions for importing data into JupyterLab and viewing a dataset. It outlines 11 steps to import the data, view the first and last entries, check for missing values and data types, re-import the data with missing values addressed, take a random sample of the data with a set seed for repeatability, and suggests further reading on random seeds and where to find additional datasets.

Uploaded by

DragosCavescu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views9 pages

Just Give Me The Codes Lecture 2: Data Importation: Goals: Import Data Into Jupyterlab View The Dataset

Uploaded by

DragosCavescu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Just Give me the Codes

Lecture 2: Data Importation

Goals:
• Import data into JupyterLab
• View the dataset
Step 1: Extract data from the internet

 Open the ipnyb file in ‘ipynb for Entry Level Python’ folder
 Go to Step 1 and run the script (press Shift & Enter together)

 Data imported! Well done!

Steps 2-3: View first 5 and last 5 entrants

 Follow Steps 2 & 3 to view the first 5 and last 5 entrants of the dataset
 Step 3 output shows period (full stop) instead of numerical values for variables 4 & 5
Steps 4-5: Look for missing values &
determine data type

 Follow Steps 4 & 5 to return sum of NaNs in

each column and to determine data type
for each variable
 Note: functions will be explained in
subsequent lectures as we are just
correcting for data type to infer accurate
descriptive analysis in the next lecture
Steps 6-7: Re-import & re-examine dataset

 Follow Steps 6 & 7 to create missing

values list and to re-examine last 5
entrants
 Take note: One needs to be
creative when creating a missing
values list, as what the data entrant
may find comprehensible for a
missing value, the data analyst may
not (i.e., maybe add a comma).
Steps 8-9: Check data types and
missing values

 Follow Steps 8-9 to re-examine data types

and sum of NaNs in each column,
respectively
Step 10: Random selection of dataset

 Follow Step 10
 Note: your random sample will
always differ unless you set a seed
Step 11: Set a seed for repeatable results

 By setting a seed (36), you ensure the

results can be repeated later on (i.e.,
setting a seed at 36 will always give
you the same results unless you
change the value of n).
 Change value of n for your
convenience (i.e., n=20)
 A seed of 21 will give you different
results from a seed of 36
 Play around with the seed values! ☺
 Refer to end of lecture links for more
information on random seed
End of Lecture 2

 Well done! You have officially gained skills in data importation!

 Where to go from here? Lecture 3 of course! But things to consider:
 Read up on random.seed
 Commence Assignment 1
 A great place to start:
 Information on random.seed
 https://pynative.com/python-random-seed/

 Repositories where you can find data in csv format

 https://archive.ics.uci.edu/ml/index.php
 https://databank.worldbank.org/home.aspx

Data Analysis With Python
No ratings yet
Data Analysis With Python
29 pages
Norma Lab1
No ratings yet
Norma Lab1
60 pages
Chapter 1. Data Preparation
No ratings yet
Chapter 1. Data Preparation
74 pages
Data Analysis
No ratings yet
Data Analysis
42 pages
Data Analytics With PowerBI
No ratings yet
Data Analytics With PowerBI
27 pages
Python Course Outline
No ratings yet
Python Course Outline
24 pages
Data Analytics
No ratings yet
Data Analytics
4 pages
FDS Unit 2
No ratings yet
FDS Unit 2
8 pages
R-Programming Lab Mannual
No ratings yet
R-Programming Lab Mannual
33 pages
Course - Introduction To Data Science (SD211105)
No ratings yet
Course - Introduction To Data Science (SD211105)
10 pages
Data Mining Using Python Manual
No ratings yet
Data Mining Using Python Manual
69 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
FDS Chapter 3
No ratings yet
FDS Chapter 3
103 pages
Unit 3
No ratings yet
Unit 3
102 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Stats Unit1
No ratings yet
Stats Unit1
27 pages
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
No ratings yet
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
538 pages
DA - Job Bootcamp - Detailed Curriculum
No ratings yet
DA - Job Bootcamp - Detailed Curriculum
3 pages
Data Science - A First Introduction With Python (Z-Lib - Io)
No ratings yet
Data Science - A First Introduction With Python (Z-Lib - Io)
452 pages
Universal Data Analytics Algorithm
No ratings yet
Universal Data Analytics Algorithm
51 pages
1.2. Data Analysis With Python - Importing Datasets 2
No ratings yet
1.2. Data Analysis With Python - Importing Datasets 2
14 pages
Data Understanding and Preparation
No ratings yet
Data Understanding and Preparation
48 pages
Udacity Enterprise Syllabus Data Analyst nd002
No ratings yet
Udacity Enterprise Syllabus Data Analyst nd002
16 pages
DM Lab Cycle 1
No ratings yet
DM Lab Cycle 1
12 pages
CS 3362 FDS
No ratings yet
CS 3362 FDS
53 pages
Document
No ratings yet
Document
29 pages
Teks DATA SCIENCE Syllabus - QR
No ratings yet
Teks DATA SCIENCE Syllabus - QR
26 pages
Lec 4
No ratings yet
Lec 4
9 pages
INFORMATIC Complete Project
No ratings yet
INFORMATIC Complete Project
27 pages
Data Preparation: Treatment of Missing Values
No ratings yet
Data Preparation: Treatment of Missing Values
26 pages
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
100% (1)
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
12 pages
Python For Data Science Unit 3: DR Kruti Dangarwala CSE & IT Department Svmit
No ratings yet
Python For Data Science Unit 3: DR Kruti Dangarwala CSE & IT Department Svmit
113 pages
1.2. Python Data Analysis Change
No ratings yet
1.2. Python Data Analysis Change
5 pages
Bca212 Ids 2023
No ratings yet
Bca212 Ids 2023
3 pages
Big Data - Lab 3
No ratings yet
Big Data - Lab 3
25 pages
Nac PDF
No ratings yet
Nac PDF
23 pages
Group A Assignment No2 Writeup
No ratings yet
Group A Assignment No2 Writeup
9 pages
ML Practical 03
No ratings yet
ML Practical 03
20 pages
Lesson 3. Data Preparation and Structuring 1 Data Cleaning
No ratings yet
Lesson 3. Data Preparation and Structuring 1 Data Cleaning
36 pages
Unit 2
No ratings yet
Unit 2
76 pages
Handling Missing Values in A Real-Time Dataset During
No ratings yet
Handling Missing Values in A Real-Time Dataset During
5 pages
Complete Roadmap To Learn Python For Data Analysis
No ratings yet
Complete Roadmap To Learn Python For Data Analysis
5 pages
Learneverythingai
No ratings yet
Learneverythingai
9 pages
CSE445 NSU Week - 3
No ratings yet
CSE445 NSU Week - 3
48 pages
Unit V
No ratings yet
Unit V
47 pages
2 Weeks Data Science Using Python: Days Topics
No ratings yet
2 Weeks Data Science Using Python: Days Topics
2 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
Python For Data Analysis
67% (3)
Python For Data Analysis
39 pages
Data Analyst Nanodegree Program - Syllabus
50% (2)
Data Analyst Nanodegree Program - Syllabus
7 pages
Data Analytics With Python-1
No ratings yet
Data Analytics With Python-1
12 pages
Data Cleaning in Python
No ratings yet
Data Cleaning in Python
14 pages
Data Science Lab Manual..
No ratings yet
Data Science Lab Manual..
54 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
Data Preparation: Handling Missing Values and Outliers
No ratings yet
Data Preparation: Handling Missing Values and Outliers
28 pages
Data Science (Oct 2024)
No ratings yet
Data Science (Oct 2024)
13 pages
Data Analytics Using Python
No ratings yet
Data Analytics Using Python
18 pages
Kenny-230722-Data Cleaning With Python and Pandas - Detecting Missing Values
No ratings yet
Kenny-230722-Data Cleaning With Python and Pandas - Detecting Missing Values
13 pages
Python (Unit - 2)
No ratings yet
Python (Unit - 2)
22 pages
DS Final
No ratings yet
DS Final
46 pages
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
A Beginner's guide to Python
From Everand
A Beginner's guide to Python
Steven Mcananey
No ratings yet
BPMN Case Study
No ratings yet
BPMN Case Study
1 page
Course - 4.04.2020 - CommunicationManagement (Review)
No ratings yet
Course - 4.04.2020 - CommunicationManagement (Review)
10 pages
Course 4.04.2020 BPMN
No ratings yet
Course 4.04.2020 BPMN
35 pages
BPMN - Team Work
No ratings yet
BPMN - Team Work
3 pages
Just Give Me The Codes Lecture 4: Data Preprocessing I: Goals: Format Variables Check For Missing Values
No ratings yet
Just Give Me The Codes Lecture 4: Data Preprocessing I: Goals: Format Variables Check For Missing Values
7 pages
Just Give Me The Codes Lecture 5: Data Preprocessing II
No ratings yet
Just Give Me The Codes Lecture 5: Data Preprocessing II
21 pages
7 Introduction To Binomial Trees
No ratings yet
7 Introduction To Binomial Trees
25 pages
Just Give Me The Codes Lecture 3: Descriptive Analysis
No ratings yet
Just Give Me The Codes Lecture 3: Descriptive Analysis
6 pages
Running Map Reduce Program in Eclipse: C:/hadoop
No ratings yet
Running Map Reduce Program in Eclipse: C:/hadoop
6 pages
Unit - 1 Part 2
No ratings yet
Unit - 1 Part 2
14 pages
4 Branch Instructions
No ratings yet
4 Branch Instructions
34 pages
Assignment 2
No ratings yet
Assignment 2
10 pages
End Term Exam Os L6
No ratings yet
End Term Exam Os L6
3 pages
Mid Exam of OOP-Aimen BEE-3A
No ratings yet
Mid Exam of OOP-Aimen BEE-3A
5 pages
Survey of Rtos
100% (1)
Survey of Rtos
17 pages
Time Table For Winter 2024 Theory Examination
No ratings yet
Time Table For Winter 2024 Theory Examination
5 pages
Project On Mysql
No ratings yet
Project On Mysql
67 pages
Control Structures in Prolog
No ratings yet
Control Structures in Prolog
10 pages
PH Data Interview Experience
No ratings yet
PH Data Interview Experience
2 pages
React
No ratings yet
React
75 pages
Devops Unit 3 Complete
No ratings yet
Devops Unit 3 Complete
18 pages
Java Document 1
No ratings yet
Java Document 1
232 pages
CD (Aicte 2020-2021)
No ratings yet
CD (Aicte 2020-2021)
74 pages
Miller
No ratings yet
Miller
9 pages
Week 3 Assignment Answer 2022
No ratings yet
Week 3 Assignment Answer 2022
3 pages
Logixpro Lab 1.A: Digital I/O Simulator
No ratings yet
Logixpro Lab 1.A: Digital I/O Simulator
6 pages
Unit - 1 Introduction To Database Management System
No ratings yet
Unit - 1 Introduction To Database Management System
40 pages
Concurrency: Mutual Exclusion and Synchronization: Operating Systems: Internals and Design Principles
No ratings yet
Concurrency: Mutual Exclusion and Synchronization: Operating Systems: Internals and Design Principles
69 pages
Minesweeper
No ratings yet
Minesweeper
6 pages
JAVA Project
No ratings yet
JAVA Project
14 pages
Lab Worksheet 2: Using Github, Download Starter Code, Edit The Code
No ratings yet
Lab Worksheet 2: Using Github, Download Starter Code, Edit The Code
3 pages
Humble PDF
No ratings yet
Humble PDF
8 pages
A 1 - Official - CCS21103 - drWali-THIS
No ratings yet
A 1 - Official - CCS21103 - drWali-THIS
5 pages
Introdction
No ratings yet
Introdction
7 pages
CUDA Programming Basic: High Performance Computing Center Hanoi University of Science & Technology
No ratings yet
CUDA Programming Basic: High Performance Computing Center Hanoi University of Science & Technology
38 pages
Lecture 2
No ratings yet
Lecture 2
21 pages
JAVA Practical Questions
No ratings yet
JAVA Practical Questions
2 pages

Just Give Me The Codes Lecture 2: Data Importation: Goals: Import Data Into Jupyterlab View The Dataset

Uploaded by

Just Give Me The Codes Lecture 2: Data Importation: Goals: Import Data Into Jupyterlab View The Dataset

Uploaded by

Just Give me the Codes

Lecture 2: Data Importation

 Data imported! Well done!

 Follow Steps 4 & 5 to return sum of NaNs in

 Follow Steps 6 & 7 to create missing

 Follow Steps 8-9 to re-examine data types

 By setting a seed (36), you ensure the

 Well done! You have officially gained skills in data importation!

 Repositories where you can find data in csv format

You might also like