Data Science Lab 3
Data Science Lab 3
Department of Computing
Class: BSCS-6A
Date: 20 -2-2020
Introduction
The purpose of this lab is to get familiar with Data Science by Python. In this lab we explore
data in Python, using examples. I encourage you to type all python commands your own
machine.
Tools/Software Requirement
Python, Jupyter Notebook
For instance, we want a list of all females who are not graduates and got a
loan. Boolean indexing can help here. You can use the following code:
["Gender","Education","Loan_Status"]]
Bahria University, Islamabad Campus
Department of Computer Science
Data types of dataframe
data.dtypes
Would return:
0 True
1 True
2 False
3 False
4 False
Name: Height, dtype: bool
This returns a new Series of True/False values though. To actually filter the data, we
need to use this Series to mask our original DataFrame:
Sometimes csv file has null values, which are later displayed as NaN in Data
Frame. Pandas dropna() method allows the user to analyze and drop
Rows/Columns with Null values in different ways
LAB TASKS
Task 1
Create a dataframe of your name with atleast 6 attributes insert data and Filter by chain
method. Display dataframe and filter results
(For example you can create dataframe of car with its specifications)
Task 2
Bigmart-sales dataset
Retail is another industry which extensively uses analytics to optimize business processes.
Tasks like product placement, inventory management, customized offers, product bundling,
etc. are being smartly handled using data science techniques. As the name suggests, this
data comprises of transaction records of a sales store. The data has 8523 rows of 12
variables
Deliverables: Submit Python files as zip archive before the next lab along with lab journal.