0% found this document useful (0 votes)

12 views38 pages

EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages

The document provides a comprehensive guide on installing Anaconda, creating environments, and running Jupyter Notebook, along with practical examples of using NumPy and Pandas for data manipulation and analysis. It covers creating and manipulating arrays with NumPy, as well as creating and managing DataFrames with Pandas, including reading data from files and performing descriptive analytics. Additionally, it discusses data visualization techniques using libraries like Seaborn and Matplotlib, specifically in the context of datasets related to diabetes and the Iris dataset.

Uploaded by

rramaniravi18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views38 pages

EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages

Uploaded by

rramaniravi18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

EX.

No: 1
Date: DOWNLOAD INSTALL EXPLORE THE FEATURES OF NUMPY,
SCIPY, JUPITER, STATSMODELS AND PANDAS PACKAGES

How to Install Anaconda & Run Jupyter Notebook

Instructions To Install Anaconda and Run Jupyter Notebook

 Download & Install Anaconda Distribution
 Create Anaconda Environment
 Install and Run Jupyter Notebook

Download & Install Anaconda Distribution

Follow the below step-by-step instructions to install Anaconda

distribution.

Download Anaconda Distribution

Go to https://anaconda.com/ and select Anaconda Individual Edition

todownload the latest version of Anaconda. This downloads the .exe file to the
windows download
folder.
Install Anaconda

By double-clicking the .exe file starts the Anaconda installation. Followthe

below screen shot’s and complete the installation
This finishes the installation of Anaconda distribution, now let’s see howto
create an environment and install Jupyter Notebook
Create Anaconda Environment from Navigator
A conda environment is a directory that contains a specific collection of conda
packages that you have installed. For example, you may have one environment with
NumPy
1.7 and its dependencies, and another environmentwith NumPy 1.6 for legacy
testing. https://conda.io/docs/using/envs.html
Open Anaconda Navigator

Open Anaconda Navigator from windows start or by searching it.

Anaconda Navigator is a UI application where you can control theAnaconda
packages, environment
e.t.c

Create an Environment to Run Jupyter Notebook

This is optional but recommended to create an environment

before you proceed. This gives complete segregation of different
package installs for different projects you would be working on. If
you already have an
Environment, you can use it too.
Select + Create icon at the bottom of the screen to create an Anaconda
environment.
Install and Run Jupyter Notebook

Once you create the anaconda environment, go back to the Home page on
Anaconda Navigator and install Jupyter Notebook from an application on the right
panel.

It will take a few seconds to install Jupyter to your environment, once

the install completes, you can open Jupyter from the same screen or by
accessing Anaconda Navigator -> Environments -> your
environment (mine pandas-tutorial) -> select Open With
Jupyter Notebook.

This opens up Jupyter Notebook in the default browser.

Now select New -> PythonX and enter the below lines and select Run. On
Jupyter, each cell is a statement, so you can run each cell independently
when there are no dependencies on previous cells.
This completes installing Anaconda and running Jupyter Notebook
WORKING WITH NUMPY ARRAYS

(i) Create a NumPy ndarray

Object Program
import numpy as np
arr = np.array([1, 2, 3, 4, 5]) print(arr)
print(type(arr))

Output

[1 2 3 4 5]
<class 'numpy.ndarray'>

2-D Arrays
Program
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]]) print(arr)

Output

[[1 2 3]
[4 5 6]]

3-D arrays Program

import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)

Output

[[[1 2 3]
[4 5 6]]
[[1 2 3]
[4 5 6]]]
(iii) Check Number of Dimensions?
Program
import numpy as npa
= np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim) print(b.ndim) print(c.ndim) print(d.ndim)

Output
0
1
2
3

Access Array Elements Program

import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[0])

Output
1

Program
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[2] + arr[3])

Output

7
(iv) Slicing arrays Program
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])

Output

[2 3 4 5]
NumPy Array Shape Program

import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)

Output
(2, 4)

(v) Reshaping arrays Program

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3) print(newarr)

Output

[[ 1 2 3]
[4 5 6]
[7 8 9]
[10 11 12]]

Iterating Arrays Program

import numpy as np
arr= np.array([1, 2, 3])
for x in arr:
print(x)

Output
1
2
3
Joining NumPy Arrays Program

import numpy as np
arr1= np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2)) print(arr)

Output
[1 2 3 4 5 6]

Splitting NumPy Arrays Program

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
newarr= np.array_split(arr, 3)
print(newarr)

Output

[array([1, 2]), array([3, 4]), array([5, 6])]

Searching Arrays Program

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x= np.where(arr == 4)
print(x)

Output
(array([3, 5, 6]),)

Sorting ArraysProgram

import numpy as np
arr = np.array([3, 2, 0, 1]) print(np.sort(arr))

Output
[0 1 2 3]
WORKING WITH PANDAS DATA FRAMES

Create a simple Pandas DataFrame:

Program
import pandas as pd data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data)
Print(df)

Output

calories duration
0 420 50
1 380

2 390 45

Locate Row Program

print(df.loc[0])

Output
calories 420
duration 50
Name: 0, dtype: int64

Note: This example returns a Pandas Series.

(iv )use a list of indexes:

Program

print(df.loc[[0, 1]])

Output
calories duration
0 420 50
1 380 40
Note: When using [], the result is a Pandas DataFrame.
Named Indexes Program

import pandas as pd data = {

"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"]) print(df)

Output
calories duration
day1 420 50
day2 380 40
day3 390 45

Locate Named Indexes

print(df.loc["day2"])

Output
calories 380
duration 40
Name: 0, dtype: int64

Load Files Into a DataFrame Program

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

Output

Duration Pulse Maxpulse Calories

0 60 110 130 409.1
1 60 117 145 479.0
2 60 103 135 340.0
3 45 109 175 282.4
4 45 117 148 406.0
.. ... ... ...
...
164 60 105 140 290.8

165 60 110 145 300.4

166 60 115 145 310.2

167 75 120 150 320.4

168 75 125 150 330.4

[169 rows x 4 columns]

Check the number of maximum returned rows: Program

import pandas as pd
print(pd.options.display.max_rows)

In my system the number is 60, which means that if the DataFrame contains more than 60
rows, the print(df) statement will return only the headers and the first andlast 5 rows.

import pandas as pd
pd.options.display.max_rows = 9999
df = pd.read_csv('data.csv')
print(df)

Viewing the Data Program

import pandas as pd
df = pd.read_csv('data.csv')
print(df.head(4))
Output
Duration Pulse Maxpulse Calories
0 60 110 130 409.1

1 60 117 145 479.0

2 60 103 135 340.0

3 45 109 175 282.4

4 45 117 148 406.0

Print the last 5 rows of the DataFrame:

print(df.tail())
print(df.info())

Output
<class 'pandas.core.frame.DataFrame'>

RangeIndex: 169 entries, 0 to 168

Data columns (total 4 columns):

# Column Non-Null Count Dtype

0 Duration 169 non-nullint64

1 Pulse 169 non-null int64
2 Maxpulse 169 non-nullint64
3 Calories 164 non-nullfloat64

dtypes: float64(1), int64(3)

memory usage: 5.4 KB

None
READING DATA FROM TEXT FILES, EXCEL AND THE WEB AND EXPLORINGVARIOUS
COMMANDS FOR DOING DESCRIPTIVE ANALYTICS ON THE IRIS DATA SET.

Program
import pandas as pd

# Reading the CSV file

df = pd.read_csv("Iris.csv")

# Printing top 5 rows

df.head()

Output:

Getting Information about the Dataset

df.shape

Output: (150, 6)

df.info()
Output

df.describe()

Checking Missing Values

df.isnull().sum()
Checking Duplicates

data = df.drop_duplicates(subset ="Species",) data

Output

df.value_counts("Species")

Data Visualization

# importing packages
import seaborn as sns

import matplotlib.pyplot as plt

sns.countplot(x='Species', data=df, )
plt.show()
Comparing Sepal Length and Sepal Width

# importing packages

import seaborn as sns

import matplotlib.pyplot as plt

sns.scatterplot(x='SepalLengthCm', y='SepalWidthCm',
hue='Species', data=df, )

# Placing Legend outside the Figure

plt.legend(bbox_to_anchor=(1, 1), loc=2)

plt.show()

# importing packages
import seaborn as sns

import matplotlib.pyplot as plt

sns.pairplot(df.drop(['Id'], axis = 1),

hue='Species', height=2)
Output:
USE THE DIABETES DATA SET FROM UCI AND PIMA INDIANS DIABETESDATA SET FOR
PERFORMING THE FOLLOWING

Program
import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.linear_model import LogisticRegression

from sklearn.externals import joblib

df = pd.read_csv('C:/Users/Downloads/diabetes.csv')

count = df['Glucose'].value_counts()

display(count)

df.head()

25
df.describe()

df.mean()

df.mode()

df.var()

26
df.std()

df.skew()

Pregnancies 0.901674

Glucose 0.173754
BloodPressure -1.843608

SkinThickness 0.109372

Insulin 2.272251
BMI -0.428982

DiabetesPedigreeFunction 1.919911
Age 1.129597

Outcome 0.635017

dtype: float64

df.kurtosis()
Pregnancies 0.159220
Glucose 0.640780
BloodPressure 5.180157
SkinThickness -0.520072
Insulin 7.214260
BMI 3.290443
DiabetesPedigreeFunction 5.594954
Age 0.643159
27
Outcome -1.600930
dtype: float64

corr = df.corr()
sns.heatmap(corr,
xticklabels=corr.columns,
yticklabels=corr.columns)

sns.countplot('Outcome', data=df)
plt.show()

28
# Computing the %age of diabetic and non-diabetic in the sample

Out0=len([df.Outcome==1])

Out1=len([df.Outcome==0])

Total=Out0+Out1

PC_of_1 = Out1*100/Total
PC_of_0 = Out0*100/Total

PC_of_1, PC_of_0

(50.0, 50.0)
plt.figure(dpi = 120,figsize= (5,4))
mask = np.triu(np.ones_like(df.corr(),dtype = bool))

sns.heatmap(df.corr(),mask = mask, fmt = ".2f",annot=True,lw=1,cmap = 'plasma')

plt.yticks(rotation = 0)
plt.xticks(rotation = 90) plt.title('Correlation Heatmap') plt.show()

29
APPLY AND EXPLORE VARIOUS PLOTTING FUNCTIONS ON UCI DATA SETS

Normal curves Program

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np
df=pd.read_csv("C:/Users/Downloads/dataset_diabetes/diabetic_data.cs v")

df.head()
mean =df['time_in_hospital'].mean()
std =df['time_in_hospital'].std()

x_axis = np.arange(1, 10, 0.01)

plt.plot(x_axis, norm.pdf(x_axis, mean, std))

plt.show()

Output

30
Density and contour plots Program

df.time_in_hospital.plot.density(color='green')

plt.title('Density plot for time_in_hospital')

plt.show()

output:
Correlation and scatter plots Program

mp.figure(figsize=(20,10))

dataplot = sb.heatmap(data.corr(), cmap="YlGnBu", annot=True)

Output
Histogram
df.hist(figsize=(12,12),layout=(5,3))

# plotting histogram for carat using distplot()

sb.distplot(a=df.num_lab_procedures, kde=False)
# visualizing plot using matplotlib.pyplot library

plt.show()

Output
Three dimensional plotting Program

fig = plt.figure()

ax = plt.axes(projection = '3d')

x= df['number_emergency']

x = pd.Series(x, name= '')

y = df['number_inpatient']

y= pd.Series(x, name= '')

z = df['number_outpatient']

z= pd.Series(x, name= '')

ax.plot3D(x, y, z, 'green')

ax.set_title('3D line plot diabetes dataset')

plt.show()

Output
BASEMAP
import numpy as np

from matplotlib import pyplot as plt

from mpl_toolkits.basemap import Basemap

plt.figure(figsize=(8,8))

m=Basemap(projection='ortho',resolution=None,lat_0=50,lon_0=-
100,llcrnrlat= -90,urcrnrlat=90,llcrnrlon=-180,urcrnrlon=180,)

m.bluemarble(scale= 0.5);

fig=plt.figure(figsize=(8,8))
m=Basemap(projection='lcc',resolution=None,width=8E6,height=8E6,lat_0
=45,lon_0= -100)
m.etopo(scale=0.5,alpha=0.5)

x,y=m(-122.3,47.6)
plt.plot(x,y,'ok',markersize=5)
plt.text(x,y,'Seattle',fontsize=12);

from itertools import chain

def draw_map(m,scale=0.2):

m.shadedrelief(scale=scale)

lats=m.drawparallels(np.linspace(-90,90,13))
lons=m.drawmeridians(np.linspace(-180,180,13))

lat_lines=chain(*(tup[1][0]for tup in lats.items()))

lon_lines=chain(*(tup[1][0]for tup in lons.items()))

all_lines=chain(lat_lines,lon_lines)
for line in all_lines:

line.set(linestyle='-',alpha=0.3,color='w')

fig=plt.figure(figsize=(8,6),edgecolor='w')
m=Basemap(projection='cyl',resolution=None,llcrnrlat= -
90,urcrnrlat=90,llcrnrlon=-180,urcrnrlon=180,)

draw_map(m)
fig=plt.figure(figsize=(8,6),edgecolor='w')

m=Basemap(projection='moll',lon_0=0,resolution='c')

m.drawcoastlines()
m.fillcontinents(color='coral',lake_color='aqua')

draw_map(m)

fig=plt.figure(figsize=(8,8))

m=Basemap(projection='ortho',resolution=None,lat_0=50,lon_0=0,llcrnrlat
= -90,urcrnrlat=90,llcrnrlon=-180,urcrnrlon=180,)

draw_map(m)
fig=plt.figure(figsize=(8,8))

m=Basemap(projection='lcc',resolution=None,lon_0=0,lat_0=50,lat_1=45,l
at_2=55,width=1.6E7,height=1.2E7)

draw_map(m)
plt.show()

Output:

Databricks - Cheatsheet
No ratings yet
Databricks - Cheatsheet
7 pages
Python in Excel Boost Your Data Analysis and Automation With Powerful Python Scripts Hayden Van Der Post Download
No ratings yet
Python in Excel Boost Your Data Analysis and Automation With Powerful Python Scripts Hayden Van Der Post Download
91 pages
PYTHON Poster
No ratings yet
PYTHON Poster
1 page
Sales Management System Report File - 4
No ratings yet
Sales Management System Report File - 4
23 pages
Pandas Basics
No ratings yet
Pandas Basics
21 pages
Data Science - Unit II
100% (2)
Data Science - Unit II
173 pages
DSL Rough Draft
No ratings yet
DSL Rough Draft
34 pages
MOD-3 Dap
No ratings yet
MOD-3 Dap
41 pages
Combined Cheatsheet
No ratings yet
Combined Cheatsheet
5 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Unit 1
No ratings yet
Unit 1
164 pages
AnalytixLabs - Data Science & Machine Learning With Python-1601625377114-1
No ratings yet
AnalytixLabs - Data Science & Machine Learning With Python-1601625377114-1
16 pages
Guide Step by Step
No ratings yet
Guide Step by Step
61 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Namrata Resume
No ratings yet
Namrata Resume
4 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
Learn Python-eBook
No ratings yet
Learn Python-eBook
27 pages
Python Libraries
No ratings yet
Python Libraries
53 pages
IP Project Deepika
No ratings yet
IP Project Deepika
26 pages
Final Report Capstone Project House Price Prediction
No ratings yet
Final Report Capstone Project House Price Prediction
34 pages
A Beginner's Guide To Grabbing and Analyzing Salary Data in Python - by Matt Grierson - Towards Data Science
No ratings yet
A Beginner's Guide To Grabbing and Analyzing Salary Data in Python - by Matt Grierson - Towards Data Science
20 pages
Importing Files Through Pandas
No ratings yet
Importing Files Through Pandas
16 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Data Science Lab Manual Full
No ratings yet
Data Science Lab Manual Full
47 pages
Unit 2 Mca275 PPT Part 2
No ratings yet
Unit 2 Mca275 PPT Part 2
33 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
Python Unit 4&5 Que
No ratings yet
Python Unit 4&5 Que
33 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
61 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Ty B Tech - Bda - Ai315 - Lab Manual
No ratings yet
Ty B Tech - Bda - Ai315 - Lab Manual
52 pages
FDS Lab Manual-1
No ratings yet
FDS Lab Manual-1
51 pages
De&v Lab Manual
No ratings yet
De&v Lab Manual
91 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
Ramu Resume56
No ratings yet
Ramu Resume56
1 page
FDS Notes Unit-4
No ratings yet
FDS Notes Unit-4
30 pages
Lecture 2.2
No ratings yet
Lecture 2.2
25 pages
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
No ratings yet
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
8 pages
Pandas
No ratings yet
Pandas
41 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
FDS Lab Manual (Print)
No ratings yet
FDS Lab Manual (Print)
43 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
ML File Updated
No ratings yet
ML File Updated
60 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Full Download Python For Financial Analysis From Zero To Hero 1st Edition Van Der Post PDF
No ratings yet
Full Download Python For Financial Analysis From Zero To Hero 1st Edition Van Der Post PDF
40 pages
Python Pandas Project
No ratings yet
Python Pandas Project
17 pages
DSF Lab Exp Full
No ratings yet
DSF Lab Exp Full
88 pages
Anshu Kumar Jha CV
No ratings yet
Anshu Kumar Jha CV
1 page
Unit 5
No ratings yet
Unit 5
27 pages
AyushMokal Resume
No ratings yet
AyushMokal Resume
2 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
DV Lab Manual Modified
No ratings yet
DV Lab Manual Modified
31 pages
Pandas Notes
No ratings yet
Pandas Notes
10 pages
Pandas
No ratings yet
Pandas
21 pages
Pandas
No ratings yet
Pandas
25 pages
Ankita Sinkar Resume-1
No ratings yet
Ankita Sinkar Resume-1
1 page
Ex. No: 1 Exploring The Features of Numpy, Scipy, Jupyter, Statsmodels and Pandas Date: 07/08/2024
No ratings yet
Ex. No: 1 Exploring The Features of Numpy, Scipy, Jupyter, Statsmodels and Pandas Date: 07/08/2024
9 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Fods Lab
No ratings yet
Fods Lab
36 pages
Unit 5 PythonPackages (Matplotlib)
No ratings yet
Unit 5 PythonPackages (Matplotlib)
24 pages
Learning NumPy and Pandas
No ratings yet
Learning NumPy and Pandas
3 pages
Data Science Programs
No ratings yet
Data Science Programs
11 pages
Pythonlibraries
No ratings yet
Pythonlibraries
20 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Batch2 FDS Printout
No ratings yet
Batch2 FDS Printout
38 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Lab 2 DWM
No ratings yet
Lab 2 DWM
13 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Data Analyst Roadmap by Harsha Verse
No ratings yet
Data Analyst Roadmap by Harsha Verse
14 pages
Pandas Practice
No ratings yet
Pandas Practice
7 pages
Lecture 7 Understanding Dataframes in Python and R
No ratings yet
Lecture 7 Understanding Dataframes in Python and R
17 pages
PP&DS Unit Iii
No ratings yet
PP&DS Unit Iii
26 pages
Numpy Data Analysis and Visualisation With Python
No ratings yet
Numpy Data Analysis and Visualisation With Python
75 pages
Data Analysis and Visualization LAB
No ratings yet
Data Analysis and Visualization LAB
2 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
Pandas Notes
No ratings yet
Pandas Notes
8 pages
Naukri KaranGowda (0y 0m)
No ratings yet
Naukri KaranGowda (0y 0m)
1 page
PPS - Unit 5 (Imp Topics)
No ratings yet
PPS - Unit 5 (Imp Topics)
7 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
59 pages
BCA Internship Report JECRC UNIVERSITY
No ratings yet
BCA Internship Report JECRC UNIVERSITY
56 pages
Pandas Interview Question
No ratings yet
Pandas Interview Question
3 pages