0% found this document useful (0 votes)
34 views34 pages

DS Manual

The document provides instructions on installing various Python packages like NumPy, SciPy, Jupyter, Pandas, and statsmodels using pip. It also includes code snippets on creating a data frame from lists and dictionaries in Pandas, performing descriptive analysis on the Iris and Pima Indian Diabetes datasets, and using logistic regression, multiple regression, and Basemap to visualize geographic data.

Uploaded by

suganthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views34 pages

DS Manual

The document provides instructions on installing various Python packages like NumPy, SciPy, Jupyter, Pandas, and statsmodels using pip. It also includes code snippets on creating a data frame from lists and dictionaries in Pandas, performing descriptive analysis on the Iris and Pima Indian Diabetes datasets, and using logistic regression, multiple regression, and Basemap to visualize geographic data.

Uploaded by

suganthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 34

INSTALLATION PROCESS:

Numpy Installation: pip install numpy

Scipy Installation: pip install scipy


Jupyter Installation: pip install jupyter

Statsmodels installation: pip install statsmodels


Pandas installation: pipip

piio
PROGRAM & OUTPUT:
PROGRAM & OUTPUT:
Creating a data frame using List:

Creating Data frame from dict of ndarray/lists:

Dealing with Rows and Columns:


PROGRAM:

OUTPUT:
IRIS DATA SET:
ALGORITHM:
Step 1: Download the IRIS dataset from the Kaggle website and save in
Documents or any other folder do you want.
Link: https://www.kaggle.com/code/bharath25/descriptive-statistics-and-
machine-learning-iris/data

iris. head (10)


iris. Shape
iris.info ()

iris. describe ()
iris.isnull ().sum ()

iris.value_counts (“Species”)
ALGORITHM:
Step 1: Download the Pima Indians Diabetes dataset
Link: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database?
resource=download
PROGRAM:
a) Univariate analysis: Frequency, Mean, Median, Mode, Variance,
Standard Deviation, Skewness and Kurtosis.

print (df.shape)
print (df.info ())

Print (df.mean ())


Print (df.median ())

Print (df.mode ())


Print (df.std ())

Print (df.var ())


Print (df.skew ())

Print (df.kurtosis ())


Df.describe ()
5 b) Bivariate Analysis: Linear and Logistic Regression Modeling.
LOGISTIC REGRESSION:
5 c) MULTIPLE REGRESSION ANALYSIS.
ALGORITHM:
Step 1: Import Libraries.
Step 2: Import dataset.
Step 3: Define x and y.
Step 4: Train the model on the training set.
Step 5: Predict the test set results.
Step 6: Evaluate the model.
Step 7: Plot the results.
ALGORITHM:
Step 1: Download Heart dataset from kaggle.
Link: https://www.kaggle.com/datasets/zhaoyingzhu/heartcsv
Step 2: Save that in downloads or any other Folder and install packages.
.PROGRAM:

BOX PLOT:
a) Normal Curve:

b) Density Plots:
c) Correlation and Scatter plots:

Correlation plot
Scatter plot

d) Histogram:
e) Three Dimensional Plotting:
RESULT:
Thus the program was executed successfully.
AIM:
To create an insight Geographic Data with Basemap.

ALGORITHM:
Step 1: Install Basemap. The zip file occurs extract the original file.
Step 2: import Packages.
Step3: Save that in downloads or any other Folder.
Step 4: Apply these following commands.
Step 5: The Output will display.

PROGRAM & OUTPUT:


%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap

plt.figure(figsize=(8, 8))
m = Basemap(projection='ortho', resolution=None, lat_0=50, lon_0=-100)
m.bluemarble(scale=0.5);
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc', resolution=None,
width=8E6, height=8E6,
lat_0=45, lon_0=-100,)

m.etopo(scale=0.5, alpha=0.5)

# Map (long, lat) to (x, y) for plotting


x, y = m(-122.3, 47.6)
plt.plot(x, y, 'ok', markersize=5)
plt.text(x, y, ' Seattle', fontsize=12);
RESULT:

Thus the program was executed successfully.


REFERENCES:

1. Kaggle.com
2. UCI.
3. PIMA Indian Diabetes Data Set.

You might also like