0% found this document useful (0 votes)
21 views11 pages

Dal Programs With Output

Data Analytics using python
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views11 pages

Dal Programs With Output

Data Analytics using python
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Experiment no.

1
Lists in Python
my_list = ['Sujit','Sandip','Milind','Makarand']
print(my_list)
print(my_list[1])

Output:
['Sujit','Sandip','Milind','Makarand']
Sandip

Arithmetic Operations on Numpy Arrays

Addition

Import numpy as np
A = np.array([5, 72, 13, 100])
B = np.array([2, 5, 10, 30])
Add_ans = np.add(A, B)
Print(add_ans)
# The same functions and operations can be used for multiple matrices
C = np.array([1, 2, 3, 4])
Add_ans = np.add(A, B, C)
Print(add_ans)

Output:
[ 7 77 23 130]
[ 7 77 23 130]
[ 8 79 26 134]
[ 7 77 23 130]
Subtraction:
Import numpy as np
A = np.array([5, 72, 13, 100])
B = np.array([2, 5, 10, 30])
Sub_ans = np.subtract(A, B)
Print(sub_ans)

Output:
[ 3 67 3 70]
[ 3 67 3 70]

Multiplication:
Import numpy as np
A = np.array([5, 72, 13, 100])
B = np.array([2, 5, 10, 30])
Mul_ans = np.multiply(a, b)
Print(mul_ans)

Output:
[ 10 360 130 3000]
[ 10 360 130 3000]

Division:
Import numpy as np
A = np.array([5, 72, 13, 100])
B = np.array([2, 5, 10, 30])
Div_ans = np.divide(a, b)
Print(div_ans)
Output:
[ 2.5 14.4 1.3 3.33333333]
[ 2.5 14.4 1.3 3.33333333]

Dictionary in Python-
my_dict = {
"brand":
"Tata",
"model":
"Punch",
"year":
2021
}
Experiment no.6

Min-Max Normalization:

from sklearn import preprocessing


import numpy as np
X_train = np.array([[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]])
min_max_scaler = preprocessing.MinMaxScaler()
X_train_minmax = min_max_scaler.fit_transform(X_train)
print(X_train_minmax)

output:

[[0.5 0. 1. ]

[1. 0.5 0.33333333]

[0. 1. 0. ]]

Calculate Z-Scores in Python


import pandas as pd
import numpy as np
import scipy.stats as stats

data = np.array([6, 7, 7, 12, 13, 13, 15, 16, 19, 22])

stats.zscore(data)

outpu: [-1.394, -1.195, -1.195, -0.199, 0, 0, 0.398, 0.598, 1.195, 1.793]

import pandas as pd

#perform binning with 3 bins

df['new_bin'] = pd.qcut(df['variable_name'], q=3)

#view updated DataFrame

print(df)

points assists rebounds points_bin

0 4 2 7 (3.999, 10.667]

1 4 5 7 (3.999, 10.667]

2 7 4 4 (3.999, 10.667]

3 8 7 6 (3.999, 10.667]

4 12 7 3 (10.667, 19.333]

5 13 8 8 (10.667, 19.333]

6 15 5 9 (10.667, 19.333]

7 18 4 9 (10.667, 19.333]

8 22 5 12 (19.333, 25.0]

9 23 11 11 (19.333, 25.0]

10 23 13 8 (19.333, 25.0]

11 25 8 9 (19.333, 25.0]
We can use the value_counts() function to find how many rows have been placed in each bin:

#count frequency of each bin

df['points_bin'].value_counts()

(3.999, 10.667] 4

(10.667, 19.333] 4

(19.333, 25.0] 4

Name: points_bin, dtype: int64

Experiment 7:Statistical Data Analysis using Python

# Python program to get variance of a list

# Importing the NumPy module

import numpy as np

list = [2, 4, 4, 4, 5, 5, 7, 9]

print(np.var(list))

Output:
4.0

# Python program to get standard deviation of a list

import numpy as np

list = [2, 4, 4, 4, 5, 5, 7, 9]
print(np.std(list))

Output:
2.0
# Python program to get mean of speed

import numpy as np

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

x = np.mean(speed)

print(x)

Output:
89.76923076923077

# Python program to get median value of speed

import numpy as np

speed = [99,86,87,88,86,103,87,94,78,77,85,86]

x = np.median(speed)

print(x)

Output:
86.5

# Python program to get mode of speed

from scipy import stats

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

x = stats.mode(speed)

print(x)
Output:
ModeResult(mode=array([86]), count=array([3]))

Experiment:8,Exploratory Data Analysis using Python, Groupby

# pandas_legislators.py

import pandas as pd

dtypes = {
"first_name": "category",
"gender": "category",
"type": "category",
"state": "category",
"party": "category",
}
df = pd.read_csv(
"groupby-data/legislators-historical.csv",
dtype=dtypes,
usecols=list(dtypes) + ["birthday", "last_name"],
parse_dates=["birthday"]
)

>>> from pandas_legislators import df


>>> df.tail()
last_name first_name birthday gender type state party
11970 Garrett Thomas 1972-03-27 M rep VA Republican
11971 Handel Karen 1962-04-18 F rep GA Republican
11972 Jones Brenda 1959-10-24 F rep MI Democrat
11973 Marino Tom 1952-08-15 M rep PA Republican
11974 Jones Walter 1943-02-10 M rep NC Republican

df.groupby(["state", "gender"])["last_name"].count()
state gender
AK F 0
M 16
AL F 3
M 203
AR F 5
...
WI M 196
WV F 1
M 119
WY F 2
M 38
Name: last_name, Length: 116, dtype: int64
Experiment 9:Linear Regression Program:

import matplotlib.pyplot as plt


from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):
return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

Result:
Experiment 10:Multiple Regression Program:

import pandas
from sklearn import linear_model

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']

regr = linear_model.LinearRegression()
regr.fit(X, y)

#predict the CO2 emission of a car where the weight is 2300kg, and
the volume is 1300cm3:
predictedCO2 = regr.predict([[2300, 1300]])

print(predictedCO2)
Result:
[107.2087328]

You might also like