0% found this document useful (0 votes)
24 views31 pages

DV Lab Manual Modified

Uploaded by

karthiksvr26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views31 pages

DV Lab Manual Modified

Uploaded by

karthiksvr26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Ex no: 1 Download, install and explore the features of NumPy, SciPy,

Date: Jupyter, statsmodels and pandas package

Aim
To download, install and explore the features of NumPy, SciPy, Jupyter, stats models and pandas
packages.
Procedure
Step1: Internet must be coneected to install the packages such as NumPy, SciPy, Jupyter, stats
models and pandas packages
Step2: open run window, type cmd and press enter to open the cmd
Step3: change the location to python using change directory as cd
Step4: changing the location as type C:\Users\students\AppData\Local\programs\Python\Python 36-
32\scripts
Step5: After setting the path, check python and pip versions. If the version is not satisfied our
requirements then update it with the new version. Then install the given packages one by one
Step6: Install the NumPy packages using enter pip install NumPy
Step7: To install SciPy packages using enter pip install SciPy
Step8: To install Jupyter by using enter pip enter Jupyter
Step9: To install stats models by using enter pip install stats models
Step10: To install pandas packages by using enter pip install pandas
To check Python & pip version
python –v
pip -v
Output
To install Numpy
pip install numPy
To install SciPy
pip install sciPy
To install Jupyter
pip install Jupyter
To install Stats models
pip install statsmodels
To install Pandas
pip install pandas
Output:
Result:
Thus the package NumPy, SciPy, Jupyter, stats models and pandas using pip has been successfully
downloaded and installed.
a. Output
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19])

b. Output
0

c. Output
First element: 2.1
Second element: 4.5
Last element: 5.5
array('d', [3.5, 4.2, 3.3])
array('d', [2.1, 4.5, 3.5, 4.2, 3.3, 5.5])
Ex no: 2 Working with Numpy arrays
Date:

Aim
To create programs and operations performed using Numpy arrays.
Algorithm
step 1: Import the numpy packages as np
step 2: To create an array in 1D and 2D using numpyfuctions
step 3: We can find the indexing value in an array using numpy
functionsstep 4: We use insert() function for inserting an elements in
an array step5: We can also delete the elements in an array using
delete() functionstep 6: To slice an elements in an array we can use
slicing [:] operator step 7: To merge the two arrays we use
concatenate() function
Program
a.Creating arrays in
NumPyimportnumpy as
np array=np.arange(20)
array

b.Finding index value of the array


import array as arr
numbers =
arr.array('i',[10,20,30])
print(numbers.index(10))

c.Access the elements in the array


import array as ar
newarr = ar.array('d', [2.1, 4.5,3.5,4.2,3.3,
5.5])print("First element:", newarr[0])
print("Second element:", newarr[1])
print("Last element:", newarr[-1])
print(newarr[2:5])
print(newarr[:])
d. Output
1
9
2
3
5
7
10

e. Output
1
2
3
5
10

f. Output
[2 3 4 5]

g. Output
[1 2 3 4 5 6]
[1 2 3 4 5 6]
[1 2 3 4 5 6]

d.Inserting an element in an array


import array as ar
num = ar.array('i', [1, 2, 3, 5, 7,
10])num.insert(1,9)
for x in num:
print(x)

e.Deleting an element in an array


import array as ar
num = ar.array('i', [1, 2, 3, 5, 7,
10])num.remove(7)
for x in num:
print(x)
f.Slice elements in an array
import numpy as np
arr=np.array([1, 2, 3, 4, 5, 6,
7])print(arr[1:5])

g.Concatenation
import numpy as nparr1=np.array([1, 2,3])
arr2=np.array([4, 5, 6])
arr=np.concatenate((arr1,arr2))print(arr)

Result:
Thus the program and operations performed on Numpy arrays in python has been executed
successfully and verified.
a. Output
Calories Duration
0 420 50
1 380 40
2 390 45

b. Output
Calories Duration
Day1 420 50
Day2 380 40
Day3 390 45

c. Output
0
0 1
1 2
2 3
3 4
4 5
Ex no:3 Working with Pandas data frames
Date:
Aim
To create programs and operations performed Pandas data frames.
Algorithm:
Step 1: Import pandas package after install the pandas packages.
Step 2: To create a simple pandas dataframe.
Step 3: After creating simple dataframe, add a list of names to give each row a name using
pandas.
Step 4: To create a dataframe list using pandas.
Step 5: Also create dataframe from dictionary of nd/arrays or lists.
Step 6: To check the addition and deletion of rows in the pandas dataframe.

a.Create a simple Pandas DataFrame


Import pandas as pd
data = { "calories": [420, 380, 390], "duration": [50, 40, 45]}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

b.Add a list of names to give each row a name


import pandas as pd
data = {"calories": [420, 380, 390],"duration": [50, 40, 45]}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

c.Create a DataFrame from Lists


import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)
d. Output
Name Age
28
0 Tom
1 Jack 34
2 Steve 29
3 Ricky 42

e. Output
a b
0 1 2
1 3 4
0 5 6
1 7 8

f. Output
a b
1 3 4
1 7 8
d.Create a DataFrame from Dictionary of nd arrays / Lists
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print(df)

e.Addition of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
print(df)

f.Deletion of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print (df)

Result:
Thus the program and operations using pandas has been successfully executed and verified.
1. Output:

2. Output:

3. Output:
Ex no: 4 Reading data from text files, Excel and the web and exploring
Date: various commands for doing descriptive analytics on the Iris
data set

Aim:

To Read data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set.
Procedure:

Download the Dataset “Iris.csv”

1) Display the whole dataset

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")print(data)

2) Displaying the number of rows randomly.


sample() function used to display the rows according to arguments given, but itwill display
the rows randomly.

print(data.sample(2))

3) Displaying the number of columns and names of the columns.


The column() function prints all the columns of the dataset in a list form.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")print(data.columns)
4. Output:

5. Output:

6. Output:
4) Displaying the shape of the dataset.
The shape of the dataset means to print the total number of rows or entries andthe total
number of columns or features of that particular dataset.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")print(data.shape)

5) Slicing the rows.


Slicing means if you want to print or work upon a particular group of lines that isfrom 10th
row to 20th row.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")sliced_data=data[2:5]

print(sliced_data)

6) Displaying only specific columns.


In any dataset, it is sometimes needed to work upon only specific features orcolumns, so we
can do this by the following code.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")

specific_data=data[["sepallength","petallength"]]

print(specific_data.head(4))
7. Output:

8. Output:
7) Displaying the specific rows using “iloc” and “loc” functions.
The “loc” functions use the index name of the row to display the particular row of the
dataset.
The ―iloc‖ functions use the index integer of the row, which gives completeinformation about
the row.

Code:
data.iloc[5]

data.loc[data["sepallength"]=="iris-

setosa"]print(data.iloc[5])

8) Counting the number of counts of unique values using


“value_counts()”.
The value_counts() function, counts the number of times a particular instance ordata has
occurred.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")

print(data["sepallength"].value_counts())

9) Calculating sum, mean and mode of a particular column.


We can also calculate the sum, mean and mode of any integer columns as Ihave done in the
following code.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")sum_data=data["sepallength"].sum()

mean_data=data["sepallength"].mean()
9. Output:

10. Output:

11. Output:
median_data=data["sepallength"].median()

print("sum:",sum_data,"\n mean:",mean_data, "\n

median:",median_data)

10) Extracting minimum and maximum from a column.


Identifying minimum and maximum integer, from a particular column or row canalso be
done in a dataset.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")

min_data=data["sepallength"].min()

max_data=data["sepallength"].max()

print("minimum:", min_data, "\n

maximum:",max_data)

11) Adding a column to the dataset.


If want to add a new column in our dataset, as we are doing any calculations orextracting
some information from the dataset, and if you want to save it a new column. This can be
done by the following code by taking a case where we have added all integer values of all
columns.

Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")cols=data.columns

print(cols)

Result:

Thus the Reading data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set is successfully executed.
Output- Importing the Diabetes CSV data file:

Output- To print full information about the data set:


Ex no: 5 Use the diabetes data set from UCI and Pima Indians
Date: diabetes data set for performing the following

a) Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,


Skewness and Kurtosis.
Aim:

To find Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set.
Procedure:

1. Importing necessary packages

import numpy as np

import pandas as pd

import seaborn as sns

import matplotlib as plt

import statistics as st

2. Importing the Diabetes CSV data file diab=pd.read_csv("D:\FDS

lab-python\pima-indians-diabetes.csv")

To print full data set

print(diab)

To print full information about the data set

print(diab.info())
Output- Mean:

Output mean(axis=1)[1:5]:

Output- Median:

Output- Mode:
3. Univariate Analysis

Mean
Mean represents the arithmetic average of the data. The line of code below prints the mean of the
numerical variables in the data.

print(diab.mean())

print(diab.mean(axis=1)[1:5])

Median
In simple terms, median represents the 50th percentile, or the middle value of the data, that
separates the distribution into two halves.

print(diab.median())

Mode
Mode represents the most frequent value of a variable in the data. The mode() function returnsthe
most common value or most repeated value of a variable.

print(diab.mode())

Standard Deviation
Standard deviation is a measure that is used to quantify the amount of variation of a set of data
values from its mean.

print(diab.std())
Output -Standard Deviation:

Output-Variance:

Output- Skewness:

Output- Kurtosis
Variance
Variance is another measure of dispersion. It is the square of the standard deviation and the
covariance of the random variable with itself.

print(diab.var())

Skewness

Another useful statistic is skewness, which is the measure of the symmetry, or lack of it, for a
real-valued random variable about its mean. The skewness value can be positive, negative, or
undefined.

print(diab.skew())

Kurtosis

peakedness of data at mean value. “The kurtosis parameter is a measure of the combined weightof
the tails relative to the rest of the distribution.” This means we measure tail heaviness of given
distribution.

print(diab.kurt())

Result:

Found Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set successfully.
Sample Output- Linear Regression:
Ex no: 6 Prime or not using R
Date:

Aim:

To write a Program to check if the input number is prime or not

Program:

# Program to check if the input number is prime or not


# take input from the user
num = as.integer(readline(prompt="Enter a number: "))
flag = 0
# prime numbers are greater than 1
if(num > 1) {
# check for factors
flag = 1
for(i in 2:(num-1)) {
if ((num %% i) == 0) {
flag = 0
break
}
}
}
if(num == 2) flag = 1
if(flag == 1) {
print(paste(num,"is a prime number"))
} else {
print(paste(num,"is not a prime number"))
}
Output 1
Enter a number: 25
[1] "25 is not a prime number"
Result:

Thus the program to find whether the given number is prime or not is executed successfully.

You might also like