0% found this document useful (0 votes)

24 views31 pages

DV Lab Manual Modified

Uploaded by

karthiksvr26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views31 pages

DV Lab Manual Modified

Uploaded by

karthiksvr26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Ex no: 1 Download, install and explore the features of NumPy, SciPy,

Date: Jupyter, statsmodels and pandas package

Aim
To download, install and explore the features of NumPy, SciPy, Jupyter, stats models and pandas
packages.
Procedure
Step1: Internet must be coneected to install the packages such as NumPy, SciPy, Jupyter, stats
models and pandas packages
Step2: open run window, type cmd and press enter to open the cmd
Step3: change the location to python using change directory as cd
Step4: changing the location as type C:\Users\students\AppData\Local\programs\Python\Python 36-
32\scripts
Step5: After setting the path, check python and pip versions. If the version is not satisfied our
requirements then update it with the new version. Then install the given packages one by one
Step6: Install the NumPy packages using enter pip install NumPy
Step7: To install SciPy packages using enter pip install SciPy
Step8: To install Jupyter by using enter pip enter Jupyter
Step9: To install stats models by using enter pip install stats models
Step10: To install pandas packages by using enter pip install pandas
To check Python & pip version
python –v
pip -v
Output
To install Numpy
pip install numPy
To install SciPy
pip install sciPy
To install Jupyter
pip install Jupyter
To install Stats models
pip install statsmodels
To install Pandas
pip install pandas
Output:
Result:
Thus the package NumPy, SciPy, Jupyter, stats models and pandas using pip has been successfully
downloaded and installed.
a. Output
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19])

b. Output
0

c. Output
First element: 2.1
Second element: 4.5
Last element: 5.5
array('d', [3.5, 4.2, 3.3])
array('d', [2.1, 4.5, 3.5, 4.2, 3.3, 5.5])
Ex no: 2 Working with Numpy arrays
Date:

Aim
To create programs and operations performed using Numpy arrays.
Algorithm
step 1: Import the numpy packages as np
step 2: To create an array in 1D and 2D using numpyfuctions
step 3: We can find the indexing value in an array using numpy
functionsstep 4: We use insert() function for inserting an elements in
an array step5: We can also delete the elements in an array using
delete() functionstep 6: To slice an elements in an array we can use
slicing [:] operator step 7: To merge the two arrays we use
concatenate() function
Program
a.Creating arrays in
NumPyimportnumpy as
np array=np.arange(20)
array

b.Finding index value of the array

import array as arr
numbers =
arr.array('i',[10,20,30])
print(numbers.index(10))

c.Access the elements in the array

import array as ar
newarr = ar.array('d', [2.1, 4.5,3.5,4.2,3.3,
5.5])print("First element:", newarr[0])
print("Second element:", newarr[1])
print("Last element:", newarr[-1])
print(newarr[2:5])
print(newarr[:])
d. Output
1
9
2
3
5
7
10

e. Output
1
2
3
5
10

f. Output
[2 3 4 5]

g. Output
[1 2 3 4 5 6]
[1 2 3 4 5 6]
[1 2 3 4 5 6]

d.Inserting an element in an array

import array as ar
num = ar.array('i', [1, 2, 3, 5, 7,
10])num.insert(1,9)
for x in num:
print(x)

e.Deleting an element in an array

import array as ar
num = ar.array('i', [1, 2, 3, 5, 7,
10])num.remove(7)
for x in num:
print(x)
f.Slice elements in an array
import numpy as np
arr=np.array([1, 2, 3, 4, 5, 6,
7])print(arr[1:5])

g.Concatenation
import numpy as nparr1=np.array([1, 2,3])
arr2=np.array([4, 5, 6])
arr=np.concatenate((arr1,arr2))print(arr)

Result:
Thus the program and operations performed on Numpy arrays in python has been executed
successfully and verified.
a. Output
Calories Duration
0 420 50
1 380 40
2 390 45

b. Output
Calories Duration
Day1 420 50
Day2 380 40
Day3 390 45

c. Output
0
0 1
1 2
2 3
3 4
4 5
Ex no:3 Working with Pandas data frames
Date:
Aim
To create programs and operations performed Pandas data frames.
Algorithm:
Step 1: Import pandas package after install the pandas packages.
Step 2: To create a simple pandas dataframe.
Step 3: After creating simple dataframe, add a list of names to give each row a name using
pandas.
Step 4: To create a dataframe list using pandas.
Step 5: Also create dataframe from dictionary of nd/arrays or lists.
Step 6: To check the addition and deletion of rows in the pandas dataframe.

a.Create a simple Pandas DataFrame

Import pandas as pd
data = { "calories": [420, 380, 390], "duration": [50, 40, 45]}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

b.Add a list of names to give each row a name

import pandas as pd
data = {"calories": [420, 380, 390],"duration": [50, 40, 45]}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

c.Create a DataFrame from Lists

import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)
d. Output
Name Age
28
0 Tom
1 Jack 34
2 Steve 29
3 Ricky 42

e. Output
a b
0 1 2
1 3 4
0 5 6
1 7 8

f. Output
a b
1 3 4
1 7 8
d.Create a DataFrame from Dictionary of nd arrays / Lists
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print(df)

e.Addition of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
print(df)

f.Deletion of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print (df)

Result:
Thus the program and operations using pandas has been successfully executed and verified.
1. Output:

2. Output:

3. Output:
Ex no: 4 Reading data from text files, Excel and the web and exploring
Date: various commands for doing descriptive analytics on the Iris
data set

Aim:

To Read data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set.
Procedure:

Download the Dataset “Iris.csv”

1) Display the whole dataset

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")print(data)

2) Displaying the number of rows randomly.

sample() function used to display the rows according to arguments given, but itwill display
the rows randomly.

print(data.sample(2))

3) Displaying the number of columns and names of the columns.

The column() function prints all the columns of the dataset in a list form.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")print(data.columns)
4. Output:

5. Output:

6. Output:
4) Displaying the shape of the dataset.
The shape of the dataset means to print the total number of rows or entries andthe total
number of columns or features of that particular dataset.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")print(data.shape)

5) Slicing the rows.

Slicing means if you want to print or work upon a particular group of lines that isfrom 10th
row to 20th row.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")sliced_data=data[2:5]

print(sliced_data)

6) Displaying only specific columns.

In any dataset, it is sometimes needed to work upon only specific features orcolumns, so we
can do this by the following code.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")

specific_data=data[["sepallength","petallength"]]

print(specific_data.head(4))
7. Output:

8. Output:
7) Displaying the specific rows using “iloc” and “loc” functions.
The “loc” functions use the index name of the row to display the particular row of the
dataset.
The ―iloc‖ functions use the index integer of the row, which gives completeinformation about
the row.

Code:
data.iloc[5]

data.loc[data["sepallength"]=="iris-

setosa"]print(data.iloc[5])

8) Counting the number of counts of unique values using

“value_counts()”.
The value_counts() function, counts the number of times a particular instance ordata has
occurred.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")

print(data["sepallength"].value_counts())

9) Calculating sum, mean and mode of a particular column.

We can also calculate the sum, mean and mode of any integer columns as Ihave done in the
following code.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")sum_data=data["sepallength"].sum()

mean_data=data["sepallength"].mean()
9. Output:

10. Output:

11. Output:
median_data=data["sepallength"].median()

print("sum:",sum_data,"\n mean:",mean_data, "\n

median:",median_data)

10) Extracting minimum and maximum from a column.

Identifying minimum and maximum integer, from a particular column or row canalso be
done in a dataset.

Code:

import pandas as pd

data=pd.read_csv("D:\FDS lab-

python\iris_csv.csv")

min_data=data["sepallength"].min()

max_data=data["sepallength"].max()

print("minimum:", min_data, "\n

maximum:",max_data)

11) Adding a column to the dataset.

If want to add a new column in our dataset, as we are doing any calculations orextracting
some information from the dataset, and if you want to save it a new column. This can be
done by the following code by taking a case where we have added all integer values of all
columns.

Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")cols=data.columns

print(cols)

Result:

Thus the Reading data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set is successfully executed.
Output- Importing the Diabetes CSV data file:

Output- To print full information about the data set:

Ex no: 5 Use the diabetes data set from UCI and Pima Indians
Date: diabetes data set for performing the following

a) Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,

Skewness and Kurtosis.
Aim:

To find Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set.
Procedure:

1. Importing necessary packages

import numpy as np

import pandas as pd

import seaborn as sns

import matplotlib as plt

import statistics as st

2. Importing the Diabetes CSV data file diab=pd.read_csv("D:\FDS

lab-python\pima-indians-diabetes.csv")

To print full data set

print(diab)

To print full information about the data set

print(diab.info())
Output- Mean:

Output mean(axis=1)[1:5]:

Output- Median:

Output- Mode:
3. Univariate Analysis

Mean
Mean represents the arithmetic average of the data. The line of code below prints the mean of the
numerical variables in the data.

print(diab.mean())

print(diab.mean(axis=1)[1:5])

Median
In simple terms, median represents the 50th percentile, or the middle value of the data, that
separates the distribution into two halves.

print(diab.median())

Mode
Mode represents the most frequent value of a variable in the data. The mode() function returnsthe
most common value or most repeated value of a variable.

print(diab.mode())

Standard Deviation
Standard deviation is a measure that is used to quantify the amount of variation of a set of data
values from its mean.

print(diab.std())
Output -Standard Deviation:

Output-Variance:

Output- Skewness:

Output- Kurtosis
Variance
Variance is another measure of dispersion. It is the square of the standard deviation and the
covariance of the random variable with itself.

print(diab.var())

Skewness

Another useful statistic is skewness, which is the measure of the symmetry, or lack of it, for a
real-valued random variable about its mean. The skewness value can be positive, negative, or
undefined.

print(diab.skew())

Kurtosis

peakedness of data at mean value. “The kurtosis parameter is a measure of the combined weightof
the tails relative to the rest of the distribution.” This means we measure tail heaviness of given
distribution.

print(diab.kurt())

Result:

Found Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set successfully.
Sample Output- Linear Regression:
Ex no: 6 Prime or not using R
Date:

Aim:

To write a Program to check if the input number is prime or not

Program:

# Program to check if the input number is prime or not

# take input from the user
num = as.integer(readline(prompt="Enter a number: "))
flag = 0
# prime numbers are greater than 1
if(num > 1) {
# check for factors
flag = 1
for(i in 2:(num-1)) {
if ((num %% i) == 0) {
flag = 0
break
}
}
}
if(num == 2) flag = 1
if(flag == 1) {
print(paste(num,"is a prime number"))
} else {
print(paste(num,"is not a prime number"))
}
Output 1
Enter a number: 25
[1] "25 is not a prime number"
Result:

Thus the program to find whether the given number is prime or not is executed successfully.

EBF2334 Statistics For Economics & Business (Group Assignment)
No ratings yet
EBF2334 Statistics For Economics & Business (Group Assignment)
18 pages
Folk and Ward, 1957
No ratings yet
Folk and Ward, 1957
24 pages
FDS Lab Manual-1
No ratings yet
FDS Lab Manual-1
51 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
CS3361-Data Science Lab Manual - B.rethina Kumar
No ratings yet
CS3361-Data Science Lab Manual - B.rethina Kumar
36 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
45 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
No ratings yet
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
9 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
59 pages
DSL Rough Draft
No ratings yet
DSL Rough Draft
34 pages
Ge - Computer Science Data Analysis
No ratings yet
Ge - Computer Science Data Analysis
16 pages
PPS - Unit 5 (Imp Topics)
No ratings yet
PPS - Unit 5 (Imp Topics)
7 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Fundamentals of Data Science Students
No ratings yet
Fundamentals of Data Science Students
52 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
Fods Lab
No ratings yet
Fods Lab
36 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
DSF Lab Exp Full
No ratings yet
DSF Lab Exp Full
88 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
FDS Lab Manual (Print)
No ratings yet
FDS Lab Manual (Print)
43 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Numpy Data Analysis and Visualisation With Python
No ratings yet
Numpy Data Analysis and Visualisation With Python
75 pages
Numpy QP
No ratings yet
Numpy QP
4 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Dfs Manual
No ratings yet
Dfs Manual
43 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
72 pages
Data Science Lab Manual Full
No ratings yet
Data Science Lab Manual Full
47 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
61 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Data Science - Unit II
100% (2)
Data Science - Unit II
173 pages
EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages
No ratings yet
EX - No: 1 Date:: Download Install Explore The Features of Numpy, Scipy, Jupiter, Statsmodels and Pandas Packages
38 pages
Fundamentals of Data Science Lab Manual New
No ratings yet
Fundamentals of Data Science Lab Manual New
33 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Data Science Practical
No ratings yet
Data Science Practical
28 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
24 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
Computing 4
No ratings yet
Computing 4
2 pages
Math 4
No ratings yet
Math 4
15 pages
FODL Question Bank
No ratings yet
FODL Question Bank
3 pages
Digital 5
No ratings yet
Digital 5
8 pages
Assignment of All Rights To Photograph
No ratings yet
Assignment of All Rights To Photograph
3 pages
2023 Using Physics Informed Deep Learning To Enhance 2-Component-2-Dimensional Particle Image Velocimetry
No ratings yet
2023 Using Physics Informed Deep Learning To Enhance 2-Component-2-Dimensional Particle Image Velocimetry
10 pages
Kanageswari & Ong 2013
No ratings yet
Kanageswari & Ong 2013
28 pages
Output Distribusi Tidak Normal
No ratings yet
Output Distribusi Tidak Normal
12 pages
Approaches To Truancy Prevention in Improving Academic Performance
No ratings yet
Approaches To Truancy Prevention in Improving Academic Performance
26 pages
Data Analysis of Students Marks With Descriptive Statistics
No ratings yet
Data Analysis of Students Marks With Descriptive Statistics
4 pages
Psych Assess Notes
No ratings yet
Psych Assess Notes
10 pages
SWPP Mockboard Exam
No ratings yet
SWPP Mockboard Exam
30 pages
SSC CGL 2022 Tier I & Tier II Exam Pattern and Syllabus
No ratings yet
SSC CGL 2022 Tier I & Tier II Exam Pattern and Syllabus
34 pages
Cyber Crime Awareness Among Pupil Teachers of Punjab
No ratings yet
Cyber Crime Awareness Among Pupil Teachers of Punjab
8 pages
Measuring Skewness - Forgotten Statistics
No ratings yet
Measuring Skewness - Forgotten Statistics
18 pages
Quantitative Techniques: Rajveer Singh Bhatia Rimjhim Khandelwal
No ratings yet
Quantitative Techniques: Rajveer Singh Bhatia Rimjhim Khandelwal
18 pages
Far-from-Equilibrium Time Evolution Between Two Gamma Distributions
No ratings yet
Far-from-Equilibrium Time Evolution Between Two Gamma Distributions
26 pages
Assignment
43% (7)
Assignment
9 pages
Moment (Mathematics) - Wikipedia
No ratings yet
Moment (Mathematics) - Wikipedia
5 pages
Learning Unit 8 - 10044701
No ratings yet
Learning Unit 8 - 10044701
60 pages
Dynamic Spread Trading
No ratings yet
Dynamic Spread Trading
33 pages
Chapter Four Findings and Discussion
No ratings yet
Chapter Four Findings and Discussion
7 pages
Module in Inferential Statistics With Additional Exercises Edited Nov 14 by Tats2020
No ratings yet
Module in Inferential Statistics With Additional Exercises Edited Nov 14 by Tats2020
91 pages
Exports of Goods - % of GDP
No ratings yet
Exports of Goods - % of GDP
5 pages
Psychological Statistics
No ratings yet
Psychological Statistics
170 pages
How Sharp Is The Shape-Ratio? - Risk-Adjusted Performance Measures
No ratings yet
How Sharp Is The Shape-Ratio? - Risk-Adjusted Performance Measures
13 pages
Fin534 Individual Assignment 1
No ratings yet
Fin534 Individual Assignment 1
30 pages
Chap 01 - Fundamentals of Probability - Practice Questions
No ratings yet
Chap 01 - Fundamentals of Probability - Practice Questions
40 pages
Homework Answers
No ratings yet
Homework Answers
6 pages
Hypothesistesting 3 Spring 23
No ratings yet
Hypothesistesting 3 Spring 23
3 pages
Gregory Springer - The Role of Accompaniment
No ratings yet
Gregory Springer - The Role of Accompaniment
19 pages
Data Analytics TB
No ratings yet
Data Analytics TB
1,944 pages
Biometry - Chapter 1
No ratings yet
Biometry - Chapter 1
22 pages

DV Lab Manual Modified

Uploaded by

DV Lab Manual Modified

Uploaded by

Ex no: 1 Download, install and explore the features of NumPy, SciPy,

Date: Jupyter, statsmodels and pandas package

b.Finding index value of the array

c.Access the elements in the array

d.Inserting an element in an array

e.Deleting an element in an array

a.Create a simple Pandas DataFrame

b.Add a list of names to give each row a name

c.Create a DataFrame from Lists

Download the Dataset “Iris.csv”

1) Display the whole dataset

2) Displaying the number of rows randomly.

3) Displaying the number of columns and names of the columns.

5) Slicing the rows.

6) Displaying only specific columns.

8) Counting the number of counts of unique values using

9) Calculating sum, mean and mode of a particular column.

print("sum:",sum_data,"\n mean:",mean_data, "\n

10) Extracting minimum and maximum from a column.

print("minimum:", min_data, "\n

11) Adding a column to the dataset.

Output- To print full information about the data set:

a) Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,

1. Importing necessary packages

import seaborn as sns

import matplotlib as plt

2. Importing the Diabetes CSV data file diab=pd.read_csv("D:\FDS

To print full data set

To print full information about the data set

To write a Program to check if the input number is prime or not

# Program to check if the input number is prime or not

You might also like