DV Lab Manual Modified
DV Lab Manual Modified
Aim
To download, install and explore the features of NumPy, SciPy, Jupyter, stats models and pandas
packages.
Procedure
Step1: Internet must be coneected to install the packages such as NumPy, SciPy, Jupyter, stats
models and pandas packages
Step2: open run window, type cmd and press enter to open the cmd
Step3: change the location to python using change directory as cd
Step4: changing the location as type C:\Users\students\AppData\Local\programs\Python\Python 36-
32\scripts
Step5: After setting the path, check python and pip versions. If the version is not satisfied our
requirements then update it with the new version. Then install the given packages one by one
Step6: Install the NumPy packages using enter pip install NumPy
Step7: To install SciPy packages using enter pip install SciPy
Step8: To install Jupyter by using enter pip enter Jupyter
Step9: To install stats models by using enter pip install stats models
Step10: To install pandas packages by using enter pip install pandas
To check Python & pip version
python –v
pip -v
Output
To install Numpy
pip install numPy
To install SciPy
pip install sciPy
To install Jupyter
pip install Jupyter
To install Stats models
pip install statsmodels
To install Pandas
pip install pandas
Output:
Result:
Thus the package NumPy, SciPy, Jupyter, stats models and pandas using pip has been successfully
downloaded and installed.
a. Output
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19])
b. Output
0
c. Output
First element: 2.1
Second element: 4.5
Last element: 5.5
array('d', [3.5, 4.2, 3.3])
array('d', [2.1, 4.5, 3.5, 4.2, 3.3, 5.5])
Ex no: 2 Working with Numpy arrays
Date:
Aim
To create programs and operations performed using Numpy arrays.
Algorithm
step 1: Import the numpy packages as np
step 2: To create an array in 1D and 2D using numpyfuctions
step 3: We can find the indexing value in an array using numpy
functionsstep 4: We use insert() function for inserting an elements in
an array step5: We can also delete the elements in an array using
delete() functionstep 6: To slice an elements in an array we can use
slicing [:] operator step 7: To merge the two arrays we use
concatenate() function
Program
a.Creating arrays in
NumPyimportnumpy as
np array=np.arange(20)
array
e. Output
1
2
3
5
10
f. Output
[2 3 4 5]
g. Output
[1 2 3 4 5 6]
[1 2 3 4 5 6]
[1 2 3 4 5 6]
g.Concatenation
import numpy as nparr1=np.array([1, 2,3])
arr2=np.array([4, 5, 6])
arr=np.concatenate((arr1,arr2))print(arr)
Result:
Thus the program and operations performed on Numpy arrays in python has been executed
successfully and verified.
a. Output
Calories Duration
0 420 50
1 380 40
2 390 45
b. Output
Calories Duration
Day1 420 50
Day2 380 40
Day3 390 45
c. Output
0
0 1
1 2
2 3
3 4
4 5
Ex no:3 Working with Pandas data frames
Date:
Aim
To create programs and operations performed Pandas data frames.
Algorithm:
Step 1: Import pandas package after install the pandas packages.
Step 2: To create a simple pandas dataframe.
Step 3: After creating simple dataframe, add a list of names to give each row a name using
pandas.
Step 4: To create a dataframe list using pandas.
Step 5: Also create dataframe from dictionary of nd/arrays or lists.
Step 6: To check the addition and deletion of rows in the pandas dataframe.
print(df)
e. Output
a b
0 1 2
1 3 4
0 5 6
1 7 8
f. Output
a b
1 3 4
1 7 8
d.Create a DataFrame from Dictionary of nd arrays / Lists
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print(df)
e.Addition of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
print(df)
f.Deletion of Rows
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print (df)
Result:
Thus the program and operations using pandas has been successfully executed and verified.
1. Output:
2. Output:
3. Output:
Ex no: 4 Reading data from text files, Excel and the web and exploring
Date: various commands for doing descriptive analytics on the Iris
data set
Aim:
To Read data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set.
Procedure:
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")print(data)
print(data.sample(2))
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-
python\iris_csv.csv")print(data.columns)
4. Output:
5. Output:
6. Output:
4) Displaying the shape of the dataset.
The shape of the dataset means to print the total number of rows or entries andthe total
number of columns or features of that particular dataset.
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-
python\iris_csv.csv")print(data.shape)
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-
python\iris_csv.csv")sliced_data=data[2:5]
print(sliced_data)
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")
specific_data=data[["sepallength","petallength"]]
print(specific_data.head(4))
7. Output:
8. Output:
7) Displaying the specific rows using “iloc” and “loc” functions.
The “loc” functions use the index name of the row to display the particular row of the
dataset.
The ―iloc‖ functions use the index integer of the row, which gives completeinformation about
the row.
Code:
data.iloc[5]
data.loc[data["sepallength"]=="iris-
setosa"]print(data.iloc[5])
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-
python\iris_csv.csv")
print(data["sepallength"].value_counts())
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")sum_data=data["sepallength"].sum()
mean_data=data["sepallength"].mean()
9. Output:
10. Output:
11. Output:
median_data=data["sepallength"].median()
median:",median_data)
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-
python\iris_csv.csv")
min_data=data["sepallength"].min()
max_data=data["sepallength"].max()
maximum:",max_data)
Code:
import pandas as pd
data=pd.read_csv("D:\FDS lab-python\iris_csv.csv")cols=data.columns
print(cols)
Result:
Thus the Reading data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set is successfully executed.
Output- Importing the Diabetes CSV data file:
To find Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set.
Procedure:
import numpy as np
import pandas as pd
import statistics as st
lab-python\pima-indians-diabetes.csv")
print(diab)
print(diab.info())
Output- Mean:
Output mean(axis=1)[1:5]:
Output- Median:
Output- Mode:
3. Univariate Analysis
Mean
Mean represents the arithmetic average of the data. The line of code below prints the mean of the
numerical variables in the data.
print(diab.mean())
print(diab.mean(axis=1)[1:5])
Median
In simple terms, median represents the 50th percentile, or the middle value of the data, that
separates the distribution into two halves.
print(diab.median())
Mode
Mode represents the most frequent value of a variable in the data. The mode() function returnsthe
most common value or most repeated value of a variable.
print(diab.mode())
Standard Deviation
Standard deviation is a measure that is used to quantify the amount of variation of a set of data
values from its mean.
print(diab.std())
Output -Standard Deviation:
Output-Variance:
Output- Skewness:
Output- Kurtosis
Variance
Variance is another measure of dispersion. It is the square of the standard deviation and the
covariance of the random variable with itself.
print(diab.var())
Skewness
Another useful statistic is skewness, which is the measure of the symmetry, or lack of it, for a
real-valued random variable about its mean. The skewness value can be positive, negative, or
undefined.
print(diab.skew())
Kurtosis
peakedness of data at mean value. “The kurtosis parameter is a measure of the combined weightof
the tails relative to the rest of the distribution.” This means we measure tail heaviness of given
distribution.
print(diab.kurt())
Result:
Found Frequency, Mean Median, Mode, Variance, Standard Deviation, Skewness and Kurtosisfor
the given UCI and Pima Indians diabetes data set successfully.
Sample Output- Linear Regression:
Ex no: 6 Prime or not using R
Date:
Aim:
Program:
Thus the program to find whether the given number is prime or not is executed successfully.