0% found this document useful (0 votes)
10 views28 pages

Data Science Practical

The document provides a series of Python programs that demonstrate the use of various libraries such as NumPy, SciPy, Jupyter, Statsmodels, and Pandas for data manipulation and analysis. Key functionalities explored include array creation, statistical operations, data visualization, and correlation analysis. Each section includes code snippets and their corresponding outputs, showcasing practical applications of these libraries.

Uploaded by

borleajay45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views28 pages

Data Science Practical

The document provides a series of Python programs that demonstrate the use of various libraries such as NumPy, SciPy, Jupyter, Statsmodels, and Pandas for data manipulation and analysis. Key functionalities explored include array creation, statistical operations, data visualization, and correlation analysis. Each section includes code snippets and their corresponding outputs, showcasing practical applications of these libraries.

Uploaded by

borleajay45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Q1.

Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and
Pandas packages.
# Importing necessary libraries

import numpy as np

import pandas as pd

import scipy.stats as stats

import statsmodels.api as sm

import matplotlib.pyplot as plt

# Function to explore NumPy features

def explore_numpy():

print("NumPy Operations:")

arr = np.array([1, 2, 3, 4, 5])

print("NumPy Array:", arr)

print("Mean of the array:", np.mean(arr))

print("Sum of the array:", np.sum(arr))

print("Standard Deviation of the array:", np.std(arr))

print("\n")

# Function to explore SciPy features

def explore_scipy():

print("SciPy Normal Distribution:")

mu, sigma = 0, 0.1 # mean and standard deviation

s = np.random.normal(mu, sigma, 1000)

# Plotting the histogram

plt.figure(figsize=(10, 6))

plt.hist(s, bins=30, density=True, alpha=0.6, color='g')

plt.title('Histogram of Normally Distributed Data')

plt.xlabel('Value')

plt.ylabel('Density')

plt.grid()

plt.show()

print("\n")
# Function to explore Statsmodels features

def explore_statsmodels():

print("Statsmodels Linear Regression:")

# Sample data

x = np.arange(100)

y = 2 * x + np.random.normal(0, 10, size=x.size)

model = sm.OLS(y, sm.add_constant(x)).fit()

print(model.summary())

print("\n")

# Function to explore Pandas features

def explore_pandas():

print("Pandas DataFrame Operations:")

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}

df = pd.DataFrame(data)

print("DataFrame:\n", df)

print("Mean of column A:", df['A'].mean())

print("Standard Deviation of column B:", df['B'].std())

print("\n")

# Execute the exploration functions

explore_numpy()

explore_scipy()

explore_statsmodels()

explore_pandas()

Output -

NumPy Operations:

NumPy Array: [1 2 3 4 5]

Mean of the array: 3.0

Sum of the array: 15

Standard Deviation of the array: 1.4142135623730951


Q2. Write a Program for working with Numpy Arrays.

# Importing the NumPy library

import numpy as np

# 1. Creating NumPy Arrays

print("1. Creating NumPy Arrays:")

# Creating a 1D array

array_1d = np.array([1, 2, 3, 4, 5])

print("1D Array:", array_1d)

# Creating a 2D array

array_2d = np.array([[1, 2, 3], [4, 5, 6]])

print("2D Array:\n", array_2d)

# Creating an array of zeros

zeros_array = np.zeros((2, 3))

print("Array of Zeros:\n", zeros_array)

# Creating an array of ones

ones_array = np.ones((3, 2))

print("Array of Ones:\n", ones_array)

# Creating an array with a range of values

range_array = np.arange(10, 20, 2)

print("Array with a Range of Values:", range_array)

# Creating an array with evenly spaced values

linspace_array = np.linspace(0, 1, 5)

print("Evenly Spaced Array:", linspace_array)

print("\n")

# 2. Array Operations

print("2. Array Operations:")

# Basic arithmetic operations

array_a = np.array([10, 20, 30])

array_b = np.array([1, 2, 3])

print("Array A:", array_a)

print("Array B:", array_b)


# Addition

print("Addition:", array_a + array_b)

# Subtraction

print("Subtraction:", array_a - array_b)

# Multiplication

print("Multiplication:", array_a * array_b)

# Division

print("Division:", array_a / array_b)

# Element-wise square

print("Element-wise Square:", np.square(array_a))

print("\n")

# 3. Indexing and Slicing

print("3. Indexing and Slicing:")

# Accessing elements

print("First element of 1D array:", array_1d[0])

print("Element at (1, 2) in 2D array:", array_2d[1, 2])

# Slicing

print("Slicing 1D array (elements 1 to 3):", array_1d[1:4])

print("Slicing 2D array (first row):", array_2d[0, :])

print("Slicing 2D array (second column):", array_2d[:, 1])

print("\n")

# 4. Reshaping Arrays

print("4. Reshaping Arrays:")

reshaped_array = np.arange(12).reshape(3, 4)

print("Reshaped Array (3x4):\n", reshaped_array)

# Flattening an array

flattened_array = reshaped_array.flatten()

print("Flattened Array:", flattened_array)

print("\n")

# 5. Statistical Operations

print("5. Statistical Operations:")


print("Mean of 1D array:", np.mean(array_1d))

print("Sum of elements in 2D array:", np.sum(array_2d))

print("Standard Deviation of 1D array:", np.std(array_1d))

print("Maximum value in 2D array:", np.max(array_2d))

print("Minimum value in 2D array:", np.min(array_2d))

print("\n")

# 6. Boolean Indexing

print("6. Boolean Indexing:")

bool_index = array_1d > 2

print("Boolean Index (elements > 2):", bool_index)

print("Elements greater than 2:", array_1d[bool_index])

Ouput –

1. Creating NumPy Arrays:

1D Array: [1 2 3 4 5]

2D Array:

[[1 2 3]

[4 5 6]]

Array of Zeros:

[[0. 0. 0.]

[0. 0. 0.]]

Array of Ones:

[[1. 1.]

[1. 1.]

[1. 1.]]

Array with a Range of Values: [10 12 14 16 18]

Evenly Spaced Array: [0. 0.25 0.5 0.75 1. ]

2. Array Operations:

Array A: [10 20 30]

Array B: [1 2 3]

Addition: [11 22 33]


Subtraction: [ 9 18 27]

Multiplication: [10 40 90]

Division: [10. 10. 10.]

Element-wise Square: [100 400 900]

3. Indexing and Slicing:

First element of 1D array: 1

Element at (1, 2) in 2D array: 6

Slicing 1D array (elements 1 to 3): [2 3 4]

Slicing 2D array (first row): [1 2 3]

Slicing 2D array (second column): [2 5]

4. Reshaping Arrays:

Reshaped Array (3x4):

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

Flattened Array: [ 0 1 2 3 4 5 6 7 8 9 10 11]

5. Statistical Operations:

Mean of 1D array: 3.0

Sum of elements in 2D array: 21

Standard Deviation of 1D array: 1.4142135623730951

Maximum value in 2D array: 6

Minimum value in 2D array: 1

6. Boolean Indexing:

Boolean Index (elements > 2): [False False True True True]

Elements greater than 2: [3 4 5]


Q3. Program to perform array slicing.

# Importing the NumPy library

import numpy as np

# Creating a 1D NumPy array

array_1d = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

print("1D Array:", array_1d)

# Slicing the 1D array

print("\nSlicing 1D Array:")

print("Elements from index 2 to 5:", array_1d[2:6]) # Slices from index 2 to 5 (exclusive)

print("Elements from the start to index 4:", array_1d[:5]) # Slices from start to index 4

print("Elements from index 5 to the end:", array_1d[5:]) # Slices from index 5 to the end

print("Last three elements:", array_1d[-3:]) # Slices the last three elements

print("Every second element:", array_1d[::2]) # Slices every second element

print("\n")

# Creating a 2D NumPy array

array_2d = np.array([[1, 2, 3, 4],

[5, 6, 7, 8],

[9, 10, 11, 12],

[13, 14, 15, 16]])

print("2D Array:\n", array_2d)

# Slicing the 2D array

print("\nSlicing 2D Array:")

print("First row:", array_2d[0, :]) # Slices the first row

print("Second column:", array_2d[:, 1]) # Slices the second column

print("Elements from row 1 to 2 and column 1 to 3:\n", array_2d[1:3, 1:3]) # Slices a sub-array

print("Last two rows:\n", array_2d[-2:, :]) # Slices the last two rows

print("Every second row and every second column:\n", array_2d[::2, ::2]) # Slices every second row
and column
Output –

1D Array: [ 10 20 30 40 50 60 70 80 90 100]

Slicing 1D Array:

Elements from index 2 to 5: [30 40 50 60]

Elements from the start to index 4: [10 20 30 40 50]

Elements from index 5 to the end: [ 60 70 80 90 100]

Last three elements: [ 80 90 100]

Every second element: [10 30 50 70 90]

2D Array:

[[ 1 2 3 4]

[ 5 6 7 8]

[ 9 10 11 12]

[13 14 15 16]]

Slicing 2D Array:

First row: [1 2 3 4]

Second column: [ 2 6 10 14]

Elements from row 1 to 2 and column 1 to 3:

[[ 6 7]

[10 11]]

Last two rows:

[[ 9 10 11 12]

[13 14 15 16]]

Every second row and every second column:

[[ 1 3]

[ 9 11]]
Q4. Program for pandas Data Frames.

import pandas as pd

# Creating a DataFrame from a dictionary

data = {

'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 35],

'City': ['New York', 'Los Angeles', 'Chicago']

df = pd.DataFrame(data)

print(df)

Output –

Name Age City

0 Alice 25 New York

1 Bob 30 Los Angeles

2 Charlie 35 Chicago
Q5. Program to compute weighted averages in python either defining your own function or using
numby.

import numpy as np

def weighted_average_numpy(values, weights):

values = np.array(values)

weights = np.array(weights)

if len(values) != len(weights):

raise ValueError("The length of values and weights must be the same.")

if np.sum(weights) == 0:

raise ValueError("The sum of weights must not be zero.")

return np.average(values, weights=weights)

# Example usage

values = [80, 90, 70]

weights = [0.2, 0.5, 0.3]

result = weighted_average_numpy(values, weights)

print(f"Weighted Average (NumPy): {result}")

Output –

Weighted Average (NumPy): 82.0


Q6. Program to calculate variance.

def calculate_variance(data):

if len(data) == 0:

return 0

mean = sum(data) / len(data)

variance = sum((x - mean) ** 2 for x in data) / len(data)

return variance

# Example usage

data = [10, 12, 23, 23, 16, 23, 21, 16]

variance = calculate_variance(data)

print(f"The variance of the dataset is: {variance}")

Output –

The variance of the dataset is: 24.0.


Q7. Program to create normal curve.

import numpy as np

import matplotlib.pyplot as plt

def normal_curve(mean, std_dev, num_points=1000):

# Generate x values

x = np.linspace(mean - 4*std_dev, mean + 4*std_dev, num_points)

# Calculate the normal distribution (PDF)

y = (1 / (std_dev * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mean) / std_dev) ** 2)

return x, y

# Parameters for the normal distribution

mean = 0

std_dev = 1

# Generate the normal curve data

x, y = normal_curve(mean, std_dev)

# Plotting the normal curve

plt.figure(figsize=(10, 6))

plt.plot(x, y, color='blue', label='Normal Distribution Curve')

plt.title('Normal Distribution Curve')

plt.xlabel('X-axis')

plt.ylabel('Probability Density')

plt.axvline(mean, color='red', linestyle='--', label='Mean')

plt.legend()

plt.grid()

plt.show()
Output –

| *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

|* *

|* *

+-------------------------------------------+

-4 -3 -2 -1 0 1 2 3 4
Q8. Program for correlation with scatter plot.

import numpy as np

import matplotlib.pyplot as plt

from scipy.stats import pearsonr

# Generate sample data

np.random.seed(0) # For reproducibility

x = np.random.rand(100) * 10 # 100 random points scaled to 0-10

y = 2.5 * x + np.random.normal(0, 2, 100) # Linear relationship with some noise

# Calculate the Pearson correlation coefficient

correlation_coefficient, _ = pearsonr(x, y)

# Create a scatter plot

plt.figure(figsize=(10, 6))

plt.scatter(x, y, color='blue', alpha=0.6, edgecolors='w', s=100)

plt.title('Scatter Plot with Correlation')

plt.xlabel('X-axis')

plt.ylabel('Y-axis')

plt.axhline(0, color='black', lw=0.8, ls='--')

plt.axvline(0, color='black', lw=0.8, ls='--')

# Display the correlation coefficient on the plot

plt.text(1, 20, f'Correlation Coefficient: {correlation_coefficient:.2f}', fontsize=12, color='red')

plt.grid()

plt.show()
Output –

| *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

| * *

|* *

|* *

+-------------------------------------------+

-4 -3 -2 -1 0 1 2 3 4
Q9. Program to compute correlation Coefficient.

import numpy as np

from scipy.stats import pearsonr

# Sample data

data_x = np.array([10, 20, 30, 40, 50])

data_y = np.array([12, 24, 36, 48, 60])

# Calculate the Pearson correlation coefficient

correlation_coefficient, p_value = pearsonr(data_x, data_y)

# Output the correlation coefficient

print(f"Data X: {data_x}")

print(f"Data Y: {data_y}")

print(f"Correlation Coefficient: {correlation_coefficient:.2f}")

print(f"P-value: {p_value:.4f}")

Output –

Data X: [10 20 30 40 50]

Data Y: [12 24 36 48 60]

Correlation Coefficient: 1.00

P-value: 0.0000
Q10. Program for simple linear Regression.

import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

# Sample data

# Independent variable (X)

X = np.array([[1], [2], [3], [4], [5]])

# Dependent variable (y)

y = np.array([2, 3, 5, 7, 11])

# Create a linear regression model

model = LinearRegression()

# Fit the model

model.fit(X, y)

# Make predictions

y_pred = model.predict(X)

# Output the coefficients

slope = model.coef_[0]

intercept = model.intercept_

print(f"Slope: {slope}")

print(f"Intercept: {intercept}")

# Plotting the results

plt.scatter(X, y, color='blue', label='Data points')

plt.plot(X, y_pred, color='red', label='Regression line')

plt.title('Simple Linear Regression')

plt.xlabel('Independent Variable (X)')

plt.ylabel('Dependent Variable (y)')

plt.legend()

plt.show()
Output –

Slope: 2.2

Intercept: 0.39999999999999947
Q11. Create a numpy And Array object by using array Function ().

import numpy as np

# Creating a NumPy array from a list

list_data = [1, 2, 3, 4, 5]

array_from_list = np.array(list_data)

# Creating a NumPy array from a tuple

tuple_data = (6, 7, 8, 9, 10)

array_from_tuple = np.array(tuple_data)

# Output the arrays

print("Array from list:", array_from_list)

print("Array from tuple:", array_from_tuple)

Output –

Array from list: [1 2 3 4 5]

Array from tuple: [ 6 7 8 9 10]


Q12. Use Tuple to create numpy array.

import numpy as np

# Step 1: Create a tuple

data_tuple = (10, 20, 30, 40, 50)

# Step 2: Convert the tuple to a NumPy array

array_from_tuple = np.array(data_tuple)

# Step 3: Display the result

print("NumPy Array from Tuple:", array_from_tuple)

Output –

NumPy Array from Tuple: [10 20 30 40 50]


Q13. Create a 2-D array containing two arrays with the values1,2,3 and 4,5,6.

import numpy as np

# Step 1: Create a nested tuple (or you can use a list of lists)

data_2d = ((1, 2, 3), (4, 5, 6))

# Step 2: Convert the nested tuple to a 2-D NumPy array

array_2d = np.array(data_2d)

# Step 3: Display the result

print("2-D NumPy Array:")

print(array_2d)

Output –

2-D NumPy Array:

[[1 2 3]

[4 5 6]]
Q14. Displaying the dimension array from 0 to 3.

import numpy as np

# Step 1: Create a 1-D array with values from 0 to 3

array_1d = np.arange(4) # This will create an array with values [0, 1, 2, 3]

# Step 2: Reshape the array to different dimensions

array_2d = array_1d.reshape(2, 2) # Reshape to 2x2

array_3d = array_1d.reshape(2, 2, 1) # Reshape to 2x2x1

# Step 3: Display the results

print("1-D Array:")

print(array_1d)

print("\n2-D Array (2x2):")

print(array_2d)

print("\n3-D Array (2x2x1):")

print(array_3d)

Output –

1-D Array:

[0 1 2 3]

2-D Array (2x2):

[[0 1]

[2 3]]

3-D Array (2x2x1):

[[[0]

[1]]

[[2]

[3]]]
Q15. Program for accessing array element by indexing & adding it.

import numpy as np

# Step 1: Create a NumPy array

array = np.array([10, 20, 30, 40, 50])

# Step 2: Access elements by indexing

first_element = array[0] # Accessing the first element (10)

second_element = array[1] # Accessing the second element (20)

# Step 3: Add the accessed elements

sum_of_elements = first_element + second_element

# Step 4: Display the results

print("First Element:", first_element)

print("Second Element:", second_element)

print("Sum of First and Second Elements:", sum_of_elements)

Output –

First Element: 10

Second Element: 20

Sum of First and Second Elements: 30


Q16. Program slice elements from index 1 to 5.

import numpy as np

# Step 1: Create a NumPy array

array = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

# Step 2: Slice elements from index 1 to 5

sliced_array = array[1:6] # Slicing from index 1 to 5 (exclusive)

# Step 3: Display the results

print("Original Array:")

print(array)

print("\nSliced Array (from index 1 to 5):")

print(sliced_array)

Output –

Original Array:

[ 10 20 30 40 50 60 70 80 90 100]

Sliced Array (from index 1 to 5):

[20 30 40 50 60]
Q17. Print the shape of an array.

import numpy as np

# Step 1: Create a NumPy array

array_1d = np.array([10, 20, 30, 40, 50]) # 1-D array

array_2d = np.array([[1, 2, 3], [4, 5, 6]]) # 2-D array

array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # 3-D array

# Step 2: Print the shape of the arrays

print("Shape of 1-D Array:", array_1d.shape)

print("Shape of 2-D Array:", array_2d.shape)

print("Shape of 3-D Array:", array_3d.shape)

Output –

Shape of 1-D Array: (5,)

Shape of 2-D Array: (2, 3)

Shape of 3-D Array: (2, 2, 2)


Q18. Iterate element on 1-D array.

import numpy as np

# Step 1: Create a 1-D NumPy array

array_1d = np.array([10, 20, 30, 40, 50])

# Step 2: Iterate over the elements of the array

print("Elements of the 1-D array:")

for index, value in enumerate(array_1d):

print(f"Index {index}: Value {value}")

Output –

Elements of the 1-D array:

Index 0: Value 10

Index 1: Value 20

Index 2: Value 30

Index 3: Value 40

Index 4: Value 50
Q19. Program to split the array in three parts.

import numpy as np

# Step 1: Create a NumPy array

array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Step 2: Split the array into three parts

split_arrays = np.array_split(array, 3)

# Step 3: Display the results

print("Original Array:")

print(array)

print("\nSplit Arrays:")

for i, part in enumerate(split_arrays):

print(f"Part {i + 1}: {part}")

Output –

Original Array:

[ 1 2 3 4 5 6 7 8 9 10]

Split Arrays:

Part 1: [1 2 3 4]

Part 2: [5 6 7]

Part 3: [ 8 9 10]
Q20. Program to find indexes where the value is even.

import numpy as np

# Step 1: Create a NumPy array

array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Step 2: Find indexes where the value is even

even_indexes = np.where(array % 2 == 0)[0]

# Step 3: Display the results

print("Original Array:")

print(array)

print("\nIndexes of even values:")

print(even_indexes)

Output –

Original Array:

[ 1 2 3 4 5 6 7 8 9 10]

Indexes of even values:

[1 3 5 7 9]

You might also like