0% found this document useful (0 votes)
21 views39 pages

Filefile

Uploaded by

Zishan Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views39 pages

Filefile

Uploaded by

Zishan Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

GURU TEGH BAHADUR INSTITUTE OF TECHNOLOGY

Machine Learning & Data Analytics Framework

Practical File

SUBMITTED TO: - Mr. Pradeep Gulati

NAME: - Kunal Kumar Tanwar

BRANCH: - IT-2 (ML&DA)

ENROLLMENT NO: - 09713203121


SERIAL DATE TITLE TEACHER’S

NO. SIGN

1 Calculator in R (without using


objects)

2 Calculator in R using Mathematical


functions.

3 R Script: Calculator Application


with R Objects.

4 Write an R script to find basic


descriptive statistics using
summary, structure, and quartile
functions on mtcars and cars
datasets.

5 Write an R script to find subsets of


datasets
using subset() and aggregate() fun
ctions on the IRIS dataset.

6 Write a program in R for reading


different types of datasets
(e.g., .csv) from web and disk and
writing them to a file in a specific
disk location.

7 Program in R for reading Excel


datasheets.

8
Program to read XML datasets in R.

9 Write an R program to find data


distribution using box plots and
scatter plots.

10 Program in R to detect outliers


using box plots.
INDEX

Experiment 1
Calculator in R (without using objects)
SOURCE CODE
# Without using objects (just performing operations directly)

# Addition

3+5

# Subtraction

10 - 2

# Multiplication

4*7

# Division

20 / 4

# Power (Exponentiation)

2^3
Output
Calculator in R (using objects)
SOURCE CODE
# Using objects (variables) for operands and results

# Step 1: Assign values to variables (objects)

num1 <- 10

num2 <- 5

# Step 2: Perform operations using these variables

# Addition

result_add <- num1 + num2

print(paste("Addition:", result_add))

# Subtraction

result_sub <- num1 - num2

print(paste("Subtraction:", result_sub))

# Multiplication

result_mult <- num1 * num2

print(paste("Multiplication:", result_mult))

# Division

result_div <- num1 / num2

print(paste("Division:", result_div))
# Power (Exponentiation)

result_pow <- num1^2

print(paste("Exponentiation:", result_pow))

Output
Experiment:2
AIM:Calculator in R using Mathematical functions.
SOURCE CODE
# Function to perform basic operations

calculator <- function() {

print("Welcome to the R calculator!")

# Ask user for first number

num1 <- as.numeric(readline(prompt = "Enter the first number: "))

# Ask user for the operation

operation <- readline(prompt = "Enter operation (+, -, *, /, ^, sqrt, log, sin, cos, tan): ")

if (operation == "sqrt") {

# Square root function

result <- sqrt(num1)

print(paste("Square root of", num1, "is:", result))

} else if (operation == "log") {

# Logarithm (natural log by default)


result <- log(num1)

print(paste("Logarithm of", num1, "is:", result))

} else if (operation == "sin") {

# Sine function

result <- sin(num1)

print(paste("Sine of", num1, "is:", result))

} else if (operation == "cos") {

# Cosine function

result <- cos(num1)

print(paste("Cosine of", num1, "is:", result))

} else if (operation == "tan") {

# Tangent function

result <- tan(num1)

print(paste("Tangent of", num1, "is:", result))

} else if (operation == "^") {

# Exponentiation

num2 <- as.numeric(readline(prompt = "Enter the exponent: "))

result <- num1^num2

print(paste(num1, "raised to the power of", num2, "is:", result))

} else if (operation == "+") {

# Addition

num2 <- as.numeric(readline(prompt = "Enter the second number: "))

result <- num1 + num2

print(paste(num1, "+", num2, "=", result))

} else if (operation == "-") {


# Subtraction

num2 <- as.numeric(readline(prompt = "Enter the second number: "))

result <- num1 - num2

print(paste(num1, "-", num2, "=", result))

} else if (operation == "*") {

# Multiplication

num2 <- as.numeric(readline(prompt = "Enter the second number: "))

result <- num1 * num2

print(paste(num1, "*", num2, "=", result))

} else if (operation == "/") {

# Division

num2 <- as.numeric(readline(prompt = "Enter the second number: "))

if (num2 != 0) {

result <- num1 / num2

print(paste(num1, "/", num2, "=", result))

} else {

print("Error: Cannot divide by zero!")

} else {

print("Invalid operation!")

# Run the calculator function

calculator()
Output
Experiment:3
AIM: R Script: Calculator Application with R Objects
SOURCE CODE
# Calculator Application using R Objects

# Define initial operands (these can be modified based on user input)

num1 <- 10 # First number

num2 <- 5 # Second number

# Define result variables to store operation outcomes

result_add <- NULL

result_sub <- NULL

result_mult <- NULL

result_div <- NULL

result_exp <- NULL

result_sqrt1 <- NULL

result_sqrt2 <- NULL

result_log1 <- NULL

result_sin1 <- NULL

result_cos1 <- NULL

# Perform basic arithmetic operations and store in objects

# Addition

result_add <- num1 + num2

print(paste(num1, "+", num2, "=", result_add))


# Subtraction

result_sub <- num1 - num2

print(paste(num1, "-", num2, "=", result_sub))

# Multiplication

result_mult <- num1 * num2

print(paste(num1, "*", num2, "=", result_mult))

# Division

if (num2 != 0) {

result_div <- num1 / num2

print(paste(num1, "/", num2, "=", result_div))

} else {

print("Error: Division by zero is not allowed!")

# Exponentiation (num1 raised to the power of num2)

result_exp <- num1^num2

print(paste(num1, "^", num2, "=", result_exp))

# Square root of num1

result_sqrt1 <- sqrt(num1)

print(paste("Square root of", num1, "is:", result_sqrt1))

# Square root of num2

result_sqrt2 <- sqrt(num2)

print(paste("Square root of", num2, "is:", result_sqrt2))

# Logarithm of num1 (natural log)

result_log1 <- log(num1)

print(paste("Natural log of", num1, "is:", result_log1))


# Sine of num1 (in radians)

result_sin1 <- sin(num1)

print(paste("Sine of", num1, "is:", result_sin1))

# Cosine of num1 (in radians)

result_cos1 <- cos(num1)

print(paste("Cosine of", num1, "is:", result_cos1))

Output
Experiment 4
AIM: Write R script to find basic descriptive statistics using
summary, str, quartile functions on mtcars & cars datasets.

SOURCE CODE
# Load the datasets

data(mtcars)

data(cars)

# ---- Descriptive statistics for the mtcars dataset ----

cat("Descriptive Statistics for mtcars Dataset:\n")

# Summary of mtcars dataset

print(summary(mtcars))

# Structure of mtcars dataset (shows data types and sample data)

cat("\nStructure of mtcars dataset:\n")

str(mtcars)

# Quartiles for mtcars dataset (for each numerical variable)

cat("\nQuartiles for mtcars dataset:\n")

print(apply(mtcars, 2, quantile)) # Apply quantile function to each column

# ---- Descriptive statistics for the cars dataset ----

cat("\nDescriptive Statistics for cars Dataset:\n")

# Summary of cars dataset


print(summary(cars))

# Structure of cars dataset (shows data types and sample data)

cat("\nStructure of cars dataset:\n")

str(cars)

# Quartiles for cars dataset

cat("\nQuartiles for cars dataset:\n")

print(quantile(cars$speed)) # Quartiles for the 'speed' column

print(quantile(cars$dist)) # Quartiles for the 'dist' column.


Output
Experiment: 5
AIM: Write R script to find subsets of datasets by using
subset(), aggregate() functions on IRIS data set.

SOURCE CODE
# Load the iris dataset

data(iris)

# ---- Using subset() function ----

# Subset the data to include only the rows where species is 'setosa'

setosa_data <- subset(iris, Species == "setosa")

cat("Subset of data where Species is 'setosa':\n")

print(head(setosa_data))

# Subset the data to include only rows where Sepal.Length is greater than 5

sepal_length_subset <- subset(iris, Sepal.Length > 5)

cat("\nSubset of data where Sepal.Length > 5:\n")

print(head(sepal_length_subset))

# Subset the data to include only columns Sepal.Length, Sepal.Width, and Petal.Length

subset_columns <- subset(iris, select = c(Sepal.Length, Sepal.Width, Petal.Length))

cat("\nSubset with selected columns Sepal.Length, Sepal.Width, and Petal.Length:\n")

print(head(subset_columns))
# ---- Using aggregate() function ----

# Aggregate data by Species, calculating the mean of Sepal.Length, Sepal.Width, Petal.Length, and
Petal.Width

aggregated_data <- aggregate(. ~ Species, data = iris, FUN = mean)

cat("\nAggregated data (mean of Sepal.Length, Sepal.Width, Petal.Length, Petal.Width by Species):\n")

print(aggregated_data)

# Aggregate the data to calculate the sum of Sepal.Length and Petal.Width by Species

aggregate_sum <- aggregate(cbind(Sepal.Length, Petal.Width) ~ Species, data = iris, FUN = sum)

cat("\nAggregated data (sum of Sepal.Length and Petal.Width by Species):\n")

print(aggregate_sum)

# Aggregate the data to find the count of observations for each Species

aggregate_count <- aggregate(Species ~ Species, data = iris, FUN = length)

cat("\nAggregated data (count of observations for each Species):\n")

print(aggregate_count)
Output
Experiment:6
AIM: Write a program in R for reading different
types of datasets(.csv)from web and disk and writing
in file in specific disk location.

SOURCE CODE
R Script for Reading and Writing Datasets

1. Reading Datasets from the Web

We can use read.csv(), read.table(), or read.delim() for reading datasets from the web.
For this example, we'll use read.csv() to load a .csv file from the web.

2. Reading Datasets from Disk

For reading data from the local disk, we can use read.csv(), read.table(), or readRDS() (for
R-specific files). Here, we'll demonstrate reading .csv and .txt files.

3. Writing Data to Disk

For writing data, we can use write.csv(), write.table(), or saveRDS() depending on the
format we want to save.

R Script

# ---- 1. Read Dataset from the Web (CSV File) ----

# Example: Reading a dataset from a URL

url <- "https://people.sc.fsu.edu/~jburkardt/data/csv/hw_200.csv" # Example CSV URL

cat("Reading CSV file from the web:\n")

web_data <- read.csv(url)

cat("First few rows of the dataset from the web:\n")


print(head(web_data))

# ---- 2. Read Dataset from the Disk (CSV and TXT Files) ----

# Example: Reading a CSV file from the local disk

csv_file_path <- "path/to/your/dataset.csv" # Change this to your file path

cat("\nReading CSV file from local disk:\n")

local_csv_data <- read.csv(csv_file_path)

cat("First few rows of the local CSV dataset:\n")

print(head(local_csv_data))

# Example: Reading a TXT file from the local disk

txt_file_path <- "path/to/your/dataset.txt" # Change this to your file path

cat("\nReading TXT file from local disk:\n")

local_txt_data <- read.table(txt_file_path, header = TRUE, sep = "\t") # Assumes tab-separated

cat("First few rows of the local TXT dataset:\n")

print(head(local_txt_data))

# ---- 3. Write Data to Disk (CSV and TXT Files) ----

# Example: Writing a CSV file to a specific location on the disk

output_csv_path <- "path/to/output/dataset_output.csv" # Change this to your desired output


location

cat("\nWriting data to CSV file on disk:\n")

write.csv(web_data, output_csv_path, row.names = FALSE) # Writing web data to CSV


# Example: Writing a TXT file to a specific location on the disk

output_txt_path <- "path/to/output/dataset_output.txt" # Change this to your desired output


location

cat("\nWriting data to TXT file on disk:\n")

write.table(local_txt_data, output_txt_path, row.names = FALSE, sep = "\t", quote = FALSE)

# Writing to TXT

cat("\nData successfully written to disk at specified locations.")


Output
Experiment:7
AIM:Program in R for reading excel datasheets.

SOURCE CODE
To read Excel files in R, we can use a package like readxl or openxlsx. The most commonly used
package for reading Excel files in R is readxl because it is simple to use and does not require any
external dependencies (like Java).

1. Install and Load Required Packages


# Install readxl package if not installed
install.packages("readxl")

# Load the readxl package


library(readxl)

2. Reading Excel Files


You can use read_excel() to read Excel files. It automatically detects the file format (.xls
or .xlsx) and works with both.
# File path to the Excel file

file_path <- "path/to/your/excel_file.xlsx" # Change to your file path

# Reading the Excel file

excel_data <- read_excel(file_path)

# Show the first few rows of the data

cat("First few rows of the Excel dataset:\n")

print(head(excel_data))

3. Reading Specific Sheets and Range


# Read a specific sheet by name
sheet_data <- read_excel(file_path, sheet = "Sheet1") # Replace "Sheet1" with your
sheet name

# Read a specific sheet by index (e.g., second sheet)


sheet_data_by_index <- read_excel(file_path, sheet = 2)

# Read a specific range (e.g., cells A1 to D20 in Sheet1)


range_data <- read_excel(file_path, sheet = "Sheet1", range = "A1:D20")

# Show the data


cat("Data from specific sheet (Sheet1):\n")
print(head(sheet_data))

4. Handling Excel Files with Multiple Sheets


# Get names of all sheets in the Excel file
sheet_names <- excel_sheets(file_path)

cat("Sheet names in the Excel file:\n")


print(sheet_names)

cat("\nData from specific range (A1:D20):\n")


print(range_data)
5. Reading Excel File with Specific Column Types
# Specify column types (e.g., first column as character, second as numeric, etc.)
col_types <- c("text", "numeric", "date", "logical")

# Read the data with specific column types


typed_data <- read_excel(file_path, col_types = col_types)

cat("Data with specified column types:\n")


print(head(typed_data))
Output
Experiment :8
AIM: program to read xml datasets in R
SOURCE CODE
Reading XML data in R can be done using the xml2 package, which provides an easy interface to
parse, read, and manipulate XML data. Here's how you can use this package to read and process
XML datasets in R.

1. Install and Load the xml2 Package


# Install xml2 package if not installed
install.packages("xml2")
# Load the xml2 package
library(xml2)
2. Read an XML File
The main function in the xml2 package for reading XML files is read_xml(). This function
reads the XML file into an R object, which can then be manipulated.
# Path to the XML file

xml_file_path <- "path/to/your/file.xml" # Replace with your actual file path

# Read the XML file

xml_data <- read_xml(xml_file_path)

# Print the XML content

cat("XML Content:\n")

print(xml_data)

3. Extracting Data from XML


# Example XML structure:

# <person>

# <name>John Doe</name>

# <age>30</age>

# <address>New York</address>
# </person>

# Extract specific elements (e.g., 'name' and 'age')

names <- xml_find_all(xml_data, ".//name")

ages <- xml_find_all(xml_data, ".//age")

# Get the text content of the extracted elements

name_text <- xml_text(names)

age_text <- xml_text(ages)

# Print extracted data

cat("\nExtracted Data:\n")

cat("Name: ", name_text, "\n")

cat("Age: ", age_text, "\n")

4. Working with Attributes in XML


# Example XML structure:

# <person id="123">

# <name>John Doe</name>

# <age>30</age>

# </person>

# Extract the 'id' attribute from the 'person' element

person_id <- xml_attr(xml_find_first(xml_data, ".//person"), "id")

# Print the extracted attribute

cat("\nPerson ID: ", person_id, "\n")

5. Extracting All Data from a Complex XML File


# Example XML structure:

# <people>

# <person>

# <name>John</name>
# <age>30</age>

# <address>New York</address>

# </person>

# <person>

# <name>Jane</name>

# <age>25</age>

# <address>Los Angeles</address>

# </person>

# </people>

# Extract all 'person' elements

people_nodes <- xml_find_all(xml_data, ".//person")

# Extract name, age, and address for each person

names <- xml_find_all(people_nodes, ".//name")

ages <- xml_find_all(people_nodes, ".//age")

addresses <- xml_find_all(people_nodes, ".//address")

# Get the text content

person_names <- xml_text(names)

person_ages <- xml_text(ages)

person_addresses <- xml_text(addresses)

# Combine the extracted data into a data frame

people_data <- data.frame(

Name = person_names,

Age = person_ages,

Address = person_addresses

# Print the resulting data frame

cat("\nExtracted Data from Multiple Persons:\n")

print(people_data)
6. Working with XML Namespaces
# Example XML with namespaces:

# <ns:person xmlns:ns="http://www.example.com">

# <ns:name>John Doe</ns:name>

# <ns:age>30</ns:age>

# </ns:person>

# Define the namespace

ns <- xml_ns(xml_data)

# Extract elements using the namespace

name_ns <- xml_find_all(xml_data, ".//ns:name", ns)

age_ns <- xml_find_all(xml_data, ".//ns:age", ns)

# Get the text content

name_text <- xml_text(name_ns)

age_text <- xml_text(age_ns)

# Print extracted data

cat("\nName (with namespace handling): ", name_text, "\n")

cat("Age (with namespace handling): ", age_text, "\n")

7. Writing Data to XML Files


# Create a simple XML structure

root <- xml_new_root("people")

person_node <- xml_add_child(root, "person")

xml_add_child(person_node, "name", "John")

xml_add_child(person_node, "age", "30")

xml_add_child(person_node, "address", "New York")

# Write the XML content to a file

output_xml_path <- "path/to/output.xml"

xml_write(root, output_xml_path)

cat("\nXML file has been written to:", output_xml_path, "\n")


Output
Experiment:9
AIM: write R program to find data distribution using
box and scatter plot

SOURCE CODE
Step 1: Install and Load Necessary Libraries

# Install ggplot2 if not already installed

install.packages("ggplot2")

# Load the ggplot2 package

library(ggplot2)

Step 2: Example Data

# Create an example data set (normal distribution)

set.seed(42) # For reproducibility

data <- data.frame(value = rnorm(100)) # 100 random numbers from a normal distribution

Step 3: Create a Boxplot

# Basic boxplot in base R

boxplot(data$value, main = "Boxplot of Data Distribution", ylab = "Value", col = "lightblue")

Step 4: Create a Scatter Plot

# Basic scatter plot in base R

plot(data$value, main = "Scatter Plot of Data Distribution",

xlab = "Index", ylab = "Value", pch = 19, col = "blue")


Experiment:10
AIM: Program in R to find the outlier using plot
● Steps to Detect Outliers Using Boxplot in R
1. Create a boxplot: Outliers are often shown as points outside of the "whiskers" of the
boxplot.
2. Use IQR to define outliers: The typical definition of an outlier is any data point that falls
below:
o Q1−1.5×IQRQ1 - 1.5 \times \text{IQR}Q1−1.5×IQR or above Q3+1.5×IQRQ3 +
1.5 \times \text{IQR}Q3+1.5×IQR where Q1Q1Q1 is the first quartile, Q3Q3Q3
is the third quartile, and IQR is the interquartile range (Q3−Q1Q3 - Q1Q3−Q1).

SOURCE CODE
# Step 1: Create a sample dataset

set.seed(42) # For reproducibility

data <- c(rnorm(100), 10, 12, 14) # Normal distribution + some outliers

# Step 2: Calculate the IQR (Interquartile Range)

Q1 <- quantile(data, 0.25) # First quartile

Q3 <- quantile(data, 0.75) # Third quartile

IQR_value <- Q3 - Q1 # Interquartile range

# Step 3: Calculate outlier thresholds

lower_bound <- Q1 - 1.5 * IQR_value

upper_bound <- Q3 + 1.5 * IQR_value

# Step 4: Identify outliers

outliers <- data[data < lower_bound | data > upper_bound]

cat("Outliers detected:", outliers, "\n")

boxplot(data, main = "Boxplot of Data (Outliers)", ylab = "Value", col = "lightblue")

points(which(data %in% outliers), data[data %in% outliers], col = "red", pch =


Output

You might also like