0% found this document useful (0 votes)
14 views19 pages

MQP R Answers

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 19

MODULE -1

Q1 (a): Explain with example Environments and Func ons (12 Marks)

Environments in R
An environment is a collec on of paired objects, such as variables or func ons, and their
associated values. It is where R looks up the values associated with variable names during
execu on.

 Global Environment: The default environment where user-defined variables and


func ons reside.

 Local Environment: Temporary environments created during the execu on of


func ons.

Key Features of Environments:

1. Parent Environment: Every environment has a parent, except for the empty
environment.

2. Search Path: R searches through environments in the order defined by the parent
structure when resolving variable names.

Example of Environments:

x <- 10

my_func on <- func on() {

x <- 5

print(paste("Local x:", x))

my_func on() # Output: Local x: 5

print(paste("Global x:", x)) # Output: Global x: 10

Func ons in R
A func on is a block of code that performs a specific task. Func ons are first-class objects in
R, meaning they can be treated like any other object.

Key Components of a Func on:

1. Name: The iden fier for the func on.

2. Arguments: Inputs required for the func on to operate.


3. Body: The set of instruc ons the func on executes.

4. Return Value: The result of the func on.

Example of Func ons:

add_numbers <- func on(a, b) {

result <- a + b

return(result)

sum_result <- add_numbers(10, 15)

print(paste("The sum is:", sum_result))

Q1 (b): Explain with example flow control and loops in R programming (8 Marks)

Flow Control
Flow control in R determines the logical flow of execu on. The main constructs are:

1. If-Else Statements

2. Switch Statements

Example of If-Else:

x <- 10

if (x > 0) {

print("Posi ve number")

} else {

print("Non-posi ve number")

Example of Switch:

fruit <- "apple"

result <- switch(fruit,

"apple" = "Apple is red",

"banana" = "Banana is yellow",

"Unknown fruit")

print(result)
Loops
Loops are used to repeat ac ons. Common loops in R include:

1. For Loop: Iterates over a sequence.

2. While Loop: Repeats while a condi on is TRUE.

Example of For Loop:

for (i in 1:5) {

print(i)

Example of While Loop:

count <- 1

while (count <= 3) {

print(paste("Count is:", count))

count <- count + 1

Q2 (a): List and explain different data types in R (10 Marks)

R supports several data types:

1. Numeric:

o Numbers with decimal values.

o Default data type for numbers in R.

o Example: x <- 3.14

2. Integer:

o Whole numbers are stored as integers.

o Use L to denote integers.

o Example: y <- 5L

3. Character:

o Stores text strings.

o Enclosed in quotes.
o Example: z <- "Hello"

4. Logical:

o Boolean values (TRUE or FALSE).

o Useful for condi ons and logical opera ons.

o Example: w <- TRUE

5. Complex:

o Stores complex numbers in the form a + bi.

o Example: c <- 2 + 3i

6. Raw:

o Represents raw bytes.

o Example: raw_val <- charToRaw("A")

Code Example:

num <- 12.5

int <- 10L

char <- "Data Analy cs"

logi <- TRUE

comp <- 2 + 3i

print(c(class(num), class(int), class(char), class(logi), class(comp)))

Q2 (b): Explain different steps in ini a ng R (10 Marks)

Steps to Ini ate R Programming

1. Download and Install R:

o Visit the CRAN website (h ps://cran.r-project.org/).

o Choose the appropriate version for your opera ng system and install.

2. Install an IDE:

o Use IDEs like RStudio for a be er coding environment.

o Download RStudio from RStudio's website.

3. Load Necessary Libraries:


o Use install.packages("package_name") to install required packages.

o Load libraries using library(package_name).

4. Understand the R Environment:

o Get familiar with R's Console, Environment, and Script Editor in RStudio.

5. Write and Execute Code:

o Use .R files for scripts or directly type commands in the Console.

o Execute code by pressing Ctrl + Enter (Windows) or Cmd + Enter (Mac).

6. Save and Export Results:

o Save scripts and data for reuse using save() or write.csv().

Example Session:

install.packages("dplyr")

library(dplyr)

print("Welcome to R!")

MODULE-2
Q3 (a): Explain Lists and Data Frames with examples (10 Marks)

Lists
A list in R is a collec on of elements that can hold mul ple types of data (e.g., numeric,
character, logical). It is a flexible data structure and can even store other lists or data frames.

Features of Lists:

 Can store heterogeneous elements.

 Accessed using [[ ]] or $.

Example of Lists:

# Crea ng a list

my_list <- list(name = "John", age = 25, scores = c(90, 85, 88))

# Accessing elements

print(my_list$name) # Access by name


print(my_list[[2]]) # Access by posi on

Data Frames
A data frame is a table-like structure where each column can have a different type of data,
but all rows must have the same number of elements. It is the most common data structure
for storing datasets.

Features of Data Frames:

 Columns can hold different types (numeric, character, etc.).

 Accessed using column names or indices.

Example of Data Frames:

# Crea ng a data frame

df <- data.frame(

ID = 1:3,

Name = c("Alice", "Bob", "Charlie"),

Age = c(25, 30, 35)

# Accessing elements

print(df$Name) # Access column

print(df[1, ]) # Access row

print(df[1, "Age"]) # Access specific value

Q3 (b): Develop a program to create two 3x3 matrices and perform matrix opera ons (10
Marks)

Solu on:

1. Create Matrices:
Use matrix() to define two 3x3 matrices.

2. Matrix Opera ons:


Perform transpose, addi on, subtrac on, and mul plica on.

Code: # Create two 3x3 matrices

A <- matrix(c ( 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ) , nrow = 3 , ncol = 3)


B <- matrix(c ( 9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , 1 ) , nrow = 3 , ncol = 3)

p r in t ( ” Matrix A: ” )

p r in t (A)

p r in t ( ” Matrix B: ” )

p r in t (B)

1. Transpose of the matrix

transpose_A <- t(A)

transpose_B <-t(b)

p r in t ( ” Transpose o f Matrix A: ” )

p r in t ( transpose_A)

p r in t ( ” Transpose o f Matrix B: ” )

p r in t ( transpose_B)

2. Addi on of two matrix

C <-A+B

p r in t ( ” Addi on o f Matrix A and B: ” )

p r in t (C)

3. Subtrac on of two matrix

C <- A-B

p r in t ( ” Subtrac on o f Matrix A and B: ” )

p r in t (C)

4. Mul plica on of two matrix

C <- A %*% B

p r in t ( ” Matrix M u l t ip l i c a t i o n o f A and B: ” )

p r in t (C)

Q4 (a): Explain Factors and Strings (10 Marks)

Factors
Factors are used to handle categorical data in R. They store data as levels and are helpful in
sta s cal modeling.
Features of Factors:

 Represent categorical data.

 Levels can be ordered or unordered.

 Memory-efficient as they store unique levels.

Example of Factors:

# Crea ng a factor

gender <- factor(c("Male", "Female", "Female", "Male"))

print(gender)

print(levels(gender)) # View levels

Strings
Strings in R represent textual data. They are created using quotes (" or '). R provides
func ons like paste(), substr(), and nchar() to manipulate strings.

Example of Strings:

# Working with strings

text <- "Data Analy cs"

print(nchar(text)) # Number of characters

print(substr(text, 1, 4)) # Extract substring

Q4 (b): Develop an R program to find all prime numbers up to a specified number using
the sieve of Eratosthenes (10 Marks)

Solu on:
sieve_of_eratosthenes <- func on (limit) {

if (limit >= 2) {

numbers < 2: limit

for (p in numbers) {

if (p^2 > limit) {

break

}
numbers < numbers [ numbers == p | numbers %% p != 0]

return (numbers)

} else {

return (numeric (0))

prime_numbers <- sieve_of_eratosthenes (30)

print (prime_numbers)

Module 3

Q5 (a): Explain briefly about impor ng and expor ng files (10 Marks)

Impor ng Files in R
Impor ng refers to loading external data into R for analysis. Common func ons for impor ng
files include:

1. CSV Files: Use read.csv().

2. Excel Files: Use the readxl package (read_excel() func on).

3. Text Files: Use read.table() or readLines().

4. Databases: Use libraries like RSQLite or DBI.

Example - Impor ng a CSV File:

data <- read.csv("data.csv")

print(head(data))

Expor ng Files in R
Expor ng refers to saving data or results into external files. Common func ons include:

1. CSV Files: Use write.csv().

2. Excel Files: Use the writexl package (write_xlsx() func on).

3. Text Files: Use write.table().

Example - Expor ng a CSV File:

write.csv(data, "output.csv", row.names = FALSE)


Q5 (b): Explain the steps involved in data cleaning and transforming (10 Marks)

Data Cleaning
Data cleaning ensures the data is free from errors or inconsistencies. Key steps include:

1. Handle Missing Values:

o Remove missing values using na.omit().

o Replace missing values with mean/median.

Example:

data$Age[is.na(data$Age)] <- mean(data$Age, na.rm = TRUE)

2. Remove Duplicates: Use duplicated() to iden fy and remove duplicates.


Example:

3. data <- data[!duplicated(data), ]

4. Standardize Data: Normalize numeric columns to bring them into a similar scale.

5. Detect and Handle Outliers: Use visualiza on (boxplots) or sta s cal methods.

Data Transforma on
Data transforma on involves conver ng data into a suitable format for analysis. Steps
include:

1. Scaling Data: Scale numeric columns to a uniform range. Use scale().

2. Encoding Categorical Variables: Convert categorical variables into numeric form


using factor() or one-hot encoding.

3. Feature Engineering: Create new variables from exis ng data.


Example:

4. data$total_score <- data$math_score + data$science_score

5. Filtering and Selec ng Columns: Use dplyr func ons like filter() and select().
Example:

6. library(dplyr)

7. filtered_data <- data %>% filter(Age > 25)

Q6 (a): Develop a data form in R to store 20 employee details and analyze (10 Marks)
Solu on:

Code:

Program no 9

Q6 (b): Explain briefly with example accessing databases (10 Marks)

R provides tools to connect to and interact with databases, such as MySQL, SQLite, or
PostgreSQL. The DBI and RSQLite packages are commonly used for database opera ons.

Steps to Access a Database:

1. Install Necessary Libraries:


Use install.packages() to install database-related packages.

2. Establish Connec on:


Use dbConnect() to connect to the database.

3. Query the Database:


Use dbSendQuery() or dbReadTable() to fetch data.

4. Close the Connec on:


Use dbDisconnect() to terminate the connec on.

Example - SQLite Database:

# Load libraries

library(DBI)

library(RSQLite)

# Connect to SQLite database

con <- dbConnect(SQLite(), ":memory:")

# Create a table

dbWriteTable(con, "employees", employee_data)

# Query the table

query_result <- dbGetQuery(con, "SELECT * FROM employees WHERE salary >


30000")

print(query_result)

# Disconnect

dbDisconnect(con)
Module 4

Q7 (a): Explain briefly Exploratory Data Analysis (EDA) and Sca er Plots (10 Marks)

Exploratory Data Analysis (EDA)


EDA is the process of analyzing datasets to summarize their main characteris cs, o en using
visual methods. It helps in understanding the structure, pa erns, and rela onships within
the data.

Key Steps in EDA:

1. Understand Data Structure:

o Use func ons like str(), summary(), and head() to inspect data.

2. Handle Missing Values:

o Iden fy and impute/remove missing data.

3. Analyze Data Distribu on:

o Use histograms, boxplots, and density plots.

4. Check Rela onships Between Variables:

o Use sca er plots, correla on matrices, or pair plots.

Example:

summary(mtcars)

boxplot(mtcars$mpg, main = "Miles Per Gallon (mpg)", ylab = "mpg")

Sca er Plots
Sca er plots visualize rela onships between two numeric variables. Each point represents
an observa on.

Key Features:

 X-axis: Independent variable.

 Y-axis: Dependent variable.

Example:

plot(mtcars$wt, mtcars$mpg,

main = "Weight vs MPG",


xlab = "Weight (1000 lbs)",

ylab = "Miles Per Gallon (mpg)",

col = "blue",

pch = 19)

Q7 (b): Explain base graphics and la ce graphics with the help of box plots (10 Marks)

Base Graphics
Base graphics use simple plo ng func ons for visualiza ons. They are part of R's core
func onality.

Example - Box Plot (Base Graphics):

boxplot(mpg ~ cyl, data = mtcars,

main = "Boxplot of MPG by Cylinders",

xlab = "Cylinders",

ylab = "Miles Per Gallon",

col = c("red", "blue", "green"))

La ce Graphics
La ce is a powerful visualiza on package for crea ng mul variate graphics. It focuses on
condi oning (subse ng) data for plots.

Example - Box Plot (La ce Graphics):

library(la ce)

bwplot(mpg ~ factor(cyl), data = mtcars,

main = "Boxplot of MPG by Cylinders (La ce)",

xlab = "Cylinders",

ylab = "Miles Per Gallon",

col = c("red", "blue", "green"))

Q8 (a): Demonstrate progression of salary with years of experience using lm() (10 Marks)

Steps:

1. Create a dataset with years_of_experience and salary.


2. Fit a linear model using lm().

3. Plot the data and add the regression line.

4. Save the results to a file.

Code:

# Create dataset

data <- data.frame(

years_of_experience = 1:10,

salary = c(30000, 35000, 40000, 45000, 50000, 55000, 60000, 65000, 70000, 75000)

# Fit linear model

model <- lm(salary ~ years_of_experience, data = data)

# Plot data and regression line

plot(data$years_of_experience, data$salary,

main = "Salary vs Years of Experience",

xlab = "Years of Experience",

ylab = "Salary",

col = "blue",

pch = 19)

abline(model, col = "red")

# Save coefficients

write.csv(coef(model), "coefficients.csv")

# Add predicted values to data

data$predicted_salary <- predict(model, data)


# Save updated dataset

write.csv(data, "updated_data.csv")

Q8 (b): Explain histograms in base graphics, la ce graphics, and ggplot2 (10 Marks)

Histograms
Histograms visualize the distribu on of a single variable by dividing it into bins.

1. Base Graphics:
Use the hist() func on.
Example:

2. hist(mtcars$mpg, main = "Histogram of MPG (Base Graphics)",

3. xlab = "Miles Per Gallon", col = "blue", breaks = 10)

4. La ce Graphics:
Use the histogram() func on.
Example:

5. library(la ce)

6. histogram(~mpg, data = mtcars,

7. main = "Histogram of MPG (La ce Graphics)",

8. xlab = "Miles Per Gallon", col = "blue")

9. ggplot2:
Use geom_histogram().
Example:

10. library(ggplot2)

11. ggplot(mtcars, aes(x = mpg)) +

12. geom_histogram(binwidth = 2, fill = "blue", color = "black") +

13. labs( tle = "Histogram of MPG (ggplot2)", x = "Miles Per Gallon", y = "Count")
Module 5

Q9 (a): Explain briefly basic sta s cal measures available in R (10 Marks)

R provides several basic sta s cal func ons to summarize data and understand its
characteris cs.

Key Sta s cal Measures:

1. Mean: The average of the data.


Func on: mean()
Example:

2. data <- c(10, 20, 30, 40, 50)

3. mean_val <- mean(data)

4. print(mean_val) # Output: 30

5. Median: The middle value of the data.


Func on: median()
Example:

6. median_val <- median(data)

7. print(median_val) # Output: 30

8. Mode: The most frequently occurring value.


R doesn’t have a built-in func on for mode, but it can be computed.
Example:

9. mode_func <- func on(x) {

10. unique_x <- unique(x)

11. unique_x[which.max(tabulate(match(x, unique_x)))]

12. }

13. mode_val <- mode_func(c(10, 20, 20, 30, 40))

14. print(mode_val) # Output: 20

15. Standard Devia on: Measure of data spread.


Func on: sd()
Example:

16. sd_val <- sd(data)

17. print(sd_val)
18. Variance: The square of the standard devia on.
Func on: var()
Example:

19. var_val <- var(data)

20. print(var_val)

21. Range: The difference between the maximum and minimum values.
Func on: range()
Example:

22. range_val <- range(data)

23. print(range_val) # Output: 10 50

24. Summary: A quick summary of all key measures.


Func on: summary()
Example:

25. summary(data)

Q9 (b): Explain four in-built func ons to generate normal distribu on in R (10 Marks)

The normal distribu on is a con nuous probability distribu on. R provides func ons to work
with it:

1. dnorm(): Calculates the density (height of the probability density func on) at a point.
Example:

2. density <- dnorm(0, mean = 0, sd = 1)

3. print(density) # Output: 0.3989423

4. pnorm(): Computes the cumula ve probability up to a point.


Example:

5. cumula ve_prob <- pnorm(1, mean = 0, sd = 1)

6. print(cumula ve_prob) # Output: 0.8413447

7. qnorm(): Finds the quan le for a given cumula ve probability.


Example:

8. quan le <- qnorm(0.8413447, mean = 0, sd = 1)

9. print(quan le) # Output: 1


10. rnorm(): Generates random numbers from a normal distribu on.
Example:

11. random_numbers <- rnorm(5, mean = 0, sd = 1)

12. print(random_numbers)

Q10 (a): Explain Correla on Analysis and Linear Regression (10 Marks)

Correla on Analysis
Correla on measures the strength and direc on of the rela onship between two variables.

Types of Correla on:

1. Posi ve Correla on: Both variables increase together.

2. Nega ve Correla on: One variable increases while the other decreases.

Func on: cor()


Example:

x <- c(1, 2, 3, 4, 5)

y <- c(2, 4, 6, 8, 10)

correla on <- cor(x, y)

print(correla on) # Output: 1 (Perfect Posi ve Correla on)

Linear Regression
Linear regression models the rela onship between a dependent variable (y) and one or
more independent variables (x).

Steps in Linear Regression:

1. Fit a model using lm().

2. Interpret coefficients.

3. Use the model for predic on.

Example:

# Dataset

data <- data.frame(

x = c(1, 2, 3, 4, 5),
y = c(2, 4, 6, 8, 10)

# Fit linear model

model <- lm(y ~ x, data = data)

summary(model)

Q10 (b): Describe Analysis of Variance (ANOVA) (10 Marks)

Analysis of Variance (ANOVA)


ANOVA is a sta s cal method to compare the means of three or more groups to see if at
least one group is significantly different.

Types of ANOVA:

1. One-Way ANOVA: Compares means of one factor.

2. Two-Way ANOVA: Examines two factors simultaneously.

Steps in One-Way ANOVA:

1. Define the groups and their observa ons.

2. Use aov() to perform ANOVA.

3. Check the significance using the p-value.

Example:

# Dataset

data <- data.frame(

group = rep(c("A", "B", "C"), each = 5),

score = c(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)

# Perform ANOVA

anova_result <- aov(score ~ group, data = data)

summary(anova_result)

Interpreta on:
If the p-value < 0.05, at least one group mean is significantly different from others.

You might also like