Statistics and Data Science with R Part -4
Statistics and Data Science with R Part -4
and
Data Science with R
Arrays
print(my_data_frame)
# Accessing elements of a data frame using
Accessing Data Frame Elements column indices
element <- my_data_frame[2, 3] # Accessing
the element in the 2nd row and 3rd column
print(element)
Levels
• Factors have levels that define unique categories
or groups within the variable.
• Levels can be predefined or assigned explicitly
using the levels argument.
Ordered Factors
• Factors can be ordered or unordered. Ordered
factors have a specific sequence or hierarchy
among their levels (e.g., low, medium, high).
# Creating a factor with predefined levels
Creating a Factor gender <- c("Male", "Female", "Female",
"Male", "Male")
factor_gender <- factor(gender)
print(factor_gender)
In R, importing and exporting data is a fundamental task, as it allows you to bring data
from external sources into your R environment for analysis and export results for further
use or sharing.
1. CSV Files
Reading CSV files: You can use the read.csv() or read_csv() (from the readr package)
functions to read CSV files.
install.packages(“readr”)
library(readr)
data <- read_csv("path/to/your/file.csv")
2. Excel Files
Reading Excel files: Use the readxl package, which provides the read_excel() function.
install.packages(“readxl”)
library(readxl)
data <- read_excel("path/to/your/file.xlsx", sheet = 1)
Data Export
1. CSV Files
Writing to CSV files: Use the write.csv() or write_csv() (from
the readr package) functions.
write.csv(data, "path/to/save/your/file.csv")
write_csv(data, "path/to/save/your/file.csv")
2. Excel Files
Writing to Excel files: Use the writexl package package.
install.packages(“writexl”)
library(writexl)
write_xlsx(data, "path/to/save/your/file.xlsx")
Built-in Datasets
#Selecting Variables
#You can use the select() function to choose specific columns and the
rename() function to rename them.
#Renaming Variables
#You can use the rename() function to rename them.
renamed_data <- selected_data %>%
rename(
Miles_per_Gallon = mpg,
Cylinder_Count = cyl
)
#Filtering Rows
The filter() function allows you to subset rows based on certain conditions.
# Filter rows where mpg is greater than 20
filtered_data <- mtcars %>%
filter(mpg > 20)
Connect
@Grant-Thornton-Bharat-LLP @GrantThorntonBharat @Grantthornton_bharat @GrantThorntonIN @GrantThorntonBharatLLP [email protected]
with us