R in Data Analysis
R in Data Analysis
packages(“tidyverse”)
library(“tidyverse”)
We can do the exact above procedure for the following libraries: here, skimr,
janitor, dplyr (which are necessary packages for basic data cleaning).
separate function separates one column into two by the criteria we define:
separate (employee, name, into= c (‘first name’ , ‘last name’), sep= ‘ ‘)
unite function on the other hand unites 2 column into one
unite (employee, ‘name’, first_name, last_name, sep=’ ‘)
mutate function can perform calculation or add a new column based on the
calculation such as:
mutate (body_mass_kg= body_mass_g/1000)
body_mass_kg is a new added column based on the existing body_mass_g
column that we performed calculation on it.
typeof (name of variable) it gives us the data type of a variable
Bias function defines the correlation between two variables such as we could
define temperature values for actual_temp and predicted_temp columns in
our dataset.
bias (actual_temp, predicted_temp)
indexing
vec2[1] it will return the first value in vector which is men nigari cox
sevirem
df$M or df$N it will return either assigned values of M or N.
select (diamonds, - price) - this is gonna give us every column except price.
we should use the dataset name(diamonds) as our first argument in
functions. If we do not want to use it, we can replace it with pipelines such
as %>%. For example:
diamonds %>%
select (-price)
Pipeline passes the diamond argument into the function.
Arrange function:
arranged_diamonds <- diamonds %>%
arrange(color)
view(arranged_diamonds)
arrange(desc(carat))
Data visualization in R
plot (name of dataset$name of column)
barplot (name of dataset$name of column) such as barplot(mtcars$cyl)
hist (name of dataset$name of column)
hist (iris$Petal.length [ iris$Species == “Setosa”])
Regarding work experience, I participated in different kinds of internship
programs in our job market, in Azerbaijan. I worked as an IT analyst in the
State Committee on the Affairs of Refugees and IDPs. During the internship,
We were instructed to collect the complaints and thoughts of the refugees
living in specific areas in terms of the quality of living. We analyzed the
responses in Microsoft Excel and did a little bit of visualization to understand
the context of a crisis in a more meaningful way. Our main task was to
ensure that every complaint was addressed as needed and that the reports
were sent to the administrative officials for further action. The duration of
the internship program was 6 months and the main tools for the task were
Microsoft Excel and Power BI.
1 eded 10 azn = 10
Hörmətlə,
Manaf Əhmədov