0% found this document useful (0 votes)
15 views7 pages

Tutorial 1 - Answers.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

Tutorial 1 - Answers.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

ICT583

3 March 2023

ICT583 Data Science Applications


Tutorial 1

1. Introduction to RStudio
https://education.rstudio.com/learn/beginner/
chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://cran.r-project.org/doc/
contrib/Paradis-rdebuts_en.pdf

RStudio Layout

(Source: https://datacarpentry.org/genomics-r-intro/00-introduction/index.html)

1
ICT583
3 March 2023

Task 1.1
Generate your first R Script file, plot a histogram for a built-in dataset, save and execute code.

# show the names of built-in datasets


data()

# Loading
data(mtcars)
# Print the first 6 rows
head(mtcars, 6)

# show description of the dataset


?mtcars

# you can also input character for the first argument of the function data()
data("mtcars")
# Print the first few rows, six rows by default
head(mtcars)
# Number of rows (observations)
nrow(mtcars)
# Number of columns (variables)
ncol(mtcars)

str(mtcars)
#> try another dataset
data("iris")
head(iris)

2
ICT583
3 March 2023

#> try using iris data to generate a histogram


# import data
data(iris)

# store septal length as object i


i = iris$Sepal.Length

# input i for the argument of the function hist(), and store the result as object h
h <- hist(i)
# show values of the plot
h
# you can also input the septal length for the argument of hist()
hist(iris$Sepal.Length)

# you can specify other arguments of hist()


hist(i, main="my iris", xlab="iris septal length",
xlim=c(3,9), ylim=c(0,35), col="blue", freq=T
)
# if you are unsure about an R function, you can check https://www.rdocumentation.org/ or
https://rdrr.io/
https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/hist
# useful sites
https://www.tutorialspoint.com/r/r_histograms.htm
http://www.sthda.com/english/wiki/ggplot2-histogram-plot-quick-start-guide-r-software-and-
data-visualization
https://www.datacamp.com/tutorial/make-histogram-basic-r

3
ICT583
3 March 2023

Task 1.2:
R Markdown
Alternatively, you can try creating R Markdown file.
https://rmarkdown.rstudio.com/articles_intro.html

It basically does the same, but also generates a report that embed code with text, outputs, etc.,
in HTML or other file type for reporting.

#>
---
title: "tut1.2"
output: html_document
date: "2023-02-27"
---

```{r setup, include=FALSE}


knitr::opts_chunk$set(echo = TRUE)
```

## R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML,
PDF, and MS Word documents. For more details on using R Markdown see
<http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content
as well as the output of any embedded R code chunks within the document. You can embed an
R code chunk like this:

4
ICT583
3 March 2023

```{r carss}

summary(cars)

str(cars)

```

## Including Plots

You can also embed plots, for example:

```{r pressure, echo=T}

plot(pressure)

str(pressure)

```

Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the
R code that generated the plot.

2. What data science applications can R achieve - Explore R shiny gallery!


5
ICT583
3 March 2023

Try the following apps


# Radiant
https://shiny.rstudio.com/gallery/radiant.html
https://github.com/radiant-rstats/radiant
# the diamonds data can be found in the tidyverse package
# install tidyverse
install.packages("tidyverse")
# load tidyverse
library(tidyverse)
# read basic info about diamonds
str(diamonds)

# health spending and life expectancy


https://shiny.rstudio.com/gallery/google-charts.html
https://github.com/rstudio/shiny-examples/tree/main/182-google-charts
https://databank.worldbank.org/

Discussion:
Can you describe their application task?
- What are the aims of the project?
What data information were presented?
- What are the variables, R functions and results?
Can you summarize any new insights after observing the generated results?
- Did the original authors achieve their goals? How accurate was it?

3. Where to find the publicly available datasets for analysis – explore Kaggle and UCI!

6
ICT583
3 March 2023

Visit the following websites which have the most popular data repository:
https://archive.ics.uci.edu/ml/datasets.php
https://www.kaggle.com/

# Choose one data set that you are most interested in.
# after downloading it onto your drive, you can read it by using an R function e.g., read.csv()
# the first argument of read.csv() should be the directory of your file

my_data = read.csv("D:/Users/SK/Downloads/abalone.data", header = F)


# my_data is the object name,
# "D:/Users/SK/Downloads/abalone.data" is the directory,
# F is FALSE, specifying the header argument which is the second argument of read.csv()

# read more about read.csv() and relevant functions


https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/read.table

# Investigate their first or last few rows of the data frame.


head(my_data)
tail(my_data)

# Understand the dataset variables and their data types.


str(my_data)

You might also like