BDA 2024 Section 01
BDA 2024 Section 01
1
Week Topics
W1 Introduction and practical course overview
W2 Using the R Graphical User Interface
W3 Getting Data into (and out of) R
W4 Data Types used in R, and the basic R operations
W5 Basic Statistics, Visualization Using R
W6 R Graphics Package Plots and Hypothesis Tests
W7 Classification Using R
W8 Clustering Using R
W9 Introduction to HADOOP and server installation
W 10 MapReduce (Analytics) with Hadoop
W 11 SQL Essentials topics
W 12 Advanced SQL topics
W 13 Microsoft PowerBI ( part 1) / Or Tableau Public
W 14 Microsoft PowerBI ( part 1) / Or Tableau Public
W 15 Final project submission and discussion
2
3
2
Public Services Government Internet
• Preventing Pandemics
3 Life Sciences Data
Collectors
• Genomic Mapping
4 IT Infrastructure
• Unstructured Data Analysis Phone/TV Retail
6
Why R?
7
Open Source
Cross-
Platform Large
Community
compatible
Why
R?
various
statistical
and User Friendly
graphical
Packages
You can download rstudio from
https://rstudio.Com/products/rstudio/download/#download
Who uses R?
11
Getting Started With R-Studio
Open R-Studio and a window. Across the top is a standard menu bar with typical menu
items that we will look at in a little while.
On the right is a "Workspace" panel that will show you the variables (also known as
"objects") that you are working within R.
12
More on Rstudio
(scripts)
13
(scripts)
You'll now see a "Script" panel appear.
Try typing a command into this panel and then
press the "Run" button shown below.
You should see the results of running the script
appear in the console panel.
14
(scripts)
The Run button will execute one line of code if you have the blinking caret, otherwise, it
will execute a selected block of code.
The scripts in R-Studio work very much like individual files in editors and spreadsheet
packages.
However, the other panels are a little different and are saved together in a workspace
when you exit R-Studio.
15
R SYNTAX
To output text in R, use single or double quotes
• Example
‘Hello World!’
“Hello World!”
5+5
R PRINT OUTPUT
Unlike many other programming languages, you can output code in R without using a print function:
• Example
“HELLO WORLD!”
However, R does have a print() function available if you want to use it. This might be
useful if you are familiar with other programming languages, which often uses
the print() function to output code.
• Example
PRINT("HELLO WORLD!")
You use the print() function to output code when working with for loops
• Example
FOR (X IN 1:10) {
PRINT(X)
}
Conclusion: it is up to you whether you want to use the print() function to output code.
However, when your code is inside an R expression (e.g. Inside curly braces {} like in the
example above), use the print() function to output the result.
R COMMENT
❖Comments can be used to explain R code, and to make it more readable. It can also
be used to prevent execution when testing alternative code.
❖Comments starts with a #. When executing code, R will ignore anything that starts
with #.
This example uses a comment before a line of code:
EXAMPLE
# THIS IS A COMMENT
"Hello World!"
Comments does not have to be text to explain the code, It can also be used to prevent R from
executing the code:
EXAMPLE
# "GOOD MORNING!"
"Good night!"
Multiline Comments
• Example
# This is a comment
# written in
# more than just one line
"Hello World!"
R VARIABLES
Creating Variables in R
❖Variables are containers for storing data values. R does not have a command for
declaring a variable. A variable is created the moment you first assign a value to it.
To assign a value to a variable, use the <- sign. To output (or print) the variable
value, just type the variable name:
EXAMPLE
name <- “John"
age <- 40
From the example above, name and age are variables, while "John" and 40 are values. In
other programming language, it is common to use = as an assignment operator. In R, we
can use both = and <- as assignment operators. However, <- is preferred in most cases
because the = operator can be forbidden in some context in R.
Print / Output Variables
Compared to many other programming languages, you do not have to use a function to
print/output variables in R. You can just type the name of the variable:
EXAMPLE
name <- "John Doe"
name # auto-print the value of the “name” variable
EXAMPLE
name <- "John Doe"
print(name) #print the value of the “name” variable using print function
R VARIABLE NAMES
(IDENTIFIERS)
R MULTIPLE
VARIABLES
• R allows you to assign the same value to multiple variables in one line:
• EXAMPLE
# Assign the same value to multiple variables in one line
VAR1 <- VAR2 <- VAR3 <- "Orange"
num1 + num2
If you try to combine a string (text) and a number, R will give you an error:
• EXAMPLE
num <- 5
text <- "some text"
num + text
Thanks
34