0% found this document useful (0 votes)
25 views34 pages

BDA 2024 Section 01

Uploaded by

abdo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views34 pages

BDA 2024 Section 01

Uploaded by

abdo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

ARAB ACADEMY FOR SCIENCE, TECHNOLOGY & MARITIME TRANSPORT

COLLEGE OF COMPUTING AND INFORMATION TECHNOLOGY

Big Data Analytics


Sec 01
Eng. Ahmed Mahmoud Nazif

1
Week Topics
W1 Introduction and practical course overview
W2 Using the R Graphical User Interface
W3 Getting Data into (and out of) R
W4 Data Types used in R, and the basic R operations
W5 Basic Statistics, Visualization Using R
W6 R Graphics Package Plots and Hypothesis Tests
W7 Classification Using R
W8 Clustering Using R
W9 Introduction to HADOOP and server installation
W 10 MapReduce (Analytics) with Hadoop
W 11 SQL Essentials topics
W 12 Advanced SQL topics
W 13 Microsoft PowerBI ( part 1) / Or Tableau Public
W 14 Microsoft PowerBI ( part 1) / Or Tableau Public
W 15 Final project submission and discussion
2
3

Source: From EMC 2012


Big Data Analytics: Industry Examples
1
Health Care
• Reducing Cost of Care Medical

2
Public Services Government Internet

• Preventing Pandemics
3 Life Sciences Data
Collectors
• Genomic Mapping
4 IT Infrastructure
• Unstructured Data Analysis Phone/TV Retail

5 Online Services Financial

• Social Media for Professionals


4
5
Introduction to Review of Basic Data Analytic Advanced Analytics – Advanced Analytics - The Endgame, or Putting it
Big Data Methods Using R Theory and Methods Technology and Tools All Together
Analytics
+
Data Analytics
Lifecycle
Big Data Using R to Look at Data - K-means Clustering Analytics for Operationalizing an
Overview Introduction to R Unstructured Data Analytics Project
Association Rules (MapReduce and Hadoop)
State of the Analyzing and Exploring the Creating the Final
Practice in Data Linear Regression The Hadoop Ecosystem Deliverables
Analytics
Statistics for Model Building Logistic Regression In-database Analytics – Data Visualization
The Data Scientist and Evaluation SQL Essentials Techniques
Naive Bayesian
Big Data Classifier Advanced SQL Microsoft PowerBI
Analytics in OR
Industry Verticals Decision Trees Tableau Public

Data Analytics Time Series Analysis


Lifecycle
Text Analysis

6
Why R?

R, is a programming and statistical language.


R, is used for data Analysis and Visualization.
R, is simple and easy to learn, read and write.
R, is an example of a FLOSS (Free Library and Open-Source Software)

7
Open Source

Cross-
Platform Large
Community
compatible
Why
R?

various
statistical
and User Friendly
graphical
Packages
You can download rstudio from
https://rstudio.Com/products/rstudio/download/#download
Who uses R?

 Google uses R to predict Economic Activity.


 Mozilla, the foundation responsible for the Firefox web browser, uses R to visualize
Web activity

11
Getting Started With R-Studio

 Open R-Studio and a window. Across the top is a standard menu bar with typical menu
items that we will look at in a little while.

 In the left-hand pane is a “console” window for the R command.

 On the right is a "Workspace" panel that will show you the variables (also known as
"objects") that you are working within R.

 On the bottom right is a pane displaying plots you create.

12
More on Rstudio
(scripts)

 you'll want to start writing larger scripts.


 In RStudio, select "New Script" from the
"File" menu.

13
(scripts)
 You'll now see a "Script" panel appear.
 Try typing a command into this panel and then
press the "Run" button shown below.
 You should see the results of running the script
appear in the console panel.

14
(scripts)

 The Run button will execute one line of code if you have the blinking caret, otherwise, it
will execute a selected block of code.

 Try entering several lines of code now and clicking run.

 The scripts in R-Studio work very much like individual files in editors and spreadsheet
packages.

 However, the other panels are a little different and are saved together in a workspace
when you exit R-Studio.

 This workspace will be reopened each time you open R-Studio.

15
R SYNTAX
To output text in R, use single or double quotes
• Example
‘Hello World!’
“Hello World!”

To output numbers, just type the number (without quotes)


• Example
5
10
25

To do simple calculations, add numbers together


• Example

5+5
R PRINT OUTPUT
Unlike many other programming languages, you can output code in R without using a print function:
• Example

“HELLO WORLD!”

However, R does have a print() function available if you want to use it. This might be
useful if you are familiar with other programming languages, which often uses
the print() function to output code.
• Example

PRINT("HELLO WORLD!")
You use the print() function to output code when working with for loops
• Example
FOR (X IN 1:10) {
PRINT(X)
}

Conclusion: it is up to you whether you want to use the print() function to output code.
However, when your code is inside an R expression (e.g. Inside curly braces {} like in the
example above), use the print() function to output the result.
R COMMENT
❖Comments can be used to explain R code, and to make it more readable. It can also
be used to prevent execution when testing alternative code.
❖Comments starts with a #. When executing code, R will ignore anything that starts
with #.
This example uses a comment before a line of code:
EXAMPLE
# THIS IS A COMMENT
"Hello World!"

This example uses a comment at the end of a line of code:


EXAMPLE
"Hello World!" # THIS IS A COMMENT

Comments does not have to be text to explain the code, It can also be used to prevent R from
executing the code:
EXAMPLE
# "GOOD MORNING!"
"Good night!"
Multiline Comments

Unlike other programming languages, such as java, there are no


syntax in R for multiline comments. However, we can just insert
a # for each line to create multiline comments:

• Example
# This is a comment
# written in
# more than just one line
"Hello World!"
R VARIABLES
Creating Variables in R
❖Variables are containers for storing data values. R does not have a command for
declaring a variable. A variable is created the moment you first assign a value to it.
To assign a value to a variable, use the <- sign. To output (or print) the variable
value, just type the variable name:

EXAMPLE
name <- “John"
age <- 40

name # output “John"


age # output 40

From the example above, name and age are variables, while "John" and 40 are values. In
other programming language, it is common to use = as an assignment operator. In R, we
can use both = and <- as assignment operators. However, <- is preferred in most cases
because the = operator can be forbidden in some context in R.
Print / Output Variables
Compared to many other programming languages, you do not have to use a function to
print/output variables in R. You can just type the name of the variable:

EXAMPLE
name <- "John Doe"
name # auto-print the value of the “name” variable

However, you can use print() function to output variables.

EXAMPLE
name <- "John Doe"
print(name) #print the value of the “name” variable using print function
R VARIABLE NAMES
(IDENTIFIERS)
R MULTIPLE
VARIABLES
• R allows you to assign the same value to multiple variables in one line:

• EXAMPLE
# Assign the same value to multiple variables in one line
VAR1 <- VAR2 <- VAR3 <- "Orange"

# Print variable values


VAR1
VAR2
VAR3
R CONCATENATE
ELEMENTS
You can concatenate, or join, two or more elements, by using the paste()
function:
EXAMPLE
text <- "awesome"
paste("R is", text)

You can also use it to add a variable to another variable:


EXAMPLE
text1 <- "R is"
text2 <- "awesome"
paste(text1, text2)

You can also use it inside print() function:

print(paste("R is", "awesome"))


Note: in the previous examples, there are spaces between words, as the
default separator in the paste() function is whitespace.

To concatenate words without spaces, use paste0()


EXAMPLE
text1 <- "R is"
text2 <- "awesome"
paste0(text1, text2)
For numbers, the + character works as a mathematical operator:
• EXAMPLE
num1 <- 5
num2 <- 10

num1 + num2

If you try to combine a string (text) and a number, R will give you an error:
• EXAMPLE
num <- 5
text <- "some text"

num + text
Thanks
34

You might also like