Week 3 4th Revolution
Week 3 4th Revolution
By
1
BIG DATA
We live surrounded and submerged by DATA
DIGITAL DATA
2
3
Introduction
Since the invention of computers, people have
used the term data to refer to computer
information.
DATA
Data can be texts or numbers written on
papers, or it can be bytes and bits inside the
memory of electronic devices, or it could be
facts that are stored inside a person’s mind.
5
DATA EXPLOSION WITH INTERNET AND SOCIAL MEDIA
6
7
WORLD OF DATA
8
Internet and Data
You are on the Internet almost daily. You check
your email, send replies, maybe browse
websites, and even click on things (image, link).
9
SOCIAL MEDIA GENERATE DATA
10
The Internet & DATA & CORONA VIRUS
The coronavirus pandemic shuttered offices,
schools, restaurants, and other establishments.
It allowed people to spend more time on the
Internet for work, learning, and entertainment.
11
2.5 quintillion bytes of data were created
every day. (SG Analytics, 2020):
12
As of August 2020, in one Internet minute
there were:
41,666,667 messages
13
404,444 users streamed on Netflix every
minute. (Domo, 2020)
14
Email users sent 306.4 billion emails per
day in 2020. In contrast, 293.6 billion were
exchanged in 2019. (Radicati Group, 2019;
TechJury, 2020)
15
300 hours of video were uploaded on
YouTube per minute. (e-Learning
Infographics, 2020)
16
Smart Transportation
1.DATA STORAGE
2.DATA PROCESSING
3.INFORMATION RETRIEVAL
4.SEARCHING DATA
5.ORGANISING DATA
6.DATA CLASSIFICATION
7.DATA CLEANING
8.COMPLETING MISSING DATA
9.REASONNING, PREDECTION, PLANNING
18
DATA & EVENTS
19
Facebook created 4 PB of data in one day.
(Raconteur, 2020)
Users posted 350 million photos in a day on
Facebook. (Raconteur, 2020)
20
Instagram users uploaded 95 million photos
per day over the year. (e-Learning
Infographics, 2020).
22
That would be six billion searches in 365
days. (Internet Live Stats, 2021)
23
Each month, users publish 70 million blog
posts and post 77 million new comments on
WordPress. (GrowthBadger, 2021)
24
WORLD OF ZETTABYTE
25
DATA = KNOWLEDGE
26
27
PILE OF DVD THAT REACHES THE MOON
WHEN STACKED
DIFFICULTIES with DATA
28
Importance of DATA
IMPROVE OUTCOMES
29
Quality
Improving quality is first and foremost among the
reasons why organizations should be using data.
DATA = KNOWLEDGE
MORE DATA = MORE KNOWLEDGE
31
PROACTIVE Versus REACTIVE
32
Example.
Data: Weather NEXT Week
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
35°C 38°C 45°C 51°C 55°C 60°C 65°C
PROACTIVE:
Inform people from now to be prepared.
Provide sufficient bottles of water.
Check your Air conditioners at home and work
Ban works from 12:00 – 15:00
Thursday Friday Saturday Sunday
51°C 55°C 60°C 65°C
REACTIVE:
Reaching Thursday, some people die by the
heat, Then take ACTIONS
33
DATA = MONEY
Strategy:
1. Give more homework’s
2. Use Videos to Explain Theoretical Concepts
3. Reduce score in midterms and final exams
4. Ask the students to work in groups
Data Collection:
Collect the data over a period of six months and see if
this strategy leads to solve the problem.
REDUCE THE NUMBER OF FAILING STUDENTS.
36
Find Solutions to Problems
Data allows organizations to more effectively
determine the cause of problems.
37
Example: Travel Agency:
AGA travel agency has 4 offices. Get data of sales
in every office over the year.
Collect DATA
Office-1 5 Million Consumption: 3 Millions
Office-2 25 Millions Consumption: 8 Millions
Office-3 18 Millions Consumption: 8 Millions
Office-4 1 Million Consumption: 2 Millions
DATA ANALYSIS
38
Systems Advocacy
Data is a key component of systems advocacy.
Utilizing data will help you present a strong
argument for systems change.
Strategic Planning
Data allows you to replicate areas of strength
across your organization. Data analysis will support
you to identify high-performing programs, service
areas, and people.
40
ALL BUSNESSES NEED BIGDATA TO FLOURISH
https://www.naukri.com/learning/articles/top-industries-hiring-data-scientists/
41
WE CANNOT GROWTH UP BUSNESSES
WITHOUT DATA
The value of the data science market
is slated to reach $16 billion by 2025
42
Top Recruiters of DATA SCIENTISTS
Amazon
Flipkart
Walmart
ITC
43
How is Data Stored?
44
Data
Example
47
We can address queries to structured data.
FROM Twitter
Twitter.UserId=Users.UserId
Organising Data
Standard data processing is made up of three basic
steps:
48
Together, these three steps make up the data
processing cycle.
49
Employee Timecards: ATTENDANCE
ERA OF
ZETTABYTE
51
TONS OF DATA
Data which are very large in size is called
Big Data like ZETTABYTES
53
Types of various Units of Memory
Byte 01011111 011111111 00000000
54
Name Equal To Size(In Bytes)
Byte 8 Bits 1
1, 024
Megabyte Kilobytes 1, 048, 576
1, 024
Gigabyte Megabytes 1, 073, 741, 824
1, 024
Terrabyte Gigabytes 1, 099, 511, 627, 776
55
DELUGE OF DATA
58
The currents systems will be very slow
and almost impossible to deal with
BIGDATA
Normally we work on data of size MB
(Word Doc, Excel) or maximum GB
(Movies) but data in Zetta bytes or Peta
bytes i.e. 1012 or 1015 byte size called
Big Data, impossible to work with them
59
DATA SCIENCE ENGINEERS
60
Google processes more
than 20 petabytes of data
every day. This includes
around 3.5 billion search
queries.
61
Volume of DATA
62
The amount of data in the world was
estimated to be 44 zettabytes at the
dawn of 2020.
63
By 2030, nine out of every ten people
aged six and above would be digitally
active.
ZETTABYTES
64
Social Media generate 500 terabytes of
new data Facebook, Google, LinkedIn, …,
every day. This data is mainly generated in
terms of photo and video uploads,
message exchanges, comments etc.
65
WEATHER PREDICTION COMPANIES CAN SELL
DATA TO ORGANISATIONS
66
SOFTWARE TO HANDLE BIGDATA
67
Identification: Your IP address.
Trends: Web pages you visited
Items you are interested in.
69
How to process BigData
70
From DATA to KNOWLEDGE
71
HADOOP FRAMEWORK FOR BIGDATA
73
Hadoop Distributed File Systems
STORAGE
74
Distributed Storage into Blocks
77
78
DATA TYPES
79
Structured Data
80
STRUCTURED DATA
81
Semi-Structured Data
To consider what semi-structured data is,
let's start with an analogy -- interviewing.
Let's say you're conducting a semi-
structured interview. This, as the name
implies, falls somewhere in-between a
structured and unstructured interview.
82
An unstructured interview, on the other hand,
is one in which the questions, and the order in
which they are asked, is up to the discretion of
the interviewer -- and could be entirely different
for each candidate.
83
Semi-structured data is information that does
not reside in a relational database or any other
data table, but nonetheless has some
organizational properties to make it easier to
analyze. A good example of semi-structured
data is HTML code to build web pages.
84
Unstructured data Any format of data.
85
Data Velocity Defined
Data velocity refers to the speed in which data
is generated, and collected.
87
Search for Coco Chanel Perfume
PROPOSE DISCOUNTS
88
89
DATA visualization – 3D DATA
TO DESSIMINATE IDEAS
90
1. STRUCTURED
2. SEMI-STRUCTURED
3. UNSTRUCTURED
91
92
HADOOP IS A SOFTWARE THAT HAS
MAN TOOLS TO
WORK WITH BIGDATA
USING CLUSTERS OF COMPUTERS
93
HADOOP IS FREE
94
95