Big Data Lec1
Big Data Lec1
❑ Final exam 70
❑ Mid-term exam 10
❑ Practice exam 10
❑ course work 10
Timing:
❑ Lecture 3
❑ Practice 3
What is a big data?
❑The notion of Big data comes before the advances in databases
technologies and from the need for solutions to handle the huge
deluge of datasets and, therefore, the lack of sufficient storage
capacity.
❑The notion of Big data has evolved through the past decades
where each decade is described in terms of computer disc space,
from Megabyte (MB) in 1970s to Exabyte (EB) which was
introduced in 2011.
What is a big data?
What is a big data?
❑Big data is a term for a collection of data sets, so large and
complex that it becomes often difficult to process using
traditional data processing applications.
❑ In fact, more than 2.5 quintillion (𝟏𝟎𝟏𝟖 ) bytes are created daily since
even as earlier as 2013 from every post, share, search, click, stream,
and many more data producers.
Velocity
❑ Represents the accumulation of data in high speed, near real-time
and real-time from dissimilar data sources.
Variety (Format)
❑ Involves collecting data from various resources and in fuzzy and
heterogeneous types.
❑The current storage systems are not capable enough to store these
data.
❑The demand for people with good analytical skills in big data is
increasing.
Technical Issues
❑Fault Tolerance
❑Scalability
❑Quality of Data
❑Heterogeneous Data
Fault Tolerance
❑A system's ability to continue operating uninterrupted
despite the failure of one or more of its components.
❑Fault-tolerant systems use backup components that
automatically take the place of failed components, ensuring
no loss of service.
Scalability
The property of a system to handle a growing amount of work by adding
resources to the system.