Big Data Introduction
Big Data Introduction
Black Box Data − It is a component of helicopter, airplanes, and jets, etc. It captures
voices of the flight crew, recordings of microphones and earphones, and the
performance information of the aircraft.
Social Media Data − Social media such as Facebook and Twitter hold information
and the views posted by millions of people across the globe.
Stock Exchange Data − The stock exchange data holds information about the ‘buy’
and ‘sell’ decisions made on a share of different companies made by the customers.
Power Grid Data − The power grid data holds information consumed by a particular
node with respect to a base station.
Transport Data − Transport data includes model, capacity, distance and availability
of a vehicle.
Search Engine Data − Search engines retrieve lots of data from different databases.
Thus Big Data includes huge volume, high velocity, and extensible variety of data. The data
in it will be of three types.
To harness the power of big data, you would require an infrastructure that can manage and
process huge volumes of structured and unstructured data in realtime and can protect data
privacy and security.
There are various technologies in the market from different vendors including Amazon,
IBM, Microsoft, etc., to handle big data. While looking into the technologies that handle big
data, we examine the following two classes of technology −
NoSQL Big Data systems are designed to take advantage of new cloud computing
architectures that have emerged over the past decade to allow massive computations to be
run inexpensively and efficiently. This makes operational big data workloads much easier
to manage, cheaper, and faster to implement.
Some NoSQL systems can provide insights into patterns and trends based on real-time data
with minimal coding and without the need for data scientists and additional infrastructure.
Capturing data
Curation
Storage
Searching
Sharing
Transfer
Analysis
Presentation
Top 5 sources of big data
preferences and changing trends. Since it is self-broadcasted and crosses all physical and
demographical barriers, it is the fastest way for businesses to get an in-depth overview of their
target audience, draw patterns and conclusions, and enhance their decision-making. Media includes
social media and interactive platforms, like Google, Facebook, Twitter, YouTube, Instagram, as well
as generic media like images, videos, audios, and podcasts that provide quantitative and qualitative
Cloud storage accommodates structured and unstructured data and provides business with real-
time information and on-demand insights. The main attribute of cloud computing is its flexibility
and scalability. As big data can be stored and sourced on public or private clouds, via networks and
‘Internet’ is commonly available to individuals and companies alike. Moreover, web services such as
Wikipedia provide free and quick informational insights to everyone. The enormity of the Web
ensures for its diverse usability and is especially beneficial to start-ups and SME’s, as they don’t
have to wait to develop their own big data infrastructure and repositories before they can leverage
big data.
data is usually generated from the sensors that are connected to electronic devices. The sourcing
capacity depends on the ability of the sensors to provide real-time accurate information. IoT is now
gaining momentum and includes big data generated, not only from computers and smartphones,
but also possibly from every device that can emit data. With IoT, data can now be sourced from
medical devices, vehicular processes, video games, meters, cameras, household appliances, and the
like.
relevant big data. This integration paves the way for a hybrid data model and requires low
investment and IT infrastructural costs. Furthermore, these databases are deployed for several
business intelligence purposes as well. These databases can then provide for the extraction of
insights that are used to drive business profits. Popular databases include a variety of data sources,
such as MS Access, DB2, Oracle, SQL, and Amazon Simple, among others.
The process of extracting and analyzing data amongst extensive big data sources is a complex
process and can be frustrating and time-consuming. These complications can be resolved if
organizations encompass all the necessary considerations of big data, take into account relevant
data sources, and deploy them in a manner which is well tuned to their organizational goals.